everything out before using it.
This is not in response to any known bug, but rather just a
pre-emptive, defensive move to help prevent bugs in code that forgets
to initialize a field.
This commit was SVN r28343.
representation is not correctly optimized (it is off by the extend).
During the data representation process, if the opportunity to merge several
items appear, we replace them with the new merged element. However, if one
of the components of this merged element was comming from a "loop representation"
then the new first element of this loop must have a displacement moved by the
extent of the loop.
This commit was SVN r28319.
Provide some nice error messages if we fail to set the limits. Since the user had to specifically request we set the limit, treat failure as an error-out situation.
This commit was SVN r28288.
Notes:
- This commit also eliminates the need for an available components list in use
in several frameworks. None of the code in question was making use of the
priority field of the priority component list item so these extra lists were
removed.
- Cleaned up selection code in several frameworks to sort lists using opal_list_sort.
- Cleans up the ompi/orte-info functions. Expose the functions that construct the
list of params so they can be used elsewhere.
patches for mtl/portals4 from brian
missed a few output variables in openib
This commit was SVN r28241.
Other changes:
- Added a flag to the MCA variable system to indicate a variable should go away
when its group does. Both mca_base_framework_var_register() and
mca_base_component_var_register() set this flag.
Notes:
- mca_base_components_open is deprecated. It will be removed in a future commit.
- All frameworks should use MCA_BASE_FRAMEWORK_DECLARE to declare their
framework structure.
- All calls to framework open/close functions should be changed to use the
mca_base_framework_* functions.
- Instead of special-casing installdirs a flag was added to prevent calling
into the variable system when opening a framework.
- Ralph: Clarify the functional definition of the "register" function in the
MCA framework object - it had the same name as another function that does a
totally different thing.
- As per discussion with Ralph the behavior of mca_base_framework_register()
is to always call mca_base_framework_components_register() if the framework's
register function was successful. This removed the need for frameworks to
have to call this function directly.
This commit was SVN r28237.
Features:
- Support for an override parameter file (openmpi-mca-param-override.conf).
Variable values in this file can not be overridden by any file or environment
value.
- Support for boolean, unsigned, and unsigned long long variables.
- Support for true/false values.
- Support for enumerations on integer variables.
- Support for MPIT scope, verbosity, and binding.
- Support for command line source.
- Support for setting variable source via the environment using
OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
- Cleaner API.
- Support for variable groups (equivalent to MPIT categories).
Notes:
- Variables must be created with a backing store (char **, int *, or bool *)
that must live at least as long as the variable.
- Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
mca_base_var_set_value() to change the value.
- String values are duplicated when the variable is registered. It is up to
the caller to free the original value if necessary. The new value will be
freed by the mca_base_var system and must not be freed by the user.
- Variables with constant scope may not be settable.
- Variable groups (and all associated variables) are deregistered when the
component is closed or the component repository item is freed. This
prevents a segmentation fault from accessing a variable after its component
is unloaded.
- After some discussion we decided we should remove the automatic registration
of component priority variables. Few component actually made use of this
feature.
- The enumerator interface was updated to be general enough to handle
future uses of the interface.
- The code to generate ompi_info output has been moved into the MCA variable
system. See mca_base_var_dump().
opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system
This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.
This commit was SVN r28236.
binding. This fix was included in the upstream 1.6 series, but not
the upstream 1.5 series, and was therefore missed when we brought
1.5.2 to OMPI.
This commit was SVN r28212.
The following SVN revision numbers were found above:
r28040 --> open-mpi/ompi@3d44f97572
At the same time, fix a minor issue where the init hook was being called twice, once by the libc malloc and once by our malloc by removing the call from our malloc.
This commit was SVN r28202.
library to multiple libraries that are implicitly sucked into the executable
as a dependency of libmpi. The initialize hook isn't visible to libc on some
linux distributions when it's in libopal and libopal isn't explicity linked
into the executable. The fix is to have a duplicate initialize hook in
libmpi as well as libopal. *sigh*.
This commit was SVN r28164.
A few changes were required to support this move:
1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).
2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.
3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.
4. the ORTE grpcomm and ess/pmi components, and the nidmap code, were updated to work with the new db framework and to specify internal/global storage options.
No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.
This commit was SVN r28112.
* Clean up ${includedir} and ${libdir} for script wrapper compilers
* Update script wrapper compilers to work like the C wrapper compilers w.r.t static and dynamic linking
* Remove the ORTE script wrapper compilers since they didn't support the ${includedir} stuff and Ralph said they weren't used anymore.
This commit was SVN r28052.
processor_bind to see if we're bound to a single core.
If not, THEN check lgroup affinity. Already CMR'ed to
v1.6 (trac 3507) and fixed upstream in hwloc (r5295).
This commit was SVN r28040.
The following SVN revision numbers were found above:
r5295 --> open-mpi/ompi@6df8cb0f02
Leif would like to revamp the ARM support in a different way, and will
submit a patch to do so in the future.
This commit was SVN r27960.
The following SVN revision numbers were found above:
r27882 --> open-mpi/ompi@8649b5eece
flags, and mca flags are kept seperate until the very end. The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro. MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.
This commit was SVN r27950.
actually care if opal_pointer_array is limited to handle_max already passes
that in as the max_size during init, so don't need it there. The arch
constant was a bit more difficult, so pass that in during MPI init and
leave empty otherwise.
This is to help with the effort to allow building ompi against an external
opal or orte.
This commit was SVN r27817.
it that the others did: move the "I won!" code up into the POST_CONFIG
macro. Also, fix a long-standing typo when restoring the $CPPFLAGS (!).
This commit was SVN r27813.
STOP_AT_FIRST. And move the side-effect-inducing code in
hwloc142/configure.m4 up to POST_CONFIG.
Also change the priority of the external hwloc component to 90 so that
it is evaluated before the internal component (as a direct result of
changing to STOP_AT_FIRST).
This commit was SVN r27796.
The following SVN revision numbers were found above:
r27794 --> open-mpi/ompi@569a60c2de
event framework to STOP_AT_FIRST, and then moves a bunch of
side-effect-inducing code in the libevent2019 configure.m4 up to
POST_CONFIG.
== More detail ==
Change the event framework from STOP_AT_FIRST_PRIORITY to
STOP_AT_FIRST. This means that only one component can win (vs. all
STOP_AT_FIRST_PRIORITY, in which multiple components of the same
priority can all win).
You still need to ensure that there are no side-effects from the
winner, however, so check for winning during POST_CONFIG, and set
things like the base_include there.
This simplifies the configury quite a bit -- you don't have to assume
that mulitple components can win: zero or one components will win.
Also change the libevent 2019 priority to 50 so that some other
(developer-specific/local) component could win, if it wanted to.
This commit was SVN r27794.
party configure.in scripts to be configure.ac so that Automake stops
complaining about them.
This commit was SVN r27791.
The following SVN revision numbers were found above:
r27790 --> open-mpi/ompi@675a2f5c48
using the modex or RML to share sm initialization information, have node rank 0
create a file containing initialization information in a well-known place. Then
during add_procs, the rest of the node processes requiring sm BTL initialization
will just read from that file to complete their initialization.
This commit was SVN r27789.
Carns, change to use access(.., F_OK) instead of stat() to check for
the presence of files.
Also remove redundant check for FAKEROOTKEY, and update all comments
to match.
This commit was SVN r27785.
config/ directory. We split them apart a while ago in the hopes that
it would simplify things, but it didn't really (e.g., because there
were still some ompi/opal .m4 files in the top-level config/
directory, resulting in developer confusion where any given m4 macro
was defined).
So this commit consolidates them back into the top-level directory for
simplicity.
There's still (at least) two changes that would be nice to make:
1. Split any generated .m4 file (e.g., autogen-generated .m4 files)
into a separate directory somewhere so that a top-level -Iconfig/
will only get our explicitly defined macros, not the autogen stuff
(e.g., with libevent2019 needing to get the visibility macro, but
NOT all the autogen-generated inclusion of component configure.m4
files).
1. Change configure to be of the form:
{{{
# ...a small amount of preamble/setup...
OPAL_SETUP
m4_ifdef([project_orte], [ORTE_SETUP])
m4_ifdef([project_ompi], [OMPI_SETUP])
# ...a small amount of finishing stuff...
}}}
I doubt we'll ever get anything as clean as that, but that would be
the goal to shoot for.
This commit was SVN r27704.
additional functionality. Rationale (refs trac:3422):
* Normal MPI applications only ever use the MPI API. Hence, -lmpi is
sufficient (they'll never directly call ORTE or OPAL
functions). This is arguably the most common case.
* That being said, we do have some test programs (e.g., those in
orte/test/mpi) that call MPI functions but also call ORTE/OPAL
functions. I've also written the occasional MPI test program that
calls opal_output, for example (there even might be a few tests in
the IBM test suite that directly call ORTE/OPAL functions).
* Even though this is not a common case, these applications should
also compile/link with mpicc.
* So we should add a --openmpi:linkall option that will also link
in whatever is necessary to call ORTE/OPAL functions
* Yes, we could hard-code "-lopen-rte -lopen-pal" in Makefiles, but
we do reserve the right to change those library names and/or add
others someday, so it's better to abstract out the names and let
the wrapper supply whatever is necessary.
* ORTE programs, however, are different. They almost always call OPAL
functions (e.g., if they want to send a message, they must use the
OPAL DSS). As such, it seems like the ORTE programs should always
link in OPAL.
Therefore:
* Add undocumented --openmpi:linkall flag to the wrapper compilers.
See the comment in opal_wrapper.c for an explanation of what it
does. This flag is only intended for Open MPI developers -- not
end users. That's why it's undocumented.
* Update orte/test/mpi/Makefile.am to add --openmpi:linkall
* Make ortecc/ortec++'s wrapper data text files always explicitly
link in libopen-pal
This commit was SVN r27670.
The following SVN revision numbers were found above:
r27668 --> open-mpi/ompi@cf845897aa
The following Trac tickets were found above:
Ticket 3422 --> https://svn.open-mpi.org/trac/ompi/ticket/3422
1. Restore libopen-pal.la, libopen-rte.la, and libmpi.la to be
separate entities (i.e., don't have libopen-rte.la include
libopen-pal.la, and don't have libmpi.la include libopen-pal.la).
Yay!
1. Consequently, make the wrapper compilers look for flags indicating
that the user wants to compile statically (currently: -static,
!--static, -Bstatic, and "-Wl," in front of all of those). If it
is, follow a 6-way matrix for determinining which libraries to
list on the underlying command line.
1. To support that, add the name of a token static and dynamic
library to look for in each of the wrapper compiler data files.
1. Fix a long-standing typo in the opalcc wrapper data file.
This commit was SVN r27662.
The old behavior of mca_base_param_deregister could cause the indices of other mca parameters to change. This could potentially cause problems if a mca user saves and later references an affected index.
This commit was SVN r27633.
Reasoning: The old behavior was a little confusing. mca_base_components_open does not open an output stream so it is a little unexpected that mca_base_components_close does. To add to this several frameworks (that don't use mca_base_components_close) failed to close their output in the framework close function and others closed their output a second time. This change is an improvement to the symantics of mca_base_components_open/close as they are now symetric in their functionality.
This commit was SVN r27570.
pml/v:
- If vprotocol is not being used vprotocol_include_list is leaked. Assume vprotocol never takes ownership (see below) and always free the string.
coll/ml:
- (patch verified) calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
- Need to free mca_coll_ml_component.config_file_name on component close.
btl/openib:
- calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
vprotocol/base:
- There was no way for pml/v to determine if vprotocol took ownership of vprotocol_include_list. Fix by always never ownership (use strdup).
mca/base:
- param_lookup will result in storage->stringval to be a newly allocated string if the mca parameter has a string value. ensure this string is always freed.
cmr:v1.7
This commit was SVN r27569.
It appears the problem was not with the command line parser but the rsh plm. I don't know why this problem was not occuring before the command line parser changes but it appears to be resolved now.
This commit was SVN r27527.
The following SVN revision numbers were found above:
r27451 --> open-mpi/ompi@d59034e6ef
r27456 --> open-mpi/ompi@ecdbf34937
Not sure what happened here, but the resulting trunk wouldn't even configure. After spending time fixing that problem, I found it wouldn't compile due to multiple syntax errors that had been introduced in both the OPAL and OMPI layer. This raised questions as to the completeness of the work.
Given that the author is departing, I pinged Jeff about it and we agreed to revert this for now. Hopefully, it can either be fixed by the author prior to actual departure, or someone else can pick it up (now that it is in the history) and fix it.
This commit was SVN r27511.
The following SVN revision numbers were found above:
r27508 --> open-mpi/ompi@12c3c743de
r27509 --> open-mpi/ompi@79e4a8ca38
r27510 --> open-mpi/ompi@1ad5ff625a