This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.
Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.
This commit was SVN r23764.
NOTE: mmap is still the default.
Some highlights:
o Silent component failover.
o The sysv component will only be queried for selection if it is placed before
the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap). In the
default case, sysv will never be queried/selected.
o Per some on-list discussion, now unlinking mmaped file in both mmap and posix
components (see: "System V Shared Memory for Open MPI: Request for Community
Input and Testing" thread).
o Assuming local process homogeneity with respect to all utilized shared
memory facilities. That is, if one local process deems a particular shared
memory facility acceptable, then ALL local processes should be able to
utilize that facility. As it stands, this is an important point because one
process dictates to all other local processes which common sm component will
be selected based on its own, local run-time test.
o Addressed some of George's code reuse concerns.
This commit was SVN r23633.
Configure Option:
--enable-sysv
MCA Parameter:
mpi_common_sm
mpi_common_sm accepts a comma delimited list of: [sysv],mmap (order
dependent). The first component that is successfully selected is used. For
example, -mca mpi_common_sm sysv,mmap will first try sysv. If sysv is not
successfully selected, then mmap will be used. mmap will be used if
mpi_common_sm is not provided.
Notes:
Please make certain that your system's shmmax limit, or equivalent, is larger
than mpool_sm_min_size. Otherwise, shmget may fail.
This commit was SVN r23260.
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
back the native error code.
* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
(OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
decode 'ret' to get the native error code.
This commit was SVN r23162.
If file does not exist, check the directory it lives in...
Maybe used by caller, trying to open mmap() on NFS, Lustre or
Panasas (thanks Sam).
For now, this is used to warn about the usage of mmap on such FS.
Please note, that Ralph mentioned the orte_no_session_dir parameter.
The help message includes a reference to this.
Tested on NFS and Lustre on Linux on
smoky: mpirun --mca orte_tmpdir_base $HOME/tmp -np 2 ./mpi_stub
jaguar: mpirun ... --mca orte_tmpdir_base /tmp/work/$USER ...
Fixes trac:1354
This should cmr:v1.5 once it has soaked and is shown to work on
Solaris
This commit was SVN r22604.
The following Trac tickets were found above:
Ticket 1354 --> https://svn.open-mpi.org/trac/ompi/ticket/1354
This commit does a bunch of things:
* Address all remaining code review items from CMR #2023:
* Defer mmap setup to be lazy; only set it up the first time we
invoke a collective. In this way, we don't penalize apps that
make lots of communicators but don't invoke collectives on them
(per #2027).
* Remove the extra assignments of mca_coll_sm_one (fixing a
convertor count setup that was the real problem).
* Remove another extra/unnecessary assignment.
* Increase libevent polling frequency when using the RML to
bootstrap mmap'ed memory.
* Fix a minor procs-related memory leak in btl_sm.
* Commit a datatype fix that George and I discovered along the way to
fixing the coll sm.
* Improve error messages when mmap fails, potentially trying to
de-alloc any allocated memory when that happens.
* Fix a previously-unnoticed confusion between extent and true_extent
in coll sm reduce.
This commit was SVN r22049.
The following Trac tickets were found above:
Ticket 2023 --> https://svn.open-mpi.org/trac/ompi/ticket/2023