To enable the epochs and the resilient orte code, use the configure flag:
--enable-resilient-orte
This will define both:
ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE
This commit was SVN r25093.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.
Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.
This commit was SVN r23764.
* change comm_init API - no need to pass local rank groups, fca calculates that on its own.
* remove local rank list from module - libfca maintains that now.
* in fca_bcast and fca_reduce - pass root rank index and let libfca figure out the local rank index.
This commit was SVN r23716.
* fixup lookup of supported ops by name:
in ompi 1.5.x the op string representation were changed from MPI_XXX to MPI_OP_XXX (relative to OMPI 1.4.x)
* keep compat between diff versions of FCA
* better error handling (return error if symbol not found)
* register to opal_progress and call fca_progress API
This commit was SVN r23597.
case individual entries aren't used, but dynamic rules are enabled
(i.e., at least one or more of them are not NULL, meaning that they'll
all be assumed to be either NULL or a valid value).
This commit was SVN r23361.
#define CACHE_LINE_SIZE to 128. This name has a conflict on NetBSD,
and it seems kinda odd to have a header file that ''only'' defines a
single value. Also, we'll soon be raising hwloc to be a first-class
item, so having this file around seemed kinda weird.
Therefore, I replaced CACHE_LINE_SIZE with opal_cache_line_size, an
int (in opal/runtime/opal_init.c and opal/runtime/opal.h) on the
rationale that we can fill this in at runtime with hwloc info (trunk
and v1.5/beyond, only). The only place we ''needed'' a compile-time
CACHE_LINE_SIZE was in the BTL SM (for struct padding), so I made a
new BTL_SM_ preprocessor macro with the old CACHE_LINE_SIZE value
(128). That use isn't suitable for run-time hwloc information,
anyway.
This commit was SVN r23349.
Configure Option:
--enable-sysv
MCA Parameter:
mpi_common_sm
mpi_common_sm accepts a comma delimited list of: [sysv],mmap (order
dependent). The first component that is successfully selected is used. For
example, -mca mpi_common_sm sysv,mmap will first try sysv. If sysv is not
successfully selected, then mmap will be used. mmap will be used if
mpi_common_sm is not provided.
Notes:
Please make certain that your system's shmmax limit, or equivalent, is larger
than mpool_sm_min_size. Otherwise, shmget may fail.
This commit was SVN r23260.
make sure that we do not call coll_gather and coll_bcast in the very same
instances, since some collective (intra) modules do not seem to like the fact
if they are called for scount or rcount being zero (for regular
intra-communicator operations, this is handled on the MPI API layer).
Fixes trac:2405
This commit was SVN r23188.
The following Trac tickets were found above:
Ticket 2405 --> https://svn.open-mpi.org/trac/ompi/ticket/2405
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
back the native error code.
* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
(OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
decode 'ret' to get the native error code.
This commit was SVN r23162.
INTERNAL to EXTRA_RETAIN, because not all "internal" communicators
have this flag set (only internal communicators with CIDs less than
their parent). Hence, what this flag ''really'' means is that there
was an extra RETAIN performed on it. So name the flag just that --
EXTRA_RETAIN -- indicating that an extra RETAIN has occurred.
This commit was SVN r22690.
The following SVN revision numbers were found above:
r22671 --> open-mpi/ompi@61dee816db
communicator that we created has a lower CID than the parent comm. This can
happen when using the hierarch collective communication module or for
inter-communicators (since we make a duplicate of the original communicator).
This is not a problem as long as the user calls MPI_Comm_free on the parent
communicator. However, if the communicators are not freed by the user but
released by Open MPI in MPI_Finalize, we walk through the list of still
available communicators and free them one by one. Thus, local_comm is freed
before the actual inter-communicator. However, the local_comm pointer in the
inter communicator will still contain the 'previous' address of the local_comm
and thus this will lead to a segmentation violation. In order to prevent that
from happening, we increase the reference counter local_comm by one if its CID
is lower than the parent. We cannot increase however its reference counter if
the CID of local_comm is larger than the CID of the inter communicators, since
a regular MPI_Comm_free would leave in that the case the local_comm hanging
around and thus we would not recycle CID's properly, which was the reason and
the cause for this trouble.
This commit fixes tickets 2094 and 2166. Note however, that I want to close
them manually, since a slightly different patch is required for the 1.4
series. This commit will have to be applied for the 1.5 series. And I will
need a volunteer to review it.
This commit was SVN r22671.
other process should ignore this value. Thanks to Michael Hofmann
for investigating this issue.
This commit closes trac:2268.
This commit was SVN r22639.
The following Trac tickets were found above:
Ticket 2268 --> https://svn.open-mpi.org/trac/ompi/ticket/2268