As documented in #4563 and #3697, there is an issue on ARM and
POWER platforms when the atomic fifo assembly isn't inlined,
which manifests as a hang. Document the issue and the
work-around until a proper fix is committed.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
If we detect that someone has given us an incorrect node name, provide a helpful message telling them as it is almost certainly a typo.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
This commit moves the backing files to /dev/shm to avoid limitations
that may be set on /tmp. The files are registered with pmix to ensure
they are cleaned up after an erroneous exit.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 48101278160672317ade352365592f56ef3b8977)
If available, have apps use registration capability to cleanup their session directories. Setup capability for vader to register its shared memory file location - let someone familiar with that code do so.
Final cleanup to track uid/gid, update the opal/pmix API to pass flags for ignore and leave top directory alone
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Somehow, the code for passing a daemon's parent was accidentally removed, thus breaking the tree-spawn callback sequence and causing all daemons to phone directly home. Note that this is noticeably slower than no-tree-spawn for small clusters where directly ssh launch of the child daemons from the HNP doesn't overload the available file descriptors.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
'-' is not an alpha character nor a digit, but it is a valid hostname
character and should be handled as an alpha character, otherwise, nodes
such as node-001 do not get "compressed" in the regex.
Refs open-mpi/ompi#4621
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Pull in changes from the v2.0x, v2.x, and v3.0.x release branches
so that master includes all items from released releases.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
It is possible to have parts of an in-use registered region be passed
to munmap or madvise. This does not necessarily mean the user has made
an error but does mean the entire region should be invalidated. This
commit checks that the munmap or madvise base matches the beginning of
the cached region. If it does and the region is in-use then we print
an error. There will certainly be false-negatives where a user
unmaps something that really is in-use but that is preferrable to a
false-positive.
References #4509
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit updates the configure code for Open MPI to check for C11
support. The features requested are: atomics and thread local
storage.
References #3879
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
There were multiple paths that could lead to a fast box
allocation. One of them made little sense (in-place send) so it has
been removed to allow a rework of the fast-box send function. This
should fix a number of issues with hanging/crashing when using the
vader btl.
References #4260
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
OMPI_FORTRAN_USEMPIF08_MOD macro was removed in open-mpi/ompi@791bcee6c0
so this macro is now manually expanded to mpi/fortran/use-mpi-f08/mod
Thanks to Nathan T. Weeks for reporting
Refs open-mpi/ompi#3605
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
set the key of all mpool_tree_item objects, so they can be retrieved
in mpool_base_free and then returned back to the
mca_mpool_base_tree_item_free_list free list.
Refs. open-mpi/ompi#4567
Thanks Philip Blakely for the bug report.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Fix ability to build examples from 32-bit builds. Remove the implicit
rule usage, so that we know what flags are being used. Make the override
of the FLAGS variables additive so that we don't wipe out FLAGS variables
set in the environment.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
This commit removes eax and edx from the clobber list. Older versions
of gcc handled these ok but gcc 7 does not. They are not required as
eax and edx are specified in output constraints.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>