This commit adds code to detect when procs are unreachable when using
the dynamic add_procs functionality.
Fixes#1501
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
* Replace all tabs with spaces
* Remove extraneous extra blank line at the end (in some cases, we
were getting *2* blank lines at the end)
* Use `echo " "` instead of `echo` (which may not be portable)
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The reference counting was broken which led PMIx_Finalize
to release resources early. This fixes the "use after free" scenarios
that I encountered.
(based on commit pmix/master@abfaa4c)
This reverts commit open-mpi/ompi@f7257a8310.
Ensure that we properly cleanup the session directory tree. Prior code had issues with symlinks, especially if the file that the link points to was already removed as we traverse the tree. Also found that the dirent checks for directory type weren't fully portable, and so fall back to the stat-based approach which is known to be portable.
Fix singularity singletons by detecting we are in a container and properly setting the pmix selection to pick the isolated component. Remove a stale restriction blocking use of the sm btl
Or at least that was the origin of the issue. It turns out
we were freeing the wrong buffer (but as it only happen in the
case of an error we never noticed).
This patch addresses most (if not all) @derbeyn concerns
expressed on #1015. I added checks for the requests allocation
in all functions, ompi_coll_base_free_reqs is called with the
right number of requests, I removed the unnecessary basic_module_comm_t
and use the base_module_comm_t instead, I remove all uses of the
COLL_BASE_BCAST_USE_BLOCKING define, and other minor fixes.
This commit brings the scif btl up to date with changes made on master
to rework the mpool and rcache frameworks.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Turns out there are some cases where the Cray
wlm_detect_get_active may return NULL, in which
case fallback to wlm_detect_get_default method
is suggested. Make use of the fallback to
avoid segfaults under some circumstances in the
ALPS plm selection method.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
This commit uses the new flag "enumerator" to support comma-delimited
lists of flags for both the btl and btl atomic flags. After this
commit is is valid to specify something like -mca btl_foo_flags
self,put,get,in-place. All non-deprecated flags are supported by the
enumerator.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds a new type of enumerator meant to support flag
values. The enumerator parses comma-delimited strings and matches
each string or value to a list of valid flags. Additionally, the
enumerator does some basic checks to see if 1) a flag is valid in the
enumerator, and 2) if any conflicting flags are specified.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds the data necessesary for supporting dynamic add_procs
to the rdma message (opal_process_name_t). The endpoint lookup
function has been updated to match the code in udcm.
Closesopen-mpi/ompi#1468.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>