This reverts commit open-mpi/ompi@f7257a8310.
Ensure that we properly cleanup the session directory tree. Prior code had issues with symlinks, especially if the file that the link points to was already removed as we traverse the tree. Also found that the dirent checks for directory type weren't fully portable, and so fall back to the stat-based approach which is known to be portable.
Fix singularity singletons by detecting we are in a container and properly setting the pmix selection to pick the isolated component. Remove a stale restriction blocking use of the sm btl
Or at least that was the origin of the issue. It turns out
we were freeing the wrong buffer (but as it only happen in the
case of an error we never noticed).
This patch addresses most (if not all) @derbeyn concerns
expressed on #1015. I added checks for the requests allocation
in all functions, ompi_coll_base_free_reqs is called with the
right number of requests, I removed the unnecessary basic_module_comm_t
and use the base_module_comm_t instead, I remove all uses of the
COLL_BASE_BCAST_USE_BLOCKING define, and other minor fixes.
This commit brings the scif btl up to date with changes made on master
to rework the mpool and rcache frameworks.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit uses the new flag "enumerator" to support comma-delimited
lists of flags for both the btl and btl atomic flags. After this
commit is is valid to specify something like -mca btl_foo_flags
self,put,get,in-place. All non-deprecated flags are supported by the
enumerator.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds a new type of enumerator meant to support flag
values. The enumerator parses comma-delimited strings and matches
each string or value to a list of valid flags. Additionally, the
enumerator does some basic checks to see if 1) a flag is valid in the
enumerator, and 2) if any conflicting flags are specified.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds the data necessesary for supporting dynamic add_procs
to the rdma message (opal_process_name_t). The endpoint lookup
function has been updated to match the code in udcm.
Closesopen-mpi/ompi#1468.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
A bus error occurs in sm OSC under the following conditions.
- sparc64 or any other architectures which need strict alignment.
- `MPI_WIN_POST` or `MPI_WIN_START` is called for a window created
by sm OSC.
- The communicator size is odd and greater than 3.
The lines 283-285 in current `ompi/mca/osc/sm/osc_sm_component.c` has
the following code.
```c
module->global_state = (ompi_osc_sm_global_state_t *) (module->segment_base);
module->node_states = (ompi_osc_sm_node_state_t *) (module->global_state + 1);
module->posts[0] = (uint64_t *) (module->node_states + comm_size);
```
The size of `ompi_osc_sm_node_state_t` is multiples of 4 but not
multiples of 8. So if `comm_size` is odd, `module->posts[0]` does
not aligned to 8. This causes a bus error when accessing
`module->posts[i][j]`.
This patch fixes the alignment of `module->posts[0]` by setting
`module->posts[0]` first.
Fix CID 1345825 (1 of 1): Dereference before null check (REVERSE_INULL):
ib_proc should not be NULL in this case. Removed the check and added a
check for NULL after OBJ_NEW.
CID 1269821 (1 of 1): Dereference null return value (NULL_RETURNS):
I labeled this one as a false positive (which it is) but the code in
question could stand be be cleaned up.
Fix CID 1356424 (1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):
While trying to silence another Coverity issue another was
flagged. Protect the close of fd with if (fd >= 0).
CID 70772 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70773 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70774 (1 of 1): Dereference null return value (NULL_RETURNS):
None of these are errors and are intentional but now that we have a
list release function use that to make these go away. The cleanup is
similar to CID 1269821.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>