OPAL free lists can be initialized with a fragment size that differs
from the size of objects from a class. This allows the free list code
to support OPAL objects that have flexible array members.
Unfortunately the free list code will throw out the desired length in
some cases. The code in question was committed in
open-mpi/ompi@90fb58de. The side effects of this are varied and can
cause segmentation faults, assert failures, hangs, etc. This commit
adds a check to ensure the requested size is at least as large as the
class size and makes opal_free_list allocations always honor the
requested fragment size (as long as it is larger than the class
size).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
1. Fix: old v1.6-era code reset the stats-emitting event to fire twice
for each time period.
1. Add the usNIC device name to the output for differentiating the
output in multi-rail scenarios.
This commit revives the component retention functionality that was
removed as part of the component repository rewrite. The new
mca_base_component_repository_retain_component function works by
preventing the dlclosing of a dynamic component until a matching call
to mca_base_component_repository_release is made.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
to continue current default behavior.
Also add an MCA param pmix_base_collect_data to direct that the blocking fence shall return all data to each process. Obviously, this param has no effect if async_
modex is used.
Move the fake usnic IBV provider out of common/verbs and into a new
common/verbs_usnic component that is always statically linked into
libopen-pal. The fake provider is registered with libibverbs at run
time, but there is no *un*register IBV API. Hence, we can't let the
code containing this provider be dlclosed -- which means it needs to
be statically linked into libopen-pal.
Fixesopen-mpi/ompi#1060.
if CFLAGS and/or CPPFLAGS are passed to the ompi configure command line, pmix1xx
configure will not use the correct ones previously passed in the environment
see discussion started at http://www.open-mpi.org/community/lists/devel/2015/10/18159.php
Thanks Siegmar Gross for bringing this to our attention
This commit fixes the following bugs:
- On send failure release newly allocated message.
- In the destructor for udcm_message_sent_t always remove the send
timeout event from the event base. Failure to do this can lead to
memory corruption since the destructor may be called from an event
callback.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This was fixed on my btl 3.0 branch but the changeset got lost in a
rebase. Fixes issues with lock ups when using osc/rdma.
References open-mpi/ompi#1010
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The mca_base_select function uses returned priorities to select the
best component/module. This priority may be of use to the caller so
pass that information back in an optional argument. If the priority is
not needed pass NULL.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>