1. Fix: old v1.6-era code reset the stats-emitting event to fire twice
for each time period.
1. Add the usNIC device name to the output for differentiating the
output in multi-rail scenarios.
This commit revives the component retention functionality that was
removed as part of the component repository rewrite. The new
mca_base_component_repository_retain_component function works by
preventing the dlclosing of a dynamic component until a matching call
to mca_base_component_repository_release is made.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
to continue current default behavior.
Also add an MCA param pmix_base_collect_data to direct that the blocking fence shall return all data to each process. Obviously, this param has no effect if async_
modex is used.
Move the fake usnic IBV provider out of common/verbs and into a new
common/verbs_usnic component that is always statically linked into
libopen-pal. The fake provider is registered with libibverbs at run
time, but there is no *un*register IBV API. Hence, we can't let the
code containing this provider be dlclosed -- which means it needs to
be statically linked into libopen-pal.
Fixesopen-mpi/ompi#1060.
if CFLAGS and/or CPPFLAGS are passed to the ompi configure command line, pmix1xx
configure will not use the correct ones previously passed in the environment
see discussion started at http://www.open-mpi.org/community/lists/devel/2015/10/18159.php
Thanks Siegmar Gross for bringing this to our attention
This commit fixes the following bugs:
- On send failure release newly allocated message.
- In the destructor for udcm_message_sent_t always remove the send
timeout event from the event base. Failure to do this can lead to
memory corruption since the destructor may be called from an event
callback.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This was fixed on my btl 3.0 branch but the changeset got lost in a
rebase. Fixes issues with lock ups when using osc/rdma.
References open-mpi/ompi#1010
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The mca_base_select function uses returned priorities to select the
best component/module. This priority may be of use to the caller so
pass that information back in an optional argument. If the priority is
not needed pass NULL.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The handling of RDMA get alignment in ugni BTL for Aries
(cray xc) was wrong, resulting in very poor bandwidth
for ugni BTL on aries.
Verified using osu_bw now gives sensible bandwidth on
Aries.
Fixes#1005
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
This commit removes the service and async event threads from the
openib btl. Both threads are replaced by opal progress thread
support. The run_in_main function is now supported by allocating an
event and adding it to the sync event base. This ensures that the
requested function is called as part of opal_progress.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds a access_flags argument to the mpool registration
function. This flag indicates what kind of access is being requested:
local write, remote read, remote write, and remote atomic. The values
of the registration access flags in the btl are tied to the new flags
in the mpool. All mpools have been updated to include the new argument
but only the grdma and udreg mpools have been updated to make use of
the access flags. In both mpools existing registrations are checked
for sufficient access before being returned. If a registration does
not contain sufficient access it is marked as invalid and a new
registration is generated.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>