Per open-mpi/ompi#381, convert the specific intialization of opal_pmix
to use the generic "= { 0 }" initializer. This form can be used to
initialize any type when the intent is just to zero out / assign
*some* value.
Embedding libfabric is a temporary measure; I'm removing some warning
notifications so that the output isn't so cluttered (we're getting
the real warnings fixed upstream, but the OMPI community doesn't
really care/need to see the warnings in the meantime).
This commit adds an MCA variable to select Portals4 logical
addressing, populates the logical-to-physical mapping table and
initializes the NI in this mode.
For static builds, we need to also set
<framework>_<component>_WRAPPER_EXTRA_LIBS so that the wrappers know
what other libraries to add to link executables.
Ensure to count *this* process when checking for how many VFs we need
on the local server.
(cherry picked from commit 386c01934e98cb8dcb48ff648ecdfb0c8677baa9)
There was a mismatch between the structure for mca_rcache_vma_t and
the OBJ_CLASS_INSTANCE. One was opal_list_item_t and the other was
ompi_free_list_item_t. The super class in the structure looks like it
is the correct one. Changed the superclass in OBJ_CLASS_INSTANCE to
match.
If there are not enough resources (e.g., low VFs), we can end up
calling finalize_one_channel() on the same channel multiple times. So
ensure to NULL out fields that we have freed already so that we do not
try to free them a second time.
Fixes CSCus26648.
Fix the ordering so that we obtain the usnic netmask information
*before* we do the filtering based on CIDR-specified networks.
Also requires upstream Github libfabric commit 3976745.
Fixes CSCus22495.
We had several problems in the old code:
1. We were specifying an arbitrary timeout (100 ms) and then abandoning
all remaining pending AV insert operations. We would then free the
endpoint buffer that we gave to fi_av_insert(), usually causing
libfabric's progress thread to write to a freed buffer.
2. We were claiming in a show_help message that the timeout was
controllable via an MCA parameter. This commit removes that
parameter, since there's no good method for us to specify a timeout
like this to libfabric right now.
3. We also weren't waiting for the correct number of fi_av_insert()
operations to complete. We were waiting for nprocs, which is
accidentally fine for 2 procs on separate hosts, but not for most
other proc counts.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>