The fi_fabric function appears to free the provider string passed in
in the fabric_attr. This causes MCA to free an invalid pointer when
the parameter is freed.
References #374
Commit open-mpi/ompi@1a3597aam changed the type of the `convertor`
variable from `ompi_osc_base_convertor_t` (which contained an
`opal_convertor_t`) to an `opal_convertor_t`. Hence, using memchecker
to ensure that the inner convertor of the `ompi_osc_base_convertor_t`
is considered initialized is now unnecessary.
Background: In order to support atomics each btl needs to provide support
for communicating with self unless the btl module can guarantee global
atomicity. Before this commit bml/r2 discarded any BTL with lower
exclusivity than an existing send btl. This would cause the BML to
discard any btl other than self.
The new behavior is as follows:
- If an exisiting send btl has higher exclusivity then the btl will not be
added to the send btl list for the endpoint.
- If a btl provides RDMA support then it is always added to the rdma btl
list.
- bml_btl weight for send btls is now calculated across all send btls.
- bml_btl weight for rdma btls is now calculated across all rdma btls.
With this change self should still win as the only send btl for loopback
without disqualifying other btls (ugni, openib) for atomic operations.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
A little background. Historically ob1 always registered the entire memory
region when the RGET protocol was in use. This changed when Mellanox
added support to fragment RGET using the btl_prepare_dst function. Now
that the BTL layer has changed to split out the limits of get/put there
is explicit fragmentation code in ob1. Before this commit the registration
was still done per RGET fragment.
This commit will attempt to register the entire region before creating
RGET fragments. If the registration is successfull then all RGET
fragments will use this registration otherwise they will each attempt
to register their own segment of the receive buffer. If that fails
enough times each fragment will give up and fall back on send/recv.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Coverity identified that we treated the possibility that one of the
message buffers could be NULL in some places (because strdup() could
fail), but not in others.
So just use stack buffers that will never be NULL.
This was CID 1269914.
This is a bit of overkill, but I'm cleaning out a bunch of other
libltdl-support assumptions, so I might as well do this one, too. The
test isn't built if we don't have libltdl support, but it had
half-hearted #if protection in it to make it safe to build even if we
didn't have libltdl support. This commit finishes that half-hearted
support.
With certain datatypes the opal_datatype_unpack method for performing
the accumulate operation does not work. This commit modifies the
accumulate code in the osc base to use opal_convertor_raw instead.
Fixes#385
Squash compiler warnings now showing up in the
query methods for the mtls. Cast pointers to the different
mtl module specific types to the mca_base_module_t.
Also, fix up a missing extern in mtl_psm_types.h.
This was causing "multiple definition" errors when building
the mca_mtl_psm.so shared library.