The fi_fabric function appears to free the provider string passed in
in the fabric_attr. This causes MCA to free an invalid pointer when
the parameter is freed.
References #374
When configured --with-devel-headers, there's now 2 "osd.h" header
files in libfabric (in different dirs). Automake's "install" target
didn't like this, and errored out.
Since embedding libfabric is a temporary measure, just avoid the
problem by not installing any libfabric headers.
Commit open-mpi/ompi@1a3597aam changed the type of the `convertor`
variable from `ompi_osc_base_convertor_t` (which contained an
`opal_convertor_t`) to an `opal_convertor_t`. Hence, using memchecker
to ensure that the inner convertor of the `ompi_osc_base_convertor_t`
is considered initialized is now unnecessary.
The previous safe_system() is quite thorough, but much more than we
really need in this script. Use a far simpler version that is
significantly easier to maintain.
Also log some of the critical steps that a human can examine the
output upon a failure (e.g., when the failure happens when launched
via cron).
Finally, set the version number to not include "openmpi-", because
Coverity has a limited-width display of the version number.
The previous condition just called assert(), which is a no-op in
non-debug builds. Change it to print a message and then call abort()
to really actually above.
This was CID 1270155.
Add the functions that changed between BTL 2.0 and 3.0 into compat.h
and compat.c:
* module.btl_prepare_src: the signature and body of this method
changed between 2.0 and 3.0. However, the functions that this
method calls did *not* need to change, so they are copied over
wholesale (with the exception that they no longer accept the unused
`registration` parameter).
* module.btl_prepare_dst: this method does not exist in BTL 3.0.
* module.btl_put: the signature and body of this method changed
between 2.0 and 3.0.
This commit make spml/yoda compatible with BTL 3.0. This is meant as a
starting point only. More work will be needed to make optimial use of
the new interface.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Background: In order to support atomics each btl needs to provide support
for communicating with self unless the btl module can guarantee global
atomicity. Before this commit bml/r2 discarded any BTL with lower
exclusivity than an existing send btl. This would cause the BML to
discard any btl other than self.
The new behavior is as follows:
- If an exisiting send btl has higher exclusivity then the btl will not be
added to the send btl list for the endpoint.
- If a btl provides RDMA support then it is always added to the rdma btl
list.
- bml_btl weight for send btls is now calculated across all send btls.
- bml_btl weight for rdma btls is now calculated across all rdma btls.
With this change self should still win as the only send btl for loopback
without disqualifying other btls (ugni, openib) for atomic operations.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
A little background. Historically ob1 always registered the entire memory
region when the RGET protocol was in use. This changed when Mellanox
added support to fragment RGET using the btl_prepare_dst function. Now
that the BTL layer has changed to split out the limits of get/put there
is explicit fragmentation code in ob1. Before this commit the registration
was still done per RGET fragment.
This commit will attempt to register the entire region before creating
RGET fragments. If the registration is successfull then all RGET
fragments will use this registration otherwise they will each attempt
to register their own segment of the receive buffer. If that fails
enough times each fragment will give up and fall back on send/recv.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>