Fix CID 1315271: Constant expression result
The intent of this conditional is to not produce a peruse event for
probe or mprobe requests. Coverity is correct that the expression is
always true. Changed the || to && to fix. Also moved the conditional
within an OMPI_WANT_PERUSE to ensure the conditional is not evaluated
if peruse is disabled.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit contains the following changes:
- pml/ob1: use the bml accessor function when requesting a bml
endpoint. this will ensure that bml endpoints are only created when
needed. for example, a bml endpoint is not requested and not
allocated when receiving an eager message from a peer.
- pml/ob1: change the pml_procs array in the ob1 communicator to a
proc pointer array. at the cost of a single level of extra
redirection this will allow us to allocate pml procs on demand.
- pml/ob1: add an accessor function to access the pml proc structure
for a given peer. this function will allocate the proc if it
doesn't already exist.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit should resolve an issue seen with CUDA-aware support. The
problem came in with BTL 3.0. Before 3.0 the size of the copy was
stored in the incoming segment's des_remote_count field. This field
does not exist in BTL 3.0 so I stored the value in the
des_segment_count field. This caused problems with the cuda support
code. To fix the issue the endpoint pointer is now stored in the in
fragment's endpoint pointer which free's up the segment's des_cbdata
pointer for storing the transfer size.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
A little background. Historically ob1 always registered the entire memory
region when the RGET protocol was in use. This changed when Mellanox
added support to fragment RGET using the btl_prepare_dst function. Now
that the BTL layer has changed to split out the limits of get/put there
is explicit fragmentation code in ob1. Before this commit the registration
was still done per RGET fragment.
This commit will attempt to register the entire region before creating
RGET fragments. If the registration is successfull then all RGET
fragments will use this registration otherwise they will each attempt
to register their own segment of the receive buffer. If that fails
enough times each fragment will give up and fall back on send/recv.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
attempting to break the get into multiple rdma fragments
A little background. Historically ob1 always registered the entire memory
region when the RGET protocol was in use. This changed when Mellanox
added support to fragment RGET using the btl_prepare_dst function. Now
that the BTL layer has changed to split out the limits of get/put there
is explicit fragmentation code in ob1. Before this commit the registration
was still done per RGET fragment.
This commit will attempt to register the entire region before creating
RGET fragments. If the registration is successfull then all RGET
fragments will use this registration otherwise they will each attempt
to register their own segment of the receive buffer. If that fails
enough times each fragment will give up and fall back on send/recv.
opal_mutex_t must be OBJ_DESTRUCTed in order to avoid
a memory leak (pthread_mutex_init allocates memory under
Cygwin, so pthread_mutex_destroy is mandatory)
Thanks to Marco Atzeri for reporting this issue
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
The wrong descriptor field was used when calculating the size received when
using the RDMA rendevous protcol.
This commit was SVN r32232.
The following SVN revision numbers were found above:
r32196 --> open-mpi/ompi@a14e0f10d4
mca_btl_base_segment_t and replace them with des_local and des_remote
This change also updates the BTL version to 3.0.0. This commit does
not represent the final version of BTL 3.0.0. More changes are coming.
In making this change I updated all of the BTLs as well as BTL user's
to use the new structure members. Please evaluate your component to
ensure the changes are correct.
RFC text:
This is the first of several BTL interface changes I am proposing for
the 1.9/2.0 release series.
What: Change naming of btl descriptor members. I propose we change
des_src and des_dst (and their associated counts) to be des_local and
des_remote. For receive callbacks the des_local member will be used to
communicate the segment information to the callback. The proposed change
will include updating all of the doxygen in btl.h as well as updating
all BTLs and BTL users to use the new naming scheme.
Why: My btl usage makes use of both put and get operations on the same
descriptor. With the current naming scheme I need to ensure that there
is consistency beteen the segments described in des_src and des_dst
depending on whether a put or get operation is executed. Additionally,
the current naming prevents BTLs that do not require prepare/RMA matched
operations (do not set MCA_BTL_FLAGS_RDMA_MATCHED) from executing
multiple simultaneous put AND get operations. At the moment the
descriptor can only be used with one or the other. The naming change
makes it easier for BTL users to setup/modify descriptors for RMA
operations as the local segment and remote segment are always in the
same member field. The only issue I forsee with this change is that it
will require a little more work to move BTL fixes to the 1.8 release
series.
This commit was SVN r32196.
Per RFC. There are two optimizations in this commit:
- Allocate requests for blocking sends and receives on the stack. This
bypasses the request free list and saves two atomics on the critical path.
This change improves the small message ping-pong by 50-200ns on both AMD
and Intel CPUs.
- For small messages try to use the btl sendi function before intializing a
send request. If the sendi fails or the btl does not have a sendi function
silently fallback on the standard send path.
cmr=v1.7.5:reviewer=brbarret
This commit was SVN r30343.
configure-time dynamic allocation of flags. The net result for platforms
which only support BTL-based communication is a reduction of 8*nprocs bytes
per process. Platforms which support both MTLs and BTLs will not see
a space reduction, but will now be able to safely run both the MTL and BTL
side-by-side, which will prove useful.
This commit was SVN r29100.
value to signal that the operation of retrieving the element from the free list
failed. However in this case the returned pointer was set to NULL as well, so the
error code was redundant. Moreover, this was a continuous source of warnings when
the picky mode is on.
The attached parch remove the rc argument from the OMPI_FREE_LIST_GET and
OMPI_FREE_LIST_WAIT macros, and change to check if the item is NULL instead of
using the return code.
This commit was SVN r28722.
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.
Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.
This commit was SVN r26242.
Uses new CUDA IPC support. Also, a few minor changes in PML to take
advantage of it.
This code has no effect unless user asks for it explicitly via
configure arguments. Otherwise, it is either #ifdef'ed out or
not compiled.
This commit was SVN r26039.