mca_pml_ob1_recv_request_put_frag is used to request a put from the peer if get fails
mca_pml_ob1_recv_request_ack_send_btl is used to send an acknowledgement, not data
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
- Add support for fallback to previous coll module on non-commutative operations (#30)
- Replace mutexes by atomic operations.
- Use the correct nbc request type (for both ibcast and ireduce)
* coll/base: document type casts in ompi_coll_base_retain_*
- add module-wide topology cache
- use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast
- reduce number of memory allocations
- call the default request completion.
- Remove the requests from the Fortran lookup conversion tables before completing
and free it.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
The upper 2 bits of an ompi tag encode the synchronize send and
synchronize send ack.
Because the mtl_ofi_create_recv_tag_CQD and mtl_ofi_create_recv_tag
functions both use ompi_mtl_ofi.sync_proto_mask instead of
ompi_mtl_ofi.sync_send when generating their "ignore" masks, they hide
the ack bit, turning the tag into an "any tag receive"
This is an issue because ssend is implemented by doing a send and
receive internally. So if there happens to be an outstanding posted
receive posted before the ssend, that receive will end up consuming the
internal message intended for the ssend's internal receive.
Updating mtl_ofi_create_recv_tag_CQD and mtl_ofi_create_recv_tag functions
to use ompi_mtl_ofi.sync_send fixes this.
Authored-by: John L. Byrne <john.l.byrne@hpe.com>
Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
Convert the MPI_Status_f082f, MPI_Status_f082c, and MPI_Status_f2c man
pages to Markdown. Fix some typos and improve the text a bit along
the way.
Left the raw NROFF redirect pages MPI_Status_f2f08, MPI_Status_c2f08,
and MPI_Status_c2f files as they were -- they're 1-line redirects, and
it seems simpler to leave those (vs. duplicating the Markdown).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Only in C bindings:
- MPI_Status_c2f08()
- MPI_Status_f082c()
In all bindings but mpif.h
- MPI_Status_f082f()
- MPI_Status_f2f08()
and the PMPI_* related subroutines
As initially inteded by the MPI forum, the Fortran to/from Fortran 2008
conversion subtoutines are *not* implemented in the mpif.h bindings.
See the discussion at https://github.com/mpi-forum/mpi-issues/issues/298
Refs. open-mpi/ompi#1475
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Make the C MPI_F08_status type definition match the updated
mpi_f08 type(MPI_Status) definition.
This fix the inconsistency introduced in open-mpi/ompi@98bc7af7d4
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Reduce scatter block and reduce scatter algorithms were hitting
correctness issues for non commutative strided tests. We will revert to
the original default algorithms for those two collectives (basic linear
and non overlapping respectively) in the non commutative op case.
See #8010
Signed-off-by: William Zhang <wilzhang@amazon.com>
As it is possible to have multiple outstanding non-blocking collectives
provided by different collective modules, we need a consistent
mechanism to allow them to select unique tags for each instance of a
collective.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
* piggybacking Bull functionalities
* coll/adapt: Fix naming conventions and C11 atomic use
This commit fixes some naming convention issues, such as function names
which should follow the naming ompi_coll_adapt instead of
mca_coll_adapt, reserved for component and module naming (cf. tuned
collective component);
It also fixes the use of _Atomic construct, which is only valid in C11.
OPAL constructs have already been adapted to that use, so use
opal_atomic_* types instead.
* coll/adapt: Remove unused component field in module
This commit removes an unneeded field referencing the component in the
module of adapt, as it is already available through the
mca_coll_adapt_component global variable.
Signed-off-by: Marc Sergent <marc.sergent@atos.net>
Co-authored-by: Lemarinier, Pierre <pierre.lemarinier@atos.net>
Co-authored-by: pierrele <31764860+pierrele@users.noreply.github.com>
the ofi mtl mrecv was not properly setting the message in/out
arg to MPI_MRECV to MPI_MESSAGE_NULL.
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
MPI-4 is finally cleaning up its language: an MPI "exception" does not
actually exist. The only thing that exists is an MPI "error" (and
associated handlers). This commit replaces all relevant uses of the
word "exception" with "error". Note that this is still applicable in
versions of the MPI standard less than MPI-4.0 (indeed, nearly all the
cases fixed in this commit are just changes to comments, anyway).
One exception to this is the Java bindings, where there's an
MPIException class. In hindsight, it probably should have been named
MPIError, but changing it now would break anyone who is using the Java
bindings.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The C++ bindings were removed a while ago;
MPI::ERRORS_THROW_EXCEPTIONS and MPI_ERRORS_THROW_EXCEPTIONS no longer
exist.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
make things happen before the terminal call
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
The btl/ofi does not currently utilize the common ofi include/exclude
list. Added verification code similar to the mtl/ofi that will check if
the info object is in the include or exclude list. If it isn't in the
include list or is in the exclude list, validate_info will return
OPAL_ERROR. The btl/ofi will no longer pass a provider name as a hint
when calling getinfo, instead filtering the provider during
validate_info.
This patch also moves the is_in_list MTL function into common code and
adds additional debugging output to the BTL to match the MTL standard.
Signed-off-by: William Zhang <wilzhang@amazon.com>
The default algorithm selections were out of date and not performing
well. After gathering data from OMPI developers, new default algorithm
decisions were selected for:
allgather
allgatherv
allreduce
alltoall
alltoallv
barrier
bcast
gather
reduce
reduce_scatter_block
reduce_scatter
scatter
These results were gathered using the ompi-collectives-tuning package
and then averaged amongst the results gathered from multiple OMPI
developers on their clusters.
You can access the graphs and averaged data here:
https://drive.google.com/drive/folders/1MV5E9gN-5tootoWoh62aoXmN0jiWiqh3
Signed-off-by: William Zhang <wilzhang@amazon.com>
default
And other streamlining of aborting behavior.
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Remove OMPI_COMM_ERRORS and use NOHANDLE macros instead.
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
route unbound errors to self error handler
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Do not raise the error handler from within components
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Use the same env to transmit the initial error handler to spawnees
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Add logic to handle different architectural capabilities
Detect the compiler flags necessary to build specialized
versions of the MPI_OP. Once the different flavors (AVX512,
AVX2, AVX) are built, detect at runtime which is the best
match with the current processor capabilities.
Add validation checks for loadu 256 and 512 bits.
Add validation tests for MPI_Op.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: dongzhong <zhongdong0321@hotmail.com>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
It is important to have the mpi_f08 Type(MPI_Status) be the same
length (in bytes) as the mpif.h status (which is an array of
MPI_STATUS_SIZE INTEGERs). The reason is because MPI_Status_ctof()
basically does the following:
MPI_Fint *f_status = ...;
int *s = (int*) &c_status;
for i=0..sizeof(MPI_Status)/sizeof(int)
f_status[i] = c_status[i];
Meaning: the Fortran status needs to be able to hold as many INTEGERs
are there are C int's that can fit in sizeof(MPI_Status) bytes.
This is because a Fortran INTEGER may be larger than a C int (e.g.,
Fortran 8 bytes vs. C 4 bytes). Hence, the assignment on the Fortran
side will take sizeof(INTEGER) bytes for each sizeof(int) bytes in the
C MPI_Status.
This commit pads out the mpi_f08 Type(MPI_Status) with enough INTEGERs
to make it the same size as an array of MPI_TYPE_SIZE INTEGERs.
Hence, MPI_Status_ctof() will work properly, regardless of whether it
is assinging to an mpi_f08 Type(MPI_Status) or an mpif.h array of
MPI_STATUS_SIZE INTEGERs.
Thanks to @ahaichen for reporting the issue.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit updates the btl interface to change the parameters
passed to receive callbacks. The interface used to pass the tag,
a btl base descriptor, and the callback context. Most of the
values in the btl base descriptor were unused and only helped
simplify the callbacks from the self btl. All of the arguments
have now been replaced with a single receive callback descriptor.
This descriptor contains the incoming endpoint, data segment(s),
tag, and callback context. All btls have been updated to use
the new callback and the btl interface version has been bumped
to v3.2.0.
As part of this change the descriptor argument (and the segments
contained within it) have been marked as const. The were treated
as const before but this change could allow the compiler to make
better optimization decisions and will enforce that the callback
does not attempt to change the data in the descriptor.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>