The rdma_frag attached to the send request was not correctly released
upon request completion, leaking until MPI_Finalize. A quick solution
would have been to add RDMA_FRAG_RETURN at different locations on the
send request completion, but it would have unnecessarily made the
sendreq completion path more complex. Instead, I added the length to
the RDMA fragment so that it can be completed during the remote ack.
Be more explicit on the comment.
The rdma_frag can only be freed once when the peer forced a protocol
change (from RDMA GET to send/recv). Otherwise the fragment will be
returned once all data pertaining to it has been trasnferred.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
There are a couple MPI_Alltoallv calls in ad_gpfs_aggrs.c where the
send/recv data comes from places like req[r].lens, and the send
buffer and send displacements for example were being calculated as
sbuf = pick one of the reqs: req[bottom].lens
sdisps[r] = req[r].lens - req[bottom].lens
which might be okay if the .lens was data inside of req[] so they'd
all be close to each other. But each .lens field is just a pointer
that's malloced, so those addresses can be all over the place, so the
integer-sized sdisps[] isn't safe.
I changed it to have a new extra array sbuf and rbuf for those two
Alltoallv calls, and copied the data into the sbuf from the same
locations it used to be setting up the sdisps[] at, and after the
Alltoallv I copy the data out of the new rbuf into the same
locations it used to be setting up the rdisps[] at.
For what it's worth I was able to get this to fail -np 2 on a GPFS
filesystem with hints romio_cb_write enable. I didn't whittle the
test down to something small, but it was failing in an
MPI_File_write_all call.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
the shortfloat extension is only made of header files,
and hence do not require a library to be built.
Refs. open-mpi/ompi#6205
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Do not require an archive when the OMPI_MPIEXT_<ext>_HAVE_OBJECT
macro is defined to 0.
See `ompi/mpiext/example/configure.m4`.
Allow some extensions to be built on OS X since the creation of
archives with no files is not permitted.
Refs. open-mpi/ompi#6205
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
abstract out the io_array structure to be used in common_ompio_build_io_array function.
This is preparation for a future component that would like to use the same function,
but not modify the io_array stored on the file handle itself.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
In case of using a btl_put in ob1, the handle of the locally registered
memory is sent with a PUT control message. In the current master code
the sent handle is necessary the handle in the frag but if the handle
has been successfully registered in the request, the frag structure does
not have any valid handle and all fragments use the request one.
I suggest to check if the handle in the fragment is valid and if not to
send the handle from the request.
Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
In the case the btl_get fails Ob1 tries to fallback on btl_put first but
the return code was ignored. So the code fell back on both btl_put and
btl_send.
Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
In fint_2_int.h there are some conversion macros for logicals. It has
one path for OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT where a new array
would be allocated and the conversions then might expand to
c_array[i] = (array[i] == 0 ? 0 : 1)
and another path for OMPI_SIZEOF_FORTRAN_LOGICAL == SIZEOF_INT where it
does things "in place", so the same conversion there would just be
array[i] = (array[i] == 0 ? 0 : 1)
The problem is some of the logical arrays being converted are INPUT
arguments. And it's possible for some compilers to even put the argument
in read-only memory so the above "in place" conversion SEGV's. A
testcase I have used
call MPI_CART_SUB(oldcomm, (/.true.,.false./), newcomm, ierr)
and gfortran put the second arg in read-only mem.
In cart_sub_f.c you can trace the ompi_fortran_logical_t *remain_dims arg.
remain_dims[] is for input only, but the file uses
OMPI_LOGICAL_ARRAY_NAME_DECL(remain_dims);
OMPI_ARRAY_LOGICAL_2_INT(remain_dims, ndims);
PMPI_Cart_sub(..., OMPI_LOGICAL_ARRAY_NAME_CONVERT(remain_dims), ...);
OMPI_ARRAY_INT_2_LOGICAL(remain_dims, ndims);
to convert it to c-ints make a C call then restore it to Fortran logicals
before returning.
It's not always wrong to convert purely in-place, eg cart_get_f.c has
a periods[] that's exclusively for OUTPUT and it would be fine with the
macros as they were. But I still say the macros are invalid because they
don't distinguish whether they're being used on INPUT or OUTPUT args and
thus they can't be used in a way that's legal for both cases.
It might be possible to fix the macros by adding more of them so that
cart_create_f.c and cart_get_f.c would use different macros that give
more context. But my fix here is just to turn off the first block and
make all paths run as if OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT.
The main macros that get enlarged by this change are
define OMPI_ARRAY_LOGICAL_2_INT_ALLOC : mallocs now
define OMPI_ARRAY_LOGICAL_2_INT : also mallocs now
But these are only used in 4 places, three of which are the purpose of
this checkin, to avoid the former in-place expansion of an INPUT arg:
cart_create_f.c
cart_map_f.c
cart_sub_f.c
and one of which is an OUPUT arg that was fine and that gets
unnecessarily expanded into a separate array by this checkin.
cart_get_f.c
So I think an unnecessary malloc in cart_get_f.c is the only downside
to this change, where the logicals array argument could have been used
and converted in place.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
Update provided by Gilles Gouaillardet to keep the in-place option
if OMPI_FORTRAN_VALUE_TRUE == 1 where no conversion is needed.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This is not fixing any issue, it is simply preventing a sefault if the
communicator creation has not happened as expected. Thus, this code path
should never really be hit in a correct MPI application with a valid
communicator creation support.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
This is so when a debugger attaches using MPIR, it can step out of this stack back into main.
This cannot be done with certain aggressive optimisations and missing debug information.
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Co-authored-by: Jeff Squyres <jsquyres@cisco.com>
The types of count, disp, and extent passed into
ompi_datatype_add() should be size_t, ptrdiff_t and ptrdiff_t,
respectively. This prevents integer overflows and errors in
computing the size of large indexed datatypes.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
- there was a set of UCX related issues reported which caused
by mmap API hooks conflicts. We added diagnostic of such
problems to simplify bug-resolving pipeline
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
in schizo/ompi, sets the new OMPI_MCA_mpi_oversubscribe environment
variable according to the node oversubscription state.
This MCA parameter is used to set the default value of the
mpi_yield_when_idle parameter.
This two steps tango is needed so the mpi_yield_when_idle setting
is always honored when set in a config file.
Refs. open-mpi/ompi#6433
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This commit DELETES the removed MPI1 functions and datatypes from
both the mpi.h header and from the library (they were deleted from the
MPI standard in MPI-3.0).
WARNING: This changes the MPI API in a non-backwards compatible way.
This also removes the configure option that was added in Open
MPI v4.0.x, requiring users to change their apps if they are
using any of these almost 20 year old APIs.
This commit removes the following MPI1 removed functions and datatypes:
MPI_Address
MPI_Errhandler_create
MPI_Errhandler_get
MPI_Errhandler_set
MPI_Type_extent
MPI_Type_hindexed
MPI_Type_hvector
MPI_Type_struct
MPI_Type_UB
MPI_Type_LB
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
Refs https://github.com/open-mpi/ompi/issues/6278.
This commit is intended to be cherry-picked to v4.0.x and
the following commit will ammend to this functionality for
master's removal.
Changes the prototypes for MPI removed functions in the
following ways:
There are 4 cases:
1) User wants MPI-1 compatibility (--enable-mpi1-compatibility)
MPI_Address (and friends) are declared in mpi.h with
deprecation notice
2) User does not want MPI-1 compatibility, and has a C11-capable
compiler
Declare an MPI_Address (etc.) macro in mpi.h, which will
cause a compile-time error using _Static_assert C11 feature
3) User does not want MPI-1 compatibility, and does not have a
C11-capable compiler, but the compiler supports error function
attributes.
Declare an MPI_Address (etc.) macro in mpi.h, which will
cause a compile-time error using error function attribute.
4) User does not want MPI-1 compatibility, and does not have a
C11-capable compiler, or a compiler that supports error
function attributes.
Do not declare MPI_Address (etc.) in mpi.h at all.
Unless the user is compiling with something like -Werror,
this will allow the user's code to compile. We are
choosing this because it seems like a losing battle to
make some kind of compile time error that is friendly to
the user (and doesn't make it look like mpi.h itself is broken).
On v4.0.x, this will allow the user code to both compile
(albeit with a warning) and link (because the MPI_Address
will be in the MPI library because we are preserving ABI
back to 3.0.x).
On master/v5.0.x, this will allow the user code to compile,
but it will fail to link (because the MPI_Address symbol will
not be in the MPI library).
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
commit c6070fd2e broke building fortran bindings
with PGI compilers. Turns out PGI compilers need
to link in the *.o from a module file whether or
not there are module subroutines defined or not in
the module file.
Related to #6411
Signed-off-by: Howard Pritchard <howardp@lanl.gov>