Use linear with sync alltoall algorithm for certain message/comm size
ranges. Does not affect default fixed decision, unless HPCX (with its
custom parameters) is used or corresponding mca is set.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 404c4800688548b021bda68bdf10792424e6b1c5)
These issues were introduced in the recent commit b71af0eca0.
This commit fixes Coverity CID 1451661 and 1451660.
Though `c_info` part was an actual bug, the `c_sendtypes` part was not.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit open-mpi/ompi@facf8c5e98)
- ignore sendcounts, sendispls and sendtypes arguments when MPI_IN_PLACE is used
- use the right size when an inter-communicator is used.
Thanks Markus Geimer for reporting this.
Refs. open-mpi/ompi#5459
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@cdaed89d04)
MPI standard states a user MPI_Op and/or user MPI_Datatype can be free'd
after a call to a non blocking collective and before the non-blocking
collective completes.
Retain user (only) MPI_Op and MPI_Datatype when the non blocking call is
invoked, and set a request callback so they are free'd when the MPI_Request
completes.
Thanks Thomas Ponweiser for reporting this
Fixesopen-mpi/ompi#2151Fixesopen-mpi/ompi#1304
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@0fe756d416)
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Correctly bubble up errors in NBC collective operations
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
The error field of requests needs to be rearmed at start, not at create
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
(cherry picked from commit open-mpi/ompi@65660e5999)
In common_ompi_aggregators calc_cost routine:
do not cast the real division to an int intermediately.
This patch removes the obsolete int variable c and assigns
the result of the P_a/P_x division directly to n_as.
With the intermediate int c variable, n_as gets 0 if P_a < P_x,
resulting in a division by 0 when computing n_s.
Signed-off-by: Harald Klimach <harald.klimach@uni-siegen.de>
(cherry picked from commit e222a04ae57e5d09b8559f3c111de1f10a47246a)
The procedure names don't contain "_f08" of Fortran 2008 bindings of
Persistent Collective Operations(mpiext/pcollreq/use-mpi-f08).
This fix adds "_f08" to the procedure names of pcollreq/use-mpi-f08,
same as other Fortran 2008 routines in `ompi/mpi/fortran/use-mpi-f08/mod`.
Signed-off-by: Tsubasa Yanagibashi <fj2505dt@aa.jp.fujitsu.com>
(cherry picked from commit 3148b0cfaa04843e7219acb8c7e04f43f6d219fe)
Use the PVAR ctx to save the SPC index, so that no lookup nor
restriction on the SPC vars position is imposed.
Make sure the PVAR are always registered.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
When using an empty fileview, a division by zero bug can occur in ompio. Not entirely sure why the problem did not show up previously, but some recent changes trigger that bug in one of our tests.
This pr is part of a fix applied in commit f6b3a0a
Fixes Issue #6703
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
If user sets HCOLL_EXTERNAL_UCM_EVENTS=1 then we try init opal
memory framework and register a mem release cb. Otherwise, rely on ucx.
Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
- in case if multithreading requested but not supported
disable PML UCX
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit a3578d9ece2b40a349529e7b223df50b0aac64aa)
Atomic lock must progress local worker while obtaining the remote lock,
otherwise an active message which actually releases the lock might not
be processed while polling on local memory location.
(picked from master 9d1994b)
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
The rdma_frag attached to the send request was not correctly released
upon request completion, leaking until MPI_Finalize. A quick solution
would have been to add RDMA_FRAG_RETURN at different locations on the
send request completion, but it would have unnecessarily made the
sendreq completion path more complex. Instead, I added the length to
the RDMA fragment so that it can be completed during the remote ack.
Be more explicit on the comment.
The rdma_frag can only be freed once when the peer forced a protocol
change (from RDMA GET to send/recv). Otherwise the fragment will be
returned once all data pertaining to it has been trasnferred.
NOTE: Had to add a typedef for "opal_atomic_size_t" from master into
opal/threads/thread_usage.h into this cherry pick (it is in
opal/include/opal_stdatomic.h on master, but that file does not exist
here on the v4.0.x branch).
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit a16cf0e4dd6df4dea820fecedd5920df632935b8)
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
In case of using a btl_put in ob1, the handle of the locally registered
memory is sent with a PUT control message. In the current master code
the sent handle is necessary the handle in the frag but if the handle
has been successfully registered in the request, the frag structure does
not have any valid handle and all fragments use the request one.
I suggest to check if the handle in the fragment is valid and if not to
send the handle from the request.
Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
(cherry picked from commit e630046a4b82bc01379fb055af4c0e414c2a8e8f)
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 2ef5bd8b3671f1e10caf00d06d66d120eac9c5be)
unless configure'd with --enable-mpi1-compatibility
This is a one-off commit for the v4.0.x branch since these symbols were
simply removed from master.
Thanks Lisandro Dalcin for reporting this.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
There are a couple MPI_Alltoallv calls in ad_gpfs_aggrs.c where the
send/recv data comes from places like req[r].lens, and the send
buffer and send displacements for example were being calculated as
sbuf = pick one of the reqs: req[bottom].lens
sdisps[r] = req[r].lens - req[bottom].lens
which might be okay if the .lens was data inside of req[] so they'd
all be close to each other. But each .lens field is just a pointer
that's malloced, so those addresses can be all over the place, so the
integer-sized sdisps[] isn't safe.
I changed it to have a new extra array sbuf and rbuf for those two
Alltoallv calls, and copied the data into the sbuf from the same
locations it used to be setting up the sdisps[] at, and after the
Alltoallv I copy the data out of the new rbuf into the same
locations it used to be setting up the rdisps[] at.
For what it's worth I was able to get this to fail -np 2 on a GPFS
filesystem with hints romio_cb_write enable. I didn't whittle the
test down to something small, but it was failing in an
MPI_File_write_all call.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
(cherry picked from commit d85cac8f1a11495415b67ecab69d2ae1cd19d155)
In the case the btl_get fails Ob1 tries to fallback on btl_put first but
the return code was ignored. So the code fell back on both btl_put and
btl_send.
Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
(cherry picked from commit 9c689f2225d29aa152627f39bab841afead254af)
We missed an assert to check if ALLOW_OVERTAKE is set or not before
validating the sequence number and this will cause deadlock.
Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
(cherry picked from commit 0263456cf4e99efc67d38acd100cf948e0399d63)
In fint_2_int.h there are some conversion macros for logicals. It has
one path for OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT where a new array
would be allocated and the conversions then might expand to
c_array[i] = (array[i] == 0 ? 0 : 1)
and another path for OMPI_SIZEOF_FORTRAN_LOGICAL == SIZEOF_INT where it
does things "in place", so the same conversion there would just be
array[i] = (array[i] == 0 ? 0 : 1)
The problem is some of the logical arrays being converted are INPUT
arguments. And it's possible for some compilers to even put the argument
in read-only memory so the above "in place" conversion SEGV's. A
testcase I have used
call MPI_CART_SUB(oldcomm, (/.true.,.false./), newcomm, ierr)
and gfortran put the second arg in read-only mem.
In cart_sub_f.c you can trace the ompi_fortran_logical_t *remain_dims arg.
remain_dims[] is for input only, but the file uses
OMPI_LOGICAL_ARRAY_NAME_DECL(remain_dims);
OMPI_ARRAY_LOGICAL_2_INT(remain_dims, ndims);
PMPI_Cart_sub(..., OMPI_LOGICAL_ARRAY_NAME_CONVERT(remain_dims), ...);
OMPI_ARRAY_INT_2_LOGICAL(remain_dims, ndims);
to convert it to c-ints make a C call then restore it to Fortran logicals
before returning.
It's not always wrong to convert purely in-place, eg cart_get_f.c has
a periods[] that's exclusively for OUTPUT and it would be fine with the
macros as they were. But I still say the macros are invalid because they
don't distinguish whether they're being used on INPUT or OUTPUT args and
thus they can't be used in a way that's legal for both cases.
It might be possible to fix the macros by adding more of them so that
cart_create_f.c and cart_get_f.c would use different macros that give
more context. But my fix here is just to turn off the first block and
make all paths run as if OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT.
The main macros that get enlarged by this change are
define OMPI_ARRAY_LOGICAL_2_INT_ALLOC : mallocs now
define OMPI_ARRAY_LOGICAL_2_INT : also mallocs now
But these are only used in 4 places, three of which are the purpose of
this checkin, to avoid the former in-place expansion of an INPUT arg:
cart_create_f.c
cart_map_f.c
cart_sub_f.c
and one of which is an OUPUT arg that was fine and that gets
unnecessarily expanded into a separate array by this checkin.
cart_get_f.c
So I think an unnecessary malloc in cart_get_f.c is the only downside
to this change, where the logicals array argument could have been used
and converted in place.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
Update provided by Gilles Gouaillardet to keep the in-place option
if OMPI_FORTRAN_VALUE_TRUE == 1 where no conversion is needed.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 0a7f1e3cc58dabe536df00ae9c97f7e9d27103ad)
This is so when a debugger attaches using MPIR, it can step out of this stack back into main.
This cannot be done with certain aggressive optimisations and missing debug information.
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Co-authored-by: Jeff Squyres <jsquyres@cisco.com>
(cherry-picked from 20f5840)
- there was a set of UCX related issues reported which caused
by mmap API hooks conflicts. We added diagnostic of such
problems to simplify bug-resolving pipeline
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit d8e3562bae700d84873c1d5ca9c45c846d7387ed)