The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 2ef5bd8b3671f1e10caf00d06d66d120eac9c5be)
unless configure'd with --enable-mpi1-compatibility
This is a one-off commit for the v4.0.x branch since these symbols were
simply removed from master.
Thanks Lisandro Dalcin for reporting this.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
There are a couple MPI_Alltoallv calls in ad_gpfs_aggrs.c where the
send/recv data comes from places like req[r].lens, and the send
buffer and send displacements for example were being calculated as
sbuf = pick one of the reqs: req[bottom].lens
sdisps[r] = req[r].lens - req[bottom].lens
which might be okay if the .lens was data inside of req[] so they'd
all be close to each other. But each .lens field is just a pointer
that's malloced, so those addresses can be all over the place, so the
integer-sized sdisps[] isn't safe.
I changed it to have a new extra array sbuf and rbuf for those two
Alltoallv calls, and copied the data into the sbuf from the same
locations it used to be setting up the sdisps[] at, and after the
Alltoallv I copy the data out of the new rbuf into the same
locations it used to be setting up the rdisps[] at.
For what it's worth I was able to get this to fail -np 2 on a GPFS
filesystem with hints romio_cb_write enable. I didn't whittle the
test down to something small, but it was failing in an
MPI_File_write_all call.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
(cherry picked from commit d85cac8f1a11495415b67ecab69d2ae1cd19d155)
In the case the btl_get fails Ob1 tries to fallback on btl_put first but
the return code was ignored. So the code fell back on both btl_put and
btl_send.
Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
(cherry picked from commit 9c689f2225d29aa152627f39bab841afead254af)
We missed an assert to check if ALLOW_OVERTAKE is set or not before
validating the sequence number and this will cause deadlock.
Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
(cherry picked from commit 0263456cf4e99efc67d38acd100cf948e0399d63)
In fint_2_int.h there are some conversion macros for logicals. It has
one path for OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT where a new array
would be allocated and the conversions then might expand to
c_array[i] = (array[i] == 0 ? 0 : 1)
and another path for OMPI_SIZEOF_FORTRAN_LOGICAL == SIZEOF_INT where it
does things "in place", so the same conversion there would just be
array[i] = (array[i] == 0 ? 0 : 1)
The problem is some of the logical arrays being converted are INPUT
arguments. And it's possible for some compilers to even put the argument
in read-only memory so the above "in place" conversion SEGV's. A
testcase I have used
call MPI_CART_SUB(oldcomm, (/.true.,.false./), newcomm, ierr)
and gfortran put the second arg in read-only mem.
In cart_sub_f.c you can trace the ompi_fortran_logical_t *remain_dims arg.
remain_dims[] is for input only, but the file uses
OMPI_LOGICAL_ARRAY_NAME_DECL(remain_dims);
OMPI_ARRAY_LOGICAL_2_INT(remain_dims, ndims);
PMPI_Cart_sub(..., OMPI_LOGICAL_ARRAY_NAME_CONVERT(remain_dims), ...);
OMPI_ARRAY_INT_2_LOGICAL(remain_dims, ndims);
to convert it to c-ints make a C call then restore it to Fortran logicals
before returning.
It's not always wrong to convert purely in-place, eg cart_get_f.c has
a periods[] that's exclusively for OUTPUT and it would be fine with the
macros as they were. But I still say the macros are invalid because they
don't distinguish whether they're being used on INPUT or OUTPUT args and
thus they can't be used in a way that's legal for both cases.
It might be possible to fix the macros by adding more of them so that
cart_create_f.c and cart_get_f.c would use different macros that give
more context. But my fix here is just to turn off the first block and
make all paths run as if OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT.
The main macros that get enlarged by this change are
define OMPI_ARRAY_LOGICAL_2_INT_ALLOC : mallocs now
define OMPI_ARRAY_LOGICAL_2_INT : also mallocs now
But these are only used in 4 places, three of which are the purpose of
this checkin, to avoid the former in-place expansion of an INPUT arg:
cart_create_f.c
cart_map_f.c
cart_sub_f.c
and one of which is an OUPUT arg that was fine and that gets
unnecessarily expanded into a separate array by this checkin.
cart_get_f.c
So I think an unnecessary malloc in cart_get_f.c is the only downside
to this change, where the logicals array argument could have been used
and converted in place.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
Update provided by Gilles Gouaillardet to keep the in-place option
if OMPI_FORTRAN_VALUE_TRUE == 1 where no conversion is needed.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 0a7f1e3cc58dabe536df00ae9c97f7e9d27103ad)
This is so when a debugger attaches using MPIR, it can step out of this stack back into main.
This cannot be done with certain aggressive optimisations and missing debug information.
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Co-authored-by: Jeff Squyres <jsquyres@cisco.com>
(cherry-picked from 20f5840)
- there was a set of UCX related issues reported which caused
by mmap API hooks conflicts. We added diagnostic of such
problems to simplify bug-resolving pipeline
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit d8e3562bae700d84873c1d5ca9c45c846d7387ed)
The types of count, disp, and extent passed into
ompi_datatype_add() should be size_t, ptrdiff_t and ptrdiff_t,
respectively. This prevents integer overflows and errors in
computing the size of large indexed datatypes.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
(cherry picked from commit b61e6242d3aa494d2b25c9ebaea9ed980b02aa9b)
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
Its so easy to misspell compatability (sic) that we need
to have ompi_info help us out.
Related to #6470
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit a5ba48c21839e0aab4c96afa97466a10f8bdc721)
Refs https://github.com/open-mpi/ompi/issues/6278.
This commit is intended to be cherry-picked to v4.0.x and
the following commit will ammend to this functionality for
master's removal.
Changes the prototypes for MPI removed functions in the
following ways:
There are 4 cases:
1) User wants MPI-1 compatibility (--enable-mpi1-compatibility)
MPI_Address (and friends) are declared in mpi.h with
deprecation notice
2) User does not want MPI-1 compatibility, and has a C11-capable
compiler
Declare an MPI_Address (etc.) macro in mpi.h, which will
cause a compile-time error using _Static_assert C11 feature
3) User does not want MPI-1 compatibility, and does not have a
C11-capable compiler, but the compiler supports error function
attributes.
Declare an MPI_Address (etc.) macro in mpi.h, which will
cause a compile-time error using error function attribute.
4) User does not want MPI-1 compatibility, and does not have a
C11-capable compiler, or a compiler that supports error
function attributes.
Do not declare MPI_Address (etc.) in mpi.h at all.
Unless the user is compiling with something like -Werror,
this will allow the user's code to compile. We are
choosing this because it seems like a losing battle to
make some kind of compile time error that is friendly to
the user (and doesn't make it look like mpi.h itself is broken).
On v4.0.x, this will allow the user code to both compile
(albeit with a warning) and link (because the MPI_Address
will be in the MPI library because we are preserving ABI
back to 3.0.x).
On master/v5.0.x, this will allow the user code to compile,
but it will fail to link (because the MPI_Address symbol will
not be in the MPI library).
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
(cherry-picked from 3136a1706cdacb3f7530937da1dcfe15d2febc79)
commit c6070fd2e broke building fortran bindings
with PGI compilers. Turns out PGI compilers need
to link in the *.o from a module file whether or
not there are module subroutines defined or not in
the module file.
Related to #6411
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 266bc3aced5ff9019f01faef1ed01dd463fafd41)
For remote node peers pack smaller worker address, which contains
network device addresses only. This would reduce amount of OOB traffic
during startup.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 751d88192d05edb7e1912bab4e48643c6f9e1574)
mark the "self" peer OMPI_OSC_RDMA_PEER_LOCAL_BASE when
the window is dynamically created and use_cpu_atomics is set
in order to correctly handle communications to self.
Thanks Bart Janssens for reporting this issue.
Refs. open-mpi/ompi#6394
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(back-ported from commit open-mpi/ompi@fe05fcc11a)
Update the OPAL_CHECK_OFI configury macro:
- Make it safe to call the macro multiple times:
- The checks only execute the first time it is invoked
- Subsequent invocations, it just emits a friendly "checking..."
message so that configure output is sensible/logical
- With the goal of ultimately removing opal/mca/common/ofi, rename the
output variables from OPAL_CHECK_OFI to be
opal_ofi_{happy|CPPFLAGS|LDFLAGS|LIBS}.
- Update btl/usnic and mtl/ofi for these new conventions.
- Also, don't use AC_REQUIRE to invoke OPAL_CHECK_OFI because that
causes the macro to be invoked at a fairly random time, which makes
configure stdout confusing / hard to grok.
- Remove a little left-over kruft in OPAL_CHECK_OFI, too (which
resulted in an indenting change, making the change to
opal_check_ofi.m4 look larger than it really is).
Thanks Alastair McKinstry for the report and initial fix.
Thanks Rashika Kheria for the reminder.
Updated from master cherry pick: the OFI BTL does not exist on the
v4.0.x branch. Therefore, did not include the OFI BTL changes on
master in this cherry pick.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit f5e1a672ccd5db127e85e1e8f6bcfeb8a8b04527)
When compiling mpi.h with a modern C++ compiler and a high degree of
pickyness (e.g., -Wold-style-cast), casting using (void*) in the
OMPI_PREDEFINED_GLOBAL and MPI_STATUS*_IGNORE macros will emit
warnings. So if we're compiling with a C++ compiler, use C++'s
static_cast<> instead of (void*).
Thanks to @shadow-fax for identifying the issue.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 30afdcead915c9c5a62305b93460e2ba8cc6f801)
Similar to #6286 rounding number of bytes into a single precision floating point value to round up the result of a division is a potential risk due to rounding errors.
- remove floating point operations for `round up`
- removes floating point conversion for round down (native behavior of integer division)
Signed-off-by: René Widera <r.widera@hzdr.de>
(cherry picked from commit a91fab80a1e55e1df15f649e18d247e5d4654eb9)
This commit fixes a problem reported on the mailing list with
individual writes larger than 512 MB.
The culprit is a floating point division of two large, close values.
Changing the datatypes from float to double (which is what is being
used in the fcoll components) fixes the problem.
See issue #6285 and
https://forum.hdfgroup.org/t/cannot-write-more-than-512-mb-in-1d/5118
Thanks for Axel Huebl and René Widera for reporting the issue.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit c0f8ce0fff4684b670135043dd150abc9d83d988)
treematch/km_partitioning.c #include "config.h",
but there is no such file when the embedded treematch is used.
In order to prevent the embedded treematch from incorrectly using
the config.h from the embedded hwloc, generate a dummy config.h.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 0aeb27f77650d3ee97e17e770c9e5aa487d5e1f5)
ACCUMULATE, unlike REDUCE, can use with derived
datatypes with predefinied operations, with some
restrictions outlined in MPI-3:11.3.4. The derived
datatype must be composed entierly from one predefined
datatype (so you can do all the construction you want,
but at the bottom, you can only use one datatype, say,
MPI_INT).
Refs. open-mpi/ompi#6275
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(back-ported from commit open-mpi/ompi@bc1cab5498)
Valgrind warns that *newtype is uninitialized when calling from
Fortran as e.g.
use mpi
integer :: t, err
call MPI_Type_create_f90_integer(5, t, err)
Since newtype is intent(out), this should not happen. There is
no reason to convert the type using PMPI_Type_f2c, only to over-
write it immediately afterwards. The other type_create_* functions
did not convert newtype.
The valgrind warnings:
==28441== Conditional jump or move depends on uninitialised value(s)
==28441== at 0x581B555: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [...])
==28441== by 0x400C46: main (in [...])
==28441==
==28441== Conditional jump or move depends on uninitialised value(s)
==28441== at 0x581B563: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [..])
==28441== by 0x400C46: main (in [...])
==28441==
==28441== Use of uninitialised value of size 8
==28441== at 0x581B577: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [...])
==28441== by 0x400C46: main (in [...])
==28441==
Signed-off-by: Risto Toijala <risto.toijala@gmail.com>
(cherry picked from commit f14a0f4fc981a488150ac7426683e94645f9fdf7)
Refs https://github.com/open-mpi/ompi/issues/6227. Thanks to
George Marselis for reporting.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 62321be186dd7d3efcedc2e801f226f6660ea0c4)
Adding the implementations of the functions that were removed
from the MPI standard to the build list, regardless of the
state of the OMPI_ENABLE_MPI1_COMPAT.
According to the README, we want the OMPI_ENABLE_MPI1_COMPAT
configure flag to control which MPI prototypes are exposed in
mpi.h, NOT, which are built into the mpi library. Those will
remain in the mpi library until a future major release (5.0?)
NOTE: for the Fortran implementations, we instead define
OMPI_OMIT_MPI1_COMPAT_DECLS to 0 instead of
OMPI_ENABLE_MPI1_COMPAT to 1. I'm not sure why, but
this seems to work correctly.
Also changing the removed MPI_Errhandler_create implementation
to use the non removed MPI_Comm_errhandler_function prototype
(prototype remains unchanged from MPI_Comm_errhandler_fn)
NOTE: This commit is *NOT* a cherry-pick from master, because
on master, we are no longer building those symbols by
default, but on v4.0.x we _ARE_ still building these
symbols by default. This is because the v4.0.x branch
is to remain backwards compatible with v3.0.x, while at
the same time removing the "removed" symbols from mpi.h
(unless the user configures with --enable-mpi1-compatibility)
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>