- added synchronized flush operation on quiet call.
- flush is implemented using get operation
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 0b108411f8)
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 2ef5bd8b36)
The new proc group is created from the "world_group" based on the
ranks mapping which can be directly taken from proc_name->vpid.
Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
- there was a set of UCX related issues reported which caused
by mmap API hooks conflicts. We added diagnostic of such
problems to simplify bug-resolving pipeline
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit d8e3562bae)
pshmem.h now includes shmem.h (since open-mpi/ompi@f46130cd20) and some macros were removed at that time.
Use the OSHMEM_HAVE_C11 macro (defined in shmem.h) instead of the
previous OSHMEMP_HAVE_C11 macrso previously defined in pshmem.h
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 5ea939aa54)
Fixes scoll_basic failures with shmem_verifier, caused by recent changes
in handling of zero-size collectives.
- Check for zero-size length only for fixed size collect (shmem_fcollect),
but not for variable-size collect (shmem_collect)
- Add 'nlong_type' parameter to internal broadcast function, to indicate
whether the 'nlong' parameter is valid on non-root PEs, since it's
used by shmem_collect algorithm. Before this change, some components
assumed it's true (scoll_mpi) while others assumed it's false
(scoll_basic).
- In scoll_basic, if nlong_type==false, do not exit if nlong==0, since
this parameter may not be the same on all PEs.
- In scoll_mpi, fallback to scoll_basic if nlong_type==false, since MPI
requires the 'count' argument of MPI_Bcast to be valid on all ranks.
(Picked from master 939162e)
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
- according spec 1.4, annex C shmem collectives should process
calls where number of elements is zero independently from pointer
value
- added zero-count processing - it just call barrier to
sync ranks
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 9de128afaf)
- added <cr> to split API groups to simplify human processing
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 6e78102089)
- added missing file to profile makefile
- constants SHMEM_CTX_* are shifted into public header
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 4a3e83780c)
mca_scoll_basic_alltoall() passed (pSync + 1) to barrier function, but
the value of _SHMEM_ALLTOALL_SYNC_SIZE is 1, which made the barrier
function use an invalid memory location. In particular, this location
was not initialized to _SHMEM_SYNC_VALUE, which broke the barrier
algorithm and it did not complete: One PE could read 0 from its peer and
assume the peer already started the barrier, and then write 1 to the
peer. Then, the peer entered the barrier and overwrote the 1 with 0, and
then it waited forever to see '1' in its pSync.
Found with shmem_verifier test suite.
(picked from master 6754bf1)
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
- added synonim to common ucx variables to allow
to print it in opal_info -a
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit e00f7a68ba)
- to avoid value mix used C99 style of object initializations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 2806504290)
- there is C11 naming conflict - atomic_init is C macro
which cause building issue
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 3295b23800)