- according spec 1.4, annex C shmem collectives should process
calls where number of elements is zero independently from pointer
value
- added zero-count processing - it just call barrier to
sync ranks
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 9de128afaf)
mca_scoll_basic_alltoall() passed (pSync + 1) to barrier function, but
the value of _SHMEM_ALLTOALL_SYNC_SIZE is 1, which made the barrier
function use an invalid memory location. In particular, this location
was not initialized to _SHMEM_SYNC_VALUE, which broke the barrier
algorithm and it did not complete: One PE could read 0 from its peer and
assume the peer already started the barrier, and then write 1 to the
peer. Then, the peer entered the barrier and overwrote the 1 with 0, and
then it waited forever to see '1' in its pSync.
Found with shmem_verifier test suite.
(picked from master 6754bf1)
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
- added synonim to common ucx variables to allow
to print it in opal_info -a
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit e00f7a68ba)
- to avoid value mix used C99 style of object initializations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 2806504290)
- there is C11 naming conflict - atomic_init is C macro
which cause building issue
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 3295b23800)
- added implementation and/or/xor operations for post and
fetch-op notations
- implemented basic and UCX transports, mxm added
NON-IMPLEMENTED wrapper
- updated C interfaces only (fortran will be added later)
- existing API is not updated to spec v1.4
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
- added common logging infrastructure for all
UCX modules
- all UCX modules are switched to new infra
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
- removed atomic-basic-specific operand from module API
- added own calls for add and swap operations
- minor optimization for UCX atomics
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
- some common functionality of del_procs calls is moved into
mca_common module
- blocking ucp_put call is replaced by non-blocking routine
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
- method mca_spml_ucx_get_mkey_slow is moved into .c module,
added pointer to this method into mca_spml_ucx_t structure
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
- some spml calls are marked as inline to exclude cross-module
dependency
- updated get-key call to get link to local module
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
- added opal_progress call to wait function to avoid
possible [dead]lock issues
- wait call is declared as inline
- minor fixes
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
+ Add quiet method to SPML, so it can have different implementation with
fence.
+ Use ucp_worker_fence for spml_fence method of UCX SPML
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
This patch disables the oshmem layer if there are no SPMLs that
will build. With the limited set of SPMLs available to support
oshmem, many builds end up installing an oshmem library that we
know will not work. There has been a bit of customer confusion
over oshmem, hopefully this will lead customers in the right
direction.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>