8a329a797c
mca_scoll_basic_alltoall() passed (pSync + 1) to barrier function, but
the value of _SHMEM_ALLTOALL_SYNC_SIZE is 1, which made the barrier
function use an invalid memory location. In particular, this location
was not initialized to _SHMEM_SYNC_VALUE, which broke the barrier
algorithm and it did not complete: One PE could read 0 from its peer and
assume the peer already started the barrier, and then write 1 to the
peer. Then, the peer entered the barrier and overwrote the 1 with 0, and
then it waited forever to see '1' in its pSync.
Found with shmem_verifier test suite.
(picked from master
|
||
---|---|---|
.. | ||
Makefile.am | ||
scoll_basic_alltoall.c | ||
scoll_basic_barrier.c | ||
scoll_basic_broadcast.c | ||
scoll_basic_collect.c | ||
scoll_basic_component.c | ||
scoll_basic_module.c | ||
scoll_basic_reduce.c | ||
scoll_basic.h |