1
1
openmpi/ompi/mca
Nathan Hjelm eed7b45db5 osc/rdma: fix issue identified by Berk Hess
osc/rdma uses counters to determine if all messages have been received
before exiting synchronization calls. The problem is that the active
target counter is always increasing (never zeroed). If over 2^31-1
messages are sent this causes the counter to overflow (in itself this
isn't an error). This causes test/wait to return before the communication
is complete. There is an additional error in the use of the fragment
flush function. If PSCW synchronization is in use this function CAN NOT
be called unless a post message has arrived.

Relevant mailing list thread: http://www.open-mpi.org/community/lists/devel/2014/10/16016.php

This commit fixes both issues. Tested against MTT and issue reproducer.

Closes #224.
2014-10-07 11:45:22 -06:00
..
bcol silence warnings 2014-08-11 07:36:46 +00:00
bml Fix the "unreachable" message so it outputs the correct hostname for the remote proc. Cleanup some of the pmix stuff when running corner cases of errors 2014-08-22 19:20:45 +00:00
coll Fix iallgather problem with intercommunicators 2014-10-02 11:45:17 -06:00
common George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) 2014-07-26 00:47:28 +00:00
crcp helpfiles: remove empty helpfiles 2014-08-08 13:33:47 +00:00
dpm Per the PMIx RFC: 2014-08-21 18:56:47 +00:00
fbtl implementation of non-blocking read/write operations through aio 2014-09-23 21:27:57 +00:00
fcoll make the zero byte read/write scenarios work without the contiguous flag. 2014-09-09 16:26:14 +00:00
fs George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) 2014-07-26 00:47:28 +00:00
io distscript: remove configure.params and autogen.subdirs kruft 2014-10-02 11:32:54 -07:00
mtl Bring over changes to MXM from pmix branch: 2014-09-03 18:22:11 +00:00
op distscript: remove configure.params and autogen.subdirs kruft 2014-10-02 11:32:54 -07:00
osc osc/rdma: fix issue identified by Berk Hess 2014-10-07 11:45:22 -06:00
pml Per the PMIx RFC: 2014-08-21 18:56:47 +00:00
pubsub Per the PMIx RFC: 2014-08-21 18:56:47 +00:00
rte Per the PMIx RFC: 2014-08-21 18:56:47 +00:00
sbgp check-help-strings cleanup 2014-08-11 03:19:57 +00:00
sharedfp George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) 2014-07-26 00:47:28 +00:00
topo distscript: remove configure.params and autogen.subdirs kruft 2014-10-02 11:32:54 -07:00
vprotocol ompi_mpi_abort had one extra argument that was never used. Clean it up. 2014-07-03 00:34:44 +00:00