903762e194
The osc/sm component was using a simple counter to determine if all expected posts had arrived to start a PSCW access epoch. This is incorrect as a post may arrive from a peer that isn't part of the current start group. There are many ways this could have been fixed. This commit adds an n^2 bitmap. When a process posts it sets a bit in the bitmap associated with the access rank to indicate the post is complete. The access rank checks for and clears the bits associated with all the processes in the start group. The bitmap requires comm_size ^ 2 bits of space. This should be managable as most nodes have relatively small numbers of processes. If this changes another algorigthm can be implemented. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> |
||
---|---|---|
.. | ||
base | ||
portals4 | ||
pt2pt | ||
rdma | ||
sm | ||
Makefile.am | ||
osc.h |