This commit fixes the following bugs:
- Allow a btl to be used for communication if it can communicate with
all non-self peers and it supports global atomic visibility. In
this case CPU atomics can be used for self and the btl for any
other peer.
- It was possible to get into a state where different threads of an
MPI process could issue conflicting accumulate operations to a
remote peer. To eliminate this race we now update the peer flags
atomically.
- Queue up and re-issue put operations that failed during a BTL
callback. This can occur during an accumulate operation. This was
an unhandled error case.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>