1
1
Jeff Squyres 6ae45b34fc usnic: check connectivity on first communication to a peer
Previously, we were only checking connectivity upon first ''send'' to
a peer.  But this ignores the case where the first communication to a
peer is actually an ACK -- i.e., we successfully received something
from the peer and we need to send an ACK back.  So we need to verify
that the ACK will actually get there.

Specifically, certain asymmetric routing cases can lead to a hang if
we don't check the connectivity in both directions.  E.g., if the
sender is able to get traffic to the receiver, but the receiver is
unable to get traffic back to the sender because it made a different
routing decision than the sender.

In this case, the connectivity checker from the sender could succeed
(because the connectivity checker will ACK along the same path in
which the ping was received), but sending a BTL ACK could fail
(because the BTL ACK will be sent back along the path chosen by the
graph algorithm, which, in an erroneous asymmetric routing scenario,
may be different/wrong).

Hence, we want to trigger the connectivity checker at the first
communication from A->B, which may either be a BTL send or an ACK.

Reviewed by Dave Goodell.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32309.
2014-07-24 21:32:56 +00:00
..
2014-07-24 19:06:43 +00:00
2012-04-18 15:57:29 +00:00
2014-04-21 23:30:05 +00:00