1
1

usnic: update connectivity checker help message

Show an example of using the btl_usnic_connectivity_map option.  Also,
mention that another reason for the "total connectivity failure" may
be due to asymmetric / unexpected routing.

Reviewed by Dave Goodell.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32465.
Этот коммит содержится в:
Jeff Squyres 2014-08-08 17:18:29 +00:00
родитель c5b2f9c8a5
Коммит 323b9f346c

Просмотреть файл

@ -174,9 +174,18 @@ Your MPI job is going to abort now.
Large message size: %u Large message size: %u
Note that this behavior usually indicates that the MTU of some network Note that this behavior usually indicates that the MTU of some network
link is too small between these two interfaces. You should verify that link is too small between these two interfaces.
UDP traffic with payloads up to the "large message size" listed above
can flow between the specified interfaces on these servers. You should verify that UDP traffic with payloads up to the "large
message size" listed above can flow between these two interfaces. You
should also verify that Open MPI is choosing to pair IP interfaces
consistently. For example:
mpirun --mca btl_usnic_connectivity_map mymap ...
Check the resulting "mymap*" files to see the exact pairing of IP
interfaces. Inconsistent results may be indicative of underlying
network misconfigurations.
# #
[connectivity error: small bad, large ok] [connectivity error: small bad, large ok]
The Open MPI usNIC BTL was unable to establish full connectivity The Open MPI usNIC BTL was unable to establish full connectivity
@ -199,15 +208,28 @@ Your MPI job is going to abort now.
This is a very strange network error, and should not occur in most This is a very strange network error, and should not occur in most
situations. You may be experiencing high amounts of congestion, or situations. You may be experiencing high amounts of congestion, or
this may indicate some kind of network misconfiguration. You should this may indicate some kind of network misconfiguration.
verify that UDP traffic with payloads up to the "large message size"
listed above can flow between the specified interfaces on these You should verify that UDP traffic with payloads up to the "large
servers. message size" listed above can flow between these two interfaces. You
should also verify that Open MPI is choosing to pair IP interfaces
consistently. For example:
mpirun --mca btl_usnic_connectivity_map mymap ...
Check the resulting "mymap*" files to see the exact pairing of IP
interfaces. Inconsistent results may be indicative of underlying
network misconfigurations.
# #
[connectivity error: small bad, large bad] [connectivity error: small bad, large bad]
The Open MPI usNIC BTL was unable to establish any connectivity The Open MPI usNIC BTL was unable to establish any connectivity
between at least one pair of interfaces on servers in the MPI job. between at least one pair of interfaces on servers in the MPI job.
Specifically, no UDP messages seemed to flow between the interfaces. This can happen for several reasons, including:
1. No UDP traffic is able to flow between the interfaces listed below.
2. There is asymmetric routing between the interfaces listed below,
leading Open MPI to discard UDP traffic it thinks is from an
unexpected source.
Your MPI job is going to abort now. Your MPI job is going to abort now.
@ -223,9 +245,18 @@ Your MPI job is going to abort now.
Large message size: %u Large message size: %u
Note that this behavior usually indicates some kind of network Note that this behavior usually indicates some kind of network
misconfiguration. You should verify that UDP traffic with payloads up misconfiguration.
to the "large message size" listed above can flow between the
specified interfaces on these servers. You should verify that UDP traffic with payloads up to the "large
message size" listed above can flow between these two interfaces. You
should also verify that Open MPI is choosing to pair IP interfaces
consistently. For example:
mpirun --mca btl_usnic_connectivity_map mymap ...
Check the resulting "mymap*" files to see the exact pairing of IP
interfaces. Inconsistent results may be indicative of underlying
network misconfigurations.
# #
[ibv_create_ah timeout] [ibv_create_ah timeout]
The usnic BTL failed to create addresses for remote peers within the The usnic BTL failed to create addresses for remote peers within the