usnic: update connectivity checker help message
Show an example of using the btl_usnic_connectivity_map option. Also, mention that another reason for the "total connectivity failure" may be due to asymmetric / unexpected routing. Reviewed by Dave Goodell. cmr=v1.8.2:reviewer=ompi-rm1.8 This commit was SVN r32465.
Этот коммит содержится в:
родитель
c5b2f9c8a5
Коммит
323b9f346c
@ -174,9 +174,18 @@ Your MPI job is going to abort now.
|
|||||||
Large message size: %u
|
Large message size: %u
|
||||||
|
|
||||||
Note that this behavior usually indicates that the MTU of some network
|
Note that this behavior usually indicates that the MTU of some network
|
||||||
link is too small between these two interfaces. You should verify that
|
link is too small between these two interfaces.
|
||||||
UDP traffic with payloads up to the "large message size" listed above
|
|
||||||
can flow between the specified interfaces on these servers.
|
You should verify that UDP traffic with payloads up to the "large
|
||||||
|
message size" listed above can flow between these two interfaces. You
|
||||||
|
should also verify that Open MPI is choosing to pair IP interfaces
|
||||||
|
consistently. For example:
|
||||||
|
|
||||||
|
mpirun --mca btl_usnic_connectivity_map mymap ...
|
||||||
|
|
||||||
|
Check the resulting "mymap*" files to see the exact pairing of IP
|
||||||
|
interfaces. Inconsistent results may be indicative of underlying
|
||||||
|
network misconfigurations.
|
||||||
#
|
#
|
||||||
[connectivity error: small bad, large ok]
|
[connectivity error: small bad, large ok]
|
||||||
The Open MPI usNIC BTL was unable to establish full connectivity
|
The Open MPI usNIC BTL was unable to establish full connectivity
|
||||||
@ -199,15 +208,28 @@ Your MPI job is going to abort now.
|
|||||||
|
|
||||||
This is a very strange network error, and should not occur in most
|
This is a very strange network error, and should not occur in most
|
||||||
situations. You may be experiencing high amounts of congestion, or
|
situations. You may be experiencing high amounts of congestion, or
|
||||||
this may indicate some kind of network misconfiguration. You should
|
this may indicate some kind of network misconfiguration.
|
||||||
verify that UDP traffic with payloads up to the "large message size"
|
|
||||||
listed above can flow between the specified interfaces on these
|
You should verify that UDP traffic with payloads up to the "large
|
||||||
servers.
|
message size" listed above can flow between these two interfaces. You
|
||||||
|
should also verify that Open MPI is choosing to pair IP interfaces
|
||||||
|
consistently. For example:
|
||||||
|
|
||||||
|
mpirun --mca btl_usnic_connectivity_map mymap ...
|
||||||
|
|
||||||
|
Check the resulting "mymap*" files to see the exact pairing of IP
|
||||||
|
interfaces. Inconsistent results may be indicative of underlying
|
||||||
|
network misconfigurations.
|
||||||
#
|
#
|
||||||
[connectivity error: small bad, large bad]
|
[connectivity error: small bad, large bad]
|
||||||
The Open MPI usNIC BTL was unable to establish any connectivity
|
The Open MPI usNIC BTL was unable to establish any connectivity
|
||||||
between at least one pair of interfaces on servers in the MPI job.
|
between at least one pair of interfaces on servers in the MPI job.
|
||||||
Specifically, no UDP messages seemed to flow between the interfaces.
|
This can happen for several reasons, including:
|
||||||
|
|
||||||
|
1. No UDP traffic is able to flow between the interfaces listed below.
|
||||||
|
2. There is asymmetric routing between the interfaces listed below,
|
||||||
|
leading Open MPI to discard UDP traffic it thinks is from an
|
||||||
|
unexpected source.
|
||||||
|
|
||||||
Your MPI job is going to abort now.
|
Your MPI job is going to abort now.
|
||||||
|
|
||||||
@ -223,9 +245,18 @@ Your MPI job is going to abort now.
|
|||||||
Large message size: %u
|
Large message size: %u
|
||||||
|
|
||||||
Note that this behavior usually indicates some kind of network
|
Note that this behavior usually indicates some kind of network
|
||||||
misconfiguration. You should verify that UDP traffic with payloads up
|
misconfiguration.
|
||||||
to the "large message size" listed above can flow between the
|
|
||||||
specified interfaces on these servers.
|
You should verify that UDP traffic with payloads up to the "large
|
||||||
|
message size" listed above can flow between these two interfaces. You
|
||||||
|
should also verify that Open MPI is choosing to pair IP interfaces
|
||||||
|
consistently. For example:
|
||||||
|
|
||||||
|
mpirun --mca btl_usnic_connectivity_map mymap ...
|
||||||
|
|
||||||
|
Check the resulting "mymap*" files to see the exact pairing of IP
|
||||||
|
interfaces. Inconsistent results may be indicative of underlying
|
||||||
|
network misconfigurations.
|
||||||
#
|
#
|
||||||
[ibv_create_ah timeout]
|
[ibv_create_ah timeout]
|
||||||
The usnic BTL failed to create addresses for remote peers within the
|
The usnic BTL failed to create addresses for remote peers within the
|
||||||
|
Загрузка…
Ссылка в новой задаче
Block a user