90799f82cd
This commit was SVN r10220.
42 строки
1.8 KiB
Plaintext
42 строки
1.8 KiB
Plaintext
# -*- text -*-
|
|
#
|
|
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
# University Research and Technology
|
|
# Corporation. All rights reserved.
|
|
# Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
# of Tennessee Research Foundation. All rights
|
|
# reserved.
|
|
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
# University of Stuttgart. All rights reserved.
|
|
# Copyright (c) 2004-2006 The Regents of the University of California.
|
|
# All rights reserved.
|
|
# $COPYRIGHT$
|
|
#
|
|
# Additional copyrights may follow
|
|
#
|
|
# $HEADER$
|
|
#
|
|
# This is the US/English general help file for Open MPI.
|
|
#
|
|
[btl_mvapi:retry-exceeded]
|
|
The retry count is a down counter initialized on creation of the QP. Retry
|
|
count is defined in the InfiniBand Spec 1.2 (12.7.38):
|
|
The total number of times that the sender wishes the receiver to retry tim-
|
|
eout, packet sequence, etc. errors before posting a completion error.
|
|
|
|
Note that two mca parameters are involved here:
|
|
btl_mvapi_ib_retry_count - The number of times the sender will attempt to
|
|
retry (defaulted to 7, the maximum value).
|
|
|
|
btl_mvapi_ib_timeout - The local ack timeout parameter (defaulted to 10). The
|
|
actual timeout value used is calculated as:
|
|
(4.096 micro-seconds * 2^btl_mvapi_ib_timeout).
|
|
See InfiniBand Spec 1.2 (12.7.34) for more details.
|
|
|
|
What to do next:
|
|
One item to note is the hosts on which this error has occured, it has been
|
|
observed that rebooting or removing a particular host from the job can resolve
|
|
this issue. Should you be able to identify a specific cause or additional
|
|
trouble shooting information please report this to devel@open-mpi.org.
|
|
|