# -*- text -*- # # Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana # University Research and Technology # Corporation. All rights reserved. # Copyright (c) 2004-2005 The University of Tennessee and The University # of Tennessee Research Foundation. All rights # reserved. # Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, # University of Stuttgart. All rights reserved. # Copyright (c) 2004-2006 The Regents of the University of California. # All rights reserved. # $COPYRIGHT$ # # Additional copyrights may follow # # $HEADER$ # # This is the US/English general help file for Open MPI. # [btl_openib:retry-exceeded] The InfiniBand retry count between two MPI processes has been exceeded. "Retry count" is defined in the InfiniBand spec 1.2 (section 12.7.38): The total number of times that the sender wishes the receiver to retry timeout, packet sequence, etc. errors before posting a completion error. This error typically means that there is something awry within the InfiniBand fabric itself. You should note the hosts on which this error has occurred; it has been observed that rebooting or removing a particular host from the job can sometimes resolve this issue. Two MCA parameters can be used to control Open MPI's behavior with respect to the retry count: * btl_openib_ib_retry_count - The number of times the sender will attempt to retry (defaulted to 7, the maximum value). * btl_openib_ib_timeout - The local ACK timeout parameter (defaulted to 10). The actual timeout value used is calculated as: 4.096 microseconds * (2^btl_openib_ib_timeout) See the InfiniBand spec 1.2 (section 12.7.34) for more details. [btl_openib:leave_pinned_multi_port] # Until ticket #142 is fixed This release of Open MPI does not support setting the "mpi_leave_pinned" parameter to a true value when using multiple HCA ports. This warning is emitted when multiple HCA ports are detected and "mpi_leave_pinned" is set to a true value, and is to inform you that Open MPI is going to automatically disregard all HCA ports beyond the first one (i.e., the MCA parameter "btl_openib_max_btls" parameter has been overridden and set to 1). You may silence this warning by setting the "btl_openib_warn_leave_pinned_multi_port" MCA parameter to 0.