1
1
openmpi/ompi/mca/btl/openib/help-mpi-btl-openib.txt
Jeff Squyres 6004e177e0 Fixes trac:1133: if you specify a max freelist size that is too small,
you'll get a helpful error message and the openib BTL will deactivate
itself.

This commit was SVN r16133.

The following Trac tickets were found above:
  Ticket 1133 --> https://svn.open-mpi.org/trac/ompi/ticket/1133
2007-09-14 21:42:56 +00:00

367 строки
12 KiB
Plaintext

# -*- text -*-
#
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
# University Research and Technology
# Corporation. All rights reserved.
# Copyright (c) 2004-2005 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
# University of Stuttgart. All rights reserved.
# Copyright (c) 2004-2006 The Regents of the University of California.
# All rights reserved.
# Copyright (c) 2006-2007 Cisco Systems, Inc. All rights reserved.
# Copyright (c) 2007 Mellanox Technologies. All rights reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#
# This is the US/English help file for Open MPI's OpenFabrics support
# (the openib BTL).
#
[ini file:file not found]
The Open MPI OpenIB BTL component was unable to find or read an INI
file that was requested via the btl_openib_hca_param_files MCA
parameter. Please check this file and/or modify the
btl_openib_hca_param_files MCA parameter:
%s
#
[ini file:not in a section]
In parsing OpenIB BTL parameter file, values were found that were not
in a valid INI section. These values will be ignored. Please
re-check this file:
%s
At line %d, near the following text:
%s
#
[ini file:unexpected token]
In parsing OpenIB BTL parameter file, unexpected tokens were found
(this may cause significant portions of the INI file to be ignored).
Please re-check this file:
%s
At line %d, near the following text:
%s
#
[ini file:expected equals]
In parsing OpenIB BTL parameter file, unexpected tokens were found
(this may cause significant portions of the INI file to be ignored).
An equals sign ("=") was expected but was not found. Please re-check
this file:
%s
At line %d, near the following text:
%s
#
[ini file:expected newline]
In parsing OpenIB BTL parameter file, unexpected tokens were found
(this may cause significant portions of the INI file to be ignored).
A newline was expected but was not found. Please re-check this file:
%s
At line %d, near the following text:
%s
#
[ini file:unknown field]
In parsing OpenIB BTL parameter file, an unrecognized field name was
found. Please re-check this file:
%s
At line %d, the field named:
%s
This field, and any other unrecognized fields, will be skipped.
#
[no hca params found]
WARNING: No HCA parameters were found for the HCA that Open MPI
detected:
Hostname: %s
HCA vendor ID: 0x%04x
HCA vendor part ID: %d
Default HCA parameters will be used, which may result in lower
performance. You can edit any of the files specified by the
btl_openib_hca_param_files MCA parameter to set values for your HCA.
NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_no_hca_params_found to 0.
#
[init-fail-no-mem]
The OpenIB BTL failed to initialize while trying to allocate some
locked memory. This typically can indicate that the memlock limits
are set too low. For most HPC installations, the memlock limits
should be set to "unlimited". The failure occured here:
Host: %s
OMPI source: %s:%d
Function: %s()
Device: %s
Memlock limit: %s
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
#
[init-fail-create-q]
The OpenIB BTL failed to initialize while trying to create an internal
queue. This typically indicates a failed OpenFabrics installation or
faulty hardware. The failure occured here:
Host: %s
OMPI source: %s:%d
Function: %s()
Error: %s (errno=%d)
Device: %s
You may need to consult with your system administrator to get this
problem fixed.
#
[btl_openib:retry-exceeded]
The InfiniBand retry count between two MPI processes has been
exceeded. "Retry count" is defined in the InfiniBand spec 1.2
(section 12.7.38):
The total number of times that the sender wishes the receiver to
retry timeout, packet sequence, etc. errors before posting a
completion error.
This error typically means that there is something awry within the
InfiniBand fabric itself. You should note the hosts on which this
error has occurred; it has been observed that rebooting or removing a
particular host from the job can sometimes resolve this issue.
Two MCA parameters can be used to control Open MPI's behavior with
respect to the retry count:
* btl_openib_ib_retry_count - The number of times the sender will
attempt to retry (defaulted to 7, the maximum value).
* btl_openib_ib_timeout - The local ACK timeout parameter (defaulted
to 10). The actual timeout value used is calculated as:
4.096 microseconds * (2^btl_openib_ib_timeout)
See the InfiniBand spec 1.2 (section 12.7.34) for more details.
#
[no active ports found]
WARNING: There is at least on IB HCA found on host '%s', but there is
no active ports detected. This is most certainly not what you wanted.
Check your cables and SM configuration.
#
[error in hca init]
WARNING: There were errors during IB HCA initialization on host '%s'.
#
[default subnet prefix]
WARNING: There are more than one active ports on host '%s', but the
default subnet GID prefix was detected on more than one of these
ports. If these ports are connected to different physical IB
networks, this configuration will fail in Open MPI. This version of
Open MPI requires that every physically separate IB subnet that is
used between connected MPI processes must have different subnet ID
values.
Please see this FAQ entry for more details:
http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid
NOTE: You can turn off this warning by setting the MCA parameter
btl_openib_warn_default_gid_prefix to 0.
#
[ibv_fork requested but not supported]
WARNING: fork() support was requested for the openib BTL, but it is
not supported on the host %s. Deactivating the openib BTL.
#
[ibv_fork_init fail]
WARNING: fork() support was requested for the openib BTL, but the
library call ibv_fork_init() failed on the host %s.
Deactivating the openib BTL.
#
[wrong buffer alignment]
Wrong buffer alignment %d configured on host '%s'. Should be bigger
than zero and power of two. Use default %d instead.
#
[of error event]
The OpenFabrics stack has reported a network error event
Open MPI will try to continue, but your job may end up failing.
Host: %s
MPI process PID: %d
Error number: %d
Error description: %s
This error may indicate connectivity problems within the fabric;
please contact your system administrator.
#
[of unknown event]
The OpenFabrics stack has reported an unknown network error event.
Open MPI will try to continue, but the job may end up failing.
Host: %s
MPI process PID: %d
Error number: %d
This error may indicate that you are using an OpenFabrics library
version that is not currently supported by Open MPI. You might try
recompiling Open MPI against your OpenFabrics library installation to
get more information.
#
[specified include and exclude]
ERROR: You have specified both the btl_openib_if_include and
btl_openib_if_exclude MCA parameters. These two parameters are
mutually exclusive; you can only specify one or the other.
For reference, the values that you specified are:
btl_openib_if_include: %s
btl_openib_if_exclude: %s
#
[nonexistent port]
WARNING: One or more nonexistent HCAs/ports were specified:
Host: %s
MCA parameter: mca_btl_if_%sclude
Nonexistent entities: %s
These entities will be ignored. You can disable this warning by
setting the btl_openib_warn_nonexistent_if MCA parameter to 0.
#
[invalid mca param value]
WARNING: An invalid MCA parameter value was found for the OpenFabrics
(openib) BTL.
Problem: %s
Resolution: %s
#
[no qps in receive_queues]
WARNING: No queue pairs were defined in the btl_openib_receive_queues
MCA parameter. At least one queue pair must be defined. The openib
BTL will therefore be deactivated for this run.
Host: %s
#
[invalid qp type in receive_queues]
WARNING: An invalid queue pair type was specified in the
btl_openib_receive_queues MCA parameter. The openib BTL will be
deactivated for this run.
Valid queue pair types are "P" for per-peer and "S" for shared receive
queue.
Host: %s
btl_openib_receive_queues: %s
Bad specification: %s
#
[invalid pp qp specification]
WARNING: An invalid per-peer receive queue specification was detected
as part of the btl_openib_receive_queues MCA parameter. The openib
BTL will therefore be deactivated for this run.
Per-peer receive queues require between 1 and 5 parameters:
1. Buffer size in bytes (mandatory)
2. Number of buffers (optional; defaults to 8)
3. Low buffer count watermark (optional; defaults to (num_buffers / 2))
4. Credit window size (optional; defaults to (low_watermark / 2))
5. Number of buffers reserved for credit messages (optional;
defaults to (num_buffers*2-1)/credit_window)
Example: P,128,256,128,16
- 128 byte buffers
- 256 buffers to receive incoming MPI messages
- When the number of available buffers reaches 128, re-post 128 more
buffers to reach a total of 256
- If the number of available credits reaches 16, send an explicit
credit message to the sender
- Defaulting to ((256 * 2) - 1) / 16 = 31; this many buffers are
reserved for explicit credit messages
Host: %s
Bad queue specification: %s
#
[invalid srq specification]
WARNING: An invalid shared receive queue specification was detected as
part of the btl_openib_receive_queues MCA parameter. The openib BTL
will therefore be deactivated for this run.
Shared receive queues can take between 2 and 4 parameters:
1. Buffer size in bytes (mandatory)
2. Number of buffers (optional; defaults to 16)
3. Low buffer count watermark (optional; defaults to (num_buffers / 2))
4. Maximum number of outstanding sends a sender can have (optional;
defaults to (low_watermark / 4)
Example: S,1024,256,128,32
- 1024 byte buffers
- 256 buffers to receive incoming MPI messages
- When the number of available buffers reaches 128, re-post 128 more
buffers to reach a total of 256
- A sender will not send to a peer unless it has less than 32
outstanding sends to that peer.
Host: %s
Bad queue specification: %s
#
[rd_num must be > rd_low]
WARNING: The number of buffers for a queue pair specified via the
btl_openib_receive_queues MCA parameter must be greater than the low
buffer count watermark. The openib BTL will therefore be deactivated
for this run.
Host: %s
Bad queue specification: %s
#
[biggest qp size is too small]
WARNING: The largest queue pair buffer size specified in the
btl_openib_receive_queues MCA parameter is smaller than the maximum
send size (i.e., the btl_openib_max_send_size MCA parameter), meaning
that no queue is large enough to receive the largest possible incoming
message fragment. The openib BTL will therefore be deactivated for
this run.
Host: %s
Largest buffer size: %d
Maximum send fragment size: %d
#
[biggest qp size is too big]
WARNING: The largest queue pair buffer size specified in the
btl_openib_receive_queues MCA parameter is larger than the maximum
send size (i.e., the btl_openib_max_send_size MCA parameter). This
means that memory will be wasted because the largest possible incoming
message fragment will not fill a buffer allocated for incoming
fragments.
Host: %s
Largest buffer size: %d
Maximum send fragment size: %d
#
[freelist too small]
WARNING: The maximum freelist size that was specified was too small
for the requested receive queue sizes. The maximum freelist size must
be at least equal to the sum of the largest number of buffers posted
to a single queue plus the corresponding number of reserved/credit
buffers for that queue. It is suggested that the maximum be quite a
bit larger than this for performance reasons.
Host: %s
Specified freelist size: %d
Minimum required freelist size: %d