2006-06-06 01:24:42 +04:00
|
|
|
# -*- text -*-
|
|
|
|
#
|
|
|
|
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
|
|
# University Research and Technology
|
|
|
|
# Corporation. All rights reserved.
|
|
|
|
# Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
|
|
# of Tennessee Research Foundation. All rights
|
|
|
|
# reserved.
|
|
|
|
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
|
|
# University of Stuttgart. All rights reserved.
|
|
|
|
# Copyright (c) 2004-2006 The Regents of the University of California.
|
|
|
|
# All rights reserved.
|
2007-01-22 21:45:43 +03:00
|
|
|
# Copyright (c) 2006-2007 Cisco Systems, Inc. All rights reserved.
|
2006-06-06 01:24:42 +04:00
|
|
|
# $COPYRIGHT$
|
|
|
|
#
|
|
|
|
# Additional copyrights may follow
|
|
|
|
#
|
|
|
|
# $HEADER$
|
|
|
|
#
|
|
|
|
# This is the US/English general help file for Open MPI.
|
|
|
|
#
|
Bring over all the work from the /tmp/ib-hw-detect branch. In
addition to my design and testing, it was conceptually approved by
Gil, Gleb, Pasha, Brad, and Galen. Functionally [probably somewhat
lightly] tested by Galen. We may still have to shake out some bugs
during the next few months, but it seems to be working for all the
cases that I can throw at it.
Here's a summary of the changes from that branch:
* Move MCA parameter registration to a new file (btl_openib_mca.c):
* Properly check the retun status of registering MCA params
* Check for valid values of MCA parameters
* Make help strings better
* Otherwise, the only default value of an MCA param that was
changed was max_btls; it went from 4 to -1 (meaning: use all
available)
* Properly prototyped internal functions in _component.c
* Made a bunch of functions static that didn't need to be public
* Renamed to remove "mca_" prefix from static functions
* Call new MCA param registration function
* Call new INI file read/lookup/finalize functions
* Updated a bunch of macros to be "BTL_" instead of "ORTE_"
* Be a little more consistent with return values
* Handle -1 for the max_btls MCA param
* Fixed a free() that should have been an OBJ_RELEASE()
* Some re-indenting
* Added INI-file parsing
* New flex file: btl_openib_ini.l
* New default HCA params .ini file (probably to be expanded over
time by other HCA vendors)
* Added more show_help messages for parsing problems
* Read in INI files and cache the values for later lookup
* When component opens an HCA, lookup to see if any corresponding
values were found in the INI files (ID'ed by the HCA vendor_id
and vendor_part_id)
* Added btl_openib_verbose MCA param that shows what the INI-file
stuff does (e.g., shows which MTU your HCA ends up using)
* Added btl_openib_hca_param_files as a colon-delimited list of INI
files to check for values during startup (in order,
left-to-right, just like the MCA base directory param).
* MTU is currently the only value supported in this framework.
* It is not a fatal error if we don't find params for the HCA in
the INI file(s). Instead, just print a warning. New MCA param
btl_openib_warn_no_hca_params_found can be used to disable
printing the warning.
* Add MTU to peer negotiation when making a connection
* Exchange maximum MTU; select the lesser of the two
This commit was SVN r11182.
2006-08-14 23:30:37 +04:00
|
|
|
[ini file:file not found]
|
|
|
|
The Open MPI OpenIB BTL component was unable to find or read an INI
|
|
|
|
file that was requested via the btl_openib_hca_param_files MCA
|
|
|
|
parameter. Please check this file and/or modify the
|
|
|
|
btl_openib_hca_param_files MCA parameter:
|
|
|
|
|
|
|
|
%s
|
|
|
|
[ini file:not in a section]
|
2007-01-22 21:45:43 +03:00
|
|
|
In parsing OpenIB BTL parameter file, values were found that were not
|
Bring over all the work from the /tmp/ib-hw-detect branch. In
addition to my design and testing, it was conceptually approved by
Gil, Gleb, Pasha, Brad, and Galen. Functionally [probably somewhat
lightly] tested by Galen. We may still have to shake out some bugs
during the next few months, but it seems to be working for all the
cases that I can throw at it.
Here's a summary of the changes from that branch:
* Move MCA parameter registration to a new file (btl_openib_mca.c):
* Properly check the retun status of registering MCA params
* Check for valid values of MCA parameters
* Make help strings better
* Otherwise, the only default value of an MCA param that was
changed was max_btls; it went from 4 to -1 (meaning: use all
available)
* Properly prototyped internal functions in _component.c
* Made a bunch of functions static that didn't need to be public
* Renamed to remove "mca_" prefix from static functions
* Call new MCA param registration function
* Call new INI file read/lookup/finalize functions
* Updated a bunch of macros to be "BTL_" instead of "ORTE_"
* Be a little more consistent with return values
* Handle -1 for the max_btls MCA param
* Fixed a free() that should have been an OBJ_RELEASE()
* Some re-indenting
* Added INI-file parsing
* New flex file: btl_openib_ini.l
* New default HCA params .ini file (probably to be expanded over
time by other HCA vendors)
* Added more show_help messages for parsing problems
* Read in INI files and cache the values for later lookup
* When component opens an HCA, lookup to see if any corresponding
values were found in the INI files (ID'ed by the HCA vendor_id
and vendor_part_id)
* Added btl_openib_verbose MCA param that shows what the INI-file
stuff does (e.g., shows which MTU your HCA ends up using)
* Added btl_openib_hca_param_files as a colon-delimited list of INI
files to check for values during startup (in order,
left-to-right, just like the MCA base directory param).
* MTU is currently the only value supported in this framework.
* It is not a fatal error if we don't find params for the HCA in
the INI file(s). Instead, just print a warning. New MCA param
btl_openib_warn_no_hca_params_found can be used to disable
printing the warning.
* Add MTU to peer negotiation when making a connection
* Exchange maximum MTU; select the lesser of the two
This commit was SVN r11182.
2006-08-14 23:30:37 +04:00
|
|
|
in a valid INI section. These values will be ignored. Please
|
|
|
|
re-check this file:
|
|
|
|
|
|
|
|
%s
|
|
|
|
|
|
|
|
At line %d, near the following text:
|
|
|
|
|
|
|
|
%s
|
|
|
|
[ini file:unexpected token]
|
2007-01-22 21:45:43 +03:00
|
|
|
In parsing OpenIB BTL parameter file, unexpected tokens were found
|
Bring over all the work from the /tmp/ib-hw-detect branch. In
addition to my design and testing, it was conceptually approved by
Gil, Gleb, Pasha, Brad, and Galen. Functionally [probably somewhat
lightly] tested by Galen. We may still have to shake out some bugs
during the next few months, but it seems to be working for all the
cases that I can throw at it.
Here's a summary of the changes from that branch:
* Move MCA parameter registration to a new file (btl_openib_mca.c):
* Properly check the retun status of registering MCA params
* Check for valid values of MCA parameters
* Make help strings better
* Otherwise, the only default value of an MCA param that was
changed was max_btls; it went from 4 to -1 (meaning: use all
available)
* Properly prototyped internal functions in _component.c
* Made a bunch of functions static that didn't need to be public
* Renamed to remove "mca_" prefix from static functions
* Call new MCA param registration function
* Call new INI file read/lookup/finalize functions
* Updated a bunch of macros to be "BTL_" instead of "ORTE_"
* Be a little more consistent with return values
* Handle -1 for the max_btls MCA param
* Fixed a free() that should have been an OBJ_RELEASE()
* Some re-indenting
* Added INI-file parsing
* New flex file: btl_openib_ini.l
* New default HCA params .ini file (probably to be expanded over
time by other HCA vendors)
* Added more show_help messages for parsing problems
* Read in INI files and cache the values for later lookup
* When component opens an HCA, lookup to see if any corresponding
values were found in the INI files (ID'ed by the HCA vendor_id
and vendor_part_id)
* Added btl_openib_verbose MCA param that shows what the INI-file
stuff does (e.g., shows which MTU your HCA ends up using)
* Added btl_openib_hca_param_files as a colon-delimited list of INI
files to check for values during startup (in order,
left-to-right, just like the MCA base directory param).
* MTU is currently the only value supported in this framework.
* It is not a fatal error if we don't find params for the HCA in
the INI file(s). Instead, just print a warning. New MCA param
btl_openib_warn_no_hca_params_found can be used to disable
printing the warning.
* Add MTU to peer negotiation when making a connection
* Exchange maximum MTU; select the lesser of the two
This commit was SVN r11182.
2006-08-14 23:30:37 +04:00
|
|
|
(this may cause significant portions of the INI file to be ignored).
|
|
|
|
Please re-check this file:
|
|
|
|
|
|
|
|
%s
|
|
|
|
|
|
|
|
At line %d, near the following text:
|
|
|
|
|
|
|
|
%s
|
|
|
|
[ini file:expected equals]
|
2007-01-22 21:45:43 +03:00
|
|
|
In parsing OpenIB BTL parameter file, unexpected tokens were found
|
Bring over all the work from the /tmp/ib-hw-detect branch. In
addition to my design and testing, it was conceptually approved by
Gil, Gleb, Pasha, Brad, and Galen. Functionally [probably somewhat
lightly] tested by Galen. We may still have to shake out some bugs
during the next few months, but it seems to be working for all the
cases that I can throw at it.
Here's a summary of the changes from that branch:
* Move MCA parameter registration to a new file (btl_openib_mca.c):
* Properly check the retun status of registering MCA params
* Check for valid values of MCA parameters
* Make help strings better
* Otherwise, the only default value of an MCA param that was
changed was max_btls; it went from 4 to -1 (meaning: use all
available)
* Properly prototyped internal functions in _component.c
* Made a bunch of functions static that didn't need to be public
* Renamed to remove "mca_" prefix from static functions
* Call new MCA param registration function
* Call new INI file read/lookup/finalize functions
* Updated a bunch of macros to be "BTL_" instead of "ORTE_"
* Be a little more consistent with return values
* Handle -1 for the max_btls MCA param
* Fixed a free() that should have been an OBJ_RELEASE()
* Some re-indenting
* Added INI-file parsing
* New flex file: btl_openib_ini.l
* New default HCA params .ini file (probably to be expanded over
time by other HCA vendors)
* Added more show_help messages for parsing problems
* Read in INI files and cache the values for later lookup
* When component opens an HCA, lookup to see if any corresponding
values were found in the INI files (ID'ed by the HCA vendor_id
and vendor_part_id)
* Added btl_openib_verbose MCA param that shows what the INI-file
stuff does (e.g., shows which MTU your HCA ends up using)
* Added btl_openib_hca_param_files as a colon-delimited list of INI
files to check for values during startup (in order,
left-to-right, just like the MCA base directory param).
* MTU is currently the only value supported in this framework.
* It is not a fatal error if we don't find params for the HCA in
the INI file(s). Instead, just print a warning. New MCA param
btl_openib_warn_no_hca_params_found can be used to disable
printing the warning.
* Add MTU to peer negotiation when making a connection
* Exchange maximum MTU; select the lesser of the two
This commit was SVN r11182.
2006-08-14 23:30:37 +04:00
|
|
|
(this may cause significant portions of the INI file to be ignored).
|
|
|
|
An equals sign ("=") was expected but was not found. Please re-check
|
|
|
|
this file:
|
|
|
|
|
|
|
|
%s
|
|
|
|
|
|
|
|
At line %d, near the following text:
|
|
|
|
|
|
|
|
%s
|
|
|
|
[ini file:expected newline]
|
2007-01-22 21:45:43 +03:00
|
|
|
In parsing OpenIB BTL parameter file, unexpected tokens were found
|
Bring over all the work from the /tmp/ib-hw-detect branch. In
addition to my design and testing, it was conceptually approved by
Gil, Gleb, Pasha, Brad, and Galen. Functionally [probably somewhat
lightly] tested by Galen. We may still have to shake out some bugs
during the next few months, but it seems to be working for all the
cases that I can throw at it.
Here's a summary of the changes from that branch:
* Move MCA parameter registration to a new file (btl_openib_mca.c):
* Properly check the retun status of registering MCA params
* Check for valid values of MCA parameters
* Make help strings better
* Otherwise, the only default value of an MCA param that was
changed was max_btls; it went from 4 to -1 (meaning: use all
available)
* Properly prototyped internal functions in _component.c
* Made a bunch of functions static that didn't need to be public
* Renamed to remove "mca_" prefix from static functions
* Call new MCA param registration function
* Call new INI file read/lookup/finalize functions
* Updated a bunch of macros to be "BTL_" instead of "ORTE_"
* Be a little more consistent with return values
* Handle -1 for the max_btls MCA param
* Fixed a free() that should have been an OBJ_RELEASE()
* Some re-indenting
* Added INI-file parsing
* New flex file: btl_openib_ini.l
* New default HCA params .ini file (probably to be expanded over
time by other HCA vendors)
* Added more show_help messages for parsing problems
* Read in INI files and cache the values for later lookup
* When component opens an HCA, lookup to see if any corresponding
values were found in the INI files (ID'ed by the HCA vendor_id
and vendor_part_id)
* Added btl_openib_verbose MCA param that shows what the INI-file
stuff does (e.g., shows which MTU your HCA ends up using)
* Added btl_openib_hca_param_files as a colon-delimited list of INI
files to check for values during startup (in order,
left-to-right, just like the MCA base directory param).
* MTU is currently the only value supported in this framework.
* It is not a fatal error if we don't find params for the HCA in
the INI file(s). Instead, just print a warning. New MCA param
btl_openib_warn_no_hca_params_found can be used to disable
printing the warning.
* Add MTU to peer negotiation when making a connection
* Exchange maximum MTU; select the lesser of the two
This commit was SVN r11182.
2006-08-14 23:30:37 +04:00
|
|
|
(this may cause significant portions of the INI file to be ignored).
|
|
|
|
A newline was expected but was not found. Please re-check this file:
|
|
|
|
|
|
|
|
%s
|
|
|
|
|
|
|
|
At line %d, near the following text:
|
|
|
|
|
|
|
|
%s
|
2007-01-22 21:45:43 +03:00
|
|
|
[ini file:unknown field]
|
|
|
|
In parsing OpenIB BTL parameter file, an unrecognized field name was
|
|
|
|
found. Please re-check this file:
|
|
|
|
|
|
|
|
%s
|
|
|
|
|
|
|
|
At line %d, the field named:
|
|
|
|
|
|
|
|
%s
|
|
|
|
|
|
|
|
This field, and any other unrecognized fields, will be skipped.
|
Bring over all the work from the /tmp/ib-hw-detect branch. In
addition to my design and testing, it was conceptually approved by
Gil, Gleb, Pasha, Brad, and Galen. Functionally [probably somewhat
lightly] tested by Galen. We may still have to shake out some bugs
during the next few months, but it seems to be working for all the
cases that I can throw at it.
Here's a summary of the changes from that branch:
* Move MCA parameter registration to a new file (btl_openib_mca.c):
* Properly check the retun status of registering MCA params
* Check for valid values of MCA parameters
* Make help strings better
* Otherwise, the only default value of an MCA param that was
changed was max_btls; it went from 4 to -1 (meaning: use all
available)
* Properly prototyped internal functions in _component.c
* Made a bunch of functions static that didn't need to be public
* Renamed to remove "mca_" prefix from static functions
* Call new MCA param registration function
* Call new INI file read/lookup/finalize functions
* Updated a bunch of macros to be "BTL_" instead of "ORTE_"
* Be a little more consistent with return values
* Handle -1 for the max_btls MCA param
* Fixed a free() that should have been an OBJ_RELEASE()
* Some re-indenting
* Added INI-file parsing
* New flex file: btl_openib_ini.l
* New default HCA params .ini file (probably to be expanded over
time by other HCA vendors)
* Added more show_help messages for parsing problems
* Read in INI files and cache the values for later lookup
* When component opens an HCA, lookup to see if any corresponding
values were found in the INI files (ID'ed by the HCA vendor_id
and vendor_part_id)
* Added btl_openib_verbose MCA param that shows what the INI-file
stuff does (e.g., shows which MTU your HCA ends up using)
* Added btl_openib_hca_param_files as a colon-delimited list of INI
files to check for values during startup (in order,
left-to-right, just like the MCA base directory param).
* MTU is currently the only value supported in this framework.
* It is not a fatal error if we don't find params for the HCA in
the INI file(s). Instead, just print a warning. New MCA param
btl_openib_warn_no_hca_params_found can be used to disable
printing the warning.
* Add MTU to peer negotiation when making a connection
* Exchange maximum MTU; select the lesser of the two
This commit was SVN r11182.
2006-08-14 23:30:37 +04:00
|
|
|
[no hca params found]
|
|
|
|
WARNING: No HCA parameters were found for the HCA that Open MPI
|
|
|
|
detected:
|
|
|
|
|
|
|
|
Hostname: %s
|
|
|
|
HCA vendor ID: 0x%04x
|
|
|
|
HCA vendor part ID: %d
|
|
|
|
|
|
|
|
Default HCA parameters will be used, which may result in lower
|
|
|
|
performance. You can edit any of the files specified by the
|
|
|
|
btl_openib_hca_param_files MCA parameter to set values for your HCA.
|
|
|
|
|
|
|
|
NOTE: You can turn off this warning by setting the MCA parameter
|
|
|
|
btl_openib_warn_no_hca_params_found to 0.
|
2007-01-25 01:25:40 +03:00
|
|
|
[init-fail-no-mem]
|
|
|
|
The OpenIB BTL failed to initialize while trying to allocate some
|
|
|
|
locked memory. This typically can indicate that the memlock limits
|
|
|
|
are set too low. For most HPC installations, the memlock limits
|
|
|
|
should be set to "unlimited". The failure occured here:
|
|
|
|
|
2007-01-31 00:22:56 +03:00
|
|
|
Host: %s
|
|
|
|
OMPI source: %s:%d
|
|
|
|
Function: %s()
|
|
|
|
Device: %s
|
|
|
|
Memlock limit: %s
|
2007-01-25 01:25:40 +03:00
|
|
|
|
|
|
|
You may need to consult with your system administrator to get this
|
|
|
|
problem fixed. This FAQ entry on the Open MPI web site may also be
|
|
|
|
helpful:
|
|
|
|
|
|
|
|
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
|
|
|
|
[init-fail-create-q]
|
|
|
|
The OpenIB BTL failed to initialize while trying to create an internal
|
|
|
|
queue. This typically indicates a failed OpenFabrics installation or
|
|
|
|
faulty hardware. The failure occured here:
|
|
|
|
|
|
|
|
Host: %s
|
|
|
|
OMPI source: %s:%d
|
|
|
|
Function: %s()
|
|
|
|
Error: %s (errno=%d)
|
|
|
|
Device: %s
|
|
|
|
|
|
|
|
You may need to consult with your system administrator to get this
|
|
|
|
problem fixed.
|
2006-06-06 01:24:42 +04:00
|
|
|
[btl_openib:retry-exceeded]
|
2006-06-20 15:23:38 +04:00
|
|
|
The InfiniBand retry count between two MPI processes has been
|
|
|
|
exceeded. "Retry count" is defined in the InfiniBand spec 1.2
|
|
|
|
(section 12.7.38):
|
2006-06-06 01:24:42 +04:00
|
|
|
|
2006-06-20 15:23:38 +04:00
|
|
|
The total number of times that the sender wishes the receiver to
|
|
|
|
retry timeout, packet sequence, etc. errors before posting a
|
|
|
|
completion error.
|
2006-06-06 06:04:56 +04:00
|
|
|
|
2006-06-20 15:23:38 +04:00
|
|
|
This error typically means that there is something awry within the
|
|
|
|
InfiniBand fabric itself. You should note the hosts on which this
|
|
|
|
error has occurred; it has been observed that rebooting or removing a
|
|
|
|
particular host from the job can sometimes resolve this issue.
|
2006-06-06 06:04:56 +04:00
|
|
|
|
2006-06-20 15:23:38 +04:00
|
|
|
Two MCA parameters can be used to control Open MPI's behavior with
|
|
|
|
respect to the retry count:
|
2006-06-06 06:04:56 +04:00
|
|
|
|
2006-06-20 15:23:38 +04:00
|
|
|
* btl_openib_ib_retry_count - The number of times the sender will
|
|
|
|
attempt to retry (defaulted to 7, the maximum value).
|
|
|
|
|
|
|
|
* btl_openib_ib_timeout - The local ACK timeout parameter (defaulted
|
|
|
|
to 10). The actual timeout value used is calculated as:
|
|
|
|
|
|
|
|
4.096 microseconds * (2^btl_openib_ib_timeout)
|
|
|
|
|
|
|
|
See the InfiniBand spec 1.2 (section 12.7.34) for more details.
|
2006-09-19 12:56:32 +04:00
|
|
|
[no active ports found]
|
|
|
|
WARNING: There is at least on IB HCA found on host '%s', but there is
|
|
|
|
no active ports detected. This is most certainly not what you wanted.
|
|
|
|
Check your cables and SM configuration.
|
|
|
|
[error in hca init]
|
|
|
|
WARNING: There were errors during IB HCA initialization on host '%s'.
|
2006-09-26 16:12:33 +04:00
|
|
|
[default subnet prefix]
|
2006-12-09 18:13:03 +03:00
|
|
|
WARNING: There are more than one active ports on host '%s', but the
|
|
|
|
default subnet GID prefix was detected on more than one of these
|
|
|
|
ports. If these ports are connected to different physical IB
|
|
|
|
networks, this configuration will fail in Open MPI. This version of
|
|
|
|
Open MPI requires that every physically separate IB subnet that is
|
|
|
|
used between connected MPI processes must have different subnet ID
|
|
|
|
values.
|
|
|
|
|
|
|
|
Please see this FAQ entry for more details:
|
|
|
|
|
|
|
|
http://www.open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid
|
2006-09-26 16:12:33 +04:00
|
|
|
|
|
|
|
NOTE: You can turn off this warning by setting the MCA parameter
|
|
|
|
btl_openib_warn_default_gid_prefix to 0.
|