1
1
openmpi/README
Jeff Squyres 387dacef66 Clarify BProc notes
This commit was SVN r6936.
2005-08-19 14:30:38 +00:00

492 строки
18 KiB
Plaintext

Copyright (c) 2004-2005 The Trustees of Indiana University.
All rights reserved.
Copyright (c) 2004-2005 The Trustees of the University of Tennessee.
All rights reserved.
Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
University of Stuttgart. All rights reserved.
Copyright (c) 2004-2005 The Regents of the University of California.
All rights reserved.
$COPYRIGHT$
Additional copyrights may follow
$HEADER$
This is a preliminary README file. It will be scrubbed formally
before release.
The best way to report bugs, send comments, or ask questions is to
sign up on the user's and/or developer's mailing list (for user-level
and developer-level questions; when in doubt, send to the user's
list):
users@open-mpi.org
devel@open-mpi.org
Because of spam, only subscribers are allowed to post to these lists
(ensure that you subscribe with and post from exactly the same e-mail
address -- joe@example.com is considered different than
joe@mycomputer.example.com!). Visit these pages to subscribe to the
lists:
http://www.open-mpi.org/mailman/listinfo.cgi/users
http://www.open-mpi.org/mailman/listinfo.cgi/devel
Thanks for your time.
===========================================================================
The following abbreviated list of release notes applies to this code
base as of this writing (8 Aug 2005):
- The Open MPI installation must be in your PATH on all nodes (and
potentially LD_LIBRARY_PATH, if libmpi is a shared library).
- LAM/MPI-like mpirun notation of "C" and "N" is not yet supported.
- Shared memory support will not function properly on machines that
have a weak memory consistency mode. The default in this beta is to
disable shared memory support on all Power PC architectures, even
though some Power PC platforms have strong memory consistency
models. See the description of the --enable-ptl-sm configure flag,
below.
- Striping MPI messages across multiple networks is supported (and
happens automatically when multiple networks are available), but
needs performance tuning.
- The only run-time systems currently supported are:
- rsh / ssh
- Recent versions of BProc
- Complete user and system administrator documentation is missing
(this file comprises the majority of the current user
documentation).
- Missing MPI functionality:
- The Fortran 90 MPI API is disabled (it is not complete).
- MPI-2 one-sided functionality will not be included in the first
few releases of Open MPI.
- Systems that have been tested are:
- Linux, 32 bit, with gcc
- OS X (10.3), 32 bit, with gcc
- OS X (10.4), 32 bit, with gcc
- Other systems have been lightly (but not fully tested):
- Other compilers on Linux, 32 bit
- 64 bit platforms (AMD, PPC64, Sparc); they "mostly work", but
there are still some known issues
- As of 9 June 2005, Open MPI will not function on OS X Tiger (10.4)
in 64 bit mode due to an ioctl() bug in OS X. Apple has been
notified of the problem.
- There are some cases where after running MPI applications, the
directory /tmp/openmpi-sessions-<username>@<hostname>* will exist
(but will likely be empty). It is safe to remove after the run is
complete.
- The MPI and run-time layers do not free all used memory properly
during MPI_FINALIZE.
- Running on nodes with different endian and/or different datatype
sizes within a single parallel application is not supported in this
beta.
- Threading support (both asynchronous progress and
MPI_THREAD_MULTIPLE) is included, but is only lightly tested.
===========================================================================
Building Open MPI
-----------------
Open MPI uses a traditional configure script paired with "make" to
build. Typical installs can be of the pattern:
---------------------------------------------------------------------------
shell$ ./configure [...options...]
shell$ make all install
---------------------------------------------------------------------------
There are many available configure options (see "./configure --help"
for a full list); a summary of the more important ones follows:
--prefix=<directory>
Install Open MPI into the base directory named <directory>. Hence,
Open MPI will place its executables in <directory>/bin, its header
files in <directory>/include, its libraries in <directory>/lib, etc.
--with-btl-gm=<directory>
Specify the directory where the GM libraries and header files are
located. This enables GM support in Open MPI.
--with-btl-mx=<directory>
Specify the directory where the MX libraries and header files are
located. This enables MX support in Open MPI.
--with-btl-mvapi=<directory>
Specify the directory where the mVAPI libraries and header files are
located. This enables mVAPI support in Open MPI.
--with-btl-openib=<directory>
Specify the directory where the Open IB libraries and header files are
located. This enables mVAPI support in Open MPI.
--with-mpi-param_check(=value)
"value" can be one of: always, never, runtime. If no value is
specified, or this option is not used, "always" is the default.
Using --without-mpi-param-check is equivalent to "never".
- always: the parameters of MPI functions are always checked for
errors
- never: the parameters of MPI functions are never checked for
errors
- runtime: whether the parameters of MPI functions are checked
depends on the value of the MCA parameter mpi_param_check
(default: yes).
--with-threads=value
Since thread support (both support for MPI_THREAD_MULTIPLE and
asynchronous progress) is only partially tested, it is disabled by
default. To enable threading, use "--with-threads=posix". This is
most useful when combined with --enable-mpi-threads and/or
--enable-progress-threads.
--enable-mpi-threads
Allows the MPI thread level MPI_THREAD_MULTIPLE. See
--with-threads; this is currently disabled by default.
--enable-progress-threads
Allows asynchronous progress in some transports. See
--with-threads; this is currently disabled by default.
--disable-shared
By default, libmpi is built as a shared library, and all components
are built as dynamic shared objects (DSOs). This switch disables
this default; it is really only useful when used with
--enable-static.
--enable-static
Build libmpi as a static library, and statically link in all
components.
There are other options available -- see "./configure --help".
Open MPI supports all the "make" targets that are provided by GNU
Automake, such as:
all - build the entire package
install - install the package
uninstall - remove all traces of the package from the $prefix
clean - clean out the build tree
Once Open MPI has been built and installed, it is safe to run "make
clean" and/or remove the entire build tree.
VPATH builds are fully supported.
Generally speaking, the only thing that users need to do to use Open
MPI is ensure that <prefix>/bin is in their PATH. Users may need to
ensure that this directory is set in their PATH in their shell setup
files (e.g., .bashrc, .cshrc) so that rsh/ssh-based logins will be
able to find the Open MPI executables.
Setting LD_LIBRARY_PATH is typically not necessary, but in some cases,
if libmpi.so cannot be found when MPI applications are run,
<prefix>/lib should be added to LD_LIBRARY_PATH.
===========================================================================
Checking Your Open MPI Installation
-----------------------------------
The "ompi_info" command can be used to check the status of your Open
MPI installation (located in <prefix>/bin/ompi_info). Running it with
no arguments provides a summary of information about your Open MPI
installation.
Note that the ompi_info command is extremely helpful in determining
which components are installed as well as listing all the run-time
settable parameters that are available in each component (as well as
their default values).
The following options may be helpful:
--all Show a *lot* of information about your Open MPI
installation.
--parsable Display all the information in an easily
grep/cut/awk/sed-able format.
--param <framework> <component>
A <framework> of "all" and a <component> of "all" will
show all parameters to all components. Otherwise, the
parameters of all the components in a specific framework,
or just the parameters of a specific component can be
displayed by using an appropriate <framework> and/or
<component> name.
Changing the values of these parameters is explained in the "The
Modular Component Architecture (MCA)" section, below.
===========================================================================
Compiling Open MPI Applications
-------------------------------
Open MPI provides "wrapper" compilers that should be used for
compiling MPI applications:
C: mpicc
C++: mpiCC (or mpic++ if your filesystem is case-insensitive)
Fortran 77: mpif77
Fortran 90: mpif90
For example:
shell$ mpicc hello_world_mpi.c -o hello_world_mpi -g
shell$
All the wrapper compilers do is add a variety of compiler and linker
flags to the command line and then invoke a back-end compiler. The
end result is an MPI executable that is properly linked to all the
relevant libraries.
===========================================================================
Running Open MPI Applications
-----------------------------
Open MPI supports both mpirun and mpiexec (they are actually the
same). For example:
shell$ mpirun -np 2 hello_world_mpi
or
shell$ mpiexec -np 1 hello_world_mpi : -np 1 hello_world_mpi
are equivalent. Many of mpiexec's switches (such as -host and -arch)
are not yet functional, although they will not error if you try to use
them.
Since rsh is probably the launcher that you will be using (if you are
outside of Los Alamos National Laboratory), you can also specify a
-hostfile parameter, indicating an standard mpirun-style hostfile (one
hostname per line):
shell$ mpirun -hostfile my_hostfile -np 2 hello_world_mpi
If you intend to run more than one process on a node, the hostfile can
use the "slots" attribute. If "slots" is not specified, a count of 1
is assumed. For example, using the following hostfile:
---------------------------------------------------------------------------
node1.example.com
node2.example.com
node3.example.com slots=2
node4.example.com slots=4
---------------------------------------------------------------------------
shell$ mpirun -hostfile my_hostfile -np 8 hello_world_mpi
will launch MPI_COMM_WORLD rank 0 on node1, rank 1 on node2, ranks 2
and 3 on node3, and ranks 4 through 7 on node4.
Note that the values of component parameters can be changed on the
mpirun / mpiexec command line. This is explained in the section
below, "The Modular Component Architecture (MCA)".
===========================================================================
The Modular Component Architecture (MCA)
The MCA is the backbone of Open MPI -- most services and functionality
are implemented through MCA components. Here is a list of all the
component frameworks in Open MPI:
---------------------------------------------------------------------------
MPI component frameworks:
-------------------------
coll - MPI collective algorithms
io - MPI-2 I/O
pml - MPI point-to-point management layer
btl - MPI point-to-point byte transfer layer
topo - MPI topology routines
Back-end run-time environment component frameworks:
---------------------------------------------------
errmgr - RTE error manager
gpr - General purpose registry
iof - I/O forwarding
ns - Name server
oob - Out of band messaging
pls - Process launch system
ras - Resource allocation system
rds - Resource discovery system
rmaps - Resource mapping system
rmgr - Resource manager
rml - RTE message layer
soh - State of health monitor
Miscellaneous frameworks:
-------------------------
allocator - Memory allocator
mpool - Memory pooling
---------------------------------------------------------------------------
Each framework typically has one or more components that are used at
run-time. For example, the ptl framework is used by MPI to send bytes
across underlying networks. The tcp ptl, for example, sends messages
across TCP-based networks; the gm ptl sends messages across GM
Myrinet-based networks.
Each component typically has some tunable parameters that can be
changed at run-time. Use the ompi_info command to check a component
to see what its tunable parameters are. For example:
shell$ ompi_info --param btl tcp
shows all the parameters (and default values) for the tcp btl
component.
These values can be overridden at run-time in several ways. At
run-time, the following locations are examined (in order) for new
values of parameters:
1. <prefix>/etc/openmpi-mca-params.conf
This file is intended to set any system-wide default MCA parameter
values -- it will apply, by default, to all users who use this Open
MPI installation. The default file that is installed contains many
comments explaining its format.
2. $HOME/.openmpi/mca-params.conf
If this file exists, it should be in the same format as
<prefix>/etc/openmpi-mca-params.conf. It is intended to provide
per-user default parameter values.
3. environment variables of the form OMPI_MCA_<name> set equal to a
<value>
Where <name> is the name of the parameter. For example, set the
variable named OMPI_MCA_btl_tcp_frag_size to the value 65536
(Bourne-style shells):
shell$ OMPI_MCA_btl_tcp_frag_size=65536
shell$ export OMPI_MCA_btl_tcp_frag_size
4. the mpirun command line: --mca <name> <value>
Where <name> is the name of the parameter. For example:
shell$ mpirun --mca btl_tcp_frag_size 65536 -np 2 hello_world_mpi
These locations are checked in order. For example, a parameter value
passed on the mpirun command line will override an environment
variable; an environment variable will override the system-wide
defaults.
===========================================================================
Common Questions
----------------
1. How do I change the rsh/ssh launcher to use rsh?
The default remote shell agent for the rsh/ssh launcher is ssh, but
you can set an MCA parameter to change it to rsh (or use a specific
path for ssh, pass different parameters to rsh/ssh, etc.). The MCA
parameter name is pls_rsh_agent. You can use any of the methods
for setting MCA parameters described above; for example:
shell$ mpirun --mca pls_rsh_agent rsh -np 4 a.out
2. When I run "make", it looks very much like the build system is
going into a loop, and I see messages similar to:
Warning: File `Makefile.am' has modification time 3.6e+04 s in
the future
Open MPI uses an Automake-based build system, and is therefore
highly dependent upon filesystem timestamps. If building on a
networked file system, you *must* ensure that the time of the
machine that you are building on is tightly synchronized with the
time on your network fileserver (e.g., using ntp). If this is not
possible, you will need to build Open MPI on a non-networked
filesystem.
...we'll add more questions here as they are asked by real users.
===========================================================================
Got more questions?
-------------------
Found a bug? Got a question? Want to make a suggestion? Want to
contribute to Open MPI? Please let us know!
User-level questions and comments should generally be sent to the
user's mailing list (users@open-mpi.org). Because of spam, only
subscribers are allowed to post to this list (ensure that you
subscribe with and post from *exactly* the same e-mail address --
joe@example.com is considered different than
joe@mycomputer.example.com!). Visit this page to subscribe to the
user's list:
http://www.open-mpi.org/mailman/listinfo.cgi/users
Developer-level bug reports, questions, and comments should generally
be sent to the developer's mailing list (devel@open-mpi.org). Please
do not post the same question to both lists. As with the user's list,
only subscribers are allowed to post to the developer's list. Visit
the following web page to subscribe:
http://www.open-mpi.org/mailman/listinfo.cgi/devel
When submitting bug reports to either list, be sure to include the
following information in your mail (please compress!):
- the stdout and stderr from Open MPI's configure
- the top-level config.log file
- the stdout and stderr from building Open MPI
- the output from "ompi_info --all" (if possible)
For Bourne-type shells, here's one way to capture this information:
shell$ ./configure ... 2>&1 | tee config.out
[...lots of configure output...]
shell$ make 2>&1 | tee make.out
[...lots of make output...]
shell$ mkdir ompi-output
shell$ cp config.out config.log make.out ompi-output
shell$ ompi_info --all |& tee ompi-output/ompi-info.out
shell$ tar cvf ompi-output.tar ompi-output
[...output from tar...]
shell$ gzip ompi-output.tar
For C shell-type shells, the procedure is only slightly different:
shell% ./configure ... |& tee config.out
[...lots of configure output...]
shell% make |& tee make.out
[...lots of make output...]
shell% mkdir ompi-output
shell% cp config.out config.log make.out ompi-output
shell% ompi_info --all |& tee ompi-output/ompi-info.out
shell% tar cvf ompi-output.tar ompi-output
[...output from tar...]
shell% gzip ompi-output.tar
In either case, attach the resulting ompi-output.tar.gz file to your
mail. This provides the Open MPI developers with a lot of information
about your installation and can greatly assist us in helping with your
problem.
Be sure to also include any other useful files (in the
ompi-output.tar.gz tarball), such as output showing specific errors.