43822c6ec6
is not an issue on Darwin, since Darwin doesn't support statically linking executables (there is no LibSystem.a) This commit was SVN r7032.
512 строки
19 KiB
Plaintext
512 строки
19 KiB
Plaintext
Copyright (c) 2004-2005 The Trustees of Indiana University.
|
|
All rights reserved.
|
|
Copyright (c) 2004-2005 The Trustees of the University of Tennessee.
|
|
All rights reserved.
|
|
Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
University of Stuttgart. All rights reserved.
|
|
Copyright (c) 2004-2005 The Regents of the University of California.
|
|
All rights reserved.
|
|
$COPYRIGHT$
|
|
|
|
Additional copyrights may follow
|
|
|
|
$HEADER$
|
|
|
|
This is a preliminary README file. It will be scrubbed formally
|
|
before release.
|
|
|
|
The best way to report bugs, send comments, or ask questions is to
|
|
sign up on the user's and/or developer's mailing list (for user-level
|
|
and developer-level questions; when in doubt, send to the user's
|
|
list):
|
|
|
|
users@open-mpi.org
|
|
devel@open-mpi.org
|
|
|
|
Because of spam, only subscribers are allowed to post to these lists
|
|
(ensure that you subscribe with and post from exactly the same e-mail
|
|
address -- joe@example.com is considered different than
|
|
joe@mycomputer.example.com!). Visit these pages to subscribe to the
|
|
lists:
|
|
|
|
http://www.open-mpi.org/mailman/listinfo.cgi/users
|
|
http://www.open-mpi.org/mailman/listinfo.cgi/devel
|
|
|
|
Thanks for your time.
|
|
|
|
===========================================================================
|
|
|
|
The following abbreviated list of release notes applies to this code
|
|
base as of this writing (8 Aug 2005):
|
|
|
|
- The Open MPI installation must be in your PATH on all nodes (and
|
|
potentially LD_LIBRARY_PATH, if libmpi is a shared library).
|
|
|
|
- LAM/MPI-like mpirun notation of "C" and "N" is not yet supported.
|
|
|
|
- Shared memory support will not function properly on machines that
|
|
have a weak memory consistency mode. The default in this beta is to
|
|
disable shared memory support on all Power PC architectures, even
|
|
though some Power PC platforms have strong memory consistency
|
|
models. See the description of the --enable-ptl-sm configure flag,
|
|
below.
|
|
|
|
- Striping MPI messages across multiple networks is supported (and
|
|
happens automatically when multiple networks are available), but
|
|
needs performance tuning.
|
|
|
|
- The only run-time systems currently supported are:
|
|
- rsh / ssh
|
|
- Recent versions of BProc
|
|
|
|
- Complete user and system administrator documentation is missing
|
|
(this file comprises the majority of the current user
|
|
documentation).
|
|
|
|
- The Fortran 90 MPI API is disabled by default (we have only be able
|
|
to get it to work with gfortran). You can enable with with
|
|
configure options; see below.
|
|
|
|
- Missing MPI functionality:
|
|
- MPI-2 one-sided functionality will not be included in the first
|
|
few releases of Open MPI.
|
|
|
|
- Systems that have been tested are:
|
|
- Linux, 32 bit, with gcc
|
|
- OS X (10.3), 32 bit, with gcc
|
|
- OS X (10.4), 32 bit, with gcc
|
|
|
|
- Other systems have been lightly (but not fully tested):
|
|
- Other compilers on Linux, 32 bit
|
|
- 64 bit platforms (AMD, PPC64, Sparc); they "mostly work", but
|
|
there are still some known issues
|
|
|
|
- There are some cases where after running MPI applications, the
|
|
directory /tmp/openmpi-sessions-<username>@<hostname>* will exist
|
|
(but will likely be empty). It is safe to remove after the run is
|
|
complete.
|
|
|
|
- The MPI and run-time layers do not free all used memory properly
|
|
during MPI_FINALIZE.
|
|
|
|
- Running on nodes with different endian and/or different datatype
|
|
sizes within a single parallel application is not supported in this
|
|
beta.
|
|
|
|
- Threading support (both asynchronous progress and
|
|
MPI_THREAD_MULTIPLE) is included, but is only lightly tested.
|
|
|
|
- On Linux, if either the malloc_hooks or malloc_interpose memory
|
|
hooks are enabled, it will not be possible to link against a static
|
|
libc.a. libmpi can still be built statically - it is only the final
|
|
application link step that can not be static. If applications must be
|
|
statically linked, it is recommended you compile Open MPI with the
|
|
--without-memory-manager configure option.
|
|
|
|
===========================================================================
|
|
|
|
Building Open MPI
|
|
-----------------
|
|
|
|
Open MPI uses a traditional configure script paired with "make" to
|
|
build. Typical installs can be of the pattern:
|
|
|
|
---------------------------------------------------------------------------
|
|
shell$ ./configure [...options...]
|
|
shell$ make all install
|
|
---------------------------------------------------------------------------
|
|
|
|
There are many available configure options (see "./configure --help"
|
|
for a full list); a summary of the more important ones follows:
|
|
|
|
--prefix=<directory>
|
|
Install Open MPI into the base directory named <directory>. Hence,
|
|
Open MPI will place its executables in <directory>/bin, its header
|
|
files in <directory>/include, its libraries in <directory>/lib, etc.
|
|
|
|
--with-btl-gm=<directory>
|
|
Specify the directory where the GM libraries and header files are
|
|
located. This enables GM support in Open MPI.
|
|
|
|
--with-btl-mx=<directory>
|
|
Specify the directory where the MX libraries and header files are
|
|
located. This enables MX support in Open MPI.
|
|
|
|
--with-btl-mvapi=<directory>
|
|
Specify the directory where the mVAPI libraries and header files are
|
|
located. This enables mVAPI support in Open MPI.
|
|
|
|
--with-btl-openib=<directory>
|
|
Specify the directory where the Open IB libraries and header files are
|
|
located. This enables mVAPI support in Open MPI.
|
|
|
|
--with-mpi-param_check(=value)
|
|
"value" can be one of: always, never, runtime. If no value is
|
|
specified, or this option is not used, "always" is the default.
|
|
Using --without-mpi-param-check is equivalent to "never".
|
|
- always: the parameters of MPI functions are always checked for
|
|
errors
|
|
- never: the parameters of MPI functions are never checked for
|
|
errors
|
|
- runtime: whether the parameters of MPI functions are checked
|
|
depends on the value of the MCA parameter mpi_param_check
|
|
(default: yes).
|
|
|
|
--with-threads=value
|
|
Since thread support (both support for MPI_THREAD_MULTIPLE and
|
|
asynchronous progress) is only partially tested, it is disabled by
|
|
default. To enable threading, use "--with-threads=posix". This is
|
|
most useful when combined with --enable-mpi-threads and/or
|
|
--enable-progress-threads.
|
|
|
|
--enable-mpi-threads
|
|
Allows the MPI thread level MPI_THREAD_MULTIPLE. See
|
|
--with-threads; this is currently disabled by default.
|
|
|
|
--enable-progress-threads
|
|
Allows asynchronous progress in some transports. See
|
|
--with-threads; this is currently disabled by default.
|
|
|
|
--enable-f90
|
|
Enable building the Fortran 90 MPI bindings (disabled by default).
|
|
We have only been able to get these bindings to build with gfortran.
|
|
Also related to the --with-f90-max-array-dim option.
|
|
|
|
--with-f90-max-array-dim=<DIM>
|
|
The F90 MPI bindings are stictly typed, even including the number of
|
|
dimensions for arrays for MPI choice buffer parameters. Open MPI
|
|
generates these bindings at compile time with a maximum number of
|
|
dimensions as specified by this parameter. The default value is 4.
|
|
|
|
--disable-shared
|
|
By default, libmpi is built as a shared library, and all components
|
|
are built as dynamic shared objects (DSOs). This switch disables
|
|
this default; it is really only useful when used with
|
|
--enable-static.
|
|
|
|
--enable-static
|
|
Build libmpi as a static library, and statically link in all
|
|
components.
|
|
|
|
There are other options available -- see "./configure --help".
|
|
|
|
Open MPI supports all the "make" targets that are provided by GNU
|
|
Automake, such as:
|
|
|
|
all - build the entire package
|
|
install - install the package
|
|
uninstall - remove all traces of the package from the $prefix
|
|
clean - clean out the build tree
|
|
|
|
Once Open MPI has been built and installed, it is safe to run "make
|
|
clean" and/or remove the entire build tree.
|
|
|
|
VPATH builds are fully supported.
|
|
|
|
Generally speaking, the only thing that users need to do to use Open
|
|
MPI is ensure that <prefix>/bin is in their PATH. Users may need to
|
|
ensure that this directory is set in their PATH in their shell setup
|
|
files (e.g., .bashrc, .cshrc) so that rsh/ssh-based logins will be
|
|
able to find the Open MPI executables.
|
|
|
|
Setting LD_LIBRARY_PATH is typically not necessary, but in some cases,
|
|
if libmpi.so cannot be found when MPI applications are run,
|
|
<prefix>/lib should be added to LD_LIBRARY_PATH.
|
|
|
|
===========================================================================
|
|
|
|
Checking Your Open MPI Installation
|
|
-----------------------------------
|
|
|
|
The "ompi_info" command can be used to check the status of your Open
|
|
MPI installation (located in <prefix>/bin/ompi_info). Running it with
|
|
no arguments provides a summary of information about your Open MPI
|
|
installation.
|
|
|
|
Note that the ompi_info command is extremely helpful in determining
|
|
which components are installed as well as listing all the run-time
|
|
settable parameters that are available in each component (as well as
|
|
their default values).
|
|
|
|
The following options may be helpful:
|
|
|
|
--all Show a *lot* of information about your Open MPI
|
|
installation.
|
|
--parsable Display all the information in an easily
|
|
grep/cut/awk/sed-able format.
|
|
--param <framework> <component>
|
|
A <framework> of "all" and a <component> of "all" will
|
|
show all parameters to all components. Otherwise, the
|
|
parameters of all the components in a specific framework,
|
|
or just the parameters of a specific component can be
|
|
displayed by using an appropriate <framework> and/or
|
|
<component> name.
|
|
|
|
Changing the values of these parameters is explained in the "The
|
|
Modular Component Architecture (MCA)" section, below.
|
|
|
|
===========================================================================
|
|
|
|
Compiling Open MPI Applications
|
|
-------------------------------
|
|
|
|
Open MPI provides "wrapper" compilers that should be used for
|
|
compiling MPI applications:
|
|
|
|
C: mpicc
|
|
C++: mpiCC (or mpic++ if your filesystem is case-insensitive)
|
|
Fortran 77: mpif77
|
|
Fortran 90: mpif90
|
|
|
|
For example:
|
|
|
|
shell$ mpicc hello_world_mpi.c -o hello_world_mpi -g
|
|
shell$
|
|
|
|
All the wrapper compilers do is add a variety of compiler and linker
|
|
flags to the command line and then invoke a back-end compiler. The
|
|
end result is an MPI executable that is properly linked to all the
|
|
relevant libraries.
|
|
|
|
===========================================================================
|
|
|
|
Running Open MPI Applications
|
|
-----------------------------
|
|
|
|
Open MPI supports both mpirun and mpiexec (they are actually the
|
|
same). For example:
|
|
|
|
shell$ mpirun -np 2 hello_world_mpi
|
|
|
|
or
|
|
|
|
shell$ mpiexec -np 1 hello_world_mpi : -np 1 hello_world_mpi
|
|
|
|
are equivalent. Many of mpiexec's switches (such as -host and -arch)
|
|
are not yet functional, although they will not error if you try to use
|
|
them.
|
|
|
|
Since rsh is probably the launcher that you will be using (if you are
|
|
outside of Los Alamos National Laboratory), you can also specify a
|
|
-hostfile parameter, indicating an standard mpirun-style hostfile (one
|
|
hostname per line):
|
|
|
|
shell$ mpirun -hostfile my_hostfile -np 2 hello_world_mpi
|
|
|
|
If you intend to run more than one process on a node, the hostfile can
|
|
use the "slots" attribute. If "slots" is not specified, a count of 1
|
|
is assumed. For example, using the following hostfile:
|
|
|
|
---------------------------------------------------------------------------
|
|
node1.example.com
|
|
node2.example.com
|
|
node3.example.com slots=2
|
|
node4.example.com slots=4
|
|
---------------------------------------------------------------------------
|
|
|
|
shell$ mpirun -hostfile my_hostfile -np 8 hello_world_mpi
|
|
|
|
will launch MPI_COMM_WORLD rank 0 on node1, rank 1 on node2, ranks 2
|
|
and 3 on node3, and ranks 4 through 7 on node4.
|
|
|
|
Note that the values of component parameters can be changed on the
|
|
mpirun / mpiexec command line. This is explained in the section
|
|
below, "The Modular Component Architecture (MCA)".
|
|
|
|
===========================================================================
|
|
|
|
The Modular Component Architecture (MCA)
|
|
|
|
The MCA is the backbone of Open MPI -- most services and functionality
|
|
are implemented through MCA components. Here is a list of all the
|
|
component frameworks in Open MPI:
|
|
|
|
---------------------------------------------------------------------------
|
|
MPI component frameworks:
|
|
-------------------------
|
|
|
|
coll - MPI collective algorithms
|
|
io - MPI-2 I/O
|
|
pml - MPI point-to-point management layer
|
|
bml - BTL management layer
|
|
btl - MPI point-to-point byte transfer layer
|
|
topo - MPI topology routines
|
|
|
|
Back-end run-time environment component frameworks:
|
|
---------------------------------------------------
|
|
|
|
errmgr - RTE error manager
|
|
gpr - General purpose registry
|
|
iof - I/O forwarding
|
|
ns - Name server
|
|
oob - Out of band messaging
|
|
pls - Process launch system
|
|
ras - Resource allocation system
|
|
rds - Resource discovery system
|
|
rmaps - Resource mapping system
|
|
rmgr - Resource manager
|
|
rml - RTE message layer
|
|
soh - State of health monitor
|
|
|
|
Miscellaneous frameworks:
|
|
-------------------------
|
|
|
|
allocator - Memory allocator
|
|
mpool - Memory pooling
|
|
paffinity - Processor affinity
|
|
timer - High-resolution timers
|
|
memory - Memory subsystem hooks
|
|
---------------------------------------------------------------------------
|
|
|
|
Each framework typically has one or more components that are used at
|
|
run-time. For example, the ptl framework is used by MPI to send bytes
|
|
across underlying networks. The tcp ptl, for example, sends messages
|
|
across TCP-based networks; the gm ptl sends messages across GM
|
|
Myrinet-based networks.
|
|
|
|
Each component typically has some tunable parameters that can be
|
|
changed at run-time. Use the ompi_info command to check a component
|
|
to see what its tunable parameters are. For example:
|
|
|
|
shell$ ompi_info --param btl tcp
|
|
|
|
shows all the parameters (and default values) for the tcp btl
|
|
component.
|
|
|
|
These values can be overridden at run-time in several ways. At
|
|
run-time, the following locations are examined (in order) for new
|
|
values of parameters:
|
|
|
|
1. <prefix>/etc/openmpi-mca-params.conf
|
|
|
|
This file is intended to set any system-wide default MCA parameter
|
|
values -- it will apply, by default, to all users who use this Open
|
|
MPI installation. The default file that is installed contains many
|
|
comments explaining its format.
|
|
|
|
2. $HOME/.openmpi/mca-params.conf
|
|
|
|
If this file exists, it should be in the same format as
|
|
<prefix>/etc/openmpi-mca-params.conf. It is intended to provide
|
|
per-user default parameter values.
|
|
|
|
3. environment variables of the form OMPI_MCA_<name> set equal to a
|
|
<value>
|
|
|
|
Where <name> is the name of the parameter. For example, set the
|
|
variable named OMPI_MCA_btl_tcp_frag_size to the value 65536
|
|
(Bourne-style shells):
|
|
|
|
shell$ OMPI_MCA_btl_tcp_frag_size=65536
|
|
shell$ export OMPI_MCA_btl_tcp_frag_size
|
|
|
|
4. the mpirun command line: --mca <name> <value>
|
|
|
|
Where <name> is the name of the parameter. For example:
|
|
|
|
shell$ mpirun --mca btl_tcp_frag_size 65536 -np 2 hello_world_mpi
|
|
|
|
These locations are checked in order. For example, a parameter value
|
|
passed on the mpirun command line will override an environment
|
|
variable; an environment variable will override the system-wide
|
|
defaults.
|
|
|
|
===========================================================================
|
|
|
|
Common Questions
|
|
----------------
|
|
|
|
1. How do I change the rsh/ssh launcher to use rsh?
|
|
|
|
The default remote shell agent for the rsh/ssh launcher is ssh, but
|
|
you can set an MCA parameter to change it to rsh (or use a specific
|
|
path for ssh, pass different parameters to rsh/ssh, etc.). The MCA
|
|
parameter name is pls_rsh_agent. You can use any of the methods
|
|
for setting MCA parameters described above; for example:
|
|
|
|
shell$ mpirun --mca pls_rsh_agent rsh -np 4 a.out
|
|
|
|
2. When I run "make", it looks very much like the build system is
|
|
going into a loop, and I see messages similar to:
|
|
|
|
Warning: File `Makefile.am' has modification time 3.6e+04 s in
|
|
the future
|
|
|
|
Open MPI uses an Automake-based build system, and is therefore
|
|
highly dependent upon filesystem timestamps. If building on a
|
|
networked file system, you *must* ensure that the time of the
|
|
machine that you are building on is tightly synchronized with the
|
|
time on your network fileserver (e.g., using ntp). If this is not
|
|
possible, you will need to build Open MPI on a non-networked
|
|
filesystem.
|
|
|
|
...we'll add more questions here as they are asked by real users.
|
|
|
|
===========================================================================
|
|
|
|
Got more questions?
|
|
-------------------
|
|
|
|
Found a bug? Got a question? Want to make a suggestion? Want to
|
|
contribute to Open MPI? Please let us know!
|
|
|
|
User-level questions and comments should generally be sent to the
|
|
user's mailing list (users@open-mpi.org). Because of spam, only
|
|
subscribers are allowed to post to this list (ensure that you
|
|
subscribe with and post from *exactly* the same e-mail address --
|
|
joe@example.com is considered different than
|
|
joe@mycomputer.example.com!). Visit this page to subscribe to the
|
|
user's list:
|
|
|
|
http://www.open-mpi.org/mailman/listinfo.cgi/users
|
|
|
|
Developer-level bug reports, questions, and comments should generally
|
|
be sent to the developer's mailing list (devel@open-mpi.org). Please
|
|
do not post the same question to both lists. As with the user's list,
|
|
only subscribers are allowed to post to the developer's list. Visit
|
|
the following web page to subscribe:
|
|
|
|
http://www.open-mpi.org/mailman/listinfo.cgi/devel
|
|
|
|
When submitting bug reports to either list, be sure to include the
|
|
following information in your mail (please compress!):
|
|
|
|
- the stdout and stderr from Open MPI's configure
|
|
- the top-level config.log file
|
|
- the stdout and stderr from building Open MPI
|
|
- the output from "ompi_info --all" (if possible)
|
|
|
|
For Bourne-type shells, here's one way to capture this information:
|
|
|
|
shell$ ./configure ... 2>&1 | tee config.out
|
|
[...lots of configure output...]
|
|
shell$ make 2>&1 | tee make.out
|
|
[...lots of make output...]
|
|
shell$ mkdir ompi-output
|
|
shell$ cp config.out config.log make.out ompi-output
|
|
shell$ ompi_info --all |& tee ompi-output/ompi-info.out
|
|
shell$ tar cvf ompi-output.tar ompi-output
|
|
[...output from tar...]
|
|
shell$ gzip ompi-output.tar
|
|
|
|
For C shell-type shells, the procedure is only slightly different:
|
|
|
|
shell% ./configure ... |& tee config.out
|
|
[...lots of configure output...]
|
|
shell% make |& tee make.out
|
|
[...lots of make output...]
|
|
shell% mkdir ompi-output
|
|
shell% cp config.out config.log make.out ompi-output
|
|
shell% ompi_info --all |& tee ompi-output/ompi-info.out
|
|
shell% tar cvf ompi-output.tar ompi-output
|
|
[...output from tar...]
|
|
shell% gzip ompi-output.tar
|
|
|
|
In either case, attach the resulting ompi-output.tar.gz file to your
|
|
mail. This provides the Open MPI developers with a lot of information
|
|
about your installation and can greatly assist us in helping with your
|
|
problem.
|
|
|
|
Be sure to also include any other useful files (in the
|
|
ompi-output.tar.gz tarball), such as output showing specific errors.
|