42ec26e640
This commit was SVN r7999.
1284 строки
51 KiB
TeX
1284 строки
51 KiB
TeX
% -*- latex -*-
|
|
%
|
|
% Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
% University Research and Technology
|
|
% Corporation. All rights reserved.
|
|
% Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
% of Tennessee Research Foundation. All rights
|
|
% reserved.
|
|
% Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
% University of Stuttgart. All rights reserved.
|
|
% Copyright (c) 2004-2005 The Regents of the University of California.
|
|
% All rights reserved.
|
|
% $COPYRIGHT$
|
|
%
|
|
% Additional copyrights may follow
|
|
%
|
|
% $HEADER$
|
|
%
|
|
|
|
\chapter{Open MPI Command Quick Reference}
|
|
\label{sec:commands}
|
|
|
|
This section is intended to provide a quick reference of the major
|
|
Open MPI commands. Each command also has its own manual page which
|
|
typically provides more detail than this document.
|
|
|
|
{\Huge JMS needs total overhaul}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamboot} Command}
|
|
\label{sec:commands-lamboot}
|
|
|
|
The \cmd{lamboot} command is used to start the Open MPI run-time
|
|
environment (RTE). \cmd{lamboot} is typically the first command used
|
|
before any other Open MPI command (notable exceptions are the wrapper
|
|
compilers, which do not require the Open MPI RTE, and \cmd{mpiexec} which
|
|
can launch its own Open MPI universe). \cmd{lamboot} can use any of the
|
|
available \kind{boot} SSI modules; Section~\ref{sec:lam-ssi-boot}
|
|
details the requirements and operations of each of the \kind{boot} SSI
|
|
modules that are included in the Open MPI distribution.
|
|
|
|
Common arguments that are used with the \cmd{lamboot} command are:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-b}: When used with the \boot{rsh} boot module, the
|
|
``fast'' boot algorithm is used which can noticeably speed up the
|
|
execution time of \cmd{lamboot}. It can also be used where remote
|
|
shell agents cannot provide output from remote nodes (e.g., in a
|
|
Condor environment). Specifically, the ``fast'' algorithm assumes
|
|
that the user's shell on the remote node is the same as the shell on
|
|
the node where \cmd{lamboot} was invoked.
|
|
|
|
\item \cmdarg{-d}: Print debugging output. This will print a {\em
|
|
lot} of output, and is typically only necessary if \cmd{lamboot}
|
|
fails for an unknown reason. The output is forwarded to standard
|
|
out as well as either \file{/tmp} or syslog facilities. The amount of
|
|
data produced can fill these filesystems, leading to general system
|
|
problems.
|
|
|
|
\item \cmdarg{-l}: Use local hostname resolution instead of
|
|
centralized lookups. This is useful in environments where the same
|
|
hostname may resolve to different IP addresses on different nodes
|
|
(e.g., clusters based on Finite Neighborhood Networks\footnote{See
|
|
\url{http://www.aggregate.org/} for more details.}).
|
|
|
|
\changebegin{7.1}
|
|
|
|
\item \cmdarg{-prefix $<$lam/install/path$>$}: Use the Open MPI
|
|
installation specified in the $<$lam/install/path$>$ - where
|
|
$<$lam/install/path$>$ is the top level directory where Open MPI is
|
|
installed. This is typically used when a user has multiple Open MPI
|
|
installations and want to switch between them without changing the
|
|
dot files or PATH environment variable.
|
|
|
|
This option is not compatible with Open MPI versions prior to 7.1.
|
|
|
|
\changeend{7.1}
|
|
|
|
\item \cmdarg{-s}: Close the \file{stdout} and \file{stderr} of the
|
|
locally-launched Open MPI daemon (they are normally left open). This is
|
|
necessary when invoking \cmd{lamboot} via a remote agent such as
|
|
\cmd{rsh} or \cmd{ssh}.
|
|
|
|
\item \cmdarg{-v}: Print verbose output. This is useful to show
|
|
progress during \cmd{lamboot}'s progress. Unlike \cmdarg{-d},
|
|
\cmdarg{-v} does not forward output to a file or syslog.
|
|
|
|
\item \cmdarg{-x}: Run the Open MPI RTE in fault-tolerant mode.
|
|
|
|
\item \cmdarg{$<$filename$>$}: The name of the boot schema file. Boot
|
|
schemas, while they can be as simple as a list of hostnames, can
|
|
contain additional information and are discussed in detail
|
|
in Sections~\ref{sec:getting-started-hostfile} and
|
|
~\ref{sec:lam-ssi-boot-schema},
|
|
pages~\pageref{sec:getting-started-hostfile}
|
|
and~\pageref{sec:lam-ssi-boot-schema}, respectively.
|
|
\end{itemize}
|
|
|
|
Booting the Open MPI RTE is where most users (particularly first-time
|
|
users) encounter problems. Each \kind{boot} module has its own
|
|
specific requirements and prerequisites for success. Although
|
|
\cmd{lamboot} typically prints detailed messages when errors occur,
|
|
users are strongly encouraged to read Section~\ref{sec:lam-ssi-boot}
|
|
for the details of the \kind{boot} module that they will be using.
|
|
Additionally, the \cmdarg{-d} switch should be used to examine exactly
|
|
what is happening to determine the actual source of the problem --
|
|
many problems with \cmd{lamboot} come from the operating system or the
|
|
user's shell setup; not from within Open MPI itself.
|
|
|
|
The most common \cmd{lamboot} example simply uses a hostfile to launch
|
|
across an \cmd{rsh}/\cmd{ssh}-based cluster of nodes (the
|
|
``\cmdarg{-ssi boot rsh}'' is not technically necessary here, but it
|
|
is specified to make this example correct in all environments):
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamboot -v -ssi boot rsh hostfile
|
|
|
|
Open MPI 7.0/MPI 2 C++/ROMIO - Indiana University
|
|
|
|
n0<1234> ssi:boot:base:linear: booting n0 (node1.cluster.example.com)
|
|
n0<1234> ssi:boot:base:linear: booting n1 (node2.cluster.example.com)
|
|
n0<1234> ssi:boot:base:linear: booting n2 (node3.cluster.example.com)
|
|
n0<1234> ssi:boot:base:linear: booting n3 (node4.cluster.example.com)
|
|
n0<1234> ssi:boot:base:linear: finished
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Multiple Sessions on the Same Node}
|
|
|
|
In some cases (such as in batch-regulated environments), it is
|
|
desirable to allow multiple universes owned by the same on the same
|
|
node. The \ienvvar{TMPDIR},
|
|
\ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-PREFIX}, and
|
|
\ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX} environment variables can be
|
|
used to effect this behavior. The main issue is the location of Open MPI's
|
|
session directory; each node in a Open MPI universe has a session directory
|
|
in a well-known location in the filesystem that identifies how to
|
|
contact the Open MPI daemon on that node. Multiple Open MPI universes can
|
|
simultaneously co-exist on the same node as long as they have
|
|
different session directories.
|
|
|
|
Open MPI recognizes several batch environments and automatically adapts the
|
|
session directory to be specific to a batch job. Hence, if the batch
|
|
scheduler allocates multiple jobs from the same user to the same node,
|
|
Open MPI will automatically do the ``right thing'' and ensure that the Open MPI
|
|
universes from each job will not collide.
|
|
%
|
|
Sections~\ref{sec:misc-batch} and~\ref{sec:misc-session-directory}
|
|
(starting on page~\pageref{sec:misc-batch}) discuss these issues in
|
|
detail.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Avoiding Running on Specific Nodes}
|
|
\label{sec:commands-lamboot-no-schedule}
|
|
\index{no-schedule boot schema attribute@\cmd{no-schedule} boot
|
|
schema attribute}
|
|
|
|
Once the Open MPI universe is booted, processes can be launched on any
|
|
node. The \cmd{mpirun}, \cmd{mpiexec}, and \cmd{lamexec} commands are
|
|
most commonly used to launch jobs in the universe, and are typically
|
|
used with the \cmdarg{N} and \cmdarg{C} nomenclatures (see the description of
|
|
\cmd{mpirun} in Section~\ref{sec:commands-mpirun} for details on the
|
|
\cmdarg{N} and \cmdarg{C} nomenclature) which launch jobs on all schedulable
|
|
nodes and CPUs in the Open MPI universe, respectively. While finer-grained
|
|
controls are available through \cmd{mpirun} (etc.), it can be
|
|
convenient to simply mark some nodes as ``non-schedulable,'' and
|
|
therefore avoid having \cmd{mpirun} (etc.) launch executables on those
|
|
nodes when using \cmdarg{N} and \cmdarg{C} nomenclature.
|
|
|
|
For example, it may be convenient to boot a Open MPI universe that includes
|
|
a controller node (e.g., a desktop workstation) and a set of worker
|
|
nodes. In this case, it is desirable to mark the desktop workstation
|
|
as ``non-scheduable'' so that Open MPI will not launch executables there
|
|
(by default). Consider the following boot schema:
|
|
|
|
\lstset{style=lam-shell}
|
|
\begin{lstlisting}
|
|
# Mark my_workstation as ``non-schedulable''
|
|
my_workstation.office.example.com schedule=no
|
|
# All the other nodes are, by default, schedulable
|
|
node1.cluster.example.com
|
|
node2.cluster.example.com
|
|
node3.cluster.example.com
|
|
node4.cluster.example.com
|
|
\end{lstlisting}
|
|
|
|
Booting with this schema allows the convenienve of:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpirun C my_mpi_program
|
|
\end{lstlisting}
|
|
% stupid emacs mode: $
|
|
|
|
\noindent which will only run \cmd{my\_\-mpi\_\-program} on the four
|
|
cluster nodes (i.e., not the workstation).
|
|
%
|
|
Note that this behavior {\em only} applies to the \cmdarg{C} and \cmdarg{N}
|
|
designations; Open MPI will always allow execution on any node when using
|
|
the \cmdarg{nX} or \cmdarg{cX} notation:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpirun c0 C my_mpi_program
|
|
\end{lstlisting}
|
|
% stupid emacs mode: $
|
|
|
|
\noindent which will run \cmd{my\_\-mpi\_\-program} on all five nodes
|
|
in the Open MPI universe.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamcheckpoint} Command}
|
|
\label{sec:commands-lamcheckpoint}
|
|
|
|
\changebegin{7.1}
|
|
|
|
The \cmd{lamcheckpoint} command is provided to checkpoint a MPI
|
|
application. One of the arguments to \cmd{lamcheckpoint} is the name
|
|
of the checkpoint/restart module (which can be either one of
|
|
\crssi{blcr} and \crssi{self}). Additional arguments to
|
|
\cmd{lamcheckpoint} depend of the selected checkpoint/restart module.
|
|
The name of the module can be specified by passing the \crssi{cr} SSI
|
|
parameter.
|
|
|
|
Common arguments that are used with the \cmd{lamcheckpoint} command
|
|
are:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-ssi}: Just like with \cmd{mpirun}, the \cmdarg{-ssi}
|
|
flag can be used to pass key=value pairs to Open MPI. Indeed, it is
|
|
required to pass at least one SSI parameter: \ssiparam{cr},
|
|
indicating which \kind{cr} module to use for checkpointing.
|
|
|
|
\item \cmdarg{-pid}: Indicate the PID of \cmd{mpirun} to checkpoint.
|
|
\end{itemize}
|
|
|
|
\noindent Notes:
|
|
|
|
\begin{itemize}
|
|
\item If the \crssi{blcr} \kind{cr} module is selected, the name of
|
|
the directory for storing the checkpoint files and the PID of
|
|
\cmd{mpirun} should be passed as SSI parameters to
|
|
\cmd{lamcheckpoint}.
|
|
|
|
\item If the \crssi{self} \kind{cr} module is selected, the PID of
|
|
\cmd{mpirun} should be passed via the \cmdarg{-pid} parameter.
|
|
\end{itemize}
|
|
|
|
\changeend{7.1}
|
|
|
|
See Section~\ref{sec:mpi-ssi-cr} for more detail about the
|
|
checkpoint/restart capabilities of Open MPI, including details about
|
|
the \crssi{blcr} and \crssi{self} \kind{cr} modules.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamclean} Command}
|
|
\label{sec:commands-lamclean}
|
|
|
|
The \cmd{lamclean} command is provided to clean up the Open MPI universe.
|
|
It is typically only necessary when MPI processes terminate ``badly,''
|
|
and potentially leave resources allocated in the Open MPI universe (such as
|
|
MPI-2 published names, processes, or shared memory). The
|
|
\cmd{lamclean} command will kill {\em all} processes running in the
|
|
Open MPI universe, and free {\em all} resources that were associated with
|
|
them (including unpublishing MPI-2 dynamicly published names).
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamexec} Command}
|
|
\label{sec:commands-lamexec}
|
|
|
|
The \cmd{lamexec} command is similar to \cmd{mpirun} but is used for
|
|
non-MPI programs. For example:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamexec N uptime
|
|
5:37pm up 21 days, 23:49, 5 users, load average: 0.31, 0.26, 0.25
|
|
5:37pm up 21 days, 23:49, 2 users, load average: 0.01, 0.00, 0.00
|
|
5:37pm up 21 days, 23:50, 3 users, load average: 0.01, 0.00, 0.00
|
|
5:37pm up 21 days, 23:50, 2 users, load average: 0.87, 0.81, 0.80
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
|
|
Most of the parameters and options that are available to \cmd{mpirun}
|
|
are also available to \cmd{lamexec}. See the \cmd{mpirun} description
|
|
in Section~\ref{sec:commands-mpirun} for more details.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamgrow} Command}
|
|
\label{sec:commands-lamgrow}
|
|
|
|
The \cmd{lamgrow} command adds a single node to the Open MPI universe. It
|
|
must use the same \kind{boot} module that was used to initially boot
|
|
the Open MPI universe. \cmd{lamgrow} must be run from a node already in
|
|
the Open MPI universe. Common parameters include:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-v}: Verbose mode.
|
|
|
|
\item \cmdarg{-d}: Debug mode; enables a {\em lot} of diagnostic
|
|
output.
|
|
|
|
\item \cmdarg{-n $<$nodeid$>$}: Assign the new host the node ID
|
|
\cmdarg{nodeid}. \cmdarg{nodeid} must be an unused node ID. If
|
|
\cmdarg{-n} is not specified, Open MPI will find the lowest node ID that
|
|
is not being used.
|
|
|
|
\item \cmdarg{-no-schedule}: Has the same effect as putting ``{\tt
|
|
no\_\-schedule=yes}'' in the boot schema. This means that the
|
|
\cmdarg{C} and \cmdarg{N} expansion used in \cmd{mpirun} and \cmd{lamexec}
|
|
will not include this node.
|
|
|
|
\item \cmdarg{-ssi $<$key$>$ $<$value$>$}: Pass in SSI parameter
|
|
\cmdarg{key} with the value \cmdarg{value}.
|
|
|
|
\item \cmdarg{$<$hostname$>$}: The name of the host to expand the
|
|
universe to.
|
|
\end{itemize}
|
|
|
|
For example, the following adds the node \host{blinky} to the existing
|
|
Open MPI universe using the \boot{rsh} boot module:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamgrow -ssi boot rsh blinky.cluster.example.com
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
|
|
Note that \cmd{lamgrow} cannot grow a Open MPI universe that only contains
|
|
one node that has an IP address of 127.0.0.1 (e.g., if \cmd{lamboot}
|
|
was run with the default boot schema that only contains the name
|
|
\host{localhost}). In this case, \cmd{lamgrow} will print an error
|
|
and abort without adding the new node.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamhalt} Command}
|
|
\label{sec:commands-lamhalt}
|
|
|
|
The \cmd{lamhalt} command is used to shut down the Open MPI RTE.
|
|
Typically, \cmd{lamhalt} can simply be run with no command line
|
|
parameters and it will shut down the Open MPI RTE. Optionally, the
|
|
\cmdarg{-v} or \cmdarg{-d} arguments can be used to make \cmd{lamhalt}
|
|
be verbose or extremely verbose, respectively.
|
|
|
|
There are a small number of cases where \cmd{lamhalt} will fail. For
|
|
example, if a Open MPI daemon becomes unresponsive (e.g., the daemon was
|
|
killed), \cmd{lamhalt} may fail to shut down the entire Open MPI universe.
|
|
It will eventually timeout and therefore complete in finite time, but
|
|
you may want to use the last-resort \cmd{lamwipe} command (see
|
|
Section~\ref{sec:commands-lamwipe}).
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{laminfo} Command}
|
|
\label{sec:commands-laminfo}
|
|
|
|
The \cmd{laminfo} command can be used to query the capabilities of the
|
|
Open MPI installation. Running \cmd{laminfo} with no parameters shows
|
|
a prettyprint summary of information. Using the \cmdarg{-parsable}
|
|
command line switch shows the same summary information, but in a
|
|
format that should be relatively easy to parse with common unix tools
|
|
such as \cmd{grep}, \cmd{cut}, \cmd{awk}, etc.
|
|
|
|
\cmd{laminfo} supports a variety of command line options to query for
|
|
specific information. The \cmdarg{-h} option shows a complete listing
|
|
of all options. Some of the most common options include:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-arch}: Show the architecture that Open MPI was configured
|
|
for.
|
|
|
|
\item \cmdarg{-path}: Paired with a second argument, display various
|
|
paths relevant to the Open MPI installation. Valid second arguments
|
|
include:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{prefix}: Main installation prefix
|
|
\item \cmdarg{bindir}: Where the Open MPI executables are located
|
|
\item \cmdarg{libdir}: Where the Open MPI libraries are located
|
|
\item \cmdarg{incdir}: Where the Open MPI include files are located
|
|
\item \cmdarg{pkglibdir}: Where dynamic SSI modules are
|
|
installed\footnote{Dynamic SSI modules are not supported in
|
|
Open MPI 7.0, but will be supported in future versions.}
|
|
\item \cmdarg{sysconfdir}: Where the Open MPI help files are located
|
|
\end{itemize}
|
|
|
|
\item \cmdarg{-version}: Paired with two addition options, display the
|
|
version of either Open MPI or one or more SSI modules. The first
|
|
argument identifies what to report the version of, and can be any of
|
|
the following:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{lam}: Version of Open MPI
|
|
\item \cmdarg{boot}: Version of all boot modules
|
|
\item \cmdarg{boot:module}: Version of a specific boot module
|
|
\item \cmdarg{coll}: Version of all coll modules
|
|
\item \cmdarg{coll:module}: Version of a specific coll module
|
|
\item \cmdarg{cr}: Version of all cr modules
|
|
\item \cmdarg{cr:module}: Version of a specific cr module
|
|
\item \cmdarg{rpi}: Version of all rpi modules
|
|
\item \cmdarg{rpi:module}: Version of a specific rpi module
|
|
\end{itemize}
|
|
|
|
The second argument specifies the scope of the version number to
|
|
display -- whether to show the entire version number string, or just
|
|
one component of it:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{full}: Display the entire version number string
|
|
\item \cmdarg{major}: Display the major version number
|
|
\item \cmdarg{minor}: Display the minor version number
|
|
\item \cmdarg{release}: Display the release version number
|
|
\item \cmdarg{alpha}: Display the alpha version number
|
|
\item \cmdarg{beta}: Display the beta version number
|
|
\item \cmdarg{svn}: Display the SVN version number\footnote{The
|
|
value will either be 0 (not built from SVN), 1 (built from a
|
|
Subverstion checkout) or a date encoded in the form YYYYMMDD
|
|
(built from a nightly tarball on the given date)}
|
|
|
|
\end{itemize}
|
|
|
|
\changebegin{7.1}
|
|
|
|
\item \cmdarg{-param}: Paired with two additional arguments, display
|
|
the SSI parameters for a given type and/or module. The first
|
|
argument can be any of the valid SSI types or the special name
|
|
``base,'' indicating the SSI framework itself. The second argument
|
|
can be any valid module name.
|
|
|
|
Additionally, either argument can be the wildcard ``any'' which
|
|
will match any valid SSI type and/or module.
|
|
|
|
\changeend{7.1}
|
|
\end{itemize}
|
|
|
|
Multiple options can be combined to query several attributes at once:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ laminfo -parsable -arch -version lam major -version rpi:tcp full -param rpi tcp
|
|
version:lam:7
|
|
ssi:boot:rsh:version:ssi:1.0
|
|
ssi:boot:rsh:version:api:1.0
|
|
ssi:boot:rsh:version:module:7.0
|
|
arch:i686-pc-linux-gnu
|
|
ssi:rpi:tcp:param:rpi_tcp_short:65536
|
|
ssi:rpi:tcp:param:rpi_tcp_sockbuf:-1
|
|
ssi:rpi:tcp:param:rpi_tcp_priority:20
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
|
|
Note that three version numbers are returned for the \rpi{tcp} module.
|
|
The first (\cmdarg{ssi}) indicates the overall SSI version that the
|
|
module conforms to, the second (\cmdarg{api}) indicates what version
|
|
of the \kind{rpi} API the module conforms to, and the last
|
|
(\cmdarg{module}) indicates the version of the module itself.
|
|
|
|
Running \cmd{laminfo} with no arguments provides a wealth of
|
|
information about your Open MPI installation (we ask for this output
|
|
when reporting problems to the Open MPI general user's mailing list --
|
|
see Section \ref{troubleshooting:mailing-lists} on page
|
|
\pageref{troubleshooting:mailing-lists}). Most of the output fields
|
|
are self-explanitory; two that are worth explaining are:
|
|
|
|
\begin{itemize}
|
|
\item Debug support: This indicates whether your Open MPI installation was
|
|
configured with the \confflag{with-debug} option. It is generally
|
|
only used by the Open MPI Team for development and maintenance of Open MPI
|
|
itself; it does {\em not} indicate whether user's MPI applications
|
|
can be debugged (specifically: user's MPI applications can {\em
|
|
always} be debugged, regardless of this setting). This option
|
|
defaults to ``no''; users are discouraged from using this option.
|
|
See the Install Guide for more information about
|
|
\confflag{with-debug}.
|
|
|
|
\item Purify clean: This indicates whether your Open MPI installation was
|
|
configured with the \confflag{with-purify} option. This option is
|
|
necessary to prevent a number of false positives when using
|
|
memory-checking debuggers such as Purify, Valgrind, and bcheck. It
|
|
is off by default because it can cause slight performance
|
|
degredation in MPI applications. See the Install Guide for more
|
|
information about \confflag{with-purify}.
|
|
\end{itemize}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamnodes} Command}
|
|
\label{sec:commands-lamnodes}
|
|
|
|
Open MPI was specifically designed to abstract away hostnames once
|
|
\cmd{lamboot} has completed successfully. However, for various
|
|
reasons (usually related to system-administration concerns, and/or for
|
|
creating human-readable reports), it can be desirable to retrieve the
|
|
hostnames of Open MPI nodes long after \icmd{lamboot}.
|
|
|
|
The command \cmd{lamnodes} can be used for this purpose. It accepts
|
|
both the \cmdarg{N} and \cmdarg{C} syntax from \cmd{mpirun}, and will return the
|
|
corresponding names of the specified nodes. For example:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamnodes N
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
|
|
\noindent will return the node that each CPU is located on, the
|
|
hostname of that node, the total number of CPUs on each, and any flags
|
|
that are set on that node. Specific nodes can also be queried:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamnodes n0,3
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
|
|
\noindent will return the node, hostname, number of CPUs, and flags on
|
|
n0 and n3.
|
|
|
|
Command line arguments can be used to customize the output of
|
|
\cmd{lamnodes}. These include:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-c}: Suppress printing CPU counts
|
|
\item \cmdarg{-i}: Print IP addresses instead of IP names
|
|
\item \cmdarg{-n}: Suppress printing Open MPI node IDs
|
|
\end{itemize}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamrestart} Command}
|
|
|
|
The \cmd{lamrestart} can be used to restart a previously-checkpointed
|
|
MPI application. The arguments to \cmd{lamrestart} depend on the
|
|
selected checkpoint/restart module. Regardless of the
|
|
checkpoint/restart module used, invoking \cmd{lamrestart} results in a
|
|
new \cmd{mpirun} being launched.
|
|
|
|
The SSI parameter \ssiparam{cr} must be used to specify which
|
|
checkpoint/restart module should be used to restart the application.
|
|
Currently, only two values are possible: \ssiparam{blcr} and
|
|
\ssiparam{self}.
|
|
|
|
\begin{itemize}
|
|
\item If the \crssi{blcr} module is selected, the SSI parameter
|
|
\issiparam{cr\_\-blcr\_\-context\_\-file} should be used to pass in
|
|
the filename of the context file that was created during a pevious
|
|
successful checkpoint. For example:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamrestart -ssi cr blcr -ssi cr_blcr_context_file filename
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
|
|
\item If the \crssi{self} module is selected, the SSI parameter
|
|
\issiparam{cr\_\-restart\_\-args} must be passed with the arguments
|
|
to be passed to \cmd{mpirun} to restart the application. For
|
|
example:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamrestart -ssi cr self -ssi cr_restart_args "args to mpirun"
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
\end{itemize}
|
|
|
|
See Section~\ref{sec:mpi-ssi-cr} for more detail about the
|
|
checkpoint/restart capabilities of Open MPI, including details about
|
|
the \crssi{blcr} and \crssi{self} \kind{cr} modules.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{The \icmd{lamshrink} Command}
|
|
\label{sec:commands-lamshrink}
|
|
|
|
The \cmd{lamshrink} command is used to remove a node from a Open MPI
|
|
universe:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ lamshrink n3
|
|
\end{lstlisting}
|
|
% Stupid emacs: $
|
|
|
|
\noindent removes node n3 from the Open MPI universe. Note that all nodes
|
|
with ID's greater than 3 will not have their ID's reduced by one -- n3
|
|
simply becomes an empty slot in the Open MPI universe. \cmd{mpirun} and
|
|
\cmd{lamexec} will still function correctly, even when used with \cmdarg{C}
|
|
and \cmdarg{N} notation -- they will simply skip the n3 since there is no
|
|
longer an operational node in that slot.
|
|
|
|
Note that the \cmd{lamgrow} command can optionally be used to fill the
|
|
empty slot with a new node.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{mpicc}, \icmd{mpiCC} / \icmd{mpic++}, and
|
|
\icmd{mpif77} Commands}
|
|
\label{sec:commands-wrappers}
|
|
\index{wrapper compilers}
|
|
|
|
Compiling MPI applications can be a complicated process because the
|
|
list of compiler and linker flags required to successfully compile and
|
|
link a Open MPI application not only can be quite long, it can change
|
|
depending on the particular configuration that Open MPI was installed with.
|
|
For example, if Open MPI includes native support for Myrinet hardware, the
|
|
\cmdarg{-lgm} flag needs to be used when linking MPI executables.
|
|
|
|
To hide all this complexity, ``wrapper'' compilers are provided that
|
|
handle all of this automatically. They are called ``wrapper''
|
|
compilers because all they do is add relevant compiler and linker
|
|
flags to the command line before invoking the real back-end compiler
|
|
to actually perform the compile/link. Most command line arugments are
|
|
passed straight through to the back-end compiler without modification.
|
|
|
|
Therefore, to compile an MPI application, use the wrapper compilers
|
|
exactly as you would use the real compiler. For example:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpicc -O -c main.c
|
|
shell$ mpicc -O -c foo.c
|
|
shell$ mpicc -O -c bar.c
|
|
shell$ mpicc -O -o main main.o foo.o bar.o
|
|
\end{lstlisting}
|
|
|
|
This compiles three C source code files and links them together into a
|
|
single executable. No additional \cmdarg{-I}, \cmdarg{-L}, or
|
|
\cmdarg{-l} arguments are required.
|
|
|
|
The main exceptions to what flags are not passed through to the
|
|
back-end compiler are:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-showme}: Used to show what the wrapper compiler would
|
|
have executed. This is useful to see the full compile/link line
|
|
would have been executed. For example (your output may differ from
|
|
what is shown below, depending on your installed Open MPI
|
|
configuration):
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpicc -O -c main.c -showme
|
|
gcc -I/usr/local/lam/include -pthread -O -c foo.c
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
# The output line shown below is word wrapped in order to fit nicely in the document margins
|
|
shell$ mpicc -O -o main main.o foo.o bar.o -showme
|
|
gcc -I/usr/local/lam/include -pthread -O -o main main.o foo.o bar.o \
|
|
-L/usr/local/lam/lib -llammpio -lpmpi -llamf77mpi -lmpi -llam -lutil \
|
|
-pthread
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
\changebegin{7.1}
|
|
|
|
Two notable sub-flags are:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-showme:compile}: Show only the compile flags,
|
|
suitable for substitution into \envvar{CFLAGS}.
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpicc -O -c main.c -showme:compile
|
|
-I/usr/local/lam/include -pthread
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
\item \cmdarg{-showme:link}: Show only the linker flags (which are
|
|
actually \envvar{LDFLAGS} and \envvar{LIBS} mixed together),
|
|
suitable for substitution into \envvar{LIBS}.
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpicc -O -o main main.o foo.o bar.o -showme:link
|
|
-L/usr/local/lam/lib -llammpio -lpmpi -llamf77mpi -lmpi -llam -lutil -pthread
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
\end{itemize}
|
|
|
|
\changeend{7.1}
|
|
|
|
\item \cmdarg{-lpmpi}: When compiling a user MPI application, the
|
|
\cmdarg{-lpmpi} argument is used to indicate that MPI profiling
|
|
support should be included. The wrapper compiler may alter the
|
|
exact placement of this argument to ensure that proper linker
|
|
dependency semantics are preserved.
|
|
\end{itemize}
|
|
|
|
\changebegin{7.1}
|
|
Neither the compiler nor linker flags can be overridden at run-time.
|
|
The back-end compiler, however, can be. Environment variables can be
|
|
used for this purpose:
|
|
|
|
\begin{itemize}
|
|
\item \ienvvar{Open MPIMPICC} (deprecated name: \idepenvvar{Open MPIHCC}):
|
|
Overrides the default C compiler in the \cmd{mpicc} wrapper
|
|
compiler.
|
|
|
|
\item \ienvvar{Open MPIMPICXX} (deprecated name: \idepenvvar{Open MPIHCP}):
|
|
Overrides the default C compiler in the \cmd{mpicc} wrapper
|
|
compiler.
|
|
|
|
\item \ienvvar{Open MPIMPIF77} (deprecated name: \idepenvvar{Open MPIHF77}):
|
|
Overrides the default C compiler in the \cmd{mpicc} wrapper
|
|
compiler.
|
|
\end{itemize}
|
|
|
|
For example (for Bourne-like shells):
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ Open MPIPICC=cc
|
|
shell$ export Open MPIMPICC
|
|
shell$ mpicc my_application.c -o my_application
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
For csh-like shells:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell% setenv Open MPIPICC cc
|
|
shell% mpicc my_application.c -o my_application
|
|
\end{lstlisting}
|
|
|
|
All this being said, it is {\em strongly} recommended to use the
|
|
wrapper compilers -- and their default underlying compilers -- for all
|
|
compiling and linking of MPI applications. Strange behavior can occur
|
|
in MPI applications if Open MPI was configured and compiled with one
|
|
compiler and then user applications were compiled with a different
|
|
underlying compiler, to include: failure to compile, failure to link,
|
|
seg faults and other random bad behavior at run-time.
|
|
|
|
Finally, note that the wrapper compilers only add all the
|
|
Open MPI-specific flags when a command-line argument that does not
|
|
begin with a dash (``-'') is present. For example:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpicc
|
|
gcc: no input files
|
|
shell$ mpicc --version
|
|
gcc (GCC) 3.2.2 (Mandrake Linux 9.1 3.2.2-3mdk)
|
|
Copyright (C) 2002 Free Software Foundation, Inc.
|
|
This is free software; see the source for copying conditions. There is NO
|
|
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
\end{lstlisting}
|
|
|
|
\changeend{7.1}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Deprecated Names}
|
|
|
|
Previous versions of Open MPI used the names \idepcmd{hcc},
|
|
\idepcmd{hcp}, and \idepcmd{hf77} for the wrapper compilers. While
|
|
these command names still work (they are simply symbolic links to the
|
|
real wrapper compilers \cmd{mpicc}, \cmd{mpiCC}/\cmd{mpic++}, and
|
|
\cmd{mpif77}, respectively), their use is deprecated.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{mpiexec} Command}
|
|
\label{sec:commands-mpiexec}
|
|
|
|
The \cmd{mpiexec} command is used to launch MPI programs. It is
|
|
similar to, but slightly different than, \cmd{mpirun}.\footnote{The
|
|
reason that there are two methods to launch MPI executables is
|
|
because the MPI-2 standard suggests the use of \cmd{mpiexec} and
|
|
provides standardized command line arguments. Hence, even though
|
|
Open MPI already had an \cmd{mpirun} command to launch MPI executables,
|
|
\cmd{mpiexec} was added to comply with the standard.} Although
|
|
\cmd{mpiexec} is simply a wrapper around other Open MPI commands (including
|
|
\cmd{lamboot}, \cmd{mpirun}, and \cmd{lamhalt}), it ties their
|
|
functionality together and provides a unified interface for launching
|
|
MPI processes.
|
|
%
|
|
Specifically, \cmd{mpiexec} offers two features from command line
|
|
flags that require multiple steps when using other Open MPI commands:
|
|
launching MPMD MPI processes and launching MPI processes when there is
|
|
no existing Open MPI universe.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{General Syntax}
|
|
|
|
The general form of \cmd{mpiexec} commands is:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
mpiexec [global_args] local_args1 [: local_args2 [...]]
|
|
\end{lstlisting}
|
|
|
|
Global arguments are applied to all MPI processes that are launched.
|
|
They must be specified before any local arguments. Common global
|
|
arguments include:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-boot}: Boot the Open MPI RTE before launching the MPI
|
|
processes.
|
|
|
|
\item \cmdarg{-boot-args $<$args$>$}: Pass \cmdarg{$<$args$>$} to the
|
|
back-end \cmd{lamboot}. Implies \cmdarg{-boot}.
|
|
|
|
\item \cmdarg{-machinefile $<$filename$>$}: Specify {\tt
|
|
$<$filename$>$} as the boot schema to use when invoking the
|
|
back-end \cmd{lamboot}. Implies \cmdarg{-boot}.
|
|
|
|
\changebegin{7.1}
|
|
\item \cmdarg{-prefix $<$lam/install/path$>$}: Use the Open MPI
|
|
installation specified in the $<$lam/install/path$>$ - where
|
|
$<$lam/install/path$>$ is the top level directory where Open MPI is
|
|
``installed''. This is typically used when a user has multiple
|
|
Open MPI installations and want to switch between them without
|
|
changing the dot files or PATH environment variable. This option is
|
|
not compatible with Open MPI versions prior to 7.1.
|
|
\changeend{7.1}
|
|
|
|
\item \cmdarg{-ssi $<$key$>$ $<$value$>$}: Pass the SSI {\tt
|
|
$<$key$>$} and {\tt $<$value$>$} arguments to the back-end
|
|
\cmd{mpirun} command.
|
|
\end{itemize}
|
|
|
|
Local arguments are specific to an individual MPI process that will be
|
|
launched. They are specified along with the executable that will be
|
|
launched. Common local arguments include:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{-n $<$numprocs$>$}: Launch {\tt $<$numprocs$>$} number
|
|
of copies of this executable.
|
|
|
|
\item \cmdarg{-arch $<$architecture$>$}: Launch the executable on
|
|
nodes in the Open MPI universe that match this architecture. An
|
|
architecture is determined to be a match if the {\tt
|
|
$<$architecture$>$} matches any subset of the GNU Autoconf
|
|
architecture string on each of the target nodes (the \cmd{laminfo}
|
|
command shows the GNU Autoconf configure string).
|
|
|
|
\item \cmdarg{$<$other arguments$>$}: When \cmd{mpiexec} first
|
|
encounters an argument that it doesn't recognize, the remainder of
|
|
the arguments will be passed back to \cmd{mpirun} to actually start
|
|
the process.
|
|
\end{itemize}
|
|
|
|
The following example launches four copies of the
|
|
\cmd{my\_\-mpi\_\-program} executable in the Open MPI universe, using
|
|
default scheduling patterns:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpiexec -n 4 my_mpi_program
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Launching MPMD Processes}
|
|
|
|
The ``\cmdarg{:}'' separator can be used to launch multiple
|
|
executables in the same MPI job. Specifically, each process will
|
|
share a common \mpiconst{MPI\_\-COMM\_\-WORLD}. For example, the
|
|
following launches a single \cmd{manager} process as well as a
|
|
\cmd{worker} process for every CPU in the Open MPI universe:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpiexec -n 1 manager : C worker
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
Paired with the \cmd{-arch} flag, this can be especially helpful in
|
|
heterogeneous environments:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpiexec -arch solaris sol_program : -arch linux linux_program
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
Even only ``slightly heterogeneous'' environments can run into
|
|
problems with shared libraries, different compilers, etc. The
|
|
\cmd{-arch} flag can be used to differentiate between different
|
|
versions of the same operating system:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpiexec -arch solaris2.8 sol2.8_program : -arch solaris2.9 sol2.9_program
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Launching MPI Processes with No Established Open MPI Universe}
|
|
|
|
The \cmd{-boot}, \cmd{-boot-args}, and \cmd{-machinefile} global
|
|
arguments can be used to launch the Open MPI RTE, run the MPI process(es),
|
|
and then take down the Open MPI RTE. This conveniently wraps up several
|
|
Open MPI commands and provides ``one-shot'' execution of MPI processes.
|
|
For example:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpiexec -machinefile hostfile C my_mpi_program
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
Some boot SSI modules do not require a hostfile; specifying the
|
|
\cmdarg{-boot} argument is sufficient in these cases:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpiexec -boot C my_mpi_program
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
When \cmd{mpiexec} is used to boot the Open MPI RTE, it will do its best to
|
|
take down the Open MPI RTE even if errors occur, either during the boot
|
|
itself, or if an MPI process aborts (or the user hits Control-C).
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{mpimsg} Command (Deprecated)}
|
|
\label{sec:commands-mpimsg}
|
|
|
|
The \cmd{mpimsg} command is deprecated. It is only useful in a small
|
|
number of cases (specifically, when the \rpi{lamd} RPI module is
|
|
used), and may disappear in future Open MPI releases.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{mpirun} Command}
|
|
\label{sec:commands-mpirun}
|
|
|
|
The \cmd{mpirun} command is the main mechanism to launch MPI processes
|
|
in parallel.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Simple Examples}
|
|
|
|
Although \cmd{mpirun} supports many different modes of execution, most
|
|
users will likely only need to use a few of its capabilities. It is
|
|
common to launch either one process per node or one process per CPU in
|
|
the Open MPI universe (CPU counts are established in the boot schema). The
|
|
following two examples show these two cases:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
# Launch one copy of my_mpi_program on every schedulable node in the Open MPI universe
|
|
shell$ mpirun N my_mpi_program
|
|
\end{lstlisting}
|
|
% stupid emacs mode: $
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
# Launch one copy of my_mpi_program on every schedulable CPU in the Open MPI universe
|
|
shell$ mpirun C my_mpi_program
|
|
\end{lstlisting}
|
|
% stupid emacs mode: $
|
|
|
|
The specific number of processes that are launched can be controlled
|
|
with the \cmdarg{-np} switch:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
# Launch four my_mpi_program processes
|
|
shell$ mpirun -np 4 my_mpi_program
|
|
\end{lstlisting}
|
|
% stupid emacs mode: $
|
|
|
|
The \cmdarg{-ssi} switch can be used to specify tunable parameters to
|
|
MPI processes.
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
# Specify to use the usysv RPI module
|
|
shell$ mpirun -ssi rpi usysv C my_mpi_program
|
|
\end{lstlisting}
|
|
% stupid emacs mode: $
|
|
|
|
The available modules and their associated parameters are discussed in
|
|
detail in Chapter~\ref{sec:mpi-ssi}.
|
|
|
|
Arbitrary user arguments can also be passed to the user program.
|
|
\cmd{mpirun} will attempt to parse all options (looking for Open MPI
|
|
options) until it finds a \cmdarg{--}. All arguments following
|
|
\cmdarg{--} are directly passed to the MPI application.
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
# Pass three command line arguments to every instance of my_mpi_program
|
|
shell$ mpirun -ssi rpi usysv C my_mpi_program arg1 arg2 arg3
|
|
# Pass three command line arguments, escaped from parsing
|
|
shell$ mpirun -ssi rpi usysv C my_mpi_program -- arg1 arg2 arg3
|
|
\end{lstlisting}
|
|
% stupid emacs mode: $
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Controlling Where Processes Are Launched}
|
|
|
|
\cmd{mpirun} allows for fine-grained control of where to schedule
|
|
launched processes. Note Open MPI uses the term ``schedule'' extensively
|
|
to indicate which nodes processes are launched on. Open MPI does {\em not}
|
|
influence operating system semantics for prioritizing processes or
|
|
binding processes to specific CPUs. The boot schema file can be used
|
|
to indicate how many CPUs are on a node, but this is only used for
|
|
scheduling purposes. For a fuller description of CPU counts in boot
|
|
schemas, see Sections~\ref{sec:getting-started-hostfile}
|
|
and~\ref{sec:lam-ssi-boot-schema} on
|
|
pages~\pageref{sec:getting-started-hostfile}
|
|
and~\pageref{sec:lam-ssi-boot-schema}, respectively.
|
|
|
|
Open MPI offers two main scheduling nomenclatures: by node and by CPU. For
|
|
example \cmdarg{N} means ``all schedulable nodes in the universe''
|
|
(``schedulable'' is defined in
|
|
Section~\ref{sec:commands-lamboot-no-schedule}). Similarly,
|
|
\cmdarg{C} means ``all schedulable CPUs in the universe.''
|
|
|
|
More fine-grained control is also possible -- nodes and CPUs can be
|
|
individually identified, or identified by ranges. The syntax for
|
|
these concepts is \cmdarg{n$<$range$>$} and \cmdarg{c$<$range$>$},
|
|
respectively. \cmdarg{$<$range$>$} can specify one or more elements
|
|
by listing integers separated by commas and dashes. For example:
|
|
|
|
\begin{itemize}
|
|
\item \cmdarg{n3}: The node with an ID of 3.
|
|
|
|
\item \cmdarg{c2}: The CPU with an ID of 2.
|
|
|
|
\item \cmdarg{n2,4}: The nodes with IDs of 2 and 4.
|
|
|
|
\item \cmdarg{c2,4-7}: The CPUs with IDs of 2, 4, 5, 6, and 7. Note
|
|
that some of these CPUs may be on the same node(s).
|
|
\end{itemize}
|
|
|
|
Integers can range from 0 to the highest numbered node/CPU. Note that
|
|
these nomenclatures can be mixed and matched on the \cmd{mpirun}
|
|
command line:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpirun n0 C manager-worker
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
\noindent will launch the \cmd{manager-worker} program on \cmdarg{n0}
|
|
as well as on every schedulable CPU in the universe (yes, this means
|
|
that \cmdarg{n0} will likely be over-subscribed).
|
|
|
|
When running on SMP nodes, it is preferable to use the
|
|
\cmdarg{C}/\cmdarg{c$<$range$>$} nomenclature (with appropriate CPU
|
|
counts in the boot schema) to the \cmdarg{N}/\cmdarg{n$<$range$>$}
|
|
nomenclature because of how Open MPI will order ranks in \mcw. For
|
|
example, consider a Open MPI universe of two four-way SMPs -- \cmdarg{n0}
|
|
and \cmdarg{n1} both have a CPU count of 4. Using the following:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpirun C my_mpi_program
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
\noindent will launch eight copies of \cmd{my\_\-mpi\_\-program}, four
|
|
on each node. Open MPI will place as many adjoining \mcw\ ranks on the
|
|
same node as possible: \mcw\ ranks 0-3 will be scheduled on
|
|
\cmdarg{n0} and \mcw\ ranks 4-7 will be scheduled on \cmdarg{n1}.
|
|
Specifically, \cmdarg{C} schedules processes starting with \cmd{c0}
|
|
and incrementing the CPU index number.
|
|
|
|
Note that unless otherwise specified, Open MPI schedules processes by CPU
|
|
(vs.\ scheduling by node). For example, using \cmd{mpirun}'s
|
|
\cmdarg{-np} switch to specify an absolute number of processes
|
|
schedules on a per-CPU basis.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Per-Process Controls}
|
|
|
|
\cmd{mpirun} allows for arbitrary, per-process controls such as
|
|
launching MPMD jobs, passing different command line arguments to
|
|
different \mcw\ ranks, etc. This is accomplished by creating a text
|
|
file called an application schema that lists, one per line, the
|
|
location, relevant flags, user executable, and command line arguments
|
|
for each process. For example (lines beginning with ``\#'' are
|
|
comments):
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
# Start the manager on c0 with a specific set of command line options
|
|
c0 manager manager_arg1 manager_arg2 manager_arg3
|
|
# Start the workers on all available CPUs with different arguments
|
|
C worker worker_arg1 worker_arg2 worker_arg3
|
|
\end{lstlisting}
|
|
|
|
Note that the \cmdarg{-ssi} switch is {\em not} permissible in
|
|
application schema files; \cmdarg{-ssi} flags are considered to be
|
|
global to the entire MPI job, not specified per-process. Application
|
|
schemas are described in more detail in the \file{appschema(5)} manual
|
|
page.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Ability to Pass Environment Variables}
|
|
|
|
All environment variables with names that begin with
|
|
\envvar{Open MPI\_\-MPI\_} are automatically passed to remote notes (unless
|
|
disabled via the \cmdarg{-nx} option to \cmd{mpirun}). Additionally,
|
|
the \cmdarg{-x} option enables exporting of specific environment
|
|
variables to the remote nodes:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ Open MPI_MPI_FOO=``green eggs and ham''
|
|
shell$ export Open MPI_MPI_FOO
|
|
shell$ mpirun C -x DISPLAY,SEUSS=author samIam
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
This will launch the \cmd{samIam} application on all available CPUs.
|
|
The \envvar{Open MPI\_\-MPI\_\-FOO}, \envvar{DISPLAY}, and \envvar{SEUSS}
|
|
environment variables will be created each the process environment
|
|
before the \cmd{smaIam} program is invoked.
|
|
|
|
Note that the parser for the \cmd{-x} option is currently not very
|
|
sophisticated -- it cannot even handle quoted values when defining new
|
|
environment variables. Users are advised to set variables in the
|
|
environment prior to invoking \cmd{mpirun}, and only use \cmd{-x} to
|
|
export the variables to the remote nodes (not to define new
|
|
variables), if possible.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\subsection{Current Working Directory Behavior}
|
|
|
|
Using the \cmd{-wd} option to \cmd{mpirun} allows specifying an
|
|
arbitrary working directory for the launched processes. It can also
|
|
be used in application schema files to specify working directories on
|
|
specific nodes and/or for specific applications.
|
|
|
|
If the \cmdarg{-wd} option appears both in an application schema file
|
|
and on the command line, the schema file directory will override the
|
|
command line value. \cmd{-wd} is mutually exclusive with \cmdarg{-D}.
|
|
|
|
If neither \cmdarg{-wd} nor \cmdarg{-D} are specified, the local node
|
|
will send the present working directory name from the \cmd{mpirun}
|
|
process to each of the remote nodes. The remote nodes will then try
|
|
to change to that directory. If they fail (e.g., if the directory
|
|
does not exist on that node), they will start from the user's home
|
|
directory.
|
|
|
|
All directory changing occurs before the user's program is invoked; it
|
|
does not wait until \mpifunc{MPI\_\-INIT} is called.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{mpitask} Command}
|
|
\label{sec:commands-mpitask}
|
|
|
|
The \cmd{mpitask} command shows a list of the processes running in the
|
|
Open MPI universe and a snapshot of their current MPI activity. It is
|
|
usually invoked with no command line parameters, thereby showing
|
|
summary details of all processes currently running.
|
|
%
|
|
Since \cmd{mpitask} only provides a snapshot view, it is not advisable
|
|
to use \cmd{mpitask} as a high-resolution debugger (see
|
|
Chapter~\ref{sec:debugging}, page~\pageref{sec:debugging}, for more
|
|
details on debugging MPI programs). Instead, \cmd{mpitask} can be
|
|
used to provide answers to high-level questions such as ``Where is my
|
|
program hung?'' and ``Is my program making progress?''
|
|
|
|
The following example shows an MPI program running on four nodes,
|
|
sending a message of 524,288 integers around in a ring pattern.
|
|
Process 0 is running (i.e., not in an MPI function), while the other
|
|
three are blocked in \mpifunc{MPI\_\-RECV}.
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ mpitask
|
|
TASK (G/L) FUNCTION PEER|ROOT TAG COMM COUNT DATATYPE
|
|
0 ring <running>
|
|
1/1 ring Recv 0/0 201 WORLD 524288 INT
|
|
2/2 ring Recv 1/1 201 WORLD 524288 INT
|
|
3/3 ring Recv 2/2 201 WORLD 524288 INT
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{recon} Command}
|
|
\label{sec:commands-recon}
|
|
|
|
The \cmd{recon} command is a quick test to see if the user's
|
|
environment is setup properly to boot the Open MPI RTE. It takes most of
|
|
the same parameters as the \cmd{lamboot} command.
|
|
|
|
Although it does not boot the RTE, and does not definitively guarantee
|
|
that \cmd{lamboot} will succeed, it is a good tool for testing while
|
|
setting up first-time Open MPI users. \cmd{recon} will display a
|
|
message when it has completed indicating whether it succeeded or
|
|
failed.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{tping} Command}
|
|
\label{sec:commands-tping}
|
|
|
|
The \cmd{tping} command can be used to verify the functionality of a
|
|
Open MPI universe. It is used to send a ping message between the Open MPI
|
|
daemons that constitute the Open MPI RTE.
|
|
|
|
It commonly takes two arguments: the set of nodes to ping (expressed
|
|
in \cmdarg{N} notation) and how many times to ping them. Similar to the
|
|
Unix \cmd{ping} command, if the number of times to ping is not
|
|
specified, \cmd{tping} will continue until it is stopped (usually by
|
|
the user hitting Control-C). The following example pings all nodes in
|
|
the Open MPI universe three times:
|
|
|
|
\lstset{style=lam-cmdline}
|
|
\begin{lstlisting}
|
|
shell$ tping N -c 3
|
|
1 byte from 3 remote nodes and 1 local node: 0.002 secs
|
|
1 byte from 3 remote nodes and 1 local node: 0.001 secs
|
|
1 byte from 3 remote nodes and 1 local node: 0.001 secs
|
|
|
|
3 messages, 3 bytes (0.003K), 0.005 secs (1.250K/sec)
|
|
roundtrip min/avg/max: 0.001/0.002/0.002
|
|
\end{lstlisting}
|
|
% Stupid emacs mode: $
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
|
|
\section{The \icmd{lamwipe} Command}
|
|
\label{sec:commands-lamwipe}
|
|
|
|
\changebegin{7.1}
|
|
The \cmd{lamwipe} command used to be called \idepcmd{wipe}. The name
|
|
\idepcmd{wipe} has now been deprecated and although it still works in
|
|
this version of Open MPI, will be removed in future versions. All
|
|
users are encouraged to start using \cmd{lamwipe} instead.
|
|
\changeend{7.1}
|
|
|
|
The \cmd{lamwipe} command is used as a ``last resort'' command, and is
|
|
typically only necessary if \cmd{lamhalt} fails. This usually only
|
|
occurs in error conditions, such as if a node fails. The
|
|
\cmd{lamwipe} command takes most of the same parameters as the
|
|
\cmd{lamboot} command -- it launches a process on each node in the
|
|
boot schema to kill the Open MPI RTE on that node. Hence, it should be
|
|
used with the same (or an equivalent) boot schema file as was used
|
|
with \cmd{lamboot}.
|
|
|