% -*- latex -*-
%
% Copyright (c) 2004-2005 The Trustees of Indiana University.
%                         All rights reserved.
% Copyright (c) 2004-2005 The Trustees of the University of Tennessee.
%                         All rights reserved.
% Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, 
%                         University of Stuttgart.  All rights reserved.
% Copyright (c) 2004-2005 The Regents of the University of California.
%                         All rights reserved.
% $COPYRIGHT$
% 
% Additional copyrights may follow
% 
% $HEADER$
%

\chapter{Miscellaneous}
\label{sec:misc}

This chapter covers a variety of topics that don't conveniently fit
into other chapters.

{\Huge JMS Needs a lot of overhauling}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Singleton MPI Processes}

It is possible to run an MPI process without the \cmd{mpirun} or
\cmd{mpiexec} commands -- simply run the program as one would normally
launch a serial program:

\lstset{style=lam-cmdline}
\begin{lstlisting}
shell$ my_mpi_program
\end{lstlisting}
% Stupid emacs mode: $

Doing so will create an \mcw with a single process.  This process can
either run by itself, or spawn or connect to other MPI processes and
become part of a larger MPI jobs using the MPI-2 dynamic function
calls.  

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI-2 I/O Support}
\index{ROMIO}
\index{MPI-2 I/O support|see {ROMIO}}
\index{I/O support|see {ROMIO}}

MPI-2 I/O support is provided through the ROMIO
package~\cite{thak99a,thak99b}.  ROMIO has been fully integrated into
Open MPI.  As such, \mpitype{MPI\_\-Request} objects can be used .....

ROMIO includes its own documentation and listings of known issues and
limitations.  See the \file{README} file in the ROMIO directory in the
Open MPI distribution.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Fortran Process Names}
\index{fortran process names}
\cmdindex{mpitask}{fortran process names}

Since Fortran does not portably provide the executable name of the
process (similar to the way that C programs get an array of {\tt
  argv}), the \icmd{mpitask} command lists the name ``Open MPI MPI Fortran
program'' by default for MPI programs that used the Fortran binding
for \mpifunc{MPI\_\-INIT} or \mpifunc{MPI\_\-INIT\_\-THREAD}.

The environment variable \ienvvar{Open MPI\_\-MPI\_\-PROCESS\_\-NAME} can
be used to override this behavior.
%
Setting this environment variable before invoking \icmd{mpirun} will
cause \cmd{mpitask} to list that name instead of the default title.
%
This environment variable only works for processes that invoke the
Fortran binding for \mpifunc{MPI\_\-INIT} or
\mpifunc{MPI\_\-INIT\_\-THREAD}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI Thread Support}
\label{sec:misc-threads}
\index{threads and MPI}
\index{MPI and threads|see {threads and MPI}}

\def\mtsingle{\mpiconst{MPI\_\-THREAD\_\-SINGLE}}
\def\mtfunneled{\mpiconst{MPI\_\-THREAD\_\-FUNNELED}}
\def\mtserial{\mpiconst{MPI\_\-THREAD\_\-SERIALIZED}}
\def\mtmultiple{\mpiconst{MPI\_\-THREAD\_\-MULTIPLE}}
\def\mpiinit{\mpifunc{MPI\_\-INIT}}
\def\mpiinitthread{\mpifunc{MPI\_\-INIT\_\-THREAD}}

Open MPI currently implements support for \mtsingle, \mtfunneled, and
\mtserial.  The constant \mtmultiple\ is provided, although Open MPI will
never return \mtmultiple\ in the \funcarg{provided} argument to
\mpiinitthread.

Open MPI makes no distinction between \mtsingle\ and \mtfunneled.  When
\mtserial\ is used, a global lock is used to ensure that only one
thread is inside any MPI function at any time.

\subsection{Thread Level}

Selecting the thread level for an MPI job is best described in terms
of the two parameters passed to \mpiinitthread: \funcarg{requested}
and \funcarg{provided}.  \funcarg{requested} is the thread level that
the user application requests, while \funcarg{provided} is the thread
level that Open MPI will run the application with.

\begin{itemize}
\item If \mpiinit\ is used to initialize the job, \funcarg{requested}
  will implicitly be \mtsingle.  However, if the
  \ienvvar{Open MPI\_\-MPI\_\-THREAD\_\-LEVEL} environment variable is set
  to one of the values in Table~\ref{tbl:mpi-env-thread-level}, the
  corresponding thread level will be used for \funcarg{requested}.
  
\item If \mpiinitthread\ is used to initialized the job, the
  \funcarg{requested} thread level is the first thread level that the
  job will attempt to use.  There is currently no way to specify lower
  or upper bounds to the thread level that Open MPI will use.
  
  The resulting thread level is largely determined by the SSI modules
  that will be used in an MPI job; each module must be able to support
  the target thread level.  A complex algorithm is used to attempt to
  find a thread level that is acceptable to all SSI modules.
  Generally, the algorithm starts at \funcarg{requested} and works
  backwards towards \mpiconst{MPI\_\-THREAD\_\-SINGLE} looking for an
  acceptable level.  However, any module may {\em increase} the thread
  level under test if it requires it.  At the end of this process, if
  an acceptable thread level is not found, the MPI job will abort.
\end{itemize}
  
\begin{table}[htbp]
  \centering
  \begin{tabular}{|c|l|}
    \hline
    Value & \multicolumn{1}{|c|}{Meaning} \\
    \hline
    \hline
    undefined & \mtsingle \\
    0 & \mtsingle \\
    1 & \mtfunneled \\
    2 & \mtserial \\
    3 & \mtmultiple \\
    \hline
  \end{tabular}
  \caption{Valid values for the \envvar{Open MPI\_\-MPI\_\-THREAD\_\-LEVEL}
    environment variable.}
  \label{tbl:mpi-env-thread-level}
\end{table}

Also note that certain SSI modules require higher thread support
levels than others.  For example, any checkpoint/restart SSI module
will require a minimum of \mtserial, and will attempt to adjust the
thread level upwards as necessary (if that CR module will be used
during the job).

Hence, using \mpiinit\ to initialize an MPI job does not imply that
the provided thread level will be \mtsingle.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI-2 Name Publishing}
\index{published names}
\index{dynamic name publishing|see {published names}}
\index{name publising|see {published names}}

Open MPI supports the MPI-2 functions \mpifunc{MPI\_\-PUBLISH\_\-NAME}
and \mpifunc{MPI\_\-UNPUBLISH\_\-NAME} for publishing and unpublishing
names, respectively.  Published names are stored within the Open MPI
daemons, and are therefore persistent, even when the MPI process that
published them dies.

As such, it is important for correct MPI programs to unpublish their
names before they terminate.  However, if stale names are left in the
Open MPI universe when an MPI process terminates, the \icmd{lamclean}
command can be used to clean {\em all} names from the Open MPI RTE.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Batch Queuing System Support}
\label{sec:misc-batch}
\index{batch queue systems}
\index{Portable Batch System|see {batch queue systems}}
\index{PBS|see {batch queue systems}}
\index{PBS Pro|see {batch queue systems}}
\index{OpenPBS|see {batch queue systems}}
\index{Load Sharing Facility|see {batch queue systems}}
\index{LSF|see {batch queue systems}}
\index{Clubmask|see {batch queue systems}}

Open MPI is now aware of some batch queuing systems.  Support is currently
included for PBS, LSF, and Clubmask-based
systems.  There is also a generic functionality that allows users of
other batch queue systems to take advantages of this functionality.

\begin{itemize}
\item When running under a supported batch queue system, Open MPI will take
  precautions to isolate itself from other instances of Open MPI in
  concurrent batch jobs.  That is, the multiple Open MPI instances from the
  same user can exist on the same machine when executing in batch.
  This allows a user to submit as many Open MPI jobs as necessary, and even
  if they end up running on the same nodes, a \cmd{lamclean} in one
  job will not kill MPI applications in another job.
  
\item This behavior is {\em only} exhibited under a batch environment.
  Other batch systems can easily be supported -- let the Open MPI Team know
  if you'd like to see support for others included.  Manually setting
  the environment variable \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX}
  on the node where \icmd{lamboot} is run achieves the same ends.
 \end{itemize}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Location of Open MPI's Session Directory}
\label{sec:misc-session-directory}
\index{session directory}

By default, Open MPI will create a temporary per-user session directory
in the following directory:

\centerline{\file{<tmpdir>/lam-<username>@<hostname>[-<session\_suffix>]}}

\noindent Each of the components is described below:

\begin{description}
\item[\file{<tmpdir>}]: Open MPI will set the prefix used for the session
  directory based on the following search order:

  \begin{enumerate}
    \item The value of the \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-PREFIX}
      environment variable

    \item The value of the \ienvvar{TMPDIR} environment variable

    \item \file{/tmp/}
  \end{enumerate}
  
  It is important to note that (unlike
  \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX}), the environment
  variables for determining \file{<tmpdir>} must be set on each node
  (although they do not necessarily have to be the same value).
  \file{<tmpdir>} must exist before \icmd{lamboot} is run, or
  \icmd{lamboot} will fail.

\item[\file{<username>}]: The user's name on that host.

\item[\file{<hostname>}]: The hostname.
  
\item[\file{<session\_suffix>}]: Open MPI will set the suffix (if any) used
  for the session directory based on the following search order:

  \begin{enumerate}

    \item The value of the \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX}
      environment variable.
  
    \item If running under a supported batch system, a unique session
      ID (based on information from the batch system) will be used.
  \end{enumerate}
\end{description}
  
\ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX} and the batch information
only need to be available on the node from which \icmd{lamboot} is
run.  \icmd{lamboot} will propagate the information to the other
nodes.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Signal Catching}
\index{signals}

Open MPI MPI catches the signals SEGV, BUS, FPE, and ILL.  The
signal handler terminates the application. This is useful in batch
jobs to help ensure that \icmd{mpirun} returns if an application
process dies.  To disable the catching of signals use the
\cmdarg{-nsigs} option to \icmd{mpirun}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{MPI Attributes}

\begin{discuss}
  Need to have discussion of built-in attributes here, such as
  MPI\_\-UNIVERSE\_\-SIZE, etc.  Should specifically mention that
  MPI\_\-UNIVERSE\_\-SIZE is fixed at \mpifunc{MPI\_\-INIT} time (at
  least it is as of this writing -- who knows what it will be when we
  release 7.1? :-).

  This whole section is for 7.1.
\end{discuss}