% -*- latex -*- % % Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana % University Research and Technology % Corporation. All rights reserved. % Copyright (c) 2004-2005 The University of Tennessee and The University % of Tennessee Research Foundation. All rights % reserved. % Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, % University of Stuttgart. All rights reserved. % Copyright (c) 2004-2005 The Regents of the University of California. % All rights reserved. % $COPYRIGHT$ % % Additional copyrights may follow % % $HEADER$ % \chapter{Miscellaneous} \label{sec:misc} This chapter covers a variety of topics that don't conveniently fit into other chapters. {\Huge JMS Needs a lot of overhauling} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Singleton MPI Processes} It is possible to run an MPI process without the \cmd{mpirun} or \cmd{mpiexec} commands -- simply run the program as one would normally launch a serial program: \lstset{style=lam-cmdline} \begin{lstlisting} shell$ my_mpi_program \end{lstlisting} % Stupid emacs mode: $ Doing so will create an \mcw with a single process. This process can either run by itself, or spawn or connect to other MPI processes and become part of a larger MPI jobs using the MPI-2 dynamic function calls. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{MPI-2 I/O Support} \index{ROMIO} \index{MPI-2 I/O support|see {ROMIO}} \index{I/O support|see {ROMIO}} MPI-2 I/O support is provided through the ROMIO package~\cite{thak99a,thak99b}. ROMIO has been fully integrated into Open MPI. As such, \mpitype{MPI\_\-Request} objects can be used ..... ROMIO includes its own documentation and listings of known issues and limitations. See the \file{README} file in the ROMIO directory in the Open MPI distribution. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Fortran Process Names} \index{fortran process names} \cmdindex{mpitask}{fortran process names} Since Fortran does not portably provide the executable name of the process (similar to the way that C programs get an array of {\tt argv}), the \icmd{mpitask} command lists the name ``Open MPI MPI Fortran program'' by default for MPI programs that used the Fortran binding for \mpifunc{MPI\_\-INIT} or \mpifunc{MPI\_\-INIT\_\-THREAD}. The environment variable \ienvvar{Open MPI\_\-MPI\_\-PROCESS\_\-NAME} can be used to override this behavior. % Setting this environment variable before invoking \icmd{mpirun} will cause \cmd{mpitask} to list that name instead of the default title. % This environment variable only works for processes that invoke the Fortran binding for \mpifunc{MPI\_\-INIT} or \mpifunc{MPI\_\-INIT\_\-THREAD}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{MPI Thread Support} \label{sec:misc-threads} \index{threads and MPI} \index{MPI and threads|see {threads and MPI}} \def\mtsingle{\mpiconst{MPI\_\-THREAD\_\-SINGLE}} \def\mtfunneled{\mpiconst{MPI\_\-THREAD\_\-FUNNELED}} \def\mtserial{\mpiconst{MPI\_\-THREAD\_\-SERIALIZED}} \def\mtmultiple{\mpiconst{MPI\_\-THREAD\_\-MULTIPLE}} \def\mpiinit{\mpifunc{MPI\_\-INIT}} \def\mpiinitthread{\mpifunc{MPI\_\-INIT\_\-THREAD}} Open MPI currently implements support for \mtsingle, \mtfunneled, and \mtserial. The constant \mtmultiple\ is provided, although Open MPI will never return \mtmultiple\ in the \funcarg{provided} argument to \mpiinitthread. Open MPI makes no distinction between \mtsingle\ and \mtfunneled. When \mtserial\ is used, a global lock is used to ensure that only one thread is inside any MPI function at any time. \subsection{Thread Level} Selecting the thread level for an MPI job is best described in terms of the two parameters passed to \mpiinitthread: \funcarg{requested} and \funcarg{provided}. \funcarg{requested} is the thread level that the user application requests, while \funcarg{provided} is the thread level that Open MPI will run the application with. \begin{itemize} \item If \mpiinit\ is used to initialize the job, \funcarg{requested} will implicitly be \mtsingle. However, if the \ienvvar{Open MPI\_\-MPI\_\-THREAD\_\-LEVEL} environment variable is set to one of the values in Table~\ref{tbl:mpi-env-thread-level}, the corresponding thread level will be used for \funcarg{requested}. \item If \mpiinitthread\ is used to initialized the job, the \funcarg{requested} thread level is the first thread level that the job will attempt to use. There is currently no way to specify lower or upper bounds to the thread level that Open MPI will use. The resulting thread level is largely determined by the SSI modules that will be used in an MPI job; each module must be able to support the target thread level. A complex algorithm is used to attempt to find a thread level that is acceptable to all SSI modules. Generally, the algorithm starts at \funcarg{requested} and works backwards towards \mpiconst{MPI\_\-THREAD\_\-SINGLE} looking for an acceptable level. However, any module may {\em increase} the thread level under test if it requires it. At the end of this process, if an acceptable thread level is not found, the MPI job will abort. \end{itemize} \begin{table}[htbp] \centering \begin{tabular}{|c|l|} \hline Value & \multicolumn{1}{|c|}{Meaning} \\ \hline \hline undefined & \mtsingle \\ 0 & \mtsingle \\ 1 & \mtfunneled \\ 2 & \mtserial \\ 3 & \mtmultiple \\ \hline \end{tabular} \caption{Valid values for the \envvar{Open MPI\_\-MPI\_\-THREAD\_\-LEVEL} environment variable.} \label{tbl:mpi-env-thread-level} \end{table} Also note that certain SSI modules require higher thread support levels than others. For example, any checkpoint/restart SSI module will require a minimum of \mtserial, and will attempt to adjust the thread level upwards as necessary (if that CR module will be used during the job). Hence, using \mpiinit\ to initialize an MPI job does not imply that the provided thread level will be \mtsingle. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{MPI-2 Name Publishing} \index{published names} \index{dynamic name publishing|see {published names}} \index{name publising|see {published names}} Open MPI supports the MPI-2 functions \mpifunc{MPI\_\-PUBLISH\_\-NAME} and \mpifunc{MPI\_\-UNPUBLISH\_\-NAME} for publishing and unpublishing names, respectively. Published names are stored within the Open MPI daemons, and are therefore persistent, even when the MPI process that published them dies. As such, it is important for correct MPI programs to unpublish their names before they terminate. However, if stale names are left in the Open MPI universe when an MPI process terminates, the \icmd{lamclean} command can be used to clean {\em all} names from the Open MPI RTE. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Batch Queuing System Support} \label{sec:misc-batch} \index{batch queue systems} \index{Portable Batch System|see {batch queue systems}} \index{PBS|see {batch queue systems}} \index{PBS Pro|see {batch queue systems}} \index{OpenPBS|see {batch queue systems}} \index{Load Sharing Facility|see {batch queue systems}} \index{LSF|see {batch queue systems}} \index{Clubmask|see {batch queue systems}} Open MPI is now aware of some batch queuing systems. Support is currently included for PBS, LSF, and Clubmask-based systems. There is also a generic functionality that allows users of other batch queue systems to take advantages of this functionality. \begin{itemize} \item When running under a supported batch queue system, Open MPI will take precautions to isolate itself from other instances of Open MPI in concurrent batch jobs. That is, the multiple Open MPI instances from the same user can exist on the same machine when executing in batch. This allows a user to submit as many Open MPI jobs as necessary, and even if they end up running on the same nodes, a \cmd{lamclean} in one job will not kill MPI applications in another job. \item This behavior is {\em only} exhibited under a batch environment. Other batch systems can easily be supported -- let the Open MPI Team know if you'd like to see support for others included. Manually setting the environment variable \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX} on the node where \icmd{lamboot} is run achieves the same ends. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Location of Open MPI's Session Directory} \label{sec:misc-session-directory} \index{session directory} By default, Open MPI will create a temporary per-user session directory in the following directory: \centerline{\file{/lam-@[-]}} \noindent Each of the components is described below: \begin{description} \item[\file{}]: Open MPI will set the prefix used for the session directory based on the following search order: \begin{enumerate} \item The value of the \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-PREFIX} environment variable \item The value of the \ienvvar{TMPDIR} environment variable \item \file{/tmp/} \end{enumerate} It is important to note that (unlike \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX}), the environment variables for determining \file{} must be set on each node (although they do not necessarily have to be the same value). \file{} must exist before \icmd{lamboot} is run, or \icmd{lamboot} will fail. \item[\file{}]: The user's name on that host. \item[\file{}]: The hostname. \item[\file{}]: Open MPI will set the suffix (if any) used for the session directory based on the following search order: \begin{enumerate} \item The value of the \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX} environment variable. \item If running under a supported batch system, a unique session ID (based on information from the batch system) will be used. \end{enumerate} \end{description} \ienvvar{Open MPI\_\-MPI\_\-SESSION\_\-SUFFIX} and the batch information only need to be available on the node from which \icmd{lamboot} is run. \icmd{lamboot} will propagate the information to the other nodes. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Signal Catching} \index{signals} Open MPI MPI catches the signals SEGV, BUS, FPE, and ILL. The signal handler terminates the application. This is useful in batch jobs to help ensure that \icmd{mpirun} returns if an application process dies. To disable the catching of signals use the \cmdarg{-nsigs} option to \icmd{mpirun}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{MPI Attributes} \begin{discuss} Need to have discussion of built-in attributes here, such as MPI\_\-UNIVERSE\_\-SIZE, etc. Should specifically mention that MPI\_\-UNIVERSE\_\-SIZE is fixed at \mpifunc{MPI\_\-INIT} time (at least it is as of this writing -- who knows what it will be when we release 7.1? :-). This whole section is for 7.1. \end{discuss}