openmpi/doc/user/debuggers.tex

% -*- latex -*-
%
% Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
%                         University Research and Technology
%                         Corporation.  All rights reserved.
% Copyright (c) 2004-2005 The University of Tennessee and The University
%                         of Tennessee Research Foundation.  All rights
%                         reserved.
% Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, 
%                         University of Stuttgart.  All rights reserved.
% Copyright (c) 2004-2005 The Regents of the University of California.
%                         All rights reserved.
% $COPYRIGHT$
% 
% Additional copyrights may follow
% 
% $HEADER$
%

\chapter{Debugging Parallel Programs}
\label{sec:debug}
\label{sec:debugging}
\index{debuggers|(}

{\Huge JMS this section is not bad, but needs a little revising (see
  notes below}

Open MPI supports multiple methods of debugging parallel programs.
The following notes and observations generally apply to debugging in
parallel:

\begin{itemize}
\item Note that most debuggers require that MPI applications were
  compiled with debugging support enabled.  This typically entails
  adding \cmdarg{-g} to the compile and link lines when building
  your MPI application.
  
\item Unless you specifically need it, it is not recommended to
  compile Open MPI with \cmdarg{-g}.  This will allow you to treat MPI
  function calls as atomic instructions.
  
\item Even when debugging in parallel, it is possible that not all MPI
  processes will execute exactly the same code.  For example, ``if''
  statements that are based upon a communicator's rank of the calling
  process, or other location-specific information may cause different
  execution paths in each MPI process.
\end{itemize}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Naming MPI Objects}

Open MPI supports the MPI-2 functions {\sf
  MPI\_\-$<$type$>$\_\-SET\_\-NAME} and {\sf
  MPI\_\-$<$type$>$\_\-GET\_\-NAME}, where {\sf $<$type$>$} can be:
{\sf COMM}, {\sf WIN}, or {\sf TYPE}.  Hence, you can associate
relevant text names with communicators, windows, and datatypes (e.g.,
``6x13x12 molecule datatype'', ``Local group reduction
intracommunicator'', ``Spawned worker intercommunicator'').  The use
of these functions is strongly encouraged while debugging MPI
applications.  Since they are constant-time, one-time setup functions,
using these functions likely does not impact performance, and may be
safe to use in production environments, too.

The rationale for using these functions is to allow Open MPI (and
supported debuggers, profilers, and other MPI diagnostic tools) to
display accurate information about MPI communicators, windows, and
datatypes.  For example, whenever a communicator name is available,
Open MPI will use it in relevant error messages; when names are not
available, communicators (and windows and types) are identified by
index number, which -- depending on the application -- may vary
between successive runs.  The TotalView parallel debugger will also
show communicator names (if available) when displaying the message
queues.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{TotalView Parallel Debugger}
\label{sec:debug-totalview}

TotalView is a commercial debugger from Etnus that supports debugging
MPI programs in parallel.  That is, with supported MPI
implementations, the TotalView debugger can automatically attach to
one or more MPI processes in a parallel application.

Open MPI now supports basic debugging functionality with the TotalView
debugger.  Specifically, Open MPI supports TotalView attaching to one
or more MPI processes, as well as viewing the MPI message queues in
supported RPI modules.

This section provides some general tips and suggested use of TotalView
with Open MPI.  It is {\em not} intended to replace the TotalView
documentation in any way.  {\bf Be sure to consult the TotalView
  documentation for more information and details than are provided
  here.}

Note: TotalView is licensed product provided by Etnus. You need to
have TotalView installed properly before you can use it with Open
MPI.\footnote{Refer to \url{http://www.etnus.com/} for more
  information about TotalView.}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Attaching TotalView to MPI Processes}
\index{TotalView parallel debugger}
\index{debuggers!TotalView}

Open MPI does not need to be configured or compiled in any special way
to allow TotalView to attach to MPI processes.

You can attach TotalView to MPI processes started by \icmd{mpirun} /
\icmd{mpiexec} in following ways:

\begin{enumerate}
\item Use the \cmdarg{-tv} convenience argument when running
  \cmd{mpirun} or \cmd{mpiexec} (this is the preferred method):

  \lstset{style=ompi-cmdline}
  \begin{lstlisting}
shell$ mpirun -tv [...other mpirun arguments...]
  \end{lstlisting}
  % Stupid emacs mode: $

  For example:
  
  \lstset{style=ompi-cmdline}
  \begin{lstlisting}
shell$ mpirun -tv C my_mpi_program arg1 arg2 arg3
  \end{lstlisting}
  % Stupid emacs mode: $

\item Directly launch \cmd{mpirun} in TotalView (you {\em cannot}
  launch \cmd{mpiexec} in TotalView):
  
  \lstset{style=ompi-cmdline}
  \begin{lstlisting}
shell$ totalview mpirun -a [...mpirun arguments...]
  \end{lstlisting}
  % Stupid emacs mode: $
  
  For example:
  
  \lstset{style=ompi-cmdline}
  \begin{lstlisting}
shell$ totalview mpirun -a C my_mpi_program arg1 arg2 arg3
  \end{lstlisting}
  % Stupid emacs mode: $
  
  Note the \cmdarg{-a} argument after \cmd{mpirun}.  This is necessary
  to tell TotalView that arguments following ``\cmdarg{-a}'' belong to
  \cmd{mpirun} and not TotalView.
  
  Also note that the \cmdarg{-tv} convenience argument to \cmd{mpirun}
  simply executes ``\cmd{totalview mpirun -a ...}''; so both methods
  are essentially identical.
\end{enumerate}
        
TotalView can either attach to all MPI processes in
\mpiconst{MPI\_\-COMM\_\-WORLD} or a subset of them.  The controls for
``partial attach'' are in TotalView, not Open MPI.  In TotalView 6.0.0
(analogous methods may work for earlier versions of TotalView -- see
the TotalView documentation for more details), you need to set the
parallel launch preference to ``ask.''  In the root window menu:

\begin{enumerate}
\item Select File $\rightarrow$ Preferences
\item Select the Parallel tab
\item In the ``When a job goes parallel'' box, select ``Ask what to do''
\item Click on OK
\end{enumerate}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Suggested Use}

Since TotalView support is started with the \cmd{mpirun} command,
TotalView will, by default, start by debugging \cmd{mpirun} itself.
While this may seem to be an annoying drawback, there are actually
good reasons for this:

\begin{itemize}
\item While debugging the parallel program, if you need to re-run the
  program, you can simply re-run the application from within TotalView
  itself.  There is no need to exit the debugger to run your parallel
  application again.
  
\item TotalView can be configured to automatically skip displaying the
  \cmd{mpirun} code.  Specifically, instead of displaying the
  \cmd{mpirun} code and enabling it for debugging, TotalView will
  recognize the command named \cmd{mpirun} and start executing it
  immediately upon load.  See below for details.
\end{itemize}

\noindent There are two ways to start debugging the MPI application:

\begin{enumerate}
\item The preferred method is to have a \ifile{\$HOME/.tvdrc} file
  that tells TotalView to skip past the \cmd{mpirun} code and
  automatically start the parallel program.  Create or edit your
  \ifile{\$HOME/.tvdrc} file to include the following:

\lstset{style=ompi-shell}
\begin{lstlisting}
# Set a variable to say what the MPI ``starter'' program is
set starter_program mpirun

# Check if the newly loaded image is the starter program
# and start it immediately if it is.
proc auto_run_starter {loaded_id} {
    global starter_program
    set executable_name [TV::symbol get $loaded_id full_pathname]
    set file_component [file tail $executable_name]
                               
    if {[string compare $file_component $starter_program] == 0} {
        puts ``Automatically starting $file_component''
        dgo
    }
}

# Append this function to TotalView's image load callbacks so that
# TotalView run this program automatically.
dlappend TV::image_load_callbacks auto_run_starter
\end{lstlisting}
% Stupid emacs mode: $

Note that when using this method, \cmd{mpirun} is actually running in
the debugger while you are debugging your parallel application, even
though it may not be obvious.  Hence, when the MPI job completes,
you'll be returned to viewing \cmd{mpirun} in the debugger.  {\em This
  is normal} -- all MPI processes have exited; the only process that
remains is \cmd{mpirun}.  If you click ``Go'' again, \cmd{mpirun} will
launch the MPI job again.

\item Do not create the \file{\$HOME/.tvdrc} file with the ``auto
  run'' functionality described in the previous item, but instead
  simply click the ``go'' button when TotalView launches.  This runs
  the \cmd{mpirun} command with the command line arguments, which will
  eventually launch the MPI programs and allow attachment to the MPI
  processes.
  
\end{enumerate}

When TotalView initially attaches to an MPI process, you will see the
code for \mpifunc{MPI\_\-INIT} or one of its sub-functions (which will
likely be assembly code, unless Open MPI itself was compiled with debugging
information).
%
You probably want to skip past the rest of \mpifunc{MPI\_\-INIT}.  In
the Stack Trace window, click on function which called
\mpifunc{MPI\_\-INIT} (e.g., \func{main}) and set a breakpoint to line
following call to \mpifunc{MPI\_\-INIT}.  Then click ``Go''.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Limitations}

The following limitations are currently imposed when debugging Open MPI
jobs in TotalView:

\begin{enumerate}
\item Cannot attach to scripts: You cannot attach TotalView to MPI
  processes if they were launched by scripts instead of \cmd{mpirun}.
  Specifically, the following won't work:
                
  \lstset{style=ompi-cmdline}
  \begin{lstlisting}
shell$ mpirun -tv C script_to_launch_foo
  \end{lstlisting}
  % Stupid emacs mode: $

  But this will:

  \lstset{style=ompi-cmdline}
  \begin{lstlisting}
shell$ mpirun -tv C foo
  \end{lstlisting}
  % Stupid emacs mode: $
  
  For that reason, since \cmd{mpiexec} is a script, although the
  \cmdarg{-tv} switch works with \cmd{mpiexec} (because it will
  eventually invoke \cmd{mpirun}), you cannot launch \cmd{mpiexec}
  with TotalView.
  
\item TotalView needs to launch the TotalView server on all remote
  nodes in order to attach to remote processes.  
  
  The command that TotalView uses to launch remote executables might
  be different than what Open MPI uses.  You may have to set this
  command explicitly and independently of Open MPI.
%
  For example, if your local environment has \cmd{rsh} disabled and
  only allows \cmd{ssh}, then you likely need to set the TotalView
  remote server launch command to ``\cmd{ssh}''.  You can set this
  internally in TotalView or with the \ienvvar{TVDSVRLAUNCHCMD}
  environment variable (see the TotalView documentation for more
  information on this).
  
\item The TotalView license must be able to be found on all nodes
  where you expect to attach the debugger.  
  
  Consult with your system administrator to ensure that this is set up
  properly.  You may need to edit your ``dot'' files (e.g.,
  \file{.profile}, \file{.bashrc}, \file{.cshrc}, etc.) to ensure that
  relevant environment variable settings exist on all nodes.
  
\item It is always a good idea to let \cmd{mpirun} finish before you
  rerun or exit TotalView.
  
\item TotalView will not be able to attach to MPI programs when you
  execute \cmd{mpirun} with \cmdarg{-s} option.  
  
  This is because TotalView will not get the source code of your
  program on nodes other than the source node.  We advise you to
  either use a common filesystem or copy the source code and
  executable on all nodes when using TotalView with Open MPI so that you
  can avoid the use of \cmd{mpirun}'s \cmdarg{-s} flag.
\end{enumerate}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Message Queue Debugging}

The TotalView debugger can show the sending, receiving, and
unexepected message queues for many parallel applications.  Note the
following:

\begin{itemize}
\item The MPI-2 function for naming communicators
  (\mpifunc{MPI\_\-COMM\_\-SET\_\-NAME}) is strongly recommended when
  using the message queue debugging functionality.  For example,
  \mpiconst{MPI\_\-COMM\_\-WORLD} and \mpiconst{MPI\_\-COMM\_\-SELF}
  are automatically named by Open MPI.  Naming communicators makes it
  significantly easier to identify communicators of interest in the
  debugger.

  {\Huge JMS is this true?}
  Any communicator that is not named will be displayed as ``{\tt
  --unnamed--}''.

\item Message queue debugging of applications is not currently
  supported for 64 bit executables.  If you attempt to use the message
  queue debugging functionality on a 64 bit executable, TotalView will
  display a warning before disabling the message queue options.
  
\item Open MPI does not currently provide debugging support for
  dynamic processes (e.g., \mpifunc{MPI\_\-COMM\_\-SPAWN}).
\end{itemize}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Serial Debuggers}
\label{sec:debug-serial}
\index{serial debuggers}
\index{debuggers!serial}

Open MPI also allows the use of one or more serial debuggers when debugging
a parallel program.  

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Lauching Debuggers}
\index{debuggers!launching}

Open MPI allows the arbitrary execution of any executable in an MPI
context as long as an MPI executable is eventually launched.  For
example, it is common to \icmd{mpirun} a debugger (or a script that
launches a debugger on some nodes, and directly runs the application
on other nodes) since the debugger will eventually launch the MPI
process.

{\Huge JMS may need some minor revamping}

However, one must be careful when running programs on remote nodes
that expect the use of \file{stdin} -- \file{stdin} on remote nodes is
redirected to \file{/dev/null}.  For example, it is advantageous to
export the \ienvvar{DISPLAY} environment variable, and run a shell
script that invokes an \cmd{xterm} with ``\cmd{gdb}'' (for example)
running in it on each node.  For example:

\lstset{style=ompi-cmdline}
\begin{lstlisting}
shell$ mpirun C -x DISPLAY xterm-gdb.csh
\end{lstlisting}
% Stupid emacs mode: $

Additionally, it may be desirable to only run the debugger on certain
ranks in \mcw.  For example, with parallel jobs that include tens or
hundreds of MPI processes, it is really only feasible to attach
debuggers to a small number of processes.  In this case, a script may
be helpful to launch debuggers for some ranks in \mcw and directly
launch the application in others.

{\Huge JMS needs revising}

The Open MPI environment variable \ienvvar{Open MPIRANK} can be
helpful in this situation.  This variable is placed in the environment
before the target application is executed.  Hence, it is visible to
shell scripts as well as the target MPI application.  It is erroneous
to alter the value of this variable.

Consider the following script:

\lstset{style=ompi-shell}
\begin{lstlisting}
#!/bin/csh -f

# Which debugger to run
set debugger=gdb

# On MPI_COMM_WORLD rank 0, launch the process in the debugger.
# Elsewhere, just launch the process directly.
if (``$Open MPIRANK'' == ``0'') then
  echo Launching $debugger on MPI_COMM_WORLD rank $Open MPIRANK
  $debugger $*
else
  echo Launching MPI executable on MPI_COMM_WORLD rank $Open MPIRANK
  $*
endif

# All done
exit 0
\end{lstlisting}
% Stupid emacs mode: $

This script can be executed via \cmd{mpirun} to launch a debugger on
\mpiconst{MPI\_\-COMM\_\-WORLD} rank 0, and directly launch the MPI
process in all other cases.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\subsection{Attaching Debuggers}
\index{debuggers!attaching}

In some cases, it is not possible or desirable to start debugging a
parallel application immediately.  For example, it may only be
desirable to attach to certain MPI processes whose identity may not be
known until run-time.

In this case, the technique of attaching to a running process can be
used (this functionality is supported by many serial debuggers).
Specifically, determine which MPI process you want to attach to.  Then
login to the node where it is running, and use the debugger's
``attach'' functionality to latch on to the running process.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Memory-Checking Debuggers}
\label{sec:debug-mem}
\index{debuggers!memory-checking}

Memory-checking debuggers are an invaluable tool when debugging
software (even parallel software).  They can provide detailed reports
about memory leaks, bad memory accesses, duplicate/bad memory
management calls, etc.  Some memory-checking debuggers include (but
are not limited to): the Solaris Forte debugger (including the
\cmd{bcheck} command-line memory checker), the Purify software
package, and the Valgrind software package.

Open MPI can be used with memory-checking debuggers.  However, Open
MPI should be compiled with special support for such debuggers.  This
is because in an attempt to optimize performance, there are many
structures used internally to Open MPI that do not always have all
memory positions initialized.  For example, Open MPI's internal
\type{struct nmsg} is one of the underlying message constructs used to
pass data between Open MPI processes.  But since the \type{struct
  nmsg} is used in so many places, it is a generalized structure and
contains fields that are not used in every situation.

By default, Open MPI only initializes relevant struct members before
using a structure.  Using a structure may involve sending the entire
structure (including uninitialized members) to a remote host.  This is
not a problem for Open MPI; the remote host will also ignore the
irrelevant struct members (depending on the specific function being
invoked).  More to the point -- Open MPI was designed this way to
avoid setting variables that will not be used; this is a slight
optimization in run-time performance.  Memory-checking debuggers,
however, will flag this behavior with ``read from uninitialized''
warnings.

The \confflag{enable-mem-debug} option can be used with Open MPI's
\cmd{configure} script that will force Open MPI to zero out {\em all}
memory before it is used.  This will eliminate the ``read from
uninitialized'' types of warnings that memory-checking debuggers will
identify deep inside Open MPI.  This option can only be specified when
Open MPI is configured; it is not possible to enable or disable this
behavior at run-time.  Since this option invokes a slight overhead
penalty in the run-time performance of Open MPI, it is not the
default.

% Close out the debuggers index entry

\index{debuggers|)}