% -*- latex -*- % % Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana % University Research and Technology % Corporation. All rights reserved. % Copyright (c) 2004-2005 The University of Tennessee and The University % of Tennessee Research Foundation. All rights % reserved. % Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, % University of Stuttgart. All rights reserved. % Copyright (c) 2004-2005 The Regents of the University of California. % All rights reserved. % $COPYRIGHT$ % % Additional copyrights may follow % % $HEADER$ % \chapter{Debugging Parallel Programs} \label{sec:debug} \label{sec:debugging} \index{debuggers|(} {\Huge JMS this section is not bad, but needs a little revising (see notes below} Open MPI supports multiple methods of debugging parallel programs. The following notes and observations generally apply to debugging in parallel: \begin{itemize} \item Note that most debuggers require that MPI applications were compiled with debugging support enabled. This typically entails adding \cmdarg{-g} to the compile and link lines when building your MPI application. \item Unless you specifically need it, it is not recommended to compile Open MPI with \cmdarg{-g}. This will allow you to treat MPI function calls as atomic instructions. \item Even when debugging in parallel, it is possible that not all MPI processes will execute exactly the same code. For example, ``if'' statements that are based upon a communicator's rank of the calling process, or other location-specific information may cause different execution paths in each MPI process. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Naming MPI Objects} Open MPI supports the MPI-2 functions {\sf MPI\_\-$<$type$>$\_\-SET\_\-NAME} and {\sf MPI\_\-$<$type$>$\_\-GET\_\-NAME}, where {\sf $<$type$>$} can be: {\sf COMM}, {\sf WIN}, or {\sf TYPE}. Hence, you can associate relevant text names with communicators, windows, and datatypes (e.g., ``6x13x12 molecule datatype'', ``Local group reduction intracommunicator'', ``Spawned worker intercommunicator''). The use of these functions is strongly encouraged while debugging MPI applications. Since they are constant-time, one-time setup functions, using these functions likely does not impact performance, and may be safe to use in production environments, too. The rationale for using these functions is to allow Open MPI (and supported debuggers, profilers, and other MPI diagnostic tools) to display accurate information about MPI communicators, windows, and datatypes. For example, whenever a communicator name is available, Open MPI will use it in relevant error messages; when names are not available, communicators (and windows and types) are identified by index number, which -- depending on the application -- may vary between successive runs. The TotalView parallel debugger will also show communicator names (if available) when displaying the message queues. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{TotalView Parallel Debugger} \label{sec:debug-totalview} TotalView is a commercial debugger from Etnus that supports debugging MPI programs in parallel. That is, with supported MPI implementations, the TotalView debugger can automatically attach to one or more MPI processes in a parallel application. Open MPI now supports basic debugging functionality with the TotalView debugger. Specifically, Open MPI supports TotalView attaching to one or more MPI processes, as well as viewing the MPI message queues in supported RPI modules. This section provides some general tips and suggested use of TotalView with Open MPI. It is {\em not} intended to replace the TotalView documentation in any way. {\bf Be sure to consult the TotalView documentation for more information and details than are provided here.} Note: TotalView is licensed product provided by Etnus. You need to have TotalView installed properly before you can use it with Open MPI.\footnote{Refer to \url{http://www.etnus.com/} for more information about TotalView.} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Attaching TotalView to MPI Processes} \index{TotalView parallel debugger} \index{debuggers!TotalView} Open MPI does not need to be configured or compiled in any special way to allow TotalView to attach to MPI processes. You can attach TotalView to MPI processes started by \icmd{mpirun} / \icmd{mpiexec} in following ways: \begin{enumerate} \item Use the \cmdarg{-tv} convenience argument when running \cmd{mpirun} or \cmd{mpiexec} (this is the preferred method): \lstset{style=ompi-cmdline} \begin{lstlisting} shell$ mpirun -tv [...other mpirun arguments...] \end{lstlisting} % Stupid emacs mode: $ For example: \lstset{style=ompi-cmdline} \begin{lstlisting} shell$ mpirun -tv C my_mpi_program arg1 arg2 arg3 \end{lstlisting} % Stupid emacs mode: $ \item Directly launch \cmd{mpirun} in TotalView (you {\em cannot} launch \cmd{mpiexec} in TotalView): \lstset{style=ompi-cmdline} \begin{lstlisting} shell$ totalview mpirun -a [...mpirun arguments...] \end{lstlisting} % Stupid emacs mode: $ For example: \lstset{style=ompi-cmdline} \begin{lstlisting} shell$ totalview mpirun -a C my_mpi_program arg1 arg2 arg3 \end{lstlisting} % Stupid emacs mode: $ Note the \cmdarg{-a} argument after \cmd{mpirun}. This is necessary to tell TotalView that arguments following ``\cmdarg{-a}'' belong to \cmd{mpirun} and not TotalView. Also note that the \cmdarg{-tv} convenience argument to \cmd{mpirun} simply executes ``\cmd{totalview mpirun -a ...}''; so both methods are essentially identical. \end{enumerate} TotalView can either attach to all MPI processes in \mpiconst{MPI\_\-COMM\_\-WORLD} or a subset of them. The controls for ``partial attach'' are in TotalView, not Open MPI. In TotalView 6.0.0 (analogous methods may work for earlier versions of TotalView -- see the TotalView documentation for more details), you need to set the parallel launch preference to ``ask.'' In the root window menu: \begin{enumerate} \item Select File $\rightarrow$ Preferences \item Select the Parallel tab \item In the ``When a job goes parallel'' box, select ``Ask what to do'' \item Click on OK \end{enumerate} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Suggested Use} Since TotalView support is started with the \cmd{mpirun} command, TotalView will, by default, start by debugging \cmd{mpirun} itself. While this may seem to be an annoying drawback, there are actually good reasons for this: \begin{itemize} \item While debugging the parallel program, if you need to re-run the program, you can simply re-run the application from within TotalView itself. There is no need to exit the debugger to run your parallel application again. \item TotalView can be configured to automatically skip displaying the \cmd{mpirun} code. Specifically, instead of displaying the \cmd{mpirun} code and enabling it for debugging, TotalView will recognize the command named \cmd{mpirun} and start executing it immediately upon load. See below for details. \end{itemize} \noindent There are two ways to start debugging the MPI application: \begin{enumerate} \item The preferred method is to have a \ifile{\$HOME/.tvdrc} file that tells TotalView to skip past the \cmd{mpirun} code and automatically start the parallel program. Create or edit your \ifile{\$HOME/.tvdrc} file to include the following: \lstset{style=ompi-shell} \begin{lstlisting} # Set a variable to say what the MPI ``starter'' program is set starter_program mpirun # Check if the newly loaded image is the starter program # and start it immediately if it is. proc auto_run_starter {loaded_id} { global starter_program set executable_name [TV::symbol get $loaded_id full_pathname] set file_component [file tail $executable_name] if {[string compare $file_component $starter_program] == 0} { puts ``Automatically starting $file_component'' dgo } } # Append this function to TotalView's image load callbacks so that # TotalView run this program automatically. dlappend TV::image_load_callbacks auto_run_starter \end{lstlisting} % Stupid emacs mode: $ Note that when using this method, \cmd{mpirun} is actually running in the debugger while you are debugging your parallel application, even though it may not be obvious. Hence, when the MPI job completes, you'll be returned to viewing \cmd{mpirun} in the debugger. {\em This is normal} -- all MPI processes have exited; the only process that remains is \cmd{mpirun}. If you click ``Go'' again, \cmd{mpirun} will launch the MPI job again. \item Do not create the \file{\$HOME/.tvdrc} file with the ``auto run'' functionality described in the previous item, but instead simply click the ``go'' button when TotalView launches. This runs the \cmd{mpirun} command with the command line arguments, which will eventually launch the MPI programs and allow attachment to the MPI processes. \end{enumerate} When TotalView initially attaches to an MPI process, you will see the code for \mpifunc{MPI\_\-INIT} or one of its sub-functions (which will likely be assembly code, unless Open MPI itself was compiled with debugging information). % You probably want to skip past the rest of \mpifunc{MPI\_\-INIT}. In the Stack Trace window, click on function which called \mpifunc{MPI\_\-INIT} (e.g., \func{main}) and set a breakpoint to line following call to \mpifunc{MPI\_\-INIT}. Then click ``Go''. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Limitations} The following limitations are currently imposed when debugging Open MPI jobs in TotalView: \begin{enumerate} \item Cannot attach to scripts: You cannot attach TotalView to MPI processes if they were launched by scripts instead of \cmd{mpirun}. Specifically, the following won't work: \lstset{style=ompi-cmdline} \begin{lstlisting} shell$ mpirun -tv C script_to_launch_foo \end{lstlisting} % Stupid emacs mode: $ But this will: \lstset{style=ompi-cmdline} \begin{lstlisting} shell$ mpirun -tv C foo \end{lstlisting} % Stupid emacs mode: $ For that reason, since \cmd{mpiexec} is a script, although the \cmdarg{-tv} switch works with \cmd{mpiexec} (because it will eventually invoke \cmd{mpirun}), you cannot launch \cmd{mpiexec} with TotalView. \item TotalView needs to launch the TotalView server on all remote nodes in order to attach to remote processes. The command that TotalView uses to launch remote executables might be different than what Open MPI uses. You may have to set this command explicitly and independently of Open MPI. % For example, if your local environment has \cmd{rsh} disabled and only allows \cmd{ssh}, then you likely need to set the TotalView remote server launch command to ``\cmd{ssh}''. You can set this internally in TotalView or with the \ienvvar{TVDSVRLAUNCHCMD} environment variable (see the TotalView documentation for more information on this). \item The TotalView license must be able to be found on all nodes where you expect to attach the debugger. Consult with your system administrator to ensure that this is set up properly. You may need to edit your ``dot'' files (e.g., \file{.profile}, \file{.bashrc}, \file{.cshrc}, etc.) to ensure that relevant environment variable settings exist on all nodes. \item It is always a good idea to let \cmd{mpirun} finish before you rerun or exit TotalView. \item TotalView will not be able to attach to MPI programs when you execute \cmd{mpirun} with \cmdarg{-s} option. This is because TotalView will not get the source code of your program on nodes other than the source node. We advise you to either use a common filesystem or copy the source code and executable on all nodes when using TotalView with Open MPI so that you can avoid the use of \cmd{mpirun}'s \cmdarg{-s} flag. \end{enumerate} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Message Queue Debugging} The TotalView debugger can show the sending, receiving, and unexepected message queues for many parallel applications. Note the following: \begin{itemize} \item The MPI-2 function for naming communicators (\mpifunc{MPI\_\-COMM\_\-SET\_\-NAME}) is strongly recommended when using the message queue debugging functionality. For example, \mpiconst{MPI\_\-COMM\_\-WORLD} and \mpiconst{MPI\_\-COMM\_\-SELF} are automatically named by Open MPI. Naming communicators makes it significantly easier to identify communicators of interest in the debugger. {\Huge JMS is this true?} Any communicator that is not named will be displayed as ``{\tt --unnamed--}''. \item Message queue debugging of applications is not currently supported for 64 bit executables. If you attempt to use the message queue debugging functionality on a 64 bit executable, TotalView will display a warning before disabling the message queue options. \item Open MPI does not currently provide debugging support for dynamic processes (e.g., \mpifunc{MPI\_\-COMM\_\-SPAWN}). \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Serial Debuggers} \label{sec:debug-serial} \index{serial debuggers} \index{debuggers!serial} Open MPI also allows the use of one or more serial debuggers when debugging a parallel program. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Lauching Debuggers} \index{debuggers!launching} Open MPI allows the arbitrary execution of any executable in an MPI context as long as an MPI executable is eventually launched. For example, it is common to \icmd{mpirun} a debugger (or a script that launches a debugger on some nodes, and directly runs the application on other nodes) since the debugger will eventually launch the MPI process. {\Huge JMS may need some minor revamping} However, one must be careful when running programs on remote nodes that expect the use of \file{stdin} -- \file{stdin} on remote nodes is redirected to \file{/dev/null}. For example, it is advantageous to export the \ienvvar{DISPLAY} environment variable, and run a shell script that invokes an \cmd{xterm} with ``\cmd{gdb}'' (for example) running in it on each node. For example: \lstset{style=ompi-cmdline} \begin{lstlisting} shell$ mpirun C -x DISPLAY xterm-gdb.csh \end{lstlisting} % Stupid emacs mode: $ Additionally, it may be desirable to only run the debugger on certain ranks in \mcw. For example, with parallel jobs that include tens or hundreds of MPI processes, it is really only feasible to attach debuggers to a small number of processes. In this case, a script may be helpful to launch debuggers for some ranks in \mcw and directly launch the application in others. {\Huge JMS needs revising} The Open MPI environment variable \ienvvar{Open MPIRANK} can be helpful in this situation. This variable is placed in the environment before the target application is executed. Hence, it is visible to shell scripts as well as the target MPI application. It is erroneous to alter the value of this variable. Consider the following script: \lstset{style=ompi-shell} \begin{lstlisting} #!/bin/csh -f # Which debugger to run set debugger=gdb # On MPI_COMM_WORLD rank 0, launch the process in the debugger. # Elsewhere, just launch the process directly. if (``$Open MPIRANK'' == ``0'') then echo Launching $debugger on MPI_COMM_WORLD rank $Open MPIRANK $debugger $* else echo Launching MPI executable on MPI_COMM_WORLD rank $Open MPIRANK $* endif # All done exit 0 \end{lstlisting} % Stupid emacs mode: $ This script can be executed via \cmd{mpirun} to launch a debugger on \mpiconst{MPI\_\-COMM\_\-WORLD} rank 0, and directly launch the MPI process in all other cases. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Attaching Debuggers} \index{debuggers!attaching} In some cases, it is not possible or desirable to start debugging a parallel application immediately. For example, it may only be desirable to attach to certain MPI processes whose identity may not be known until run-time. In this case, the technique of attaching to a running process can be used (this functionality is supported by many serial debuggers). Specifically, determine which MPI process you want to attach to. Then login to the node where it is running, and use the debugger's ``attach'' functionality to latch on to the running process. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Memory-Checking Debuggers} \label{sec:debug-mem} \index{debuggers!memory-checking} Memory-checking debuggers are an invaluable tool when debugging software (even parallel software). They can provide detailed reports about memory leaks, bad memory accesses, duplicate/bad memory management calls, etc. Some memory-checking debuggers include (but are not limited to): the Solaris Forte debugger (including the \cmd{bcheck} command-line memory checker), the Purify software package, and the Valgrind software package. Open MPI can be used with memory-checking debuggers. However, Open MPI should be compiled with special support for such debuggers. This is because in an attempt to optimize performance, there are many structures used internally to Open MPI that do not always have all memory positions initialized. For example, Open MPI's internal \type{struct nmsg} is one of the underlying message constructs used to pass data between Open MPI processes. But since the \type{struct nmsg} is used in so many places, it is a generalized structure and contains fields that are not used in every situation. By default, Open MPI only initializes relevant struct members before using a structure. Using a structure may involve sending the entire structure (including uninitialized members) to a remote host. This is not a problem for Open MPI; the remote host will also ignore the irrelevant struct members (depending on the specific function being invoked). More to the point -- Open MPI was designed this way to avoid setting variables that will not be used; this is a slight optimization in run-time performance. Memory-checking debuggers, however, will flag this behavior with ``read from uninitialized'' warnings. The \confflag{enable-mem-debug} option can be used with Open MPI's \cmd{configure} script that will force Open MPI to zero out {\em all} memory before it is used. This will eliminate the ``read from uninitialized'' types of warnings that memory-checking debuggers will identify deep inside Open MPI. This option can only be specified when Open MPI is configured; it is not possible to enable or disable this behavior at run-time. Since this option invokes a slight overhead penalty in the run-time performance of Open MPI, it is not the default. % Close out the debuggers index entry \index{debuggers|)}