Removed all of the LAM stuff.

This needs to be gone over a few more times before it is allowed to see daylight, but has come a long way. Some sections may be off more than a little, but the general idea is there. Need to audit to make sure we don't call the ORTE VHNP's daemons :) This commit was SVN r9078.
2006-02-17 03:47:52 +00:00 · 2006-02-17 03:47:52 +00:00 · 02c999776b
--- a/orte/tools/orterun/orterun.1
+++ b/orte/tools/orterun/orterun.1
@ -218,665 +218,277 @@ line if an application schema is specified.
 .\"    Description Section
 .\" **************************
 .SH DESCRIPTION
-One invocation of
+.
-.I mpirun
+One invocation of \fImpirun\fP starts an MPI application running under Open
-starts an MPI application running under LAM.
+MPI. If the application is simply SPMD, the application can be specified on the
-If the application is simply SPMD, the application can be specified on the
+\fImpirun\fP command line.
-.I mpirun
+
 command line.
 If the application is MIMD, comprising multiple programs, an application
 schema is required in a separate file.
-See appschema(5) for a description of the application schema syntax,
+See appschema(5) for a description of the application schema syntax.
-but it essentially contains multiple
+It essentially contains multiple \fImpirun\fP command lines, less the command
-.I mpirun
+name itself.  The ability to specify different options for different
-command lines, less the command name itself.  The ability to specify
+instantiations of a program is another reason to use an application schema.
 different options for different instantiations of a program is another
 reason to use an application schema.
 .
 .
 .
 .SS Location Nomenclature
-As described above, 
+.
-.I mpirun
+As described above, \fImpirun\fP can specify arbitrary locations in the current
-can specify arbitrary locations in the current LAM universe.
+Open MPI universe.
-Locations can be specified either by CPU or by node (noted by the
+Locations can be specified either by CPU or by node.
-"<where>" in the SYNTAX section, above).  Note that LAM does not bind
+
-processes to CPUs -- specifying a location "by CPU" is really a
+.B Note:
-convenience mechanism for SMPs that ultimately maps down to a specific
+Open MPI does not bind processes to CPUs -- specifying a location "by CPU" is
 really a convenience mechanism for SMPs that ultimately maps down to a specific
 node.
 .PP
 Note that LAM effectively numbers MPI_COMM_WORLD ranks from
 left-to-right in the <where>, regardless of which nomenclature is
 used.  This can be important because typical MPI programs tend to
 communicate more with their immediate neighbors (i.e., myrank +/- X)
 than distant neighbors.  When neighbors end up on the same node, the
 shmem RPIs can be used for communication rather than the network RPIs,
 which can result in faster MPI performance.
 .PP
 Specifying locations by node will launch one copy of an executable per
-specified node.  Using a capitol "N" tells LAM to use all available
+specified node.
-nodes that were lambooted (see lamboot(1)).  Ranges of specific nodes
+Using the \fI--bynode\fP option tells Open MPI to use all available nodes.
-can also be specified in the form "nR[,R]*", where R specifies either
+Using the \fI--byslot\fP option tells Open MPI to use all slots on an available
-a single node number or a valid range of node numbers in the range of
+node before allocating resources on the next available node.
-[0, num_nodes).  For example:
+For example:
 .
 .TP 4
-mpirun N a.out
+mpirun --bynode -np 4 a.out
 Runs one copy of the the executable
 .I a.out
-on all available nodes in the LAM universe.  MPI_COMM_WORLD rank 0
+on all available nodes in the Open MPI universe.  MPI_COMM_WORLD rank 0
-will be on n0, rank 1 will be on n1, etc.
+will be on node0, rank 1 will be on node1, etc. Regardless of how many slots
 are available on each of the nodes.
 .
 .
 .TP
-mpirun n0-3 a.out
+mpirun --byslot -np 4 a.out
 Runs one copy of the the executable
 .I a.out
-on nodes 0 through 3.  MPI_COMM_WORLD rank 0 will be on n0, rank 1
+on each slot on a given node before running the executable on other available
-will be on n1, etc.
+nodes.
 .TP
 mpirun n0-3,8-11,15 a.out
 Runs one copy of the the executable
 .I a.out
 on nodes 0 through 3, 8 through 11, and 15.  MPI_COMM_WORLD ranks will
 be ordered as follows: (0, n0), (1, n1), (2, n2), (3, n3), (4, n8),
 (5, n9), (6, n10), (7, n11), (8, n15).
 .PP
 Specifying by CPU is the preferred method of launching MPI jobs.  The
 intent is that the boot schema used with lamboot(1) will indicate how
 many CPUs are available on each node, and then a single, simple 
 .I mpirun
 command can be used to launch across all of them.  As noted above,
 specifying CPUs does not actually bind processes to CPUs -- it is only
 a convenience mechanism for launching on SMPs.  Otherwise, the by-CPU
 notation is the same as the by-node notation, except that "C" and "c"
 are used instead of "N" and "n".  
 .PP
 Assume in the following example that the LAM universe consists of four
 4-way SMPs.  So c0-3 are on n0, c4-7 are on n1, c8-11 are on n2, and
 13-15 are on n3.
 .TP 4
 mpirun C a.out
 Runs one copy of the the executable
 .I a.out
 on all available CPUs in the LAM universe.  This is typically the
 simplest (and preferred) method of launching all MPI jobs (even if it
 resolves to one process per node).  MPI_COMM_WORLD ranks 0-3 will be
 on n0, ranks 4-7 will be on n1, ranks 8-11 will be on n2, and ranks
 13-15 will be on n3.
 .TP
 mpirun c0-3 a.out
 Runs one copy of the the executable
 .I a.out
 on CPUs 0 through 3.  All four ranks of MPI_COMM_WORLD will be on
 MPI_COMM_WORLD.
 .TP
 mpirun c0-3,8-11,15 a.out
 Runs one copy of the the executable
 .I a.out
 on CPUs 0 through 3, 8 through 11, and 15.  MPI_COMM_WORLD ranks 0-3
 will be on n0, 4-7 will be on n2, and 8 will be on n3.
 .PP
 The reason that the by-CPU nomenclature is preferred over the by-node
 nomenclature is best shown through example.  Consider trying to run
 the first CPU example (with the same MPI_COMM_WORLD mapping) with the
 by-node nomenclature -- run one copy of 
 .I a.out
 for every available CPU, and maximize the number of local neighbors to
 potentially maximize MPI performance.  One solution would be to use
 the following command:
 .TP 4
 mpirun n0,0,0,0,1,1,1,1,2,2,2,2,3,3,3,3 a.out
 .PP
 This 
 .IR works ,
 but is definitely klunky to type.  It is typically easier to use the
 by-CPU notation.  One might think that the following is equivalent:
 .TP 4
 mpirun N -np 16 a.out
 .PP
 This is 
 .I not
 equivalent because the MPI_COMM_WORLD rank mappings will be assigned
 by node rather than by CPU.  Hence rank 0 will be on n0, rank 1 will
 be on n1, etc.  Note that the following, however,
 .I is
 equivalent, because LAM interprets lack of a <where> as "C":
 .TP 4
 mpirun -np 16 a.out
 .PP
 However, a "C" can tend to be more convenient, especially for
 batch-queuing scripts because the exact number of processes may vary
 between queue submissions.  Since the batch system will determine the
 final number of CPUs available, having a generic script that
 effectively says "run on everything you gave me" may lead to more
 portable / re-usable scripts.
 .PP
 Finally, it should be noted that specifying multiple <where> clauses
 are perfectly acceptable.  As such, mixing of the by-node and by-CPU
 syntax is also valid, albiet typically not useful.  For example:
 .TP 4
 mpirun C N a.out
 .PP
 However, in some cases, specifying multiple <where> clauses can be
 useful.  Consider a parallel application where MPI_COMM_WORLD rank 0
 will be a "manager" and therefore consume very few CPU cycles because
 it is usually waiting for "worker" processes to return results.
 Hence, it is probably desirable to run one "worker" process on all
 available CPUs, and run one extra process that will be the "manager":
 .TP 4
 mpirun c0 C manager-worker-program
 .
 .
 .
 .SS Application Schema or Executable Program?
 To distinguish the two different forms,
 .I mpirun
 looks on the command line for <where> or the \fI-c\fR option.  If
 neither is specified, then the file named on the command line is
 assumed to be an application schema.  If either one or both are
 specified, then the file is assumed to be an executable program.  If
 <where> and \fI-c\fR both are specified, then copies of the program
 are started on the specified nodes/CPUs according to an internal LAM
 scheduling policy.  Specifying just one node effectively forces LAM to
 run all copies of the program in one place.  If \fI-c\fR is given, but
 not <where>, then all available CPUs on all LAM nodes are used.  If
 <where> is given, but not \fI-c\fR, then one copy of the program is
 run on each node.
 .PP
 .
-.
+To distinguish the two different forms, \fImpirun\fP
-.
+looks on the command line for \fI--app\fP option.  If
-.SS Program Transfer
+it is specified, then the file named on the command line is
-By default, LAM searches for executable programs on the target node
+assumed to be an application schema.  If it is not
-where a particular instantiation will run.  If the file system is not
+specified, then the file is assumed to be an executable program.
 shared, the target nodes are homogeneous, and the program is
 frequently recompiled, it can be convenient to have LAM transfer the
 program from a source node (usually the local node) to each target
 node.  The \fI-s\fR option specifies this behavior and identifies the
 single source node.
 .
 .
 .
 .SS Locating Files
-LAM looks for an executable program by searching the directories in
+.
 Open MPI looks for an executable program by searching the directories in
 the user's PATH environment variable as defined on the source node(s).
 This behavior is consistent with logging into the source node and
 executing the program from the shell.  On remote nodes, the "." path
 is the home directory.
 .PP
-LAM looks for an application schema in three directories: the local
+Open MPI looks for an application schema in three directories the local
-directory, the value of the LAMAPPLDIR environment variable, and
+directory.
 laminstalldir/boot, where "laminstalldir" is the directory where
 LAM/MPI was installed.
 .
 .
 .
 .SS Standard I/O
 LAM directs UNIX standard input to /dev/null on all remote nodes.  On
 the local node that invoked
 .IR mpirun ,
 standard input is inherited from
 .IR mpirun .
 The default is what used to be the -w option to prevent conflicting
 access to the terminal.
 .PP
 LAM directs UNIX standard output and error to the LAM daemon on all
 remote nodes.  LAM ships all captured output/error to the node that
 invoked
 .I mpirun
 and prints it on the standard output/error of
 .IR mpirun .
 Local processes inherit the standard output/error of
 .I mpirun
 and transfer to it directly.
 .PP
 Thus it is possible to redirect standard I/O for LAM applications by
 using the typical shell redirection procedure on
 .IR mpirun .
 .sp
 .RS
 % mpirun C my_app < my_input > my_output
 .RE
 .PP
 Note that in this example
 .I only
 the local node (i.e., the node where mpirun was invoked from) will
 receive the stream from my_input on stdin.  The stdin on all the other
 nodes will be tied to /dev/null.  However, the stdout from all nodes
 will be collected into the my_output file.
 .PP
 The
 .I \-f
 option avoids all the setup required to support standard I/O described
 above.  Remote processes are completely directed to /dev/null and
 local processes inherit file descriptors from lamboot(1).
 .
-.
+Open MPI directs UNIX standard input to /dev/null on all remote nodes.  On
-.
+the local node that invoked \fImpirun\fP, standard input is inherited from
-.SS Pseudo-tty support
+\fImpirun\fP.
 The 
 .I \-pty
 option enabled pseudo-tty support for process output (it is also
 enabled by default).  This allows, among other things, for line
 buffered output from remote nodes (which is probably what you want).
 This option can be disabled with the
 .I \-npty
 switch.
 .PP
 Open MPI directs UNIX standard output and error to the Open RTE daemon on all
 remote nodes. Open MPI ships all captured output/error to the node that
 invoked \fImpirun\fP and prints it on the standard output/error of \fImpirun\fP
 Local processes inherit the standard output/error of \fImpirun\fP and transfer
 to it directly.
 .PP
 Thus it is possible to redirect standard I/O for Open MPI applications by
 using the typical shell redirection procedure on \fImpirun\fP.
      \fBshell$\fP mpirun -np 2 my_app < my_input > my_output
 Note that in this example \fIonly\fP the local node (i.e., the node where
 mpirun was invoked from) will receive the stream from \fImy_input\fP on stdin.  The
 stdin on all the other nodes will be tied to /dev/null.  However, the stdout
 from all nodes will be collected into the \fImy_output\fP file.
 .
 .
 .
 .SS Process Termination / Signal Handling
 .
 During the run of an MPI application, if any rank dies abnormally
-(either exiting before invoking
+(either exiting before invoking \fIMPI_FINALIZE\fP, or dying as the result of a
-.IR MPI_FINALIZE ,
+signal), \fImpirun\fP will print out an error message and kill the rest of the
-or dying as the result of a signal), 
+MPI application.
 .I mpirun
 will print out an error message and kill the rest of the MPI
 application.
 .PP
-By default, LAM/MPI only installs a signal handler for one signal in
+By default, Open MPI only installs a signal handler for one signal in
-user programs (SIGUSR2 by default, but this can be overridden when LAM
+user programs (SIGUSR2).  Therefore, it is safe for users to install
-is configured and built).  Therefore, it is safe for users to install
+their own signal handlers in Open MPI programs 
 their own signal handlers in LAM/MPI programs (LAM notices
 death-by-signal cases by examining the process' return status provided
 by the operating system).  
 .PP
 User signal handlers should probably avoid trying to cleanup MPI state
-- LAM is neither thread-safe nor async-signal-safe.  For example, if
+(Open MPI is, currently, neither thread-safe nor async-signal-safe).
-a seg fault occurs in
+For example, if a seg fault occurs in \fIMPI_SEND\fP (perhaps because a bad
-.I MPI_SEND
+buffer was passed in) and a user signal handler is invoked, if this user
-(perhaps because a bad buffer was passed in) and a user signal handler
+handler attempts to invoke \fIMPI_FINALIZE\fP, Bad Things could happen since
-is invoked, if this user handler attempts to invoke
+Open MPI was already "in" MPI when the error occurred.  Since \fImpirun\fP
 .IR MPI_FINALIZE ,
 Bad Things could happen since LAM/MPI was already "in" MPI when the
 error occurred.  Since
 .I mpirun
 will notice that the process died due to a signal, it is probably not
 necessary (and safest) for the user to only clean up non-MPI state.
 .PP
 If the 
 .I -sigs
 option is used with 
 .IR mpirun ,
 LAM/MPI will install several signal handlers to locally on each rank
 to catch signals, print out error messages, and kill the rest of the
 MPI application.  This is somewhat redundant behavior since this is
 now all handled by
 .IR mpirun ,
 but it has been left for backwards compatability.
 .
 .
 .
 .SS Process Exit Statuses
 The
 .IR -sa ,
 \ 
 .IR -sf ,
 and
 .I -p
 parameters can be used to display the exist statuses of the individual
 MPI processes as they terminate.  
 .I -sa 
 forces the exit statuses to be displayed for all processes;
 .I -sf
 only displays the exist statuses if at least one process terminates
 either by a signal or a non-zero exit status (note that exiting before
 invoking
 .I MPI_FINALIZE
 will cause a non-zero exit status).
 .PP
 The status of each process is printed out, one per line, in the
 following format:
 .sp
 .RS 
 prefix_string node pid killed status
 .RE
 .PP
 If
 .I killed
 is 1, then
 .I status
 is the signal number.  If
 .I killed
 is 0, then
 .I status
 is the exit status of the process.
 .PP
 The default 
 .I prefix_string
 is "mpirun:", but the
 .I -p
 option can be used override this string.
 .
 .
 .
 .SS Current Working Directory
 .
 The default behavior of mpirun has changed with respect to the
 directory that processes will be started in.
 .PP
-The 
+The \fI\-wd\fP option to mpirun allows the user to change to an arbitrary
-.I \-wd 
+directory before their program is invoked.  It can also be used in application
 option to mpirun allows the user to change to an arbitrary directory
 before their program is invoked.  It can also be used in application
 schema files to specify working directories on specific nodes and/or
 for specific applications.
 .PP
-If the 
+If the \fI\-wd\fP option appears both in a schema file and on the command line,
-.I \-wd
+the schema file directory will override the command line value.
 option appears both in a schema file and on the command line, the
 schema file directory will override the command line value.
 .PP
-The
+The \fI\-D\fP option will change the current working directory to the directory
-.I \-D
+where the executable resides.  It cannot be used in application schema files.
 option will change the current working directory to the directory
 where the executable resides.  It cannot be used in application schema
 files.
 .I \-wd
 is mutually exclusive with 
 .IR \-D .
 .PP
-If neither 
+If \fI\-wd\fP is not specified, the local node will send the directory name
-.I \-wd 
+where mpirun was invoked from to each of the remote nodes.  The remote nodes
 nor 
 .I \-D
 are specified, the local node will send the directory name where
 mpirun was invoked from to each of the remote nodes.  The remote nodes
 will then try to change to that directory.  If they fail (e.g., if the
 directory does not exists on that node), they will start with from the
 user's home directory.
 .PP
 All directory changing occurs before the user's program is invoked; it
-does not wait until 
+does not wait until \fIMPI_INIT\fP is called.  
 .I MPI_INIT 
 is called.  
 .
 .
 .
 .SS Process Environment
 .
 Processes in the MPI application inherit their environment from the
-LAM daemon upon the node on which they are running.  The environment
+Open RTE daemon upon the node on which they are running.  The environment
-of a LAM daemon is fixed upon booting of the LAM with lamboot(1) and
+is typically inherited from the user's shell.  On remote nodes, the exact
-is typically inherited from the user's shell.  On the origin node,
+environment is determined by the boot MCA module used.  The rsh boot module,
-this will be the shell from which lamboot(1) was invoked; on remote
+for example, uses either rsh/ssh to launch the LAM daemon on remote nodes, and
-nodes, the exact environment is determined by the boot SSI module used
+typically executes one or more of the user's shell-setup files before launching
-by lamboot(1).  The rsh boot module, for example, uses either rsh/ssh
+the Open RTE daemon.  When running dynamically linked applications which
-to launch the LAM daemon on remote nodes, and typically executes one
+require the LD_LIBRARY_PATH environment variable to be set, care must be taken
-or more of the user's shell-setup files before launching the LAM
+to ensure that it is correctly set when booting Open MPI.
 daemon.  When running dynamically linked applications which require
 the LD_LIBRARY_PATH environment variable to be set, care must be taken
 to ensure that it is correctly set when booting the LAM.
 .
 .
 .
 .SS Exported Environment Variables
-All environment variables that are named in the form LAM_MPI_*,
+.
-LAM_IMPI_*, or IMPI_* will automatically be exported to new processes
+All environment variables that are named in the form OMPI_* will automatically
-on the local and remote nodes.  This exporting may be inhibited with
+be exported to new processes on the local and remote nodes.
-the
+The \fI\-x\fP option to \fImpirun\fP can be used to export specific environment
-.I \-nx
+variables to the new processes.  While the syntax of the \fI\-x\fP
 option.
 .PP
 Additionally, the 
 .I \-x
 option to 
 .IR mpirun
 can be used to export specific environment variables to the new
 processes.  While the syntax of the 
 .I \-x 
 option allows the definition of new variables, note that the parser
 for this option is currently not very sophisticated - it does not even
 understand quoted values.  Users are advised to set variables in the
-environment and use 
+environment and use \fI\-x\fP to export them; not to define them.
 .I \-x 
 to export them; not to define them.
 .
 .
 .
-.SS Trace Generation
+.SS MCA (Modular Component Architecture)
 Two switches control trace generation from processes running under LAM
 and both must be in the on position for traces to actually be
 generated.  The first switch is controlled by
 .I mpirun
 and the second switch is initially set by
 .I mpirun
 but can be toggled at runtime with MPIL_Trace_on(2) and
 MPIL_Trace_off(2).  The \fI-t\fR (\fI-ton\fR is equivalent) and
 \fI-toff\fR options all turn on the first switch.  Otherwise the first
 switch is off and calls to MPIL_Trace_on(2) in the application program
 are ineffective.  The \fI-t\fR option also turns on the second switch.
 The \fI-toff\fR option turns off the second switch.  See
 MPIL_Trace_on(2) and lamtrace(1) for more details.
 .
 .
 .
 .SS MPI Data Conversion
 LAM's MPI library converts MPI messages from local representation to
 LAM representation upon sending them and then back to local
 representation upon receiving them.  If the case of a LAM consisting
 of a homogeneous network of machines where the local representation
 differs from the LAM representation this can result in unnecessary
 conversions.
 .P
 The \fI-O\fR switch used to be necessary to indicate to LAM whether
 the mulitcomputer was homogeneous or not.  LAM now automatically
 determines whether a given MPI job is homogeneous or not.  The
 .I -O
 flag will silently be accepted for backwards compatability, but it is
 ignored.
 .
 .
 .
 .SS SSI (System Services Interface)
 The
-.I -ssi
+.I -mca
-switch allows the passing of parameters to various SSI modules.  LAM's
+switch allows the passing of parameters to various MCA modules.
-SSI modules are described in detail in lamssi(7).  SSI modules have
+.\" Open MPI's MCA modules are described in detail in ompimca(7).
-direct impact on MPI programs because they allow tunable parameters to
+MCA modules have direct impact on MPI programs because they allow tunable
-be set at run time (such as which RPI communication device driver to
+parameters to be set at run time (such as which BTL communication device driver
-use, what parameters to pass to that RPI, etc.).
+to use, what parameters to pass to that BTL, etc.).
 .PP
-The 
+The \fI-mca\fP switch takes two arguments: \fI<key\fP and \fI<value>\fP.
-.I -ssi
+The \fI<key>\fP argument generally specifies which MCA module will receive the value.
-switch takes two arguments:
+For example, the \fI<key>\fP "btl" is used to select which BTL to be used for
-.I <key>
+transporting MPI messages.  The \fI<value>\fP argument is the value that is
-and 
+passed.
-.IR <value> .
+For example: 
-The
+.
 .I <key>
 argument generally specifies which SSI module will receive the value.
 For example, the
 .I <key>
 "rpi" is used to select which RPI to be used for transporting MPI
 messages.  The
 .I <value> 
 argument is the value that is passed.  For example:
 .TP 4
-mpirun -ssi rpi lamd N foo
+mpirun -mca btl tcp,self -np 1 foo
-Tells LAM to use the "lamd" RPI and to run a single copy of "foo" on
+Tells Open MPI to use the "tcp" and "self" BTLs, and to run a single copy of
-every node.
+"foo" an allocated node.
 .
 .TP
-mpirun -ssi rpi tcp N foo
+mpirun -mca btl self -np 1 foo
-Tells LAM to use the "tcp" RPI.
+Tells Open MPI to use the "self" BTL, and to run a single copy of
-.TP
+"foo" an allocated node.
-mpirun -ssi rpi sysv N foo
+.\" And so on.  Open MPI's BTL MCA modules are described in lamssi_rpi(7).
 Tells LAM to use the "sysv" RPI.
 .PP
-And so on.  LAM's RPI SSI modules are described in lamssi_rpi(7).
+The \fI-mca\fP switch can be used multiple times to specify different
 \fI<key>\fP and/or \fI<value>\fP arguments.  If the same \fI<key>\fP is
 specified more than once, the \fI<value>\fPs are concatenated with a comma
 (",") separating them.
 .PP
-The 
+.B Note:
-.I -ssi
+The \fI-mca\fP switch is simply a shortcut for setting environment variables.
-switch can be used multiple times to specify different
+The same effect may be accomplished by setting corresponding environment
-.I <key>
+variables before running \fImpirun\fP.
-and/or
+The form of the environment variables that Open MPI sets are:
-.I <value>
+
-arguments.  If the same
+      OMPI_<key>=<value>
 .I <key>
 is specified more than once, the
 .IR <value> s
 are concatenated with a comma (",") separating them.
 .PP
-Note that the 
+Note that the \fI-mca\fP switch overrides any previously set environment
-.I -ssi
+variables.  Also note that unknown \fI<key>\fP arguments are still set as
-switch is simply a shortcut for setting environment variables.  The
+environment variable -- they are not checked (by \fImpirun\fP) for correctness.
-same effect may be accomplished by setting corresponding environment
+Illegal or incorrect \fI<value>\fP arguments may or may not be reported -- it
-variables before running
+depends on the specific MCA module.
 .IR mpirun .
 The form of the environment variables that LAM sets are:
 .IR LAM_MPI_SSI_<key>=<value> .
 .PP
 Note that the
 .I -ssi
 switch overrides any previously set environment variables.  Also note
 that unknown
 .I <key>
 arguments are still set as environment variable -- they are not
 checked (by
 .IR mpirun )
 for correctness.  Illegal or incorrect 
 .I <value>
 arguments may or may not be reported -- it depends on the specific SSI
 module.
 .PP
 The
 .I -ssi
 switch obsoletes the old
 .I -c2c
 and 
 .I -lamd
 switches.  These switches used to be relevant because LAM could only
 have two RPI's available at a time: the lamd RPI and one of the C2C
 RPIs.  This is no longer true -- all RPI's are now available and
 choosable at run-time.  Selecting the lamd RPI is shown in the
 examples above.  
 The
 .I -c2c
 switch has no direct translation since "C2C" used to refer to all
 other RPI's that were not the lamd RPI.  As such, 
 .I -ssi rpi <value>
 must be used to select the specific desired RPI (whether it is "lamd"
 or one of the other RPI's).
 .
 .
 .
 .SS Guaranteed Envelope Resources
 By default, LAM will guarantee a minimum amount of message envelope
 buffering to each MPI process pair and will impede or report an error
 to a process that attempts to overflow this system resource.  This
 robustness and debugging feature is implemented in a machine specific
 manner when direct communication is used.  For normal LAM
 communication via the LAM daemon, a protocol is used.  The \fI-nger\fR
 option disables GER and the measures taken to support it.  The minimum
 GER is configured by the system administrator when LAM is installed.
 See MPI(7) for more details.
 .
 .\" **************************
 .\"    Examples Section
 .\" **************************
 .SH EXAMPLES
-Be sure to also see the examples in the "Location Nomenclature"
+Be sure to also see the examples in the "Location Nomenclature" section, above.
-section, above.
+.
 .TP 4
-mpirun N prog1
+mpirun -np 1 prog1
-Load and execute prog1 on all nodes.  Search the user's $PATH for the
+Load and execute prog1 on one node.  Search the user's $PATH for the
 executable file on each node.
 .
 .
 .TP
-mpirun -c 8 prog1
+mpirun -np 8 --byslot prog1
-Run 8 copies of prog1 wherever LAM wants to run them.
+Run 8 copies of prog1 wherever Open MPI wants to run them.
 .
 .
 .TP
-mpirun n8-10 -v -nw -s n3 prog1 -q
+mpirun -np 4 -mca btl ib,tcp,self prog1
-Load and execute prog1 on nodes 8, 9, and 10.  Search for prog1 on
+Run 4 copies of prog1 using the "ib", "tcp", and "self" BTL's for the transport
-node 3 and transfer it to the three target nodes.  Report as each
+of MPI messages.
 process is created.  Give "-q" as a command line to each new process.
 Do not wait for the processes to complete before exiting
 .IR mpirun .
 .TP
 mpirun -v myapp
 Parse the application schema, myapp, and start all processes specified
 in it.  Report as each process is created.
 .TP
 mpirun -npty -wd /work/output -x DISPLAY C my_application
 Start one copy of "my_application" on each available CPU.  The number
 of available CPUs on each node was previously specified when LAM was
 booted with lamboot(1).  As noted above,
 .I mpirun
 will schedule adjoining rank in 
 .I MPI_COMM_WORLD 
 on the same node where possible.  For example, if n0 has a CPU count
 of 8, and n1 has a CPU count of 4,
 .I mpirun
 will place 
 .I MPI_COMM_WORLD 
 ranks 0 through 7 on n0, and 8 through 11 on n1.  This tends to
 maximize on-node communication for many parallel applications; when
 used in conjunction with the multi-protocol network/shared memory RPIs
 in LAM (see the RELEASE_NOTES and INSTALL files with the LAM
 distribution), overall communication performance can be quite good.
 Also disable pseudo-tty support, change directory to /work/output, and
 export the DISPLAY variable to the new processes (perhaps
 my_application will invoke an X application such as xv to display
 output).
 .
 .\" **************************
 .\"    Diagnostics Section
 .\" **************************
 .
-.SH DIAGNOSTICS
+.\" .SH DIAGNOSTICS
-.TP 4
+.\".TP 4
-mpirun: Exec format error
+.\"Error Msg:
-This usually means that either a number of processes or an appropriate
+.\"Description
 <where> clause was not specified, indicating that LAM does not know
 how many processes to run.  See the EXAMPLES and "Location
 Nomenclature" sections, above, for examples on how to specify how many
 processes to run, and/or where to run them.  However, it can also mean
 that a non-ASCII character was detected in the application schema.
 This is usually a command line usage error where
 .I mpirun
 is expecting an application schema and an executable file was given.
 .TP
 mpirun: syntax error in application schema, line XXX
 The application schema cannot be parsed because of a usage or syntax error
 on the given line in the file.
 .TP
 <filename>: No such file or directory
 This error can occur in two cases.  Either the named file cannot be
 located or it has been found but the user does not have sufficient
 permissions to execute the program or read the application schema.
 .
 .\" **************************
 .\"    Return Value Section
 .\" **************************
 .
 .SH RETURN VALUE
-.I mpirun
+.
-returns 0 if all ranks started by
+\fImpirun\fP returns 0 if all ranks started by \fImpirun\fP exit after calling
-.I mpirun
+MPI_FINALIZE.  A non-zero value is returned if an internal error occurred in
-exit after calling MPI_FINALIZE.  A non-zero value is returned if an 
+mpirun, or one or more ranks exited before calling MPI_FINALIZE.  If an
-internal error occurred in mpirun, or one or more ranks exited before 
+internal error occurred in mpirun, the corresponding error code is returned.
-calling MPI_FINALIZE.  If an internal error occurred in mpirun, the 
+In the event that one or more ranks exit before calling MPI_FINALIZE, the
-corresponding error code is returned.  In the event that one or more ranks 
+return value of the rank of the process that \fImpirun\fP first notices died
-exit before calling MPI_FINALIZE, the return value of the rank of the 
+before calling MPI_FINALIZE will be returned.  Note that, in general, this will
-process that
+be the first rank that died but is not guaranteed to be so.
 .I mpirun
 first notices died before calling MPI_FINALIZE will be returned.  Note
 that, in general, this will be the first rank that died but is not
 guaranteed to be so.
 .PP
-However, note that if the 
+However, note that if the \fI-nw\fP switch is used, the return value from
-.I \-nw 
+mpirun does not indicate the exit status of the ranks.
 switch is used, the return value from mpirun does not indicate the exit status
 of the ranks.
 .
 .\" **************************
 .\"    See Also Section
 .\" **************************
 .
 .SH SEE ALSO
-bhost(5), 
+orted(1)
 lamexec (1),
 lamssi(7),
 lamssi_rpi(7),
 lamtrace(1),
 loadgo(1),
 MPIL_Trace_on(2),
 mpimsg(1),
 mpitask(1)