1
1
openmpi/orte/tools/orte-restart/orte-restart.1
Josh Hursey dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00

115 строки
2.9 KiB
Groff

.\"
.\" Copyright (c) 2004-2007 The Trustees of Indiana University and Indiana
.\" University Research and Technology
.\" Corporation. All rights reserved.
.\"
.\" Man page for OMPI's ompi-restart command
.\"
.\" .TH name section center-footer left-footer center-header
.TH OMPI-RESTART 1 "March 2007" "Open MPI" "OPEN MPI COMMANDS"
.\" **************************
.\" Name Section
.\" **************************
.SH NAME
.
ompi-restart, orte-restart \- Restart a previously checkpointed parallel job
using the Open PAL Checkpoint/Restart Service (CRS)
.
.PP
.
\fBNOTE:\fP \fIompi-restart\fP, and \fIorte-restart\fP are all exact
synonyms for each other. Using any of the names will result in exactly
identical behavior.
.
.\" **************************
.\" Synopsis Section
.\" **************************
.SH SYNOPSIS
.
.B ompi-restart
.R [ options ]
.B <GLOBAL SNAPSHOT HANDLE>
.
.\" **************************
.\" Options Section
.\" **************************
.SH Options
.
\fIompi-restart\fR will attempt to restart a previously checkpointed parallel
job from the global snapshot handle reference returned by \fIompi_checkpoint\fP.
.
.TP 10
.B <GLOBAL SNAPSHOT HANDLE>
The global snapshot handle reference returned by \fIompi_checkpoint\fP, used to
restart the job. This is required to be the last argument to this command.
.
.
.TP
.B -h | --help
Display help for this command
.
.
.TP
.B -p | --preload
Preload the checkpoint files on the remote systems before restarting the
application. Disabled by default.
.
.
.TP
.B --fork
Fork off a new process, which is the restarted process. By default, the
restarted process will replace \fIompi-restart\fR.
.
.
.TP
.B -s | --seq
The sequence number of the checkpoint to restart from. By default, the most
recent sequence number is used (specified by -1).
.
.
.TP
.B -hostfile | --hostfile
The hostfile from which to restart the application. Useful in unscheduled
environments. (Same behavior as --machinefile option)
.
.
.TP
.B -machinefile | --machinefile
The machinefile from which to restart the application. Useful in unscheduled
environments. (Same behavior as --hostfile option)
.
.
.TP
.B -v | --verbose
Enable verbose output for debugging.
.
.
.TP
.B -gmca | --gmca \fR<key> <value>\fP
Pass global MCA parameters that are applicable to all contexts. \fI<key>\fP is
the parameter name; \fI<value>\fP is the parameter value.
.
.
.TP
.B -mca | --mca <key> <value>
Send arguments to various MCA modules.
.
.
.\" **************************
.\" Description Section
.\" **************************
.SH DESCRIPTION
.
.PP
\fIompi-restart\fR can be invoked multiple, non-overlapping times. This
allows the user to restart a previously running parallel job.
.
.
.\" **************************
.\" See Also Section
.\" **************************
.
.SH SEE ALSO
orte-ps(1), orte-clean(1), ompi-checkpoint(1), opal-checkpoint(1), opal-restart(1), opal_crs(7)
.