dadca7da88
This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
115 строки
2.9 KiB
Groff
115 строки
2.9 KiB
Groff
.\"
|
|
.\" Copyright (c) 2004-2007 The Trustees of Indiana University and Indiana
|
|
.\" University Research and Technology
|
|
.\" Corporation. All rights reserved.
|
|
.\"
|
|
.\" Man page for OMPI's ompi-restart command
|
|
.\"
|
|
.\" .TH name section center-footer left-footer center-header
|
|
.TH OMPI-RESTART 1 "March 2007" "Open MPI" "OPEN MPI COMMANDS"
|
|
.\" **************************
|
|
.\" Name Section
|
|
.\" **************************
|
|
.SH NAME
|
|
.
|
|
ompi-restart, orte-restart \- Restart a previously checkpointed parallel job
|
|
using the Open PAL Checkpoint/Restart Service (CRS)
|
|
.
|
|
.PP
|
|
.
|
|
\fBNOTE:\fP \fIompi-restart\fP, and \fIorte-restart\fP are all exact
|
|
synonyms for each other. Using any of the names will result in exactly
|
|
identical behavior.
|
|
.
|
|
.\" **************************
|
|
.\" Synopsis Section
|
|
.\" **************************
|
|
.SH SYNOPSIS
|
|
.
|
|
.B ompi-restart
|
|
.R [ options ]
|
|
.B <GLOBAL SNAPSHOT HANDLE>
|
|
.
|
|
.\" **************************
|
|
.\" Options Section
|
|
.\" **************************
|
|
.SH Options
|
|
.
|
|
\fIompi-restart\fR will attempt to restart a previously checkpointed parallel
|
|
job from the global snapshot handle reference returned by \fIompi_checkpoint\fP.
|
|
.
|
|
.TP 10
|
|
.B <GLOBAL SNAPSHOT HANDLE>
|
|
The global snapshot handle reference returned by \fIompi_checkpoint\fP, used to
|
|
restart the job. This is required to be the last argument to this command.
|
|
.
|
|
.
|
|
.TP
|
|
.B -h | --help
|
|
Display help for this command
|
|
.
|
|
.
|
|
.TP
|
|
.B -p | --preload
|
|
Preload the checkpoint files on the remote systems before restarting the
|
|
application. Disabled by default.
|
|
.
|
|
.
|
|
.TP
|
|
.B --fork
|
|
Fork off a new process, which is the restarted process. By default, the
|
|
restarted process will replace \fIompi-restart\fR.
|
|
.
|
|
.
|
|
.TP
|
|
.B -s | --seq
|
|
The sequence number of the checkpoint to restart from. By default, the most
|
|
recent sequence number is used (specified by -1).
|
|
.
|
|
.
|
|
.TP
|
|
.B -hostfile | --hostfile
|
|
The hostfile from which to restart the application. Useful in unscheduled
|
|
environments. (Same behavior as --machinefile option)
|
|
.
|
|
.
|
|
.TP
|
|
.B -machinefile | --machinefile
|
|
The machinefile from which to restart the application. Useful in unscheduled
|
|
environments. (Same behavior as --hostfile option)
|
|
.
|
|
.
|
|
.TP
|
|
.B -v | --verbose
|
|
Enable verbose output for debugging.
|
|
.
|
|
.
|
|
.TP
|
|
.B -gmca | --gmca \fR<key> <value>\fP
|
|
Pass global MCA parameters that are applicable to all contexts. \fI<key>\fP is
|
|
the parameter name; \fI<value>\fP is the parameter value.
|
|
.
|
|
.
|
|
.TP
|
|
.B -mca | --mca <key> <value>
|
|
Send arguments to various MCA modules.
|
|
.
|
|
.
|
|
.\" **************************
|
|
.\" Description Section
|
|
.\" **************************
|
|
.SH DESCRIPTION
|
|
.
|
|
.PP
|
|
\fIompi-restart\fR can be invoked multiple, non-overlapping times. This
|
|
allows the user to restart a previously running parallel job.
|
|
.
|
|
.
|
|
.\" **************************
|
|
.\" See Also Section
|
|
.\" **************************
|
|
.
|
|
.SH SEE ALSO
|
|
orte-ps(1), orte-clean(1), ompi-checkpoint(1), opal-checkpoint(1), opal-restart(1), opal_crs(7)
|
|
.
|