e12ca48cd9
http://www.open-mpi.org/community/lists/devel/2010/07/8240.php Documentation: http://osl.iu.edu/research/ft/ Major Changes: -------------- * Added C/R-enabled Debugging support. Enabled with the --enable-crdebug flag. See the following website for more information: http://osl.iu.edu/research/ft/crdebug/ * Added Stable Storage (SStore) framework for checkpoint storage * 'central' component does a direct to central storage save * 'stage' component stages checkpoints to central storage while the application continues execution. * 'stage' supports offline compression of checkpoints before moving (sstore_stage_compress) * 'stage' supports local caching of checkpoints to improve automatic recovery (sstore_stage_caching) * Added Compression (compress) framework to support * Add two new ErrMgr recovery policies * {{{crmig}}} C/R Process Migration * {{{autor}}} C/R Automatic Recovery * Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} ErrMgr component * Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} configure option) * {{{OMPI_CR_Checkpoint}}} (Fixes trac:2342) * {{{OMPI_CR_Restart}}} * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) * {{{OMPI_CR_INC_register_callback}}} (Fixes trac:2192) * {{{OMPI_CR_Quiesce_start}}} * {{{OMPI_CR_Quiesce_checkpoint}}} * {{{OMPI_CR_Quiesce_end}}} * {{{OMPI_CR_self_register_checkpoint_callback}}} * {{{OMPI_CR_self_register_restart_callback}}} * {{{OMPI_CR_self_register_continue_callback}}} * The ErrMgr predicted_fault() interface has been changed to take an opal_list_t of ErrMgr defined types. This will allow us to better support a wider range of fault prediction services in the future. * Add a progress meter to: * FileM rsh (filem_rsh_process_meter) * SnapC full (snapc_full_progress_meter) * SStore stage (sstore_stage_progress_meter) * Added 2 new command line options to ompi-restart * --showme : Display the full command line that would have been exec'ed. * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes trac:2413) * Deprecated some MCA params: * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir * snapc_base_global_snapshot_dir deprecated, use sstore_base_global_snapshot_dir * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared * snapc_base_store_in_place deprecated, replaced with different components of SStore * snapc_base_global_snapshot_ref deprecated, use sstore_base_global_snapshot_ref * snapc_base_establish_global_snapshot_dir deprecated, never well supported * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem Minor Changes: -------------- * Fixes trac:1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint handles and does the right thing. * Fixes trac:2097 : {{{ompi-info}}} should now report all available CRS components * Fixes trac:2161 : Manual checkpoint movement. A user can 'mv' a checkpoint directory from the original location to another and still restart from it. * Fixes trac:2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} * Move {{{ompi_cr_continue_like_restart}}} to {{{orte_cr_continue_like_restart}}} to be more flexible in where this should be set. * opal_crs_base_metadata_write* functions have been moved to SStore to support a wider range of metadata handling functionality. * Cleanup the CRS framework and components to work with the SStore framework. * Cleanup the SnapC framework and components to work with the SStore framework (cleans up these code paths considerably). * Add 'quiesce' hook to CRCP for a future enhancement. * We now require a BLCR version that supports {{{cr_request_file()}}} or {{{cr_request_checkpoint()}}} in order to make the code more maintainable. Note that {{{cr_request_file}}} has been deprecated since 0.7.0, so we prefer to use {{{cr_request_checkpoint()}}}. * Add optional application level INC callbacks (registered through the CR MPI Ext interface). * Increase the {{{opal_cr_thread_sleep_wait}}} parameter to 1000 microseconds to make the C/R thread less aggressive. * {{{opal-restart}}} now looks for cache directories before falling back on stable storage when asked. * {{{opal-restart}}} also support local decompression before restarting * {{{orte-checkpoint}}} now uses the SStore framework to work with the metadata * {{{orte-restart}}} now uses the SStore framework to work with the metadata * Remove the {{{orte-restart}}} preload option. This was removed since the user only needs to select the 'stage' component in order to support this functionality. * Since the '-am' parameter is saved in the metadata, {{{ompi-restart}}} no longer hard codes {{{-am ft-enable-cr}}}. * Fix {{{hnp}}} ErrMgr so that if a previous component in the stack has 'fixed' the problem, then it should be skipped. * Make sure to decrement the number of 'num_local_procs' in the orted when one goes away. * odls now checks the SStore framework to see if it needs to load any checkpoint files before launching (to support 'stage'). This separates the SStore logic from the --preload-[binary|files] options. * Add unique IDs to the named pipes established between the orted and the app in SnapC. This is to better support migration and automatic recovery activities. * Improve the checks for 'already checkpointing' error path. * A a recovery output timer, to show how long it takes to restart a job * Do a better job of cleaning up the old session directory on restart. * Add a local module to the autor and crmig ErrMgr components. These small modules prevent the 'orted' component from attempting a local recovery (Which does not work for MPI apps at the moment) * Add a fix for bounding the checkpointable region between MPI_Init and MPI_Finalize. This commit was SVN r23587. The following Trac tickets were found above: Ticket 1924 --> https://svn.open-mpi.org/trac/ompi/ticket/1924 Ticket 2097 --> https://svn.open-mpi.org/trac/ompi/ticket/2097 Ticket 2161 --> https://svn.open-mpi.org/trac/ompi/ticket/2161 Ticket 2192 --> https://svn.open-mpi.org/trac/ompi/ticket/2192 Ticket 2208 --> https://svn.open-mpi.org/trac/ompi/ticket/2208 Ticket 2342 --> https://svn.open-mpi.org/trac/ompi/ticket/2342 Ticket 2413 --> https://svn.open-mpi.org/trac/ompi/ticket/2413
661 строка
19 KiB
C
661 строка
19 KiB
C
/*
|
|
* Copyright (c) 2004-2010 The Trustees of Indiana University and Indiana
|
|
* University Research and Technology
|
|
* Corporation. All rights reserved.
|
|
* Copyright (c) 2004-2009 The University of Tennessee and The University
|
|
* of Tennessee Research Foundation. All rights
|
|
* reserved.
|
|
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
* University of Stuttgart. All rights reserved.
|
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
|
* All rights reserved.
|
|
* $COPYRIGHT$
|
|
*
|
|
* Additional copyrights may follow
|
|
*
|
|
* $HEADER$
|
|
*
|
|
*/
|
|
|
|
#include "orte_config.h"
|
|
#include "orte/constants.h"
|
|
|
|
#include <sys/types.h>
|
|
#include <stdio.h>
|
|
#ifdef HAVE_FCNTL_H
|
|
#include <fcntl.h>
|
|
#endif
|
|
#ifdef HAVE_UNISTD_H
|
|
#include <unistd.h>
|
|
#endif
|
|
#include <stdlib.h>
|
|
|
|
#include "opal/event/event.h"
|
|
#include "opal/runtime/opal.h"
|
|
#include "opal/mca/paffinity/paffinity.h"
|
|
|
|
#include "orte/util/show_help.h"
|
|
#include "opal/mca/mca.h"
|
|
#include "opal/mca/base/base.h"
|
|
#include "opal/mca/base/mca_base_param.h"
|
|
#include "opal/util/output.h"
|
|
#include "opal/util/opal_sos.h"
|
|
#include "opal/util/malloc.h"
|
|
#include "opal/util/argv.h"
|
|
|
|
#include "orte/mca/rml/base/base.h"
|
|
#include "orte/mca/rml/rml_types.h"
|
|
#include "orte/mca/routed/base/base.h"
|
|
#include "orte/mca/routed/routed.h"
|
|
#include "orte/mca/errmgr/base/base.h"
|
|
#include "orte/mca/grpcomm/base/base.h"
|
|
#include "orte/mca/iof/base/base.h"
|
|
#include "orte/mca/ess/base/base.h"
|
|
#include "orte/mca/ess/ess.h"
|
|
#include "orte/mca/ras/base/base.h"
|
|
#include "orte/mca/plm/base/base.h"
|
|
|
|
#include "orte/mca/rmaps/base/base.h"
|
|
#if OPAL_ENABLE_FT_CR == 1
|
|
#include "orte/mca/snapc/base/base.h"
|
|
#endif
|
|
#include "orte/mca/filem/base/base.h"
|
|
#include "orte/util/proc_info.h"
|
|
#include "orte/util/session_dir.h"
|
|
#include "orte/util/name_fns.h"
|
|
#include "orte/util/nidmap.h"
|
|
|
|
#include "orte/runtime/runtime.h"
|
|
#include "orte/runtime/orte_wait.h"
|
|
#include "orte/runtime/orte_globals.h"
|
|
|
|
#include "orte/runtime/orte_cr.h"
|
|
#include "orte/mca/ess/ess.h"
|
|
#include "orte/mca/ess/base/base.h"
|
|
#include "orte/mca/ess/env/ess_env.h"
|
|
|
|
static int env_set_name(void);
|
|
|
|
static int rte_init(void);
|
|
static int rte_finalize(void);
|
|
static uint8_t proc_get_locality(orte_process_name_t *proc);
|
|
static orte_vpid_t proc_get_daemon(orte_process_name_t *proc);
|
|
static char* proc_get_hostname(orte_process_name_t *proc);
|
|
static orte_local_rank_t proc_get_local_rank(orte_process_name_t *proc);
|
|
static orte_node_rank_t proc_get_node_rank(orte_process_name_t *proc);
|
|
static int update_pidmap(opal_byte_object_t *bo);
|
|
static int update_nidmap(opal_byte_object_t *bo);
|
|
|
|
#if OPAL_ENABLE_FT_CR == 1
|
|
static int rte_ft_event(int state);
|
|
#endif
|
|
|
|
orte_ess_base_module_t orte_ess_env_module = {
|
|
rte_init,
|
|
rte_finalize,
|
|
orte_ess_base_app_abort,
|
|
proc_get_locality,
|
|
proc_get_daemon,
|
|
proc_get_hostname,
|
|
proc_get_local_rank,
|
|
proc_get_node_rank,
|
|
update_pidmap,
|
|
update_nidmap,
|
|
orte_ess_base_query_sys_info,
|
|
#if OPAL_ENABLE_FT_CR == 1
|
|
rte_ft_event
|
|
#else
|
|
NULL
|
|
#endif
|
|
};
|
|
|
|
/*
|
|
* Local variables
|
|
*/
|
|
static orte_node_rank_t my_node_rank=ORTE_NODE_RANK_INVALID;
|
|
|
|
static int rte_init(void)
|
|
{
|
|
int ret;
|
|
char *error = NULL;
|
|
char **hosts = NULL;
|
|
char *nodelist;
|
|
|
|
/* run the prolog */
|
|
if (ORTE_SUCCESS != (ret = orte_ess_base_std_prolog())) {
|
|
error = "orte_ess_base_std_prolog";
|
|
goto error;
|
|
}
|
|
|
|
/* Start by getting a unique name from the enviro */
|
|
env_set_name();
|
|
|
|
/* if I am a daemon, complete my setup using the
|
|
* default procedure
|
|
*/
|
|
if (ORTE_PROC_IS_DAEMON) {
|
|
/* get the list of nodes used for this job */
|
|
nodelist = getenv("OMPI_MCA_orte_nodelist");
|
|
|
|
if (NULL != nodelist) {
|
|
/* split the node list into an argv array */
|
|
hosts = opal_argv_split(nodelist, ',');
|
|
}
|
|
if (ORTE_SUCCESS != (ret = orte_ess_base_orted_setup(hosts))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
error = "orte_ess_base_orted_setup";
|
|
goto error;
|
|
}
|
|
opal_argv_free(hosts);
|
|
return ORTE_SUCCESS;
|
|
}
|
|
|
|
if (ORTE_PROC_IS_TOOL) {
|
|
/* otherwise, if I am a tool proc, use that procedure */
|
|
if (ORTE_SUCCESS != (ret = orte_ess_base_tool_setup())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
error = "orte_ess_base_tool_setup";
|
|
goto error;
|
|
}
|
|
/* as a tool, I don't need a nidmap - so just return now */
|
|
return ORTE_SUCCESS;
|
|
|
|
}
|
|
|
|
/* otherwise, I must be an application process - use
|
|
* the default procedure to finish my setup
|
|
*/
|
|
if (ORTE_SUCCESS != (ret = orte_ess_base_app_setup())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
error = "orte_ess_base_app_setup";
|
|
goto error;
|
|
}
|
|
|
|
/* if one was provided, build my nidmap */
|
|
if (ORTE_SUCCESS != (ret = orte_util_nidmap_init(orte_process_info.sync_buf))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
error = "orte_util_nidmap_init";
|
|
goto error;
|
|
}
|
|
|
|
return ORTE_SUCCESS;
|
|
|
|
error:
|
|
orte_show_help("help-orte-runtime.txt",
|
|
"orte_init:startup:internal-failure",
|
|
true, error, ORTE_ERROR_NAME(ret), ret);
|
|
|
|
return ret;
|
|
}
|
|
|
|
static int rte_finalize(void)
|
|
{
|
|
int ret;
|
|
|
|
/* if I am a daemon, finalize using the default procedure */
|
|
if (ORTE_PROC_IS_DAEMON) {
|
|
if (ORTE_SUCCESS != (ret = orte_ess_base_orted_finalize())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
}
|
|
} else if (ORTE_PROC_IS_TOOL) {
|
|
/* otherwise, if I am a tool proc, use that procedure */
|
|
if (ORTE_SUCCESS != (ret = orte_ess_base_tool_finalize())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
}
|
|
/* as a tool, I didn't create a nidmap - so just return now */
|
|
return ret;
|
|
} else {
|
|
/* otherwise, I must be an application process
|
|
* use the default procedure to finish
|
|
*/
|
|
if (ORTE_SUCCESS != (ret = orte_ess_base_app_finalize())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
}
|
|
}
|
|
|
|
/* deconstruct the nidmap and jobmap arrays */
|
|
orte_util_nidmap_finalize();
|
|
|
|
return ret;
|
|
}
|
|
|
|
static uint8_t proc_get_locality(orte_process_name_t *proc)
|
|
{
|
|
orte_nid_t *nid;
|
|
|
|
if (NULL == (nid = orte_util_lookup_nid(proc))) {
|
|
ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
|
|
return OPAL_PROC_NON_LOCAL;
|
|
}
|
|
|
|
if (nid->daemon == ORTE_PROC_MY_DAEMON->vpid) {
|
|
OPAL_OUTPUT_VERBOSE((2, orte_ess_base_output,
|
|
"%s ess:env: proc %s on LOCAL NODE",
|
|
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
|
|
ORTE_NAME_PRINT(proc)));
|
|
return (OPAL_PROC_ON_NODE | OPAL_PROC_ON_CU | OPAL_PROC_ON_CLUSTER);
|
|
}
|
|
|
|
OPAL_OUTPUT_VERBOSE((2, orte_ess_base_output,
|
|
"%s ess:env: proc %s is REMOTE",
|
|
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
|
|
ORTE_NAME_PRINT(proc)));
|
|
|
|
return OPAL_PROC_NON_LOCAL;
|
|
|
|
}
|
|
|
|
static orte_vpid_t proc_get_daemon(orte_process_name_t *proc)
|
|
{
|
|
orte_nid_t *nid;
|
|
|
|
if( ORTE_JOBID_IS_DAEMON(proc->jobid) ) {
|
|
return proc->vpid;
|
|
}
|
|
|
|
if (NULL == (nid = orte_util_lookup_nid(proc))) {
|
|
return ORTE_VPID_INVALID;
|
|
}
|
|
|
|
OPAL_OUTPUT_VERBOSE((2, orte_ess_base_output,
|
|
"%s ess:env: proc %s is hosted by daemon %s",
|
|
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
|
|
ORTE_NAME_PRINT(proc),
|
|
ORTE_VPID_PRINT(nid->daemon)));
|
|
|
|
return nid->daemon;
|
|
}
|
|
|
|
static char* proc_get_hostname(orte_process_name_t *proc)
|
|
{
|
|
orte_nid_t *nid;
|
|
|
|
if (NULL == (nid = orte_util_lookup_nid(proc))) {
|
|
ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
|
|
return NULL;
|
|
}
|
|
|
|
OPAL_OUTPUT_VERBOSE((2, orte_ess_base_output,
|
|
"%s ess:env: proc %s is on host %s",
|
|
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
|
|
ORTE_NAME_PRINT(proc),
|
|
nid->name));
|
|
|
|
return nid->name;
|
|
}
|
|
|
|
static orte_local_rank_t proc_get_local_rank(orte_process_name_t *proc)
|
|
{
|
|
orte_pmap_t *pmap;
|
|
|
|
if (NULL == (pmap = orte_util_lookup_pmap(proc))) {
|
|
ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
|
|
return ORTE_LOCAL_RANK_INVALID;
|
|
}
|
|
|
|
OPAL_OUTPUT_VERBOSE((2, orte_ess_base_output,
|
|
"%s ess:env: proc %s has local rank %d",
|
|
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
|
|
ORTE_NAME_PRINT(proc),
|
|
(int)pmap->local_rank));
|
|
|
|
return pmap->local_rank;
|
|
}
|
|
|
|
static orte_node_rank_t proc_get_node_rank(orte_process_name_t *proc)
|
|
{
|
|
orte_pmap_t *pmap;
|
|
|
|
/* is this me? */
|
|
if (proc->jobid == ORTE_PROC_MY_NAME->jobid &&
|
|
proc->vpid == ORTE_PROC_MY_NAME->vpid) {
|
|
/* yes it is - reply with my rank. This is necessary
|
|
* because the pidmap will not have arrived when I
|
|
* am starting up, and if we use static ports, then
|
|
* I need to know my node rank during init
|
|
*/
|
|
return my_node_rank;
|
|
}
|
|
|
|
if (NULL == (pmap = orte_util_lookup_pmap(proc))) {
|
|
return ORTE_NODE_RANK_INVALID;
|
|
}
|
|
|
|
OPAL_OUTPUT_VERBOSE((2, orte_ess_base_output,
|
|
"%s ess:env: proc %s has node rank %d",
|
|
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
|
|
ORTE_NAME_PRINT(proc),
|
|
(int)pmap->node_rank));
|
|
|
|
return pmap->node_rank;
|
|
}
|
|
|
|
static int update_pidmap(opal_byte_object_t *bo)
|
|
{
|
|
int ret;
|
|
|
|
OPAL_OUTPUT_VERBOSE((2, orte_ess_base_output,
|
|
"%s ess:env: updating pidmap",
|
|
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
|
|
|
|
/* build the pmap */
|
|
if (ORTE_SUCCESS != (ret = orte_util_decode_pidmap(bo))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
}
|
|
|
|
return ret;
|
|
}
|
|
|
|
static int update_nidmap(opal_byte_object_t *bo)
|
|
{
|
|
int rc;
|
|
/* decode the nidmap - the util will know what to do */
|
|
if (ORTE_SUCCESS != (rc = orte_util_decode_nodemap(bo))) {
|
|
ORTE_ERROR_LOG(rc);
|
|
}
|
|
return rc;
|
|
}
|
|
|
|
static int env_set_name(void)
|
|
{
|
|
char *tmp;
|
|
int rc;
|
|
orte_jobid_t jobid;
|
|
orte_vpid_t vpid;
|
|
|
|
mca_base_param_reg_string_name("orte", "ess_jobid", "Process jobid",
|
|
true, false, NULL, &tmp);
|
|
if (NULL == tmp) {
|
|
ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
|
|
return ORTE_ERR_NOT_FOUND;
|
|
}
|
|
if (ORTE_SUCCESS != (rc = orte_util_convert_string_to_jobid(&jobid, tmp))) {
|
|
ORTE_ERROR_LOG(rc);
|
|
return(rc);
|
|
}
|
|
free(tmp);
|
|
|
|
mca_base_param_reg_string_name("orte", "ess_vpid", "Process vpid",
|
|
true, false, NULL, &tmp);
|
|
if (NULL == tmp) {
|
|
ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
|
|
return ORTE_ERR_NOT_FOUND;
|
|
}
|
|
if (ORTE_SUCCESS != (rc = orte_util_convert_string_to_vpid(&vpid, tmp))) {
|
|
ORTE_ERROR_LOG(rc);
|
|
return(rc);
|
|
}
|
|
free(tmp);
|
|
|
|
ORTE_PROC_MY_NAME->jobid = jobid;
|
|
ORTE_PROC_MY_NAME->vpid = vpid;
|
|
|
|
OPAL_OUTPUT_VERBOSE((1, orte_ess_base_output,
|
|
"ess:env set name to %s", ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
|
|
|
|
/* get my node rank in case we are using static ports - this won't
|
|
* be present for daemons, so don't error out if we don't have it
|
|
*/
|
|
mca_base_param_reg_string_name("orte", "ess_node_rank", "Process node rank",
|
|
true, false, NULL, &tmp);
|
|
if (NULL != tmp) {
|
|
my_node_rank = strtol(tmp, NULL, 10);
|
|
}
|
|
|
|
/* get the non-name common environmental variables */
|
|
if (ORTE_SUCCESS != (rc = orte_ess_env_get())) {
|
|
ORTE_ERROR_LOG(rc);
|
|
return rc;
|
|
}
|
|
|
|
return ORTE_SUCCESS;
|
|
}
|
|
|
|
#if OPAL_ENABLE_FT_CR == 1
|
|
static int rte_ft_event(int state)
|
|
{
|
|
int ret, exit_status = ORTE_SUCCESS;
|
|
orte_proc_type_t svtype;
|
|
|
|
/******** Checkpoint Prep ********/
|
|
if(OPAL_CRS_CHECKPOINT == state) {
|
|
/*
|
|
* Notify SnapC
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_snapc.ft_event(OPAL_CRS_CHECKPOINT))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Notify Routed
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_routed.ft_event(OPAL_CRS_CHECKPOINT))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Notify RML -> OOB
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_rml.ft_event(OPAL_CRS_CHECKPOINT))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
}
|
|
/******** Continue Recovery ********/
|
|
else if (OPAL_CRS_CONTINUE == state ) {
|
|
OPAL_OUTPUT_VERBOSE((1, orte_ess_base_output,
|
|
"ess:env ft_event(%2d) - %s is Continuing",
|
|
state, ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
|
|
|
|
/*
|
|
* Notify RML -> OOB
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_rml.ft_event(OPAL_CRS_CONTINUE))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Notify Routed
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_routed.ft_event(OPAL_CRS_CONTINUE))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Notify SnapC
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_snapc.ft_event(OPAL_CRS_CONTINUE))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
if( orte_cr_continue_like_restart ) {
|
|
/*
|
|
* Barrier to make all processes have been successfully restarted before
|
|
* we try to remove some restart only files.
|
|
*/
|
|
if (ORTE_SUCCESS != (ret = orte_grpcomm.barrier())) {
|
|
opal_output(0, "ess:env: ft_event(%2d): Failed in orte_grpcomm.barrier (%d)",
|
|
state, ret);
|
|
return ret;
|
|
}
|
|
|
|
if( orte_cr_flush_restart_files ) {
|
|
OPAL_OUTPUT_VERBOSE((1, orte_ess_base_output,
|
|
"ess:env ft_event(%2d): %s "
|
|
"Cleanup restart files...",
|
|
state, ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
|
|
opal_crs_base_cleanup_flush();
|
|
}
|
|
}
|
|
}
|
|
/******** Restart Recovery ********/
|
|
else if (OPAL_CRS_RESTART == state ) {
|
|
OPAL_OUTPUT_VERBOSE((1, orte_ess_base_output,
|
|
"ess:env ft_event(%2d) - %s is Restarting",
|
|
state, ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
|
|
|
|
/*
|
|
* This should follow the ess init() function
|
|
*/
|
|
|
|
/*
|
|
* Clear nidmap and jmap
|
|
*/
|
|
orte_util_nidmap_finalize();
|
|
|
|
/*
|
|
* - Reset Contact information
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = env_set_name() ) ) {
|
|
exit_status = ret;
|
|
}
|
|
|
|
/*
|
|
* Notify RML -> OOB
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_rml.ft_event(OPAL_CRS_RESTART))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Restart the routed framework
|
|
* JJH: Lie to the finalize function so it does not try to contact the daemon.
|
|
*/
|
|
svtype = orte_process_info.proc_type;
|
|
orte_process_info.proc_type = ORTE_PROC_TOOL;
|
|
if (ORTE_SUCCESS != (ret = orte_routed.finalize()) ) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
orte_process_info.proc_type = svtype;
|
|
if (ORTE_SUCCESS != (ret = orte_routed.initialize()) ) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Group Comm - Clean out stale data
|
|
*/
|
|
orte_grpcomm.finalize();
|
|
if (ORTE_SUCCESS != (ret = orte_grpcomm.init())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
if (ORTE_SUCCESS != (ret = orte_grpcomm.purge_proc_attrs())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Restart the PLM - Does nothing at the moment, but included for completeness
|
|
*/
|
|
if (ORTE_SUCCESS != (ret = orte_plm.finalize())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
if (ORTE_SUCCESS != (ret = orte_plm.init())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* RML - Enable communications
|
|
*/
|
|
if (ORTE_SUCCESS != (ret = orte_rml.enable_comm())) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Notify Routed
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_routed.ft_event(OPAL_CRS_RESTART))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/* if one was provided, build my nidmap */
|
|
if (ORTE_SUCCESS != (ret = orte_util_nidmap_init(orte_process_info.sync_buf))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
|
|
/*
|
|
* Barrier to make all processes have been successfully restarted before
|
|
* we try to remove some restart only files.
|
|
*/
|
|
if (ORTE_SUCCESS != (ret = orte_grpcomm.barrier())) {
|
|
opal_output(0, "ess:env ft_event(%2d): Failed in orte_grpcomm.barrier (%d)",
|
|
state, ret);
|
|
return ret;
|
|
}
|
|
if( orte_cr_flush_restart_files ) {
|
|
OPAL_OUTPUT_VERBOSE((1, orte_ess_base_output,
|
|
"ess:env ft_event(%2d): %s "
|
|
"Cleanup restart files...",
|
|
state, ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));
|
|
|
|
opal_crs_base_cleanup_flush();
|
|
}
|
|
|
|
/*
|
|
* Session directory re-init
|
|
*/
|
|
if (orte_create_session_dirs) {
|
|
if (ORTE_SUCCESS != (ret = orte_session_dir(true,
|
|
orte_process_info.tmpdir_base,
|
|
orte_process_info.nodename,
|
|
NULL, /* Batch ID -- Not used */
|
|
ORTE_PROC_MY_NAME))) {
|
|
exit_status = ret;
|
|
}
|
|
|
|
opal_output_set_output_file_info(orte_process_info.proc_session_dir,
|
|
"output-", NULL, NULL);
|
|
}
|
|
|
|
/*
|
|
* Notify SnapC
|
|
*/
|
|
if( ORTE_SUCCESS != (ret = orte_snapc.ft_event(OPAL_CRS_RESTART))) {
|
|
ORTE_ERROR_LOG(ret);
|
|
exit_status = ret;
|
|
goto cleanup;
|
|
}
|
|
}
|
|
else if (OPAL_CRS_TERM == state ) {
|
|
/* Nothing */
|
|
}
|
|
else {
|
|
/* Error state = Nothing */
|
|
}
|
|
|
|
cleanup:
|
|
|
|
return exit_status;
|
|
}
|
|
#endif
|