1
1
openmpi/contrib/amca-param-sets/ft-enable-cr
Josh Hursey 88aa45dd52 Commit to bring online OpenIB, MX, and shared memory support for Open MPI's checkpoint/restart functionality. Some tuning is still needed, but basic functionality is in place.
There is still a problem with OpenIB and threads (external to C/R functionality). It has been reported in Ticket #1539

Additionally:
* Fix a file cleanup bug in CRS Base.
* Fix a possible deadlock in the TCP ft_event function
* Add a mca_base_param_deregister() function to MCA base
* Add whole process checkpoint timers
* Add support for BTL: OpenIB, MX,  Shared Memory
* Add support Mpool: rdma, sm
* Sundry bounds checking an cleanup in some scattered functions

This commit was SVN r19756.
2008-10-16 15:09:00 +00:00

53 строки
1.1 KiB
Plaintext

#
# An Aggregate MCA Parameter Set to enable checkpoint/restart capabilities
# for a job.
#
# Usage:
# shell$ mpirun -am ft-enable-cr ./app
#
#
# OPAL Parameters
# - Turn off OPAL only checkpointing
# - Select only checkpoint ready components
# - Enable Additional FT infrastructure
# - Auto-select OPAL CRS component
# - If available, use the FT Thread (Default)
#
opal_cr_allow_opal_only=0
mca_base_component_distill_checkpoint_ready=1
ft_cr_enabled=1
crs=
opal_cr_use_thread=1
#
# ORTE Parameters
# - Wrap the RML
# - Use the 'full' Snapshot Coordinator
#
rml_wrapper=ftrm
snapc=full
#filem=rsh
#
# OMPI Parameters
# - Wrap the PML
# - Use a Bookmark Exchange Fully Coordinated Checkpoint/Restart Coordination Protocol
#
pml_wrapper=crcpw
crcp=bkmrk
#
# Temporary fix to force the event engine to use poll to behave well with BLCR
#
opal_event_include=poll
#
# We currently only support the following options to the OpenIB BTL
# Future development will attempt to eliminate many of these restrictions
#
btl_openib_want_fork_support=1
btl_openib_use_async_event_thread=0
btl_openib_use_eager_rdma=0
btl_openib_cpc_include=oob