openmpi

Josh Hursey 8f45fcb429 More fixes for the C/R support. Fixes a couple bugs with the migration and autor features. The C/R functionality should be fully working now. * Fix the checkpoint-restart-checkpoint case which would previous reject the checkpoint of the newly restarted process. By making sure to re-enable checkpointing once the application has fully restarted fixes this issue (make sure to set is_app_checkpointable to true on restart confirmation). * In the case of an invalid checkpoint, do not try to access the SStore datastore as it will be using a dummy handler, and return NULL strings. mpirun was segfaulting in the error case because it was trying to convert the seq_num from a string to an integer. * Make sure to initialize the timer event in the Automatic Recovery section of the HNP errmgr, per the libevent update. This caused a segfault when attempting to recover a failed process. * If ompi-checkpoint loses connection to the HNP/mpirun the TCP socket will fail and call the ErrMgr update_state function. This commit adds a dummy function {{{orte_errmgr_base_update_state()}}} that will prevent the ompi-checkpoint command from segfaulting in this error scenario. This commit was SVN r24306.		2011-01-26 14:56:35 +00:00
..
db	Update the rmcast callback function API to return message sequence number. Update orte_mcast test to stress the system.	2010-11-07 23:29:52 +00:00
debugger	removed c99 test code	2011-01-25 23:02:35 +00:00
errmgr	More fixes for the C/R support. Fixes a couple bugs with the migration and autor features. The C/R functionality should be fully working now.	2011-01-26 14:56:35 +00:00
ess	Convert the bad dos line endings to unix style for all windows related files.	2010-12-02 12:08:08 +00:00
filem	Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac.	2010-10-24 18:35:54 +00:00
grpcomm	Convert the bad dos line endings to unix style for all windows related files.	2010-12-02 12:08:08 +00:00
iof	Convert the bad dos line endings to unix style for all windows related files.	2010-12-02 12:08:08 +00:00
notifier	Few fault tolerance updates related to the CIFTS project (http://www.mcs.anl.gov/research/cifts/)	2011-01-13 20:13:49 +00:00
odls	corrected a couple places in orte where it said cpu_model when it should have been cpu_type.	2011-01-11 19:56:26 +00:00
oob	Few fault tolerance updates related to the CIFTS project (http://www.mcs.anl.gov/research/cifts/)	2011-01-13 20:13:49 +00:00
plm	Convert the bad dos line endings to unix style for all windows related files.	2010-12-02 12:08:08 +00:00
ras	Convert the bad dos line endings to unix style for all windows related files.	2010-12-02 12:08:08 +00:00
rmaps	Convert the bad dos line endings to unix style for all windows related files.	2010-12-02 12:08:08 +00:00
rmcast	Update the multicast subsystem - ported from Cisco branch	2011-01-13 01:54:05 +00:00
rml	Update the multicast subsystem - ported from Cisco branch	2011-01-13 01:54:05 +00:00
routed	Convert the bad dos line endings to unix style for all windows related files.	2010-12-02 12:08:08 +00:00
sensor	Update the rmcast callback function API to return message sequence number. Update orte_mcast test to stress the system.	2010-11-07 23:29:52 +00:00
snapc	More fixes for the C/R support. Fixes a couple bugs with the migration and autor features. The C/R functionality should be fully working now.	2011-01-26 14:56:35 +00:00
sstore	Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac.	2010-10-24 18:35:54 +00:00