To enable the epochs and the resilient orte code, use the configure flag:
--enable-resilient-orte
This will define both:
ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE
This commit was SVN r25093.
still unstable. Reverted errmgr modules back to the original errmgr (with the
updates since the resilient code was brought into the trunk).
This commit was SVN r24958.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.
Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.
This commit was SVN r23764.
The composite functionality was becoming difficult to maintain, so we removed it for now which simplifies the framework design considerably.
Since the 'crmig' and 'autor' components were -very- similar to the 'hnp' component, this commit also merges them together. By moving the 'crmig' and 'autor' to a separate file under the 'hnp' component we are able to isolate the C/R logic to a large extent, thus being only minimally hooked into the previous 'hnp' component.
So other than some name changes, the functionality is all still in place. I will update the C/R documentation later this morning.
This commit was SVN r23628.
This required modification of the errmgr.update_state API so the pid could be passed in to the function that could update the proper data record(s). All calls to that API have been updated as well, but I obviously couldn't test them all.
Thanks to Dong Ahn (LLNL) for catching this problem!
Also fixed debugger daemon cospawn, both for initial launch and attach-while-running modes. Tested and verified on rsh and slurm.
This commit was SVN r23300.
he errmgr framework so that it can decide how to respond - which for now at least is just to check for lifeline and abort if so.
Add a new error constant to indicate that the error is "unrecoverable" so the oob can know it needs to abort.
This commit was SVN r23112.