Ralph Castain
|
08e17b72cf
|
Break a circular logic loop in the cm routed module.
This commit was SVN r21708.
|
2009-07-17 18:07:35 +00:00 |
|
Ralph Castain
|
90a2db25e9
|
Modify the errmgr callback function so it passes the proc that failed instead of only the jobid.
Update the cm routed module to detect and pass orted failures.
This commit was SVN r21682.
|
2009-07-15 11:43:33 +00:00 |
|
Ralph Castain
|
b97f885c00
|
Restore the original API to terminate individual processes instead of the entire job. This was originally removed as we didn't at that time know how to take advantage of it. Some of us are now working on proactive resilience methods that move procs prior to node failure, so this is now a required API. Modify the odls, plm, and orted functions to support this new functionality.
Continue work on the resilient mapper, completing support for fault groups.
This commit was SVN r21639.
|
2009-07-13 02:29:17 +00:00 |
|
Ralph Castain
|
4a31c65126
|
Add initial support for CMs
This commit was SVN r21334.
|
2009-05-30 20:55:55 +00:00 |
|