1
1
openmpi/orte/mca/odls
Ralph Castain 555bbf0c02 Fix the iof race conditions wrt proc termination. This is comprised of two sections:
1. modify the iof to track when a proc actually closes all of its open iof output pipes. When this occurs, notify the odls that the proc's iof is complete. This is done via a zero-time event so that we can step out of the read event before processing the notification.

2. in the odls, modify the waitpid callback so it only flags that it was called. Add a function to receive the iof-complete notification, and a function that checks for both iof complete and waitpid callback before declaring a proc fully terminated. This ensures that we read and deliver -all- of the IO prior to declaring the job complete.

Also modified the odls call to orte_iof.close (and the component's implementation) so it only closes stdin, leaving the other io channels alone. This fixes the other half of the known problem.

This should fix the ticket on this subject, but I'll wait to close it pending further testing in the trunk.

This commit was SVN r19991.
2008-11-12 23:32:01 +00:00
..
base Fix the iof race conditions wrt proc termination. This is comprised of two sections: 2008-11-12 23:32:01 +00:00
bproc Remove linking components against ORTE and OPAL libs. This was 2008-11-08 00:56:57 +00:00
default Include Ralph's suggestions, i.e. keep the hnp and orted management in sync. 2008-11-01 00:39:46 +00:00
process Roll in the revamped IOF subsystem. Per the devel mailing list email, this is a complete rewrite of the iof framework designed to simplify the code for maintainability, and to support features we had planned to do, but were too difficult to implement in the old code. Specifically, the new code: 2008-10-18 00:00:49 +00:00
Makefile.am Here is the major MAD-cure commit. I have written plenty about it, so I refer you here to those messages for a description of everything that was done. 2006-09-14 21:29:51 +00:00
odls_types.h Fix the iof race conditions wrt proc termination. This is comprised of two sections: 2008-11-12 23:32:01 +00:00
odls.h Fixes trac:1392, #1400 2008-07-28 22:40:57 +00:00