1
1
openmpi/orte/mca
Jeff Squyres 3cf7dddd47 Fixes trac:635.
Ralph identified the problem, I tracked down ''where'' the fd was
being closed, and Brian figured out ''why'' (and the fix).

What was happening is that a remote process was closing its
stdout/stderr and therefore sending a 0-byte IOF message to mpirun.
mpirun, in turn, closed the iof endpoint associated with that stream
(i.e., stdout/stderr).  IOF does this to handle the case where
mpirun's stdin is closed -- this therefore causes the stdin on all the
ORTE-started processes to have their stdin's closed as well.

So the workaround here is to check that if we get a 0-byte IOF message
on a sink (indicating a remote closure), and if that sink is the
special stdout or stderr stream, don't actually close anything in the
local process.

This commit was SVN r12691.

The following Trac tickets were found above:
  Ticket 635 --> https://svn.open-mpi.org/trac/ompi/ticket/635
2006-11-28 21:42:49 +00:00
..
errmgr Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things). 2006-11-14 19:34:59 +00:00
gpr First stage in the move to a faster startup. Change the ORTE stage gate xcast into a binary tree broadcast (away from a linear broadcast). Also, removed the timing report in the gpr_proxy component that printed out the number of bytes in the compound command message as the answer was "not much" - reduces the clutter in the data. 2006-11-28 00:06:25 +00:00
iof Fixes trac:635. 2006-11-28 21:42:49 +00:00
ns - Add missing mutex lock 2006-11-22 13:37:58 +00:00
odls Correctly setup the sched_yield when launching processes via the orteds. This still doesn't adjust the yield schedule "on-the-fly" as more procs are dynamically added to a node - it just sets it when they are first launched. 2006-11-28 08:27:20 +00:00
oob First stage in the move to a faster startup. Change the ORTE stage gate xcast into a binary tree broadcast (away from a linear broadcast). Also, removed the timing report in the gpr_proxy component that printed out the number of bytes in the compound command message as the answer was "not much" - reduces the clutter in the data. 2006-11-28 00:06:25 +00:00
pls Complete the functions to match the expected prototype. 2006-11-28 00:44:30 +00:00
ras Fix a potential bug in the registry where it didn't fully check a segment's name when searching for it. Will have to verify that this doesn't break other things. 2006-11-23 04:17:37 +00:00
rds Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things). 2006-11-14 19:34:59 +00:00
rmaps Modify the reuse daemons procedure so we only generate the add_local_procs message once. Revise the display-map-at-launch option so the RMAPS framework takes responsibility for implementation of that option. 2006-11-17 19:06:10 +00:00
rmgr First stage in the move to a faster startup. Change the ORTE stage gate xcast into a binary tree broadcast (away from a linear broadcast). Also, removed the timing report in the gpr_proxy component that printed out the number of bytes in the compound command message as the answer was "not much" - reduces the clutter in the data. 2006-11-28 00:06:25 +00:00
rml First stage in the move to a faster startup. Change the ORTE stage gate xcast into a binary tree broadcast (away from a linear broadcast). Also, removed the timing report in the gpr_proxy component that printed out the number of bytes in the compound command message as the answer was "not much" - reduces the clutter in the data. 2006-11-28 00:06:25 +00:00
schema Move the name of the bproc common segment to the central schema location - avoids conflicts when bproc 3 components try to build 2006-11-22 20:23:17 +00:00
sds cleanup unused variables 2006-11-21 20:51:39 +00:00
smr Fix a potential bug in the registry where it didn't fully check a segment's name when searching for it. Will have to verify that this doesn't break other things. 2006-11-23 04:17:37 +00:00