1
1
openmpi/orte/mca
Ralph Castain ec5fe78876 When in the unity message routing mode, we have to update the RML contact info in the parent procs so that they know how to talk to the children. Ideally, this would be done in the MPI layer since that layer knows which procs are actively involved in the comm_spawn. However, it isn't being done there, which causes comm_spawn to fail, so do it explicitly in the RTE.
Note that this means ALL procs in the parent job are updated, even though they may not be participating in the comm_spawn. This doesn't really hurt anything - just unnecessary.

Comm_spawn still has a problem when a child process shares a node with a parent, so this doesn't fix everything. It only fixes the bug of ensuring all procs know how to talk to each other.

This commit was SVN r16460.
2007-10-16 16:09:41 +00:00
..
errmgr Fix invalid MCA 'base' names so they appear in ompi_info. 2007-08-18 03:05:45 +00:00
filem Change some default MCA parameters: 2007-10-15 15:21:17 +00:00
gpr These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC. 2007-10-05 19:48:23 +00:00
grpcomm Improve diagnostic output messages when errors are hit 2007-10-16 14:51:52 +00:00
iof Fix invalid MCA 'base' names so they appear in ompi_info. 2007-08-18 03:05:45 +00:00
ns Squeeeeeeze the launch message. This is the message sent to the daemons that provides all the data required for launching their local procs. In reorganizing the ODLS framework, I discovered that we were sending a significant amount of unnecessary and repeated data. This commit resolves this by: 2007-10-11 15:57:26 +00:00
odls If the HNP is acting as the orted for local launch then the gpr_replica 2007-10-11 19:47:04 +00:00
oob These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC. 2007-10-05 19:48:23 +00:00
pls When we can detect that a daemon has failed, then we would like to terminate the system without having it lock up. The "hang" is currently caused by the system attempting to send messages to the daemons (specifically, ordering them to kill their local procs and then terminate). Unfortunately, without some idea of which daemon has died, the system hangs while attempting to send a message to someone who is no longer alive. 2007-10-15 18:00:30 +00:00
ras Squeeeeeeze the launch message. This is the message sent to the daemons that provides all the data required for launching their local procs. In reorganizing the ODLS framework, I discovered that we were sending a significant amount of unnecessary and repeated data. This commit resolves this by: 2007-10-11 15:57:26 +00:00
rds These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC. 2007-10-05 19:48:23 +00:00
rmaps Fix invalid MCA 'base' names so they appear in ompi_info. 2007-08-18 03:05:45 +00:00
rmgr These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC. 2007-10-05 19:48:23 +00:00
rml When in the unity message routing mode, we have to update the RML contact info in the parent procs so that they know how to talk to the children. Ideally, this would be done in the MPI layer since that layer knows which procs are actively involved in the comm_spawn. However, it isn't being done there, which causes comm_spawn to fail, so do it explicitly in the RTE. 2007-10-16 16:09:41 +00:00
routed When in the unity message routing mode, we have to update the RML contact info in the parent procs so that they know how to talk to the children. Ideally, this would be done in the MPI layer since that layer knows which procs are actively involved in the comm_spawn. However, it isn't being done there, which causes comm_spawn to fail, so do it explicitly in the RTE. 2007-10-16 16:09:41 +00:00
schema Fix invalid MCA 'base' names so they appear in ompi_info. 2007-08-18 03:05:45 +00:00
sds Squeeeeeeze the launch message. This is the message sent to the daemons that provides all the data required for launching their local procs. In reorganizing the ODLS framework, I discovered that we were sending a significant amount of unnecessary and repeated data. This commit resolves this by: 2007-10-11 15:57:26 +00:00
smr revert my stupidity.. 2007-10-09 19:01:20 +00:00
snapc If we are going to pretend to do filem, then we should always pretend. 2007-10-15 20:04:35 +00:00