1
1
openmpi/orte/mca/oob/tcp
Ralph Castain 60d931217f Modify the routed framework to allow greater control/flexibility over response to lost routes and initial wireup of jobs as required by several soon-to-come new modules.
Specifically, add two new APIs:

1. lost_route: allows the OOB to report that a connection has failed, thereby giving the routed module an opportunity to respond appropriately to its topology. Creating the API also allows each routed component to hold its own definition of "lifeline" - in some cases, this may be a single connection, but in others it may be multiple connections. Some modules may choose to re-route messaging if the lifeline or any other connection is lost, while others may choose to abort the job.

Both the tree and unity modules retain the current behavior and abort the job if the lifeline connection is lost, while ignoring other lost connections.

2. get_wireup_info: returns (in a provided buffer) info required to wireup connections for the specified job. Some routed modules do not need to return any info as they can wireup via alternative means, while some need to xchg data with their peers. If info is inserted into the buffer, the plm_base_launch_apps function will xcast the contents to the specified job.

The commit also removes the "lifeline" entry from the orte_process_info struct (and the associated ORTE_PROC_MY_LIFELINE definition) as the lifeline info is now contained within the respective routed module.

This commit was SVN r17969.
2008-03-26 01:00:24 +00:00
..
configure.m4 Update the copyright notices for IU and UTK. 2005-11-05 19:57:48 +00:00
configure.params Remove unneeded PARAM_INIT_FILE variable in configure.params files used by 2007-01-08 03:44:22 +00:00
Makefile.am Per long threads on the mailing list and much confusion discussion 2007-12-15 13:32:02 +00:00
oob_tcp_addr.c Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
oob_tcp_addr.h Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
oob_tcp_hdr.h Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
oob_tcp_msg.c Remove the orte_proc_table. Migrate all users of it to the opal_hash_table and a new name hash function in orte. 2008-03-05 22:44:35 +00:00
oob_tcp_msg.h Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
oob_tcp_peer.c Modify the routed framework to allow greater control/flexibility over response to lost routes and initial wireup of jobs as required by several soon-to-come new modules. 2008-03-26 01:00:24 +00:00
oob_tcp_peer.h Remove the orte_proc_table. Migrate all users of it to the opal_hash_table and a new name hash function in orte. 2008-03-05 22:44:35 +00:00
oob_tcp_ping.c Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
oob_tcp_recv.c Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
oob_tcp_send.c Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
oob_tcp.c Add some debugging output to tell us what interfaces were considered and used by OOB 2008-03-21 15:35:40 +00:00
oob_tcp.h Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00