60d931217f
Specifically, add two new APIs: 1. lost_route: allows the OOB to report that a connection has failed, thereby giving the routed module an opportunity to respond appropriately to its topology. Creating the API also allows each routed component to hold its own definition of "lifeline" - in some cases, this may be a single connection, but in others it may be multiple connections. Some modules may choose to re-route messaging if the lifeline or any other connection is lost, while others may choose to abort the job. Both the tree and unity modules retain the current behavior and abort the job if the lifeline connection is lost, while ignoring other lost connections. 2. get_wireup_info: returns (in a provided buffer) info required to wireup connections for the specified job. Some routed modules do not need to return any info as they can wireup via alternative means, while some need to xchg data with their peers. If info is inserted into the buffer, the plm_base_launch_apps function will xcast the contents to the specified job. The commit also removes the "lifeline" entry from the orte_process_info struct (and the associated ORTE_PROC_MY_LIFELINE definition) as the lifeline info is now contained within the respective routed module. This commit was SVN r17969. |
||
---|---|---|
.. | ||
configure.m4 | ||
configure.params | ||
Makefile.am | ||
oob_tcp_addr.c | ||
oob_tcp_addr.h | ||
oob_tcp_hdr.h | ||
oob_tcp_msg.c | ||
oob_tcp_msg.h | ||
oob_tcp_peer.c | ||
oob_tcp_peer.h | ||
oob_tcp_ping.c | ||
oob_tcp_recv.c | ||
oob_tcp_send.c | ||
oob_tcp.c | ||
oob_tcp.h |