1
1
openmpi/orte/runtime
Ralph Castain f54fda489e This is a first step towards supporting fully-routed OOB communications:
1. remove direct routed module (hooray!)

2. add radix tree routed module (binomial remains default)

3. remove duplicate data storage - orteds were storing nidmap and pidmap data in odls, everyone else in ess

4. add ess APIs to update nidmap, add new pidmap - used only by orteds for MPI-2 support

5. modify code to eliminate multiple calls to orte_routed.update_route that recreated info already in ess pidmap. Add ess API to lookup that info instead. Modify routed modules to utilize that capability

6. setup new ability to shutdown orteds without sending back an "ack" message to mpirun - not utilized yet, will require some changes to plm terminate_orteds functions in managed environments (coming soon)

Initial tests indicating that fully routing comm via defined routing trees may not actually have a significant cost for operations like IB QP setup. More tests required to confirm.

This will require an autogen...

This commit was SVN r19866.
2008-10-31 21:10:00 +00:00
..
data_type_support Roll in the revamped IOF subsystem. Per the devel mailing list email, this is a complete rewrite of the iof framework designed to simplify the code for maintainability, and to support features we had planned to do, but were too difficult to implement in the old code. Specifically, the new code: 2008-10-18 00:00:49 +00:00
help-orte-runtime.txt It is okay for us to init the ORTE mca params multiple times. Indeed, it is absolutely required by orterun as the first time has to be done prior to parsing the command line, which means that the mca values haven't been parsed yet! 2008-06-24 17:50:56 +00:00
Makefile.am Follow-on to r19457. Rather than have #ifs in the middle of functions 2008-09-01 17:15:01 +00:00
orte_cr.c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. 2008-06-09 14:53:58 +00:00
orte_cr.h Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
orte_data_server.c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. 2008-06-09 14:53:58 +00:00
orte_data_server.h Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
orte_finalize.c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. 2008-06-09 14:53:58 +00:00
orte_globals.c This is a first step towards supporting fully-routed OOB communications: 2008-10-31 21:10:00 +00:00
orte_globals.h This is a first step towards supporting fully-routed OOB communications: 2008-10-31 21:10:00 +00:00
orte_init.c Move the lock initialization back to orte_init so that the finalize lock 2008-06-25 03:18:37 +00:00
orte_locks.c Some minor updates to the locking system changes. Remove obsolete locks. Ensure the trigger event objects do not get deconstructed until the very end to avoid possible problems due to race conditions. Route all orted abnormal term tests through the trigger. 2008-08-06 11:31:06 +00:00
orte_locks.h Some minor updates to the locking system changes. Remove obsolete locks. Ensure the trigger event objects do not get deconstructed until the very end to avoid possible problems due to race conditions. Route all orted abnormal term tests through the trigger. 2008-08-06 11:31:06 +00:00
orte_mca_params.c Roll in the revamped IOF subsystem. Per the devel mailing list email, this is a complete rewrite of the iof framework designed to simplify the code for maintainability, and to support features we had planned to do, but were too difficult to implement in the old code. Specifically, the new code: 2008-10-18 00:00:49 +00:00
orte_wait.c Cleanup the plm failed-to-start problem a little - ensure that the event is always defined so we don't have to check when trying to trigger it, thus avoiding potential race conditions. 2008-10-16 14:58:32 +00:00
orte_wait.h Fix a problem in the plm "failed to start" code observed by Jeff. When we are unable to launch to a specific node because it doesn't exist or is down, the system would hang and/or segv. The reason for the hang was that we were "firing" the orted exit trigger prior to its timer event being defined - thus "locking" that one-shot and preventing it from firing when we actually were ready to use it. 2008-10-16 14:21:37 +00:00
runtime_internals.h Follow-on to r19457. Rather than have #ifs in the middle of functions 2008-09-01 17:15:01 +00:00
runtime.h Complete implementation of the --without-rte-support configure option. Working with Brian, this has been tested on RedStorm. 2008-06-18 03:15:56 +00:00