1
1
openmpi/orte/orted
Ralph Castain f54fda489e This is a first step towards supporting fully-routed OOB communications:
1. remove direct routed module (hooray!)

2. add radix tree routed module (binomial remains default)

3. remove duplicate data storage - orteds were storing nidmap and pidmap data in odls, everyone else in ess

4. add ess APIs to update nidmap, add new pidmap - used only by orteds for MPI-2 support

5. modify code to eliminate multiple calls to orte_routed.update_route that recreated info already in ess pidmap. Add ess API to lookup that info instead. Modify routed modules to utilize that capability

6. setup new ability to shutdown orteds without sending back an "ack" message to mpirun - not utilized yet, will require some changes to plm terminate_orteds functions in managed environments (coming soon)

Initial tests indicating that fully routing comm via defined routing trees may not actually have a significant cost for operations like IB QP setup. More tests required to confirm.

This will require an autogen...

This commit was SVN r19866.
2008-10-31 21:10:00 +00:00
..
help-orted.txt Bring in an updated launch system for the orteds. This commit restores the ability to execute singletons and singleton comm_spawn, both in single node and multi-node environments. 2007-07-12 19:53:18 +00:00
Makefile.am Complete implementation of the --without-rte-support configure option. Working with Brian, this has been tested on RedStorm. 2008-06-18 03:15:56 +00:00
orted_comm.c This is a first step towards supporting fully-routed OOB communications: 2008-10-31 21:10:00 +00:00
orted_main.c Fix a problem in the plm "failed to start" code observed by Jeff. When we are unable to launch to a specific node because it doesn't exist or is down, the system would hang and/or segv. The reason for the hang was that we were "firing" the orted exit trigger prior to its timer event being defined - thus "locking" that one-shot and preventing it from firing when we actually were ready to use it. 2008-10-16 14:21:37 +00:00
orted.h Cleanup recursions in ORTE caused by processing recv'd messages that can cause the system to take action resulting in receipt of another message. 2008-02-28 19:58:32 +00:00