1
1
openmpi/orte/orted
Ralph Castain 6166278e18 Improve the scalability of the modex operation and fix a bug reported by Tim P
The bug was a race condition in the barrier operation that caused the barrier in MPI_Finalize to fail on very short programs.

Scalaiblity was improved by using the daemons to aggregate modex and barrier messages before sending them to the rank=0 proc. Improvement is proportional to ppn, of course, but there really wasn't a scaling problem at low ppn anyway. This modification also paves the way for better allgather operations since now all the data for each node is sitting at the daemon level, and the daemons are now aware that a collective operation on the OOB is underway (so they -can- participate in a collective of their own to support it).

Also added better diagnostics to map out the timing associated with MPI_Init - turned on by -mca orte_timing 1.

This commit was SVN r17988.
2008-03-27 15:17:53 +00:00
..
help-orted.txt Bring in an updated launch system for the orteds. This commit restores the ability to execute singletons and singleton comm_spawn, both in single node and multi-node environments. 2007-07-12 19:53:18 +00:00
Makefile.am Resolve a problem where the orte daemon comm functions were being accessed by mpirun while still retaining occasional reference to the orted_globals. Remove all dependence on orted_globals from the comm functions. Move those functions back into their own file to make it easier to maintain the separation. Ensure that mpirun ignores any "exit" commands being sent to daemons as it will exit on its own. 2007-07-23 18:36:33 +00:00
orted_comm.c Improve the scalability of the modex operation and fix a bug reported by Tim P 2008-03-27 15:17:53 +00:00
orted_main.c Fix #1253: default libevent to use select/poll and only use the other 2008-03-25 17:18:17 +00:00
orted.h Cleanup recursions in ORTE caused by processing recv'd messages that can cause the system to take action resulting in receipt of another message. 2008-02-28 19:58:32 +00:00