openmpi

Ralph Castain 6166278e18 Improve the scalability of the modex operation and fix a bug reported by Tim P

The bug was a race condition in the barrier operation that caused the barrier in MPI_Finalize to fail on very short programs.

Scalaiblity was improved by using the daemons to aggregate modex and barrier messages before sending them to the rank=0 proc. Improvement is proportional to ppn, of course, but there really wasn't a scaling problem at low ppn anyway. This modification also paves the way for better allgather operations since now all the data for each node is sitting at the daemon level, and the daemons are now aware that a collective operation on the OOB is underway (so they -can- participate in a collective of their own to support it).

Also added better diagnostics to map out the timing associated with MPI_Init - turned on by -mca orte_timing 1.

This commit was SVN r17988.

2008-03-27 15:17:53 +00:00

base

Cleanup recursions in ORTE caused by processing recv'd messages that can cause the system to take action resulting in receipt of another message.

2008-02-28 19:58:32 +00:00

ftrm

Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately.

2008-02-28 01:57:57 +00:00

oob

Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately.

2008-02-28 01:57:57 +00:00

Makefile.am

Clean up a couple of configure things: