1
1
openmpi/orte/mca
Ralph Castain ca0c806662 Resolve the problem of binding in inverted topologies - check the relative depth of the map and bind objects in the topology, and let that determine whether we bind downward or upwards.
cmr=v1.7.5:reviewer=jsquyres:subject=Resolve the problem of binding in inverted topologies

This commit was SVN r30643.
2014-02-09 05:30:17 +00:00
..
dfs Fix longstanding issue with our multi-project support. Rather than using 2014-01-07 22:11:15 +00:00
errmgr Fix suicide operation when MPI app loses connection to its local daemon. In that scenario, we correctly callback up to the MPI layer notifying it of the lost connection. However, when the MPI layer calls back down to tell the RTE to abort, it is passing back a flag indicating we should report that error to our local daemon - which is dead. This leads to an infinite loop. Break it by using checking the flag indicating an abnormal term was ordered by the RTE and thus don't attempt to send the message. 2014-01-29 16:56:54 +00:00
ess In case there are stale session directories around, do a purge of the relevant session directory tree when an orted, HNP, or singleton start. This won't help in the case of direct-launched apps, but it's the best we can do. 2014-02-09 02:10:31 +00:00
filem Protect array against crossing boundaries 2014-01-17 21:36:20 +00:00
grpcomm Per the RFC discussed here: 2014-02-05 14:39:27 +00:00
iof Cleanup some potential memory overruns 2014-01-19 16:31:26 +00:00
odls Use unique collective ids for the checkpoint/restart code 2014-02-04 14:03:05 +00:00
oob Upgrade the security framework to avoid multiple hits against the global security server. Add support for future case where mpirun assings a global security credential for a given run, though we need to work out how to handle connect-accept from other mpirun's in that case. Remove a bunch of duplicate code in the OOB by consolidating the connection handshake code. 2014-02-04 14:47:04 +00:00
plm Per the RFC discussed here: 2014-02-05 14:39:27 +00:00
ras Paul Hargrove has pointed out that some big SMP systems (e.g., from SGI) configure Torque differently - instead of listing each node name once/slot in the nodefile, they list the node only once and set an envar to indicate the number of procs/node being allocated. Add an MCA param users can set to indicate we are in such an environment, and then use the envar to set the slots. Error out if the mode flag is given, but (a) we don't find the PBS_PPN envar, or (b) we find a node actually listed more than once in the PBS_Nodefile. 2014-02-05 15:51:17 +00:00
rmaps Resolve the problem of binding in inverted topologies - check the relative depth of the map and bind objects in the topology, and let that determine whether we bind downward or upwards. 2014-02-09 05:30:17 +00:00
rml Track the origin of a message so it can be passed across transports 2014-01-26 21:09:26 +00:00
routed Fix longstanding issue with our multi-project support. Rather than using 2014-01-07 22:11:15 +00:00
sensor Orcm sends heartbeats to its daemons, but ORTE needs to continue sending it to the HNP 2014-01-31 01:56:01 +00:00
snapc Use unique collective ids for the checkpoint/restart code 2014-02-04 14:03:05 +00:00
sstore SNAPC/CRCP/SSTORE: remove compiler warnings 2014-01-29 20:52:00 +00:00
state Fix longstanding issue with our multi-project support. Rather than using 2014-01-07 22:11:15 +00:00