1
1
openmpi/orte/mca
Dave Goodell 5f3b81e291 oob: delete events when destroying a peer
Without this patch running ring_c with the usnic BTL under valgrind will
cause the orteds to segfault.

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Ralph Castain <rhc@open-mpi.org>

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31161.
2014-03-19 22:15:49 +00:00
..
dfs Fix longstanding issue with our multi-project support. Rather than using 2014-01-07 22:11:15 +00:00
errmgr Deal with the corner case where we encounter an error when attempting to launch a daemon. In this case, we will order abnormal termination before daemons callback to us, and thus any attempt to send them a "die" message will fail. Ensure that mpirun at least exits cleanly in this scenario, thereby allowing the remote daemons that did get launched to commit suicide when comm fails. 2014-03-14 15:32:30 +00:00
ess Set the locality for remote procs even after a comm_spawn. Ensure we store our own local cpuset upon launch so it will be shared during comm_join. 2014-03-18 14:51:07 +00:00
filem Protect array against crossing boundaries 2014-01-17 21:36:20 +00:00
grpcomm Fully fix the PMI2 warning - turned out to be larger than originally thought due to the way the function was being handled across multiple files. Properly resolve the problem by not compiling the file if PMI2 is not desired, and then appropriately setting the visibility of the function within the module 2014-03-17 17:36:37 +00:00
iof Cleanup some potential memory overruns 2014-01-19 16:31:26 +00:00
odls More corrections w.r.t. process groups 2014-03-18 21:31:01 +00:00
oob oob: delete events when destroying a peer 2014-03-19 22:15:49 +00:00
plm Surrender to the tyranny of C++ and give up on enum for node states, as nice as that would be, in favor of retaining memory footprint constraints. 2014-03-19 16:15:24 +00:00
ras Paul Hargrove has pointed out that some big SMP systems (e.g., from SGI) configure Torque differently - instead of listing each node name once/slot in the nodefile, they list the node only once and set an envar to indicate the number of procs/node being allocated. Add an MCA param users can set to indicate we are in such an environment, and then use the envar to set the slots. Error out if the mode flag is given, but (a) we don't find the PBS_PPN envar, or (b) we find a node actually listed more than once in the PBS_Nodefile. 2014-02-05 15:51:17 +00:00
rmaps When pretty-printing binding info, we need to pass the topology down to the routine as the mapper isn't always working with the local topology - otherwise, we get an erroneous help message. Thanks to Tetsuya Mishima for reporting it 2014-03-10 15:53:07 +00:00
rml use the newly created JOB_STATE_FT_* events 2014-03-12 12:37:14 +00:00
routed Deal with the corner case where we encounter an error when attempting to launch a daemon. In this case, we will order abnormal termination before daemons callback to us, and thus any attempt to send them a "die" message will fail. Ensure that mpirun at least exits cleanly in this scenario, thereby allowing the remote daemons that did get launched to commit suicide when comm fails. 2014-03-14 15:32:30 +00:00
sensor Do a little cleanup - only resusage needs the node/proc info, so remove it from the sensor base 2014-03-17 21:26:46 +00:00
snapc Use unique collective ids for the checkpoint/restart code 2014-02-04 14:03:05 +00:00
sstore fix "warning: 'sstore_stage_select' defined but not used" 2014-03-06 16:53:27 +00:00
state Continue to resolve priority issues. Cleanup the case of forced termination in mpirun during launch processing by ensuring we can respond to socket closures, and ensuring that the remote daemons correctly close their sockets when terminating. 2014-03-13 04:02:24 +00:00