1
1
openmpi/orte
Ralph Castain 0735d6f1c2 This commit fixes ticket #1414
Cleanup the logic in the odls for when processes terminate. It turns out that we were only going through the kill_proc logic once instead of looping over all local children when we ordered a daemon to kill its local procs. This went unnoticed for some time as for most systems the local procs were terminated anyway when the daemon terminated due to the parent/child relationship.

Solaris is apparently different - the children are not automatically terminated when the parent dies. As a result, it acts as a detector for this bug.

Mucho thanks to Rolf V. for his help in debugging - and to IM for letting me follow his gdb progress in quasi real-time!

This commit was SVN r19044.
2008-07-26 02:54:43 +00:00
..
etc Many thanks to Ralf W. for finding a subtle bug in these Makefile.am's 2008-06-04 01:28:03 +00:00
include Fold in the revised modex scheme. Move the ompi_proc_t modex portions to the RTE level since the daemons already have that info. Provide each process with the equivalent of a "nidmap" - both a map of what nodes are in the job, and a map of which node each process is on. This enables the use of static ports, though that hasn't been turned "on" in this commit. 2008-04-30 19:49:53 +00:00
mca This commit fixes ticket #1414 2008-07-26 02:54:43 +00:00
orted Upgrade the ability of orterun to deal with cmd line MCA params that are passed to the orteds. Help reduce the size of the cmd line by eliminating duplicates where possible, and alert to duplicate entries that can cause problems. 2008-07-08 22:36:39 +00:00
runtime Ensure that we only launch procs on the HNP if that node is actually included in the allocation. 2008-07-25 17:13:22 +00:00
test Repair the MPI-2 dynamic operations. This includes: 2008-07-03 17:53:37 +00:00
tools Fix a couple problems with orte-clean. Also add a new 2008-07-22 17:41:06 +00:00
util Update the manpages for comm_spawn(_multiple) - add man page to explain host/hostfile behavior 2008-07-21 17:58:12 +00:00
Doxyfile Fix the broken Doxyfile so people can generate what little code base documentation we have :-) 2006-04-13 12:52:17 +00:00
Makefile.am Remove the orte_proc_table. Migrate all users of it to the opal_hash_table and a new name hash function in orte. 2008-03-05 22:44:35 +00:00