1
1
openmpi/orte
Josh Hursey 2f20a38c98 This is a fix for bug Ticket #27
We were stuck in an infinite loop inside the rmaps round_robin
component when the user specified a host, then over subscribed it.
Instead of retuning an error, we looped forever.

For example:
 $ cat hostfile
  A slots=2 max-slots=2
  B slots=2 max-slots=2
 $ mpirun -np 3 --hostfile hostfile --host B
  <hang>

The loop would not terminate because both host A and B are in the 
'nodes' structure as they are both allocated to the job. However,
after allocating 2 slots to host B, we remove it from the node list
leaving us with a 'nodes' structure with just A in it. Since we can't
use host A, we keep looping here until we find a node that we can use.

This patch checks to make sure that if we get into this situation where
rmaps is looping over the list a second time without finding a node
during the first pass then we know that there are no nodes left to
use, so we have a resource allocation error, and should return to the user.

This patch should be moved to all of the release branches

This commit was SVN r10131.
2006-05-31 03:42:01 +00:00
..
class Next step in the project split, mainly source code re-arranging 2006-02-12 01:33:29 +00:00
dss Fix a bunch of warnings the Sun compilers find: 2006-04-20 15:35:58 +00:00
etc Next step in the project split, mainly source code re-arranging 2006-02-12 01:33:29 +00:00
include Add some finer error checking that should help debug some recent problems with dynamic spawns. 2006-03-23 15:31:43 +00:00
mca This is a fix for bug Ticket #27 2006-05-31 03:42:01 +00:00
runtime I'm confused ... Error string as well as the goto label had the same name ... 2006-02-14 17:49:14 +00:00
tools Additions to the tm, slurm, and rsh pls modules to handle the --prefix 2006-05-16 14:14:12 +00:00
util Add some finer error checking that should help debug some recent problems with dynamic spawns. 2006-03-23 15:31:43 +00:00
Doxyfile Fix the broken Doxyfile so people can generate what little code base documentation we have :-) 2006-04-13 12:52:17 +00:00
Makefile.am * Fix a small bug George noticed - if you change the prefix (or any of the 2006-03-12 04:35:01 +00:00