1
1
openmpi/orte/mca/rmaps
Josh Hursey 2f20a38c98 This is a fix for bug Ticket #27
We were stuck in an infinite loop inside the rmaps round_robin
component when the user specified a host, then over subscribed it.
Instead of retuning an error, we looped forever.

For example:
 $ cat hostfile
  A slots=2 max-slots=2
  B slots=2 max-slots=2
 $ mpirun -np 3 --hostfile hostfile --host B
  <hang>

The loop would not terminate because both host A and B are in the 
'nodes' structure as they are both allocated to the job. However,
after allocating 2 slots to host B, we remove it from the node list
leaving us with a 'nodes' structure with just A in it. Since we can't
use host A, we keep looping here until we find a node that we can use.

This patch checks to make sure that if we get into this situation where
rmaps is looping over the list a second time without finding a node
during the first pass then we know that there are no nodes left to
use, so we have a resource allocation error, and should return to the user.

This patch should be moved to all of the release branches

This commit was SVN r10131.
2006-05-31 03:42:01 +00:00
..
base Next step in the project split, mainly source code re-arranging 2006-02-12 01:33:29 +00:00
round_robin This is a fix for bug Ticket #27 2006-05-31 03:42:01 +00:00
Makefile.am Fix a bunch of install locations for header files 2005-12-08 00:54:44 +00:00
rmaps_types.h Initial population of orte tree 2005-07-02 13:42:54 +00:00
rmaps.h Next step in the project split, mainly source code re-arranging 2006-02-12 01:33:29 +00:00