1
1
Boris Karasev 52e81ee4b1 rmaps: fixed the ordering of mpirun target nodes
Fixed the desync of job-nodelists between mpirun and orted
daemons. The issue was observed when using RSH launching because user
can provide arbitrary order of nodes regarding HNP placement.
The mpirun process propagate the daemon's nodelist order to nodes.
The problem was that HNP itself is assembling the nodelist based on
user provided order. As the result ranks assignment was calculated
differently on orted and mpirun.

Consider following example:
* User launches mpirun on node cn2.
* Hostlist is cn1,cn2,cn3,cn4; ppn=1
* mpirun is passing hostlist cn[2:2,1,3-4]@0(4) to orteds
So as result mpirun will assing rank 0 on cn1 while orted will assign
rank 0 on cn2 (because orted sees cn2 as the first element in the node
list)

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-02-01 17:16:05 +02:00
..
2015-06-23 20:59:57 -07:00
2018-01-25 08:53:43 -08:00
2018-01-10 20:28:21 -08:00
2018-01-15 08:21:01 -08:00
2018-01-10 20:28:21 -08:00
2015-06-23 20:59:57 -07:00
2015-06-23 20:59:57 -07:00