Update the TM launcher so it provides an appropriate error message when encountering an invalid launch_id. This is a first step towards fixing ticket #1016, but needs to be followed by a more complete solution.
This commit was SVN r14578.
Этот коммит содержится в:
родитель
740af39d0a
Коммит
2683c85085
@ -16,6 +16,22 @@
|
||||
#
|
||||
# $HEADER$
|
||||
#
|
||||
[tm-bad-launchid]
|
||||
The TM (PBS / Torque) process starter cannot spawn the specified
|
||||
application on a remote node due to an invalid launch_id.
|
||||
|
||||
Node name: %s
|
||||
Launch id: %d
|
||||
|
||||
This is most likely due to use of the "--hostfile" option to the
|
||||
command line. At this time, Open MPI/OpenRTE do not support this
|
||||
method of operation. Instead, the system expects to directly read
|
||||
information regarding the nodes to be used from the environment.
|
||||
|
||||
Removing "--hostfile" from the command line will likely allow the
|
||||
application to be launched. This will be fixed in a future release
|
||||
to support the use of "--hostfile" on the command line.
|
||||
#
|
||||
[multiple-prefixes]
|
||||
Multiple different --prefix options were specified to mpirun for the
|
||||
same node. This is a fatal error for the TM (PBS / Torque) process
|
||||
|
@ -164,6 +164,22 @@ static int pls_tm_launch_job(orte_jobid_t jobid)
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
/* Iterate through each of the nodes and check to see if we have
|
||||
* a valid launch_id (must be > 0). If not, then error out as
|
||||
* we cannot do anything
|
||||
*/
|
||||
for (item = opal_list_get_first(&map->nodes);
|
||||
item != opal_list_get_end(&map->nodes);
|
||||
item = opal_list_get_next(item)) {
|
||||
orte_mapped_node_t* node = (orte_mapped_node_t*)item;
|
||||
|
||||
if (node->launch_id < 0) {
|
||||
opal_show_help("help-pls-tm.txt", "tm-bad-launchid",
|
||||
true, node->nodename, node->launch_id);
|
||||
goto cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
/* if the user requested that we re-use daemons,
|
||||
* launch the procs on any existing, re-usable daemons
|
||||
*/
|
||||
|
Загрузка…
Ссылка в новой задаче
Block a user