1
1

orte/pmix: Do not set orted exit status to one from proc abort

The fact that application proc called Abort (read failed) doesn't
mean that ORTE subsystem has failed - vice versa it does it's work
to gracefuly exit the whole application.

orted exiting with non-zero status creates a problem for at least
plm/slurm environments where orteds are launched via `srun` with
"--kill-on-bad-exit" flag. If one of orteds has exited with non-
zero status slurm will immediately kill all other orteds. As the
result we see a lot of leftover in the `/tmp` directory.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
Этот коммит содержится в:
Artem Polyakov 2017-04-12 22:23:20 +07:00
родитель 539f71d0cc
Коммит 4af7a0827f

@ -14,7 +14,7 @@
* Copyright (c) 2009 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2011 Oak Ridge National Labs. All rights reserved.
* Copyright (c) 2013-2017 Intel, Inc. All rights reserved.
* Copyright (c) 2014 Mellanox Technologies, Inc.
* Copyright (c) 2014-2017 Mellanox Technologies, Inc.
* All rights reserved.
* Copyright (c) 2014 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
@ -190,8 +190,6 @@ static void _client_abort(int sd, short args, void *cbdata)
ORTE_ACTIVATE_PROC_STATE(&p->name, ORTE_PROC_STATE_CALLED_ABORT);
}
ORTE_UPDATE_EXIT_STATUS(cd->status);
/* release the caller */
if (NULL != cd->cbfunc) {
cd->cbfunc(OPAL_SUCCESS, cd->cbdata);