1
1
openmpi/orte/orted/pmix
Artem Polyakov 4af7a0827f orte/pmix: Do not set orted exit status to one from proc abort
The fact that application proc called Abort (read failed) doesn't
mean that ORTE subsystem has failed - vice versa it does it's work
to gracefuly exit the whole application.

orted exiting with non-zero status creates a problem for at least
plm/slurm environments where orteds are launched via `srun` with
"--kill-on-bad-exit" flag. If one of orteds has exited with non-
zero status slurm will immediately kill all other orteds. As the
result we see a lot of leftover in the `/tmp` directory.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-04-13 01:37:36 +07:00
..
Makefile.am Integrate PMIx 1.0 with OMPI. 2015-08-29 16:04:10 -07:00
pmix_server_dyn.c Provide further (hopefully) helpful messages about the hotel size 2017-04-05 04:27:32 -07:00
pmix_server_fence.c Provide further (hopefully) helpful messages about the hotel size 2017-04-05 04:27:32 -07:00
pmix_server_gen.c orte/pmix: Do not set orted exit status to one from proc abort 2017-04-13 01:37:36 +07:00
pmix_server_internal.h Resolve the direct modex race condition. The request hotel was running out of rooms, thereby returning an error upon checkin - and we had missed error_logging a couple of those places. Hence no error message and things just hung. 2017-04-04 21:32:44 -07:00
pmix_server_pub.c Provide further (hopefully) helpful messages about the hotel size 2017-04-05 04:27:32 -07:00
pmix_server_register_fns.c Fix comm_spawn by registering nspace info only when needed - either when we have local procs, or when job-level info is required by connecting jobs 2017-02-14 19:47:56 -08:00
pmix_server.c Provide further (hopefully) helpful messages about the hotel size 2017-04-05 04:27:32 -07:00
pmix_server.h Fix comm_spawn by registering nspace info only when needed - either when we have local procs, or when job-level info is required by connecting jobs 2017-02-14 19:47:56 -08:00