1
1
openmpi/orte/mca
Orivej Desh 667fe3f3f3 Fix oob_tcp tcp_component_close segfault with active listeners
oob_tcp in non-HNP mode shares libevent event_base with oob_base [1].
orte_oob_base_close calls:
(1) oob_tcp component_shutdown, then
(2) opal_progress_thread_finalize, then
(3) oob_tcp tcp_component_close [2].
opal_progress_thread_finalize calls tracker_destructor [3] that frees the
event_base [4]. If any oob_tcp event listeners are active at this time, oob_tcp
will crash trying to delete them at [5] [6].

This change moves oob_tcp event listener cleanup from component_close to
component_shutdown so that it happens before the event_base is freed.

[1] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/tcp/oob_tcp_listener.c#L160
[2] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/base/oob_base_frame.c#L95
[3] https://github.com/open-mpi/ompi/blob/v4.0.1/opal/runtime/opal_progress_threads.c#L232
[4] https://github.com/open-mpi/ompi/blob/v4.0.1/opal/runtime/opal_progress_threads.c#L65
[5] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/tcp/oob_tcp_component.c#L192
[6] https://github.com/open-mpi/ompi/blob/v4.0.1/orte/mca/oob/tcp/oob_tcp_listener.c#L955

Signed-off-by: Orivej Desh <orivej@gmx.fr>
(cherry picked from commit 78b7e342bd)
2019-07-08 15:14:43 -07:00
..
common pmix/cray: fix disable-dlopen problem 2016-11-21 13:45:10 -06:00
errmgr Remove the stale orte-dvm code 2018-10-30 07:54:35 -07:00
ess Merge pull request #6550 from rhc54/cmr402/clnup 2019-04-09 10:13:15 -05:00
filem mca: Dynamic components link against project lib 2017-08-24 11:56:16 -04:00
grpcomm Update orte 2018-01-25 08:53:43 -08:00
iof Update ORTE to support PMIx v3 2018-03-02 02:00:31 -08:00
odls Remove stale ORTE code 2019-03-31 11:26:18 -07:00
oob Fix oob_tcp tcp_component_close segfault with active listeners 2019-07-08 15:14:43 -07:00
plm plm_slurm_module: adjust for new SLURM CLI options 2019-05-16 09:13:28 -07:00
ras orte: only set the ORTE_NODE_ALIAS attribute when required 2018-04-25 11:43:46 +09:00
regx regx/base: fix an integer overflow 2019-06-06 14:37:33 +09:00
rmaps Correct parsing of ppr directives 2019-01-29 11:34:44 -07:00
rml rml/ofi: remove 2019-02-19 10:27:47 -07:00
routed Fix tree spawn at scale 2019-06-04 09:49:01 -07:00
rtc orte-rmaps-base: update out-of-slots show_help message 2018-11-08 16:03:28 -05:00
schizo orterun: use consistent CLI option name for --bind-to 2018-06-21 08:22:00 -07:00
snapc pmix: added check for pmix fence status 2018-08-17 21:33:50 +06:00
sstore sstore/stage: fix parameter handling in sstore_stage_local_compress_waitpid_cb() 2018-01-04 09:33:46 +09:00
state Fix cross-mpirun connect/accept operations 2019-03-01 08:41:23 -08:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
mca.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00