1
1
openmpi/opal/mca
Nathan Hjelm 707d35deeb btl/uct: fix deadlock in connection code
This commit fixes a deadlock that can occur when using a TL that
supports the connect to endpoint model. The deadlock was occurring
while processing an incoming connection requests. This was done from
an active-message callback. For some unknown reason (at this time)
this callback was sometimes hanging. To avoid the issue the connection
active-message is saved for later processing.

At the same time I cleaned up the connection code to eliminate
duplicate messages when possible.

This commit also fixes some bugs in the active-message send path:

 - Correctly set all fragment fields in prepare_src.

 - Fix bug when using buffered-send. We were not reading the return
   code correctly (which is in bytes). This resulted in a message
   getting sent multiple times.

 - Don't try to progress sends from the btl_send function when in an
   active-message callback. It could lead to deep recursion and an
   eventual crash if we get a trace like
   send->progress->am_complete->ob1_callback->send->am_complete...

Closes #5820
Closes #5821

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-16 18:28:47 -06:00
..
allocator mca: Dynamic components link against project lib 2017-08-24 11:56:16 -04:00
backtrace stacktrace: Add flexibility in stacktrace ouptut 2017-01-26 11:55:32 -06:00
base Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
btl btl/uct: fix deadlock in connection code 2018-10-16 18:28:47 -06:00
common Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
compress Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
crs Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
dl Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
event Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
hwloc Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
if Merge pull request #5786 from jsquyres/pr/string-madness 2018-10-04 16:12:46 -05:00
installdirs Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
memchecker mca: Dynamic components link against project lib 2017-08-24 11:56:16 -04:00
memcpy Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
memory opal: Disable memory patcher component on MacOS 2018-10-02 13:35:15 -04:00
mpool Handle asprintf errors with opal_asprintf wrapper 2018-10-08 16:43:53 -07:00
patcher misc: compiler warning fixes 2018-09-15 06:04:13 -07:00
pmix pmix/ext3x: fix minor typos 2018-10-10 13:18:36 +09:00
pstat opal: convert from strncpy() -> opal_string_copy() 2018-09-27 11:56:18 -07:00
rcache opal: convert from strncpy() -> opal_string_copy() 2018-09-27 11:56:18 -07:00
reachable opal: convert from strncpy() -> opal_string_copy() 2018-09-27 11:56:18 -07:00
shmem opal: convert from strncpy() -> opal_string_copy() 2018-09-27 11:56:18 -07:00
timer Get x86 TSC frequency from bogomips 2017-07-12 17:31:25 -03:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
mca.h mca/base: enforce max string lengths 2018-09-05 08:42:00 -07:00