1
1
openmpi/opal/mca/rcache
Nathan Hjelm 14b6f4931f rcache/grdma: fix potential deadlock
This commit fixes a potential deadlock that can occur between the
memory hooks and region registation. This deadlock occurs because
of a hold and wait error between two mutexes. The first mutex is
the VMA lock used to protect internal rcache/grdma structures and
the reader/writer lock in the interval tree.

In the case of the memory hooks a reader lock is obtained on the
interval tree then the VMA lock is obtained to remove the
registration from the LRU. In the case of LRU evictions the VMA
lock is obtained then the writer lock on the interval tree is
obtained. This leads to the deadlock.

To fix the issue the code that evicts from the LRU has been
updated to only invalidate the registration while the VMA lock
is held then remove the registration from the VMA after the
lock is released. This should completely eliminate the above
deadlock.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-02-10 17:15:36 -07:00
..
base Remove warnings identified by clang. 2018-04-14 17:14:12 -04:00
gpusm mca: Dynamic components link against project lib 2017-08-24 11:56:16 -04:00
grdma rcache/grdma: fix potential deadlock 2020-02-10 17:15:36 -07:00
rgpusm mca: Dynamic components link against project lib 2017-08-24 11:56:16 -04:00
udreg opal: convert from strncpy() -> opal_string_copy() 2018-09-27 11:56:18 -07:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
rcache.h opal: add types for atomic variables 2018-09-14 10:48:55 -06:00