1
1
openmpi/opal/mca/rcache/grdma
Nathan Hjelm 14b6f4931f rcache/grdma: fix potential deadlock
This commit fixes a potential deadlock that can occur between the
memory hooks and region registation. This deadlock occurs because
of a hold and wait error between two mutexes. The first mutex is
the VMA lock used to protect internal rcache/grdma structures and
the reader/writer lock in the interval tree.

In the case of the memory hooks a reader lock is obtained on the
interval tree then the VMA lock is obtained to remove the
registration from the LRU. In the case of LRU evictions the VMA
lock is obtained then the writer lock on the interval tree is
obtained. This leads to the deadlock.

To fix the issue the code that evicts from the LRU has been
updated to only invalidate the registration while the VMA lock
is held then remove the registration from the VMA after the
lock is released. This should completely eliminate the above
deadlock.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-02-10 17:15:36 -07:00
..
Makefile.am mca: Dynamic components link against project lib 2017-08-24 11:56:16 -04:00
owner.txt opal: rework mpool and rcache frameworks 2016-03-14 10:50:41 -06:00
rcache_grdma_component.c rcache/grdma: fix typo 2016-03-16 18:30:44 -06:00
rcache_grdma_module.c rcache/grdma: fix potential deadlock 2020-02-10 17:15:36 -07:00
rcache_grdma.h rcache/base: update VMA tree to use opal_interval_tree_t 2018-02-26 13:35:56 -07:00