This commit fixes a race condition discovered by @artpol84. The race
happens when a signalling thread decrements the sync count to 0 then
goes to sleep. If the waiting thread runs and detects the count == 0
before going to sleep on the condition variable it will destroy the
condition variable while the signalling thread is potentially still
processing the completion. The fix is to add a non-atomic member to
the sync structure that indicates another process is handling
completion. Since the member will only be set to false by the
initiating thread and the completing thread the variable does not need
to be protected. When destoying a condition variable the waiting
thread needs to wait until the singalling thread is finished.
Thanks to @artpol84 for tracking this down.
Fixes#1813
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds another check to the low-priority callback
conditional that short-circuits the atomic-add if there are no
low-priority callbacks. This should improve performance in the common
case.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The OPAL_ENABLE_MULTI_THREADS macro is always defined as 1. This was
causing us to always use the multi-thread path for synchronization
objects. The code has been updated to use the opal_using_threads()
function. When MPI_THREAD_MULTIPLE support is disabled at build time
(2.x only) this function is a macro evaluating to false so the
compiler will optimize out the MT-path in this case. The
OPAL_ATOMIC_ADD_32 macro has been removed and replaced by the existing
OPAL_THREAD_ADD32 macro.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The way the gni btl is currently coded,
it will run completely out of gas on KNL at
123 processes/node. Since there are bound to be
those who try to run a MPI process/hyperthread
on KNL nodes, the fma sharing mode needs to be requested.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
This commit fixes a compile error on 32-bit platforms. The
low-priority call counter was always using 64-bit atomics which will
not work if 64-bit atomic math is not available. Updated to use 32-bit
instead.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Add PMIx 2.0
Remove PMIx 1.1.4
Cleanup copying of component
Add missing file
Touchup a typo in the Makefile.am
Update the pmix ext114 component
Minor cleanups and resync to master
Update to latest PMIx 2.x
Update to the PMIx event notification branch latest changes
Per discussion on https://github.com/open-mpi/ompi/pull/1767 (and some
subsequent phone calls and off-issue email discussions), the PSM
library is hijacking signal handlers by default. Specifically: unless
the environment variables `IPATH_NO_BACKTRACE=1` (for PSM / Intel
TrueScale) is set, the library constructor for this library will
hijack various signal handlers for the purpose of invoking its own
error reporting mechanisms.
This may be a bit *surprising*, but is not a *problem*, per se. The
real problem is that older versions of at least the PSM library do not
unregister these signal handlers upon being unloaded from memory.
Hence, a segv can actually result in a double segv (i.e., the original
segv and then another segv when the now-non-existent signal handler is
invoked).
This PSM signal hijacking subverts Open MPI's own signal reporting
mechanism, which may be a bit surprising for some users (particularly
those who do not have Intel TrueScale). As such, we disable it by
default so that Open MPI's own error-reporting mechanisms are used.
Additionally, there is a typo in the library destructor for the PSM2
library that may cause problems in the unloading of its signal
handlers. This problem can be avoided by setting `HFI_NO_BACKTRACE=1`
(for PSM2 / Intel OmniPath).
This is further compounded by the fact that the PSM / PSM2 libraries
can be loaded by the OFI MTL and the usNIC BTL (because they are
loaded by libfabric), even when there is no Intel networking hardware
present. Having the PSM/PSM2 libraries behave this way when no Intel
hardware is present is clearly undesirable (and is likely to be fixed
in future releases of the PSM/PSM2 libraries).
This commit sets the following two environment variables to disable
this behavior from the PSM/PSM2 libraries (if they are not already
set):
* IPATH_NO_BACKTRACE=1
* HFI_NO_BACKTRACE=1
If the user has set these variables before invoking Open MPI, we will
not override their values (i.e., their preferences will be honored).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit adds the opal_atomic_swap_32 and opal_atomic_swap_64
functions. This should improve the performance of btl/vader.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Before dynamic add_procs the openib_btl_size_queues was called exactly
once for non-dynamic jobs. Now the function is called on each new
connection so the calculation was wrong. Re-wrote the function to
correctly calculate the CQ size and only attempt to adjust the CQ if
the requested size has changed. This fixes a bug when using the openib
btl on psm2 hardware that is caused by the time needed to resize a
CQ. The overhead was causing udcm to timeout and fail.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit improves the CMA detection when the installed glibc doesn't
have support for CMA. In this case we need to verify that the syscall
numbers in opal/include/opal/sys/cma.h are valid for the architecture.
This verification is done by attempting to use CMA while including the
internal header.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Compiler implementations are free to include support for atomics that
use locks. Unfortunately lock-free and lock atomics do not mix. Older
versions of llvm on OS X use locks to provide
__atomic_compare_exchange on 128-bit values but are lock-free on
64-bit values. This screws up our lifo implementation which mixes
64-bit and 128-bit atomics on the same values to improve
performance. This commit adds a configure-time check if 128-bit
atomics are lock free. If they are not then the 128-bit __atomic CAS
is disabled and we check for the __sync version as a fallback.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
clang 7.0 with the picky option on is extremely verbose, and complains
about almost everything. Trying to make him happy, at least regarding
the datatype engine.