The current VMA cache implementation backing rcache/grdma can run into
a deadlock situation in multi-threaded code when madvise is hooked and
the c library uses locks. In this case we may run into the following
situation:
Thread 1:
...
free () <- Holding libc lock
madvice_hook ()
vma_iteration () <- Blocked waiting for vma lock
Thread 2:
...
vma_insert () <- Holding vma lock
vma_item_new ()
malloc () <- Blocked waiting for libc lock
To fix this problem we chose to remove the madvise () hook but that
fix is causing issue #3685. This commit aims to greatly reduce the
chance that the deadlock will be hit by putting vma items into a free
list. This moves the allocation outside the vma lock. In general there
are a relatively small number of vma items so the default is to
allocate 2048 vma items. This default is configurable but it is likely
the number is too large not too small.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Every modern compiler supports either inline assembly or builtin atomic
operations. Because of this it is time to delete all the code associated
with pre-built atomics.
This commit also clean out the DEC and XLC asm checks. Neither check
does anything and the XLC compiler supports GCC ASM.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Ignore errors caused by remote side having exited when closing CUDA IPC mappings.
openmpi/ompi#3244
Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com>
Update to support passing of HWLOC shmem topology to client procs
Update use of distance API per @bgoglin
Have the openib component lookup its object in the distance matrix
Bring usnic up-to-date
Restore binding for hwloc2
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
All symbols that need to be accessed from a MCA component must be marked
explicitly as visible using PMIX_EXPORT. This patch allows current trunk
to almost work on OsX. More on the devel mailing list.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
passed to make it all flow thru the opal/pmix "put/get" operations. Update the PMIx code to latest master to pickup some required behaviors.
Remove the no-longer-required get_contact_info and set_contact_info from the RML layer.
Add an MCA param to allow the ofi/rml component to route messages if desired. This is mainly for experimentation at this point as we aren't sure if routing wi
ll be beneficial at large scales. Leave it "off" by default.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
As part of the process for addressing removal of CR/FT related
code from master (and hence from the 3.0.0 release), it was agreed
at the OMPI devel F2F on 7/13/17 that we'd break this in to two
pieces:
1) remove the configure arguments (fewer changes)
2) remove all the CR/FT code, etc. in a subsequent bigger commit
that may not make it in to 3.0.0 in time.
By doing 1), the available configure options would not change
in a subsequent 3.0.x release if we end up not being able to do 2)
before 3.0.0 is released.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
If a user explicitly asks for the "sm" BTL, print a show_help message
saying that the SM BTL is dead, and the user should be using "vader".
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>