This commit changes the opal_fifo_push code to use
opal_update_counted_pointer to set the head. This fixes a data race
that occurs because the read of the fifo head in opal_fifo_pop
requires two instructions. This combined with the non-atomic update in
opal_fifo_push can lead to an ABA issue that puts the fifo in an
inconsistant state.
There are other ways this problem could be fixed. One way would be to
introduce an opal_atomic_read_128 implementation. On x86_64 this would
have to use the cmpxchg16b instruction. Since this instruction would
have to be in the pop path (and always executed) it would be slower
than the fix in this commit.
Closesopen-mpi/ompi#1460.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1356358: Null pointer dereferences (REVERSE_INULL):
flist->fl_mpool can no longer be NULL. Removed the conditional.
Fix CID 1356357: Resource leaks (RESOURCE_LEAK):
Added the call to free the hints array.
Fix CID 1356356: Resource leaks (RESOURCE_LEAK):
This is a false error but it is safe to call close (-1) so just always
call close.
Fix CID 1356354: Control flow issues (MISSING_BREAK):
Fix CID 1356353: Control flow issues (MISSING_BREAK):
Add comments that indicate the fall-through is intentional.
Fix CID 1356351: Null pointer dereferences (FORWARD_NULL):
Fix potential SEGV if the page_size key is malformed.
Fix CID 1356350: Error handling issues (CHECKED_RETURN):
Add (void) to indicate that we do not care about the return code of
sscanf in this case.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit removes an erroneous else statement from the OSX built-in
atomics check. The else branch sets the built-in atomics support to
BUILTIN_NO if either opal_cv_asm_builtin is not BUILTIN_NO or OSX
atomics support is disabled.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
+ Added an mca parameter to allow connecting processes from different
subnets. Its current default value is 'false' - don't allow, to keep the
current flow the way it is now.
+ rmdacm: when calling ibv_query_gid, use the gid index from
btl_openib_gid_index.
After 11 years, it's probably ok to say that we're no longer in "early
development" -- disable the "build a debug version of Open MPI by
default if we find a .git directory" behavior.
However, we are keeping the "use compiler picky flags if we find a
.git directory" behavior. That's useful behavior for developers, and
has no effect on performance.
This means we need not check for jemalloc in the configure script for
this component. Removing this.
In some machines having the TLS option on can cause errors in
opening this component. --disable-tls while configuring jemalloc.
Please look for instructions for installing jemalloc as a static
library linked directly into memkind in CONTRIBUTING file
github.com/memkind/memkindw
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:
- Before this change a significant portion of the rcache
functionality lived in mpool components. This meant that it was
impossible to add a new memory pool to use with rdma networks
(ugni, openib, etc) without duplicating the functionality of an
existing mpool component. All the registration functionality has
been removed from the mpool and placed in the rcache framework.
- All registration cache mpools components (udreg, grdma, gpusm,
rgpusm) have been changed to rcache components. rcaches are
allocated and released in the same way mpool components were.
- It is now valid to pass NULL as the resources argument when
creating an rcache. At this time the gpusm and rgpusm components
support this. All other rcache components require non-NULL
resources.
- A new mpool component has been added: hugepage. This component
supports huge page allocations on linux.
- Memory pools are now allocated using "hints". Each mpool component
is queried with the hints and returns a priority. The current hints
supported are NULL (uses posix_memalign/malloc), page_size=x (huge
page mpool), and mpool=x.
- The sm mpool has been moved to common/sm. This reflects that the sm
mpool is specialized and not meant for any general
allocations. This mpool may be moved back into the mpool framework
if there is any objection.
- The opal_free_list_init arguments have been updated. The unused0
argument is not used to pass in the registration cache module. The
mpool registration flags are now rcache registration flags.
- All components have been updated to make use of the new framework
interfaces.
As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Add primitive magic number and version checking in the connectivity
checker protocol. These checks doesn't *guarantee* to we won't get
false PINGs and ACKs, but they do significantly reduce the possibility
of interpretating random incoming fragments as PINGs or ACKs.