1
1
Граф коммитов

240 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
eaa98af52c opal/free_list: fix race condition
There was a race condition in opal_free_list_get. Code throughout the
Open MPI codebase was assuming that a NULL return from this function
was due to an out-of-memory condition. In some cases this can lead to
a fatal condition (MPI_Irecv and MPI_Isend in pml/ob1 for
example). Before this commit opal_free_list_get_mt looked like this:

```c
static inline opal_free_list_item_t *opal_free_list_get_mt (opal_free_list_t *flist)
{
    opal_free_list_item_t *item =
        (opal_free_list_item_t*) opal_lifo_pop_atomic (&flist->super);

    if (OPAL_UNLIKELY(NULL == item)) {
        opal_mutex_lock (&flist->fl_lock);
        opal_free_list_grow_st (flist, flist->fl_num_per_alloc);
        opal_mutex_unlock (&flist->fl_lock);
        item = (opal_free_list_item_t *) opal_lifo_pop_atomic (&flist->super);
    }

    return item;
}
```

The problem is in a multithreaded environment is *is* possible for the
free list to be grown successfully but the thread calling
opal_free_list_get_mt to be left without an item. The happens if
between the calls to opal_lifo_push_atomic in opal_free_list_grow_st
and the call to opal_lifo_pop_atomic other threads pop all the items
added to the free list.

This commit fixes the issue by ensuring the thread that successfully
grew the free list **always** gets a free list item.

Fixes #2921

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 5c770a7bec)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-16 15:28:20 -06:00
Nathan Hjelm
8b090103e2 opal/fifo: fix 128-bit atomic fifo on Power9
This commit updates the atomic fifo code to fix a consistency issue
observed on Power9 systems when builtin atomics are used. The cause
was two things: 1) a missing write memory barrier in fifo push, and 2)
a read ordering issue when reading the fifo head non-atomically. This
commit fixes both issues and appears to correct then inconsistency.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-07-10 15:37:11 -06:00
Nathan Hjelm
f8dbf62879 opal/asm: change ll/sc atomics to macros
This commit fixes a hang that occurs with debug builds of Open MPI on
aarch64 and power/powerpc systems. When the ll/sc atomics are inline
functions the compiler emits load/store instructions for the function
arguments with -O0. These extra load/store arguments can cause the ll
reservation to be cancelled causing live-lock.

Note that we did attempt to fix this with always_inline but the extra
instructions are stil emitted by the compiler (gcc). There may be
another fix but this has been tested and is working well.

References #3697. Close when applied to v3.0.x and v3.1.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-05-31 19:45:19 -06:00
bosilca
4ebed21b6d
Merge pull request #4670 from ggouaillardet/topic/opal_bitmap
opal/bitmap: fix opal_bitmap_set_bit()
2018-05-29 10:50:21 -04:00
Nathan Hjelm
7f761d8434 opal_free_list: use lifo atomic functions in opal_free_list_wait_mt
This commit fixes a multi-threading bug when using the thread-safe
free list functions. opal_free_list_wait_mt() was using the
conditional version of opal_lifo_pop() and not the thread-safe call.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-26 10:16:42 -06:00
Nathan Hjelm
7163fc98a0 opal/class: add a new class: opal_interval_tree_t
This commit adds a new class to opal: opal_interval_tree_t. This is a
thread-safe impelementation of a 1-dimensional interval tree. The data
structure is intended to provide a faster implementation of the
registration cache VMA tree.

The thread safety is provided by a relativistic red-black tree
implementation. This structure provides support for multiple-reader,
and single writer. There is one caveat, an item may appear in the tree
twice while the tree is being updated. Care needs to be taken to avoid
issues associated with this "feature". I don't anticipate a problem
with the current VMA tree usage.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-26 13:35:56 -07:00
Gilles Gouaillardet
9121eb4ff9 opal/lifo: fix a ABA problem in opal_lifo_pop_atomic
that was introduced in open-mpi/ompi@11bb8b09a0

Fixes open-mpi/ompi#4784

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-02-09 14:48:54 +09:00
Gilles Gouaillardet
125169f057 opal/bitmap: fix opal_bitmap_set_bit()
Correctly reallocate the bitmap when needed

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-27 14:56:43 +09:00
Nathan Hjelm
7893248c5a opal/asm: add fetch-and-op atomics
This commit adds support for fetch-and-op atomics. This is needed
because and and or are irreversible operations so there needs to be a
way to get the old value atomically. These are also the only semantics
supported by C11 (there is not atomic_op_fetch, just
atomic_fetch_op). The old op-and-fetch atomics have been defined in
terms of fetch-and-op.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:23 -07:00
Nathan Hjelm
1282e98a01 opal/asm: rename existing arithmetic atomic functions
This commit renames the arithmetic atomic operations in opal to
indicate that they return the new value not the old value. This naming
differentiates these routines from new functions that return the old
value.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:22 -07:00
Nathan Hjelm
11bb8b09a0 opal/class: use new compare-and-swap functions
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-29 12:56:32 -07:00
Nathan Hjelm
84f63d0aca opal/asm: add opal_atomic_compare_exchange_strong functions
This commit adds a new set of compare-and-exchange functions. These
functions have a signature similar to the functions found in C11. The
old cmpset functions are now deprecated and defined in terms of the
new compare-and-exchange functions. All asm backends have been
updated.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-29 12:45:44 -07:00
Nathan Hjelm
3ff34af355 opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset*
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-10-31 12:47:23 -06:00
Nathan Hjelm
76320a8ba5 opal: rename opal_atomic_init to opal_atomic_lock_init
This function is used to initalize and opal atomic lock. The old name
was confusing.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 14:15:11 -06:00
Nathan Hjelm
db973437e1 opal: fix coverity issues
Fixes coverity CIDs 1412984, and 1412983.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-23 08:15:34 -06:00
Nathan Hjelm
ffd8ee2dfd opal: use opal_list_t convienience macros
This commit cleans up code in opal to use OPAL_LIST_FOREACH(_SAFE),
OPAL_LIST_DESTRUCT, and OPAL_LIST_RELEASE.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-20 12:37:12 -06:00
George Bosilca
ba46b35515
Dont assume a size for constants with UL and ULL.
According to Section 6.4.4.1 of the C, we do not need to prepend a type
to a constant to get the right size. The compiler will infer the type
according to the number of bits in the constant.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-06-05 22:07:53 -04:00
Gilles Gouaillardet
3c6631ff6c opal: fix FIND_FIRST_ZERO macro for opal_pointer_array internal handling
Thanks George for the patch.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-05-10 14:57:44 +09:00
bosilca
d7ebcca93f Add volatile to the pointer in the list_item structure. (#3468)
This change has the side effect of improving the performance of all
atomic data structures (in addition to making the code crrect under a
certain interpretation of the volatile usage).
This commit fixes #3450.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-05-09 10:12:20 -04:00
bosilca
872cf44c28 Improve the opal_pointer_array & more (#3369)
* Complete rewrite of opal_pointer_array
Instead of a cache oblivious linear search use a bits array
to speed up the management of the free space. As a result we
slightly increase the memory used by the structure, but we get a
significant boost in performance.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Do not register datatypes in the f2c translation table.
The registration is now done up into the Fortran layer, by
forcing a call to MPI_Type_c2f.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-04-18 21:41:26 -04:00
Clement Foyer
f371cc0a43 Fix minor typo
Return value in comment about opal_list_item_compare_fn_t typedef when a < b is indicated to be 11 instead of -1.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-02-23 16:10:32 +01:00
Gilles Gouaillardet
23a8f764bd opal: add the OPAL_HASH_TABLE_FOREACH macro
this is a convenience macro similar to the OPAL_LIST_FOREACH macro,
that can be used to iterate on all the key/value pairs of an opal_hash_table_t
2016-10-08 16:58:20 +09:00
Gilles Gouaillardet
014f917462 opal: fix comment in OPAL_LIST_FOREACH macro. no code change. 2016-10-08 16:58:19 +09:00
George Bosilca
fd57f5bccd Remove some of the clang warnings. 2016-08-20 14:21:42 -04:00
Nathan Hjelm
a8c3699484 Fix performance regression caused by enabling opal thread support
This commit adds opal_using_threads() protection around the atomic
operation in OBJ_RETAIN/OBJ_RELEASE. This resolves the performance
issues seen when running psm with MPI_THREAD_SINGLE.

To avoid issues with header dependencies opal_using_threads() has been
moved to a new header (thread_usage.h). The OPAL_THREAD_ADD* and
OPAL_THREAD_CMPSET* macros have also been relocated to this header.

This commit is cherry-picked off a fix that was submitted for the v1.8
release series but never applied to master. This fixes part of the
problem reported by @nysal in #1902.

(cherry picked from commit open-mpi/ompi-release@ce91307918)

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-07-28 07:01:27 -06:00
Jeff Squyres
017f242b1b opal: remove some unused variables / compiler warnings
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-03-26 03:50:57 -07:00
Gilles Gouaillardet
4b91237464 opal/class/opal_lifo: use a standard syntax to initialize a local variable
opal_lifo.h is now included from C++, and g++ does not seem to accept C99 initializer.
2016-03-23 09:46:46 +09:00
Nathan Hjelm
4d4fa28f75 opal: fix coverity issues
Fix CID 1345825 (1 of 1): Dereference before null check (REVERSE_INULL):

ib_proc should not be NULL in this case. Removed the check and added a
check for NULL after OBJ_NEW.

CID 1269821 (1 of 1): Dereference null return value (NULL_RETURNS):

I labeled this one as a false positive (which it is) but the code in
question could stand be be cleaned up.

Fix CID 1356424 (1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):

While trying to silence another Coverity issue another was
flagged. Protect the close of fd with if (fd >= 0).

CID 70772 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70773 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70774 (1 of 1): Dereference null return value (NULL_RETURNS):

None of these are errors and are intentional but now that we have a
list release function use that to make these go away. The cleanup is
similar to CID 1269821.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 15:56:08 -06:00
Gilles Gouaillardet
013aec894b opal/class/opal_lifo: rename a local variable initially called new
this file is now indirectly included from C++, and new is a reserved C++ keyword
2016-03-18 22:15:44 +09:00
Nathan Hjelm
c749d7d977 Merge pull request #1470 from hjelmn/fifo_fix
opal/fifo: use atomics to set fifo head in opal_fifo_push
2016-03-17 22:30:31 -06:00
Nathan Hjelm
dc000213ea opal/fifo: use atomics to set fifo head in opal_fifo_push
This commit changes the opal_fifo_push code to use
opal_update_counted_pointer to set the head. This fixes a data race
that occurs because the read of the fifo head in opal_fifo_pop
requires two instructions. This combined with the non-atomic update in
opal_fifo_push can lead to an ABA issue that puts the fifo in an
inconsistant state.

There are other ways this problem could be fixed. One way would be to
introduce an opal_atomic_read_128 implementation. On x86_64 this would
have to use the cmpxchg16b instruction. Since this instruction would
have to be in the pop path (and always executed) it would be slower
than the fix in this commit.

Closes open-mpi/ompi#1460.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 13:21:27 -06:00
Nathan Hjelm
852cc8cfbc opal: fix various coverity errors
Fix CID 1356358:  Null pointer dereferences  (REVERSE_INULL):

flist->fl_mpool can no longer be NULL. Removed the conditional.

Fix CID 1356357:  Resource leaks  (RESOURCE_LEAK):

Added the call to free the hints array.

Fix CID 1356356:  Resource leaks  (RESOURCE_LEAK):

This is a false error but it is safe to call close (-1) so just always
call close.

Fix CID 1356354:  Control flow issues  (MISSING_BREAK):
Fix CID 1356353:  Control flow issues  (MISSING_BREAK):

Add comments that indicate the fall-through is intentional.

Fix CID 1356351:  Null pointer dereferences  (FORWARD_NULL):

Fix potential SEGV if the page_size key is malformed.

Fix CID 1356350:  Error handling issues  (CHECKED_RETURN):

Add (void) to indicate that we do not care about the return code of
sscanf in this case.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 10:05:57 -06:00
Nathan Hjelm
d4afb16f5a opal: rework mpool and rcache frameworks
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:

 - Before this change a significant portion of the rcache
   functionality lived in mpool components. This meant that it was
   impossible to add a new memory pool to use with rdma networks
   (ugni, openib, etc) without duplicating the functionality of an
   existing mpool component. All the registration functionality has
   been removed from the mpool and placed in the rcache framework.

 - All registration cache mpools components (udreg, grdma, gpusm,
   rgpusm) have been changed to rcache components. rcaches are
   allocated and released in the same way mpool components were.

 - It is now valid to pass NULL as the resources argument when
   creating an rcache. At this time the gpusm and rgpusm components
   support this. All other rcache components require non-NULL
   resources.

 - A new mpool component has been added: hugepage. This component
   supports huge page allocations on linux.

 - Memory pools are now allocated using "hints". Each mpool component
   is queried with the hints and returns a priority. The current hints
   supported are NULL (uses posix_memalign/malloc), page_size=x (huge
   page mpool), and mpool=x.

 - The sm mpool has been moved to common/sm. This reflects that the sm
   mpool is specialized and not meant for any general
   allocations. This mpool may be moved back into the mpool framework
   if there is any objection.

 - The opal_free_list_init arguments have been updated. The unused0
   argument is not used to pass in the registration cache module. The
   mpool registration flags are now rcache registration flags.

 - All components have been updated to make use of the new framework
   interfaces.

As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Jeff Squyres
270cc11156 opal hotel: only delete events that have not yet fired
The eviction callback, for convenience (and to avoid code
duplication), use to call opal_hotel_checkout().  However,
opal_hotel_checkout() deletes the eviction event -- which is fine to
do when opal_hotel_checkout() is invoked by the application.  But when
it's invoked by the same event that it's deleting, it can cause Bad
Things to happen.

For simplicity, instead of invoking opal_hotel_checkout() from the
eviction callback, just duplicate the checkout logic into the eviction
callback function (and skip the delete-the-evict-event part).

For good measure, put a comment in all three places where the checkout
logic occurs (because it's inlined): don't change this logic without
changing all 3 places.

Finally, also add a line in the docs for opal_hotel_init() warning
users from calling opal_hotel_checkout() from their eviction
callback.
2016-01-13 10:59:06 -08:00
Nathan Hjelm
2c02294389 opal_free_list: fix strange size check
OPAL free lists can be initialized with a fragment size that differs
from the size of objects from a class. This allows the free list code
to support OPAL objects that have flexible array members.

Unfortunately the free list code will throw out the desired length in
some cases. The code in question was committed in
open-mpi/ompi@90fb58de. The side effects of this are varied and can
cause segmentation faults, assert failures, hangs, etc. This commit
adds a check to ensure the requested size is at least as large as the
class size and makes opal_free_list allocations always honor the
requested fragment size (as long as it is larger than the class
size).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-09 19:47:55 -07:00
Ralph Castain
a7045352e2 Modify the OPAL_LIST_RELEASE and OPAL_LIST_DESTRUCT macros to release the objects only when the list object is refcounted down to 1, which will then reach zero when destructed/released at the end of the macro
`
2015-10-27 16:42:46 -07:00
Ralph Castain
02cdd046bd Revert " Releasing the list items when list destructor is called"
This reverts commit 7579ae3086.
2015-10-27 15:24:48 -07:00
rhc54
0bc51375f3 Merge pull request #1004 from rppendya/rppendya_list_release
Releasing the list items when list destructor is called
2015-10-21 14:34:19 -07:00
Nathan Hjelm
039c7dbcd6 opal/mutex: add static mutex initializers
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-14 16:08:41 -06:00
Raghavendra Pendyala
7579ae3086 Releasing the list items when list destructor is called 2015-10-09 10:49:44 -07:00
Nathan Hjelm
3c34f6f25c Merge pull request #517 from hjelmn/class_fix
opal/class: enable use of opal classes after opal_class_finalize
2015-08-31 12:13:58 -07:00
Ralph Castain
cf6137b530 Integrate PMIx 1.0 with OMPI.
Bring Slurm PMI-1 component online
Bring the s2 component online

Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways.

Bring the OMPI pubsub/pmi component online

Get comm_spawn working again

Ensure we always provide a cpuset, even if it is NULL

pmix/cray: adjust cray pmix component for pmix

Make changes so cray pmix can work within the integrated
ompi/pmix framework.

Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet

Cleanup comm_spawn - procs now starting, error in connect_accept

Complete integration
2015-08-29 16:04:10 -07:00
Nathan Hjelm
209a7a0721 opal/lifo: add load-linked store-conditional support
This commit adds implementations for opal_atomic_lifo_pop and
opal_atomic_lifo_push that make use of the load-linked and
store-conditional instruction. These instruction allow for a more
efficient implementation on supported platforms.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-18 14:01:52 -06:00
Nathan Hjelm
2a7e191dd8 opal/fifo: if available use load-linked store-conditional
These instructions allow a more efficient implementation of the
opal_fifo_pop_atomic function.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-18 14:01:52 -06:00
Nathan Hjelm
6265aaa354 Merge pull request #771 from hjelmn/lifo_fix
opal/lifo: add missing opal_atomic_wmb and remove unnecessary opal_atomic_rmb
2015-08-04 14:02:29 -06:00
Nathan Hjelm
6003a4dae1 opal/lifo: add missing opal_atomic_wmb and remove unnecessary opal_atomic_rmb
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-04 08:54:06 -06:00
Nathan Hjelm
9abccbd9fc opal/fifo: add missing memory barrier
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-03 16:22:28 -06:00
Jeff Squyres
df800286e4 Merge pull request #709 from avilcheslopez/master
Improving opal_pointer_array bounds checking.
2015-07-23 14:45:11 -04:00
Alejandro Vilches
994ed60b3d Improving opal_pointer_array bounds checking (using
OPAL_UNLIKELY).
2015-07-23 11:53:16 -07:00
Ralph Castain
61fb067f14 Update the opal_hotel class to support a given event base instead of defaulting to using opal_event_base 2015-07-11 06:42:23 -07:00