1
1
Граф коммитов

251 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
9ffac85650 build: Move libevent to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves libevent from an MCA framework to a stand-alone
library built outside of OPAL.  A wrapper in opal/util is provided
to minimize the unnecessary changes in the rest of the code.  When
using the internal Libevent, it will be installed as a stand-alone
libevent.a, instead of bundled in OPAL.  Any pre-installed version
of Libevent at or after 2.0.21 is preferred over the internal
version.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:55:58 +00:00
Christoph Niethammer
a3483e4b71 Fix null pointer arithmetic resulting in potential undefined behavior
NULL pointer arithmetic is undefined behaviour in c.
The payload_ptr can be NULL in the moment when mpool is not initialized.

References from the c11 standard:
- 6.5.6 Additive operators
- 6.3.2.3 Pointers

Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
2020-07-03 22:18:17 +02:00
Nathan Hjelm
3a036f8486 opal/class: add additional object helper functions
This commit adds two additional helpers to opal/class:

 - OPAL_HASH_TABLE_FOREACH_PTR: Same as OPAL_HASH_TABLE_FOREACH but
   operating on ptr hash tables. This is needed because the _ptr
   iterator functions take an additional argument.

 - OPAL_LIST_FOREACH_DECL: Same as OPAL_LIST_FOREACH but declares
   the variable specified in the first argument.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-05-05 06:43:19 -07:00
Noah Evans
ee3517427e Add threads framework
Add a framework to support different types of threading models including
user space thread packages such as Qthreads and argobot:

https://github.com/pmodels/argobots

https://github.com/Qthreads/qthreads

The default threading model is pthreads.  Alternate thread models are
specificed at configure time using the --with-threads=X option.

The framework is static.  The theading model to use is selected at
Open MPI configure/build time.

mca/threads: implement Argobots threading layer

config: fix thread configury

- Add double quotations
- Change Argobot to Argobots
config: implement Argobots check

If the poll time is too long, MPI hangs.

This quick fix just sets it to 0, but it is not good for the
Pthreads version. Need to find a good way to abstract it.

Note that even 1 (= 1 millisecond) causes disastrous performance
degradation.

rework threads MCA framework configury

It now works more like the ompi/mca/rte configury,
modulo some edge items that are special for threading package
linking, etc.

qthreads module
some argobots cleanup

Signed-off-by: Noah Evans <noah.evans@gmail.com>
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-03-27 10:15:45 -06:00
Austen Lauria
df9745a251
Merge pull request #7299 from awlauria/fix_warnings
Fix some compiler warnings.
2020-01-17 11:33:41 -05:00
Austen Lauria
b65ec27307 Fix some compiler warnings.
Silence unused variables, incompatible pointer types,
un-initialized variables, and signed/unsigned comparisons.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-01-10 13:10:53 -05:00
Nathan Hjelm
1145abc0b7 opal: make interval tree resilient to similar intervals
There are cases where the same interval may be in the tree multiple
times. This generally isn't a problem when searching the tree but
may cause issues when attempting to delete a particular registration
from the tree. The issue is fixed by breaking a low value tie by
checking the high value then the interval data.

If the high, low, and data of a new insertion exactly matches an
existing interval then an assertion is raised.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-07 21:43:58 -07:00
Geoffroy Vallee
98de17c6da
Fix a type in comments: insertted -> inserted
Signed-off-by: Geoffroy Vallee <geoffroy.vallee@gmail.com>
2019-11-24 00:36:06 -05:00
Nathan Hjelm
5c770a7bec opal/free_list: fix race condition
There was a race condition in opal_free_list_get. Code throughout the
Open MPI codebase was assuming that a NULL return from this function
was due to an out-of-memory condition. In some cases this can lead to
a fatal condition (MPI_Irecv and MPI_Isend in pml/ob1 for
example). Before this commit opal_free_list_get_mt looked like this:

```c
static inline opal_free_list_item_t *opal_free_list_get_mt (opal_free_list_t *flist)
{
    opal_free_list_item_t *item =
        (opal_free_list_item_t*) opal_lifo_pop_atomic (&flist->super);

    if (OPAL_UNLIKELY(NULL == item)) {
        opal_mutex_lock (&flist->fl_lock);
        opal_free_list_grow_st (flist, flist->fl_num_per_alloc);
        opal_mutex_unlock (&flist->fl_lock);
        item = (opal_free_list_item_t *) opal_lifo_pop_atomic (&flist->super);
    }

    return item;
}
```

The problem is in a multithreaded environment is *is* possible for the
free list to be grown successfully but the thread calling
opal_free_list_get_mt to be left without an item. The happens if
between the calls to opal_lifo_push_atomic in opal_free_list_grow_st
and the call to opal_lifo_pop_atomic other threads pop all the items
added to the free list.

This commit fixes the issue by ensuring the thread that successfully
grew the free list **always** gets a free list item.

Fixes #2921

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-16 13:17:09 -06:00
Nathan Hjelm
9ea5dfa799 class/opal_fifo: fix warning
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-15 19:18:31 -06:00
Nathan Hjelm
fe6528b0d5 opal/atomic: always use C11 atomics if available
This commit disables the use of both the builtin and hand-written
atomics if proper C11 atomic support is detected. This is the first
step towards requiring the availability of C11 atomics for the C
compiler used to build Open MPI.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:51:05 -06:00
Nathan Hjelm
000f9eed4d opal: add types for atomic variables
This commit updates the entire codebase to use specific opal types for
all atomic variables. This is a change from the prior atomic support
which required the use of the volatile keyword. This is the first step
towards implementing support for C11 atomics as that interface
requires the use of types declared with the _Atomic keyword.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:48:55 -06:00
Nathan Hjelm
8b090103e2 opal/fifo: fix 128-bit atomic fifo on Power9
This commit updates the atomic fifo code to fix a consistency issue
observed on Power9 systems when builtin atomics are used. The cause
was two things: 1) a missing write memory barrier in fifo push, and 2)
a read ordering issue when reading the fifo head non-atomically. This
commit fixes both issues and appears to correct then inconsistency.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-07-10 15:37:11 -06:00
Nathan Hjelm
f8dbf62879 opal/asm: change ll/sc atomics to macros
This commit fixes a hang that occurs with debug builds of Open MPI on
aarch64 and power/powerpc systems. When the ll/sc atomics are inline
functions the compiler emits load/store instructions for the function
arguments with -O0. These extra load/store arguments can cause the ll
reservation to be cancelled causing live-lock.

Note that we did attempt to fix this with always_inline but the extra
instructions are stil emitted by the compiler (gcc). There may be
another fix but this has been tested and is working well.

References #3697. Close when applied to v3.0.x and v3.1.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-05-31 19:45:19 -06:00
bosilca
4ebed21b6d
Merge pull request #4670 from ggouaillardet/topic/opal_bitmap
opal/bitmap: fix opal_bitmap_set_bit()
2018-05-29 10:50:21 -04:00
Nathan Hjelm
7f761d8434 opal_free_list: use lifo atomic functions in opal_free_list_wait_mt
This commit fixes a multi-threading bug when using the thread-safe
free list functions. opal_free_list_wait_mt() was using the
conditional version of opal_lifo_pop() and not the thread-safe call.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-26 10:16:42 -06:00
Nathan Hjelm
7163fc98a0 opal/class: add a new class: opal_interval_tree_t
This commit adds a new class to opal: opal_interval_tree_t. This is a
thread-safe impelementation of a 1-dimensional interval tree. The data
structure is intended to provide a faster implementation of the
registration cache VMA tree.

The thread safety is provided by a relativistic red-black tree
implementation. This structure provides support for multiple-reader,
and single writer. There is one caveat, an item may appear in the tree
twice while the tree is being updated. Care needs to be taken to avoid
issues associated with this "feature". I don't anticipate a problem
with the current VMA tree usage.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-26 13:35:56 -07:00
Gilles Gouaillardet
9121eb4ff9 opal/lifo: fix a ABA problem in opal_lifo_pop_atomic
that was introduced in open-mpi/ompi@11bb8b09a0

Fixes open-mpi/ompi#4784

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-02-09 14:48:54 +09:00
Gilles Gouaillardet
125169f057 opal/bitmap: fix opal_bitmap_set_bit()
Correctly reallocate the bitmap when needed

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-27 14:56:43 +09:00
Nathan Hjelm
7893248c5a opal/asm: add fetch-and-op atomics
This commit adds support for fetch-and-op atomics. This is needed
because and and or are irreversible operations so there needs to be a
way to get the old value atomically. These are also the only semantics
supported by C11 (there is not atomic_op_fetch, just
atomic_fetch_op). The old op-and-fetch atomics have been defined in
terms of fetch-and-op.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:23 -07:00
Nathan Hjelm
1282e98a01 opal/asm: rename existing arithmetic atomic functions
This commit renames the arithmetic atomic operations in opal to
indicate that they return the new value not the old value. This naming
differentiates these routines from new functions that return the old
value.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:22 -07:00
Nathan Hjelm
11bb8b09a0 opal/class: use new compare-and-swap functions
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-29 12:56:32 -07:00
Nathan Hjelm
84f63d0aca opal/asm: add opal_atomic_compare_exchange_strong functions
This commit adds a new set of compare-and-exchange functions. These
functions have a signature similar to the functions found in C11. The
old cmpset functions are now deprecated and defined in terms of the
new compare-and-exchange functions. All asm backends have been
updated.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-29 12:45:44 -07:00
Nathan Hjelm
3ff34af355 opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset*
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-10-31 12:47:23 -06:00
Nathan Hjelm
76320a8ba5 opal: rename opal_atomic_init to opal_atomic_lock_init
This function is used to initalize and opal atomic lock. The old name
was confusing.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 14:15:11 -06:00
Nathan Hjelm
db973437e1 opal: fix coverity issues
Fixes coverity CIDs 1412984, and 1412983.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-23 08:15:34 -06:00
Nathan Hjelm
ffd8ee2dfd opal: use opal_list_t convienience macros
This commit cleans up code in opal to use OPAL_LIST_FOREACH(_SAFE),
OPAL_LIST_DESTRUCT, and OPAL_LIST_RELEASE.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-20 12:37:12 -06:00
George Bosilca
ba46b35515
Dont assume a size for constants with UL and ULL.
According to Section 6.4.4.1 of the C, we do not need to prepend a type
to a constant to get the right size. The compiler will infer the type
according to the number of bits in the constant.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-06-05 22:07:53 -04:00
Gilles Gouaillardet
3c6631ff6c opal: fix FIND_FIRST_ZERO macro for opal_pointer_array internal handling
Thanks George for the patch.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-05-10 14:57:44 +09:00
bosilca
d7ebcca93f Add volatile to the pointer in the list_item structure. (#3468)
This change has the side effect of improving the performance of all
atomic data structures (in addition to making the code crrect under a
certain interpretation of the volatile usage).
This commit fixes #3450.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-05-09 10:12:20 -04:00
bosilca
872cf44c28 Improve the opal_pointer_array & more (#3369)
* Complete rewrite of opal_pointer_array
Instead of a cache oblivious linear search use a bits array
to speed up the management of the free space. As a result we
slightly increase the memory used by the structure, but we get a
significant boost in performance.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Do not register datatypes in the f2c translation table.
The registration is now done up into the Fortran layer, by
forcing a call to MPI_Type_c2f.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-04-18 21:41:26 -04:00
Clement Foyer
f371cc0a43 Fix minor typo
Return value in comment about opal_list_item_compare_fn_t typedef when a < b is indicated to be 11 instead of -1.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-02-23 16:10:32 +01:00
Gilles Gouaillardet
23a8f764bd opal: add the OPAL_HASH_TABLE_FOREACH macro
this is a convenience macro similar to the OPAL_LIST_FOREACH macro,
that can be used to iterate on all the key/value pairs of an opal_hash_table_t
2016-10-08 16:58:20 +09:00
Gilles Gouaillardet
014f917462 opal: fix comment in OPAL_LIST_FOREACH macro. no code change. 2016-10-08 16:58:19 +09:00
George Bosilca
fd57f5bccd Remove some of the clang warnings. 2016-08-20 14:21:42 -04:00
Nathan Hjelm
a8c3699484 Fix performance regression caused by enabling opal thread support
This commit adds opal_using_threads() protection around the atomic
operation in OBJ_RETAIN/OBJ_RELEASE. This resolves the performance
issues seen when running psm with MPI_THREAD_SINGLE.

To avoid issues with header dependencies opal_using_threads() has been
moved to a new header (thread_usage.h). The OPAL_THREAD_ADD* and
OPAL_THREAD_CMPSET* macros have also been relocated to this header.

This commit is cherry-picked off a fix that was submitted for the v1.8
release series but never applied to master. This fixes part of the
problem reported by @nysal in #1902.

(cherry picked from commit open-mpi/ompi-release@ce91307918)

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-07-28 07:01:27 -06:00
Jeff Squyres
017f242b1b opal: remove some unused variables / compiler warnings
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-03-26 03:50:57 -07:00
Gilles Gouaillardet
4b91237464 opal/class/opal_lifo: use a standard syntax to initialize a local variable
opal_lifo.h is now included from C++, and g++ does not seem to accept C99 initializer.
2016-03-23 09:46:46 +09:00
Nathan Hjelm
4d4fa28f75 opal: fix coverity issues
Fix CID 1345825 (1 of 1): Dereference before null check (REVERSE_INULL):

ib_proc should not be NULL in this case. Removed the check and added a
check for NULL after OBJ_NEW.

CID 1269821 (1 of 1): Dereference null return value (NULL_RETURNS):

I labeled this one as a false positive (which it is) but the code in
question could stand be be cleaned up.

Fix CID 1356424 (1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):

While trying to silence another Coverity issue another was
flagged. Protect the close of fd with if (fd >= 0).

CID 70772 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70773 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70774 (1 of 1): Dereference null return value (NULL_RETURNS):

None of these are errors and are intentional but now that we have a
list release function use that to make these go away. The cleanup is
similar to CID 1269821.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 15:56:08 -06:00
Gilles Gouaillardet
013aec894b opal/class/opal_lifo: rename a local variable initially called new
this file is now indirectly included from C++, and new is a reserved C++ keyword
2016-03-18 22:15:44 +09:00
Nathan Hjelm
c749d7d977 Merge pull request #1470 from hjelmn/fifo_fix
opal/fifo: use atomics to set fifo head in opal_fifo_push
2016-03-17 22:30:31 -06:00
Nathan Hjelm
dc000213ea opal/fifo: use atomics to set fifo head in opal_fifo_push
This commit changes the opal_fifo_push code to use
opal_update_counted_pointer to set the head. This fixes a data race
that occurs because the read of the fifo head in opal_fifo_pop
requires two instructions. This combined with the non-atomic update in
opal_fifo_push can lead to an ABA issue that puts the fifo in an
inconsistant state.

There are other ways this problem could be fixed. One way would be to
introduce an opal_atomic_read_128 implementation. On x86_64 this would
have to use the cmpxchg16b instruction. Since this instruction would
have to be in the pop path (and always executed) it would be slower
than the fix in this commit.

Closes open-mpi/ompi#1460.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 13:21:27 -06:00
Nathan Hjelm
852cc8cfbc opal: fix various coverity errors
Fix CID 1356358:  Null pointer dereferences  (REVERSE_INULL):

flist->fl_mpool can no longer be NULL. Removed the conditional.

Fix CID 1356357:  Resource leaks  (RESOURCE_LEAK):

Added the call to free the hints array.

Fix CID 1356356:  Resource leaks  (RESOURCE_LEAK):

This is a false error but it is safe to call close (-1) so just always
call close.

Fix CID 1356354:  Control flow issues  (MISSING_BREAK):
Fix CID 1356353:  Control flow issues  (MISSING_BREAK):

Add comments that indicate the fall-through is intentional.

Fix CID 1356351:  Null pointer dereferences  (FORWARD_NULL):

Fix potential SEGV if the page_size key is malformed.

Fix CID 1356350:  Error handling issues  (CHECKED_RETURN):

Add (void) to indicate that we do not care about the return code of
sscanf in this case.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 10:05:57 -06:00
Nathan Hjelm
d4afb16f5a opal: rework mpool and rcache frameworks
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:

 - Before this change a significant portion of the rcache
   functionality lived in mpool components. This meant that it was
   impossible to add a new memory pool to use with rdma networks
   (ugni, openib, etc) without duplicating the functionality of an
   existing mpool component. All the registration functionality has
   been removed from the mpool and placed in the rcache framework.

 - All registration cache mpools components (udreg, grdma, gpusm,
   rgpusm) have been changed to rcache components. rcaches are
   allocated and released in the same way mpool components were.

 - It is now valid to pass NULL as the resources argument when
   creating an rcache. At this time the gpusm and rgpusm components
   support this. All other rcache components require non-NULL
   resources.

 - A new mpool component has been added: hugepage. This component
   supports huge page allocations on linux.

 - Memory pools are now allocated using "hints". Each mpool component
   is queried with the hints and returns a priority. The current hints
   supported are NULL (uses posix_memalign/malloc), page_size=x (huge
   page mpool), and mpool=x.

 - The sm mpool has been moved to common/sm. This reflects that the sm
   mpool is specialized and not meant for any general
   allocations. This mpool may be moved back into the mpool framework
   if there is any objection.

 - The opal_free_list_init arguments have been updated. The unused0
   argument is not used to pass in the registration cache module. The
   mpool registration flags are now rcache registration flags.

 - All components have been updated to make use of the new framework
   interfaces.

As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Jeff Squyres
270cc11156 opal hotel: only delete events that have not yet fired
The eviction callback, for convenience (and to avoid code
duplication), use to call opal_hotel_checkout().  However,
opal_hotel_checkout() deletes the eviction event -- which is fine to
do when opal_hotel_checkout() is invoked by the application.  But when
it's invoked by the same event that it's deleting, it can cause Bad
Things to happen.

For simplicity, instead of invoking opal_hotel_checkout() from the
eviction callback, just duplicate the checkout logic into the eviction
callback function (and skip the delete-the-evict-event part).

For good measure, put a comment in all three places where the checkout
logic occurs (because it's inlined): don't change this logic without
changing all 3 places.

Finally, also add a line in the docs for opal_hotel_init() warning
users from calling opal_hotel_checkout() from their eviction
callback.
2016-01-13 10:59:06 -08:00
Nathan Hjelm
2c02294389 opal_free_list: fix strange size check
OPAL free lists can be initialized with a fragment size that differs
from the size of objects from a class. This allows the free list code
to support OPAL objects that have flexible array members.

Unfortunately the free list code will throw out the desired length in
some cases. The code in question was committed in
open-mpi/ompi@90fb58de. The side effects of this are varied and can
cause segmentation faults, assert failures, hangs, etc. This commit
adds a check to ensure the requested size is at least as large as the
class size and makes opal_free_list allocations always honor the
requested fragment size (as long as it is larger than the class
size).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-09 19:47:55 -07:00
Ralph Castain
a7045352e2 Modify the OPAL_LIST_RELEASE and OPAL_LIST_DESTRUCT macros to release the objects only when the list object is refcounted down to 1, which will then reach zero when destructed/released at the end of the macro
`
2015-10-27 16:42:46 -07:00
Ralph Castain
02cdd046bd Revert " Releasing the list items when list destructor is called"
This reverts commit 7579ae3086.
2015-10-27 15:24:48 -07:00
rhc54
0bc51375f3 Merge pull request #1004 from rppendya/rppendya_list_release
Releasing the list items when list destructor is called
2015-10-21 14:34:19 -07:00
Nathan Hjelm
039c7dbcd6 opal/mutex: add static mutex initializers
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-14 16:08:41 -06:00