1
1
Граф коммитов

715 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
e8277d9d06 tests/asm/run_tests: fix basename usage
Looks like this script was left over from quite a long time ago, and
was expecting CLI params from the "old"-style Automake test engine.
Update it to look for `--test-name` to get the test name, and update a
few other minor style things.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-17 10:23:13 -07:00
Rainer Keller
a8cdc0d38b Restore testing all datatypes.
Signed-off-by: Rainer Keller <rainer.keller@hs-esslingen.de>
2020-05-20 17:21:54 +02:00
Ralph Castain
bd29ab0ae9
Update dpm to handle deprecation of MPI_Info keys
Deprecate the current OMPI-specific MPI_Info key definitions for
MPI_Comm_spawn and replace them with their PMIx equivalents. Issue a
deprecation/conversion warning as this is done. Also issue deprecation
warnings for options such as "ompi_non_mpi" that are no longer used.

Handle both cases where the user might pass either the PMIx attribute
name itself (e.g., "PMIX_MAPBY") or the string value of the attribute
(e.g., PMIX_MAPBY, which translates to "pmix.mapby"). This can only be
done for PMIx v4 and above, so protect that code.

Silence a couple of Coverity warnings and add a test along the way.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-29 14:56:38 -07:00
Ralph Castain
6d29bbfde8
Cleanup heterogeneous builds
Consolidate the ompi_process_info and opal_process_info structs to
remove duplicate storage and conversion issues. Unwind some interweaving
of include files using opal.h. Silence a couple of warnings.

For now, set the arch to local if PMIX_ARCH is not found.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-22 12:46:27 -07:00
Jeff Squyres
b2e0957d6f
Merge pull request #7610 from bosilca/topic/fix_MPI_T
Follow the MPI_T guidelines on return errors.
2020-04-12 14:12:32 -04:00
Ralph Castain
a210f8046f
Cleanup ompi/dpm operations
Do some code cleanup in the connect/accept code. Ensure that the OMPI
layer has access to the PMIx identifier for the process. Add macros for
converting PMIx names to/from strings. Cleanup a few of the simple test
programs. Add a little more info to a btl/tcp error message.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-08 08:37:25 -07:00
George Bosilca
f4af1848c9
Follow the MPI_T guidelines on return errors.
As indicated in the MPI3.2 document 14.3.10 page 599 line 1, the only
MPI error code possible is MPI_SUCCESS. All other errors must be in the
error class MPI_T_ERR*.
Fix the return of few pvar/cvar function that failed to correctly
convert to an MPI error code.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-04-08 00:02:45 -04:00
Jeff Squyres
9687d5e867 Upgrade all www.open-mpi.org URLs to https
Found a handful of other URLs that weren't https-ized, so I updated
them, too (after verifying that they support https, of course).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-04-02 10:43:50 -04:00
Nathan Hjelm
160ff188b8
Merge pull request #7169 from hjelmn/fix_what_wg21_calls_our_problem_not_theirs_seriously__in_some_ways_they_are_correct_but_wtf
configure: use -iquote for non-system include paths
2020-03-30 09:22:54 -07:00
Shintaro Iwasaki
8cab081770 test/class: fix opal_fifo and opal_lifo
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
2020-03-27 10:16:03 -06:00
Noah Evans
ee3517427e Add threads framework
Add a framework to support different types of threading models including
user space thread packages such as Qthreads and argobot:

https://github.com/pmodels/argobots

https://github.com/Qthreads/qthreads

The default threading model is pthreads.  Alternate thread models are
specificed at configure time using the --with-threads=X option.

The framework is static.  The theading model to use is selected at
Open MPI configure/build time.

mca/threads: implement Argobots threading layer

config: fix thread configury

- Add double quotations
- Change Argobot to Argobots
config: implement Argobots check

If the poll time is too long, MPI hangs.

This quick fix just sets it to 0, but it is not good for the
Pthreads version. Need to find a good way to abstract it.

Note that even 1 (= 1 millisecond) causes disastrous performance
degradation.

rework threads MCA framework configury

It now works more like the ompi/mca/rte configury,
modulo some edge items that are special for threading package
linking, etc.

qthreads module
some argobots cleanup

Signed-off-by: Noah Evans <noah.evans@gmail.com>
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-03-27 10:15:45 -06:00
Gilles Gouaillardet
69bc2e8372 misc: fix <> vs "" includes throught the ompi codebase
This commit fixes an issue with the include usage in some
ompi source files. These source files are using the <> form
of include when the "" form is correct (as these are internal,
**not** system headers).

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-03-09 21:13:49 -04:00
Ralph Castain
dcf110d432
Add missing Makefile
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-22 13:17:34 -08:00
Ralph Castain
7e2874a83d
Save the old ORTE simple tests
Useful when debugging RTE-related issues

Not for inclusion in the tarball - just added to git repo for use by
developers.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-21 06:15:06 -08:00
Charles Shereda
cbc6feaab2 Created opal_gethostname() as safer gethostname substitute.
The opal_gethostname() function provides a more robust mechanism
to retrieve the hostname than gethostname(), which can return
results that are not null-terminated, and which can vary in its
behavior from system to system.

opal_gethostname() just returns the value in opal_process_info.nodename;
this is populated in opal_init_gethostname() inside opal_init.c.

-Changed all gethostname calls in opal subtree to opal_gethostname
-Changed all gethostname calls in orte subtree to opal_gethostname
-Changed all gethostname calls in ompi subdir to opal_gethostname
-Changed all gethostname calls in oshmem subdir to opal_gethostname
-Changed opal_if.c in test subdir to use opal_gethostname
-Changed opal_init.c to include opal_init_gethostname. This function
 returns an int and directly sets opal_process_info.nodename per
 jsquyres' modifications.

Relates to open-mpi#6801

Signed-off-by: Charles Shereda <cpshereda@lanl.gov>
2020-01-13 08:52:17 -08:00
Joseph Schuchart
c385c927fb Ensure proper alignment of memory provided by MPI
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-10-01 11:54:29 +02:00
Ralph Castain
373e816b37
Ensure buffer_unload leaves the buffer in a clean state
Silence a warning in orte/nidmap

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-04 08:32:27 -07:00
George Bosilca
82d632278a
Add a test for datatypes composed by multiple predefined
elements that can be merged into a larger UINT1 type.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-30 19:56:48 -04:00
Jeff Squyres
2ab8109be1 Update OPAL DDT variable names
These variables were renamed in
904276bb44caec207638247f23139bc21bc6a09e; update them to use the new
names.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-08-27 12:00:20 -07:00
George Bosilca
0a24f0374e
Small improvements on the test.
Rework the to_self test to be able to be used as a benchmark.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:09 -04:00
George Bosilca
6c75334162
Use the correct counter name in the example.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-05-29 00:54:56 -04:00
George Bosilca
d141bf7912 Update the datatype dump to match the actual types.
Update the comments to better reflect what is going on.
Minor indentations.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-05-10 18:03:57 -04:00
George Bosilca
e42b573cd3
Fix the PVAR allocation usage.
According to the MPI standard the obj_handle is a pointer to an MPI
object, and therefore cannot be MPI_COMM_WORLD. The MPI standard example
14.6 highlight this usage.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-02-02 19:03:43 -05:00
George Bosilca
5a82c4fd07
Provide a better fix for #6285.
The issue was a little complicated due to the internal stack used in the
convertor. The main issue was that in the case where we run out of iov
space to save the raw description of the data while hanbdling a
repetition (loop), instead of saving the current position and bailing out
directly we reading of the next predefined type element. It worked in
most cases, except the one identified by the HDF5 test. However, the
biggest issue here was the drop in performance for all ensuing calls to
the convertor pack/unpack, as instead of handling contiguous loops as a
whole (and minimizing the number of memory copies) we copied data
description by data description.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-01-31 10:01:48 -05:00
Nathan Hjelm
ea40d48899
Merge pull request #6295 from ggouaillardet/topic/opal_convertor_raw
opal/datatype: fix opal_convertor_raw()
2019-01-29 10:57:29 -07:00
Gilles Gouaillardet
45fb69b2b9 ompi/datatype: fix how we compute the space needed for the args
Refs. open-mpi/ompi#6275

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-01-28 15:26:11 +09:00
Gilles Gouaillardet
0832ab5acc opal/datatype: fix opal_convertor_raw
correctly handle the case in which iovec is full and the
last accessed element of the datatype is the beginning of a loop

Refs. open-mpi/ompi#6285

Thanks Axel Huebl for reporting this

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-01-23 15:38:43 +09:00
bosilca
182a2db2a4
Merge pull request #6029 from ggouaillardet/topic/large_datatypes
opal/datatype: correctly handle large datatypes
2018-12-24 12:49:52 -05:00
Nathan Hjelm
46255d0790 test: call opal_init/finalize_util in ddt tests
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-12-18 14:37:04 -07:00
Nathan Hjelm
0edfd328f8 opal: clean up init/finalize
This commit contains the following changes:

 - Remove the unused opal_test_init/opal_test_finalize
   functions. These functions are not used by anything in the code
   base or MTT. Tests use opal_init_util/opal_finalize_util instead.

 - Get rid of gotos in opal_init_util and opal_init. Replaced them
   with a cleaner solution.

 - Automatically register cleanup functions in init functions. The
   cleanup functions are executed in the reverse order of the
   initialization functions. The cleanup functions are run in
   opal_finalize_util() before tearing down the class system.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-12-18 14:37:04 -07:00
George Bosilca
1d8ad9281f Add more details about what is going on.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-12-06 13:30:58 +09:00
George Bosilca
88a693bf71 Add a test for very large data.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-12-06 13:30:58 +09:00
Brian Barrett
e9e4d2a4bc Handle asprintf errors with opal_asprintf wrapper
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error.  However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined.  Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
Nathan Hjelm
000f9eed4d opal: add types for atomic variables
This commit updates the entire codebase to use specific opal types for
all atomic variables. This is a change from the prior atomic support
which required the use of the volatile keyword. This is the first step
towards implementing support for C11 atomics as that interface
requires the use of types declared with the _Atomic keyword.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:48:55 -06:00
Gilles Gouaillardet
a02be5e91a test: protect <sys/mount.h> with the HAVE_SYS_MOUNT_H macro
Thanks Zoltan Mizsei for bringing this to our attention.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-08-24 17:03:54 +09:00
Nathan Hjelm
1c84f48640 config: remove OPAL_ENABLE_MULTI_THREADS config macro
We long ago hard-coded this value to 1. This commit cleans it out
entirely.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-23 13:47:02 -06:00
Thananon Patinyasakdikul
390d72addd
Merge pull request #4885 from davideberius/spc_pr
Initial Software-based Performance Counters PR
2018-06-12 14:04:49 -07:00
David Eberius
d377a6b6f4 Added Software-based Performance Counters driver code along with several counters.
This code is the implementation of Software-base Performance Counters as described in the paper 'Using Software-Base Performance Counters to Expose Low-Level Open MPI Performance Information' in EuroMPI/USA '17 (http://icl.cs.utk.edu/news_pub/submissions/software-performance-counters.pdf).  More practical usage information can be found here: https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI.

All software events functions are put in macros that become no-ops when SOFTWARE_EVENTS_ENABLE is not defined.  The internal timer units have been changed to cycles to avoid division operations which was a large source of overhead as discussed in the paper.  Added a --with-spc configure option to enable SPCs in the Open MPI build.  This defines SOFTWARE_EVENTS_ENABLE.  Added an MCA parameter, mpi_spc_enable, for turning on specific counters.  Added an MCA parameter, mpi_spc_dump_enabled, for turning on and off dumping SPC counters in MPI_Finalize.  Added an SPC test and example.

Signed-off-by: David Eberius <deberius@vols.utk.edu>
2018-06-11 22:48:16 -04:00
Nathan Hjelm
74563d22a0 test/ddt_lib: remove UB/LB tests
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-05-31 09:44:19 -06:00
Geoffrey Paulsen
cc9f713d09 Fixing 'make check' test opal_fifo.
xlc on ppc64le complains about incompatible pointer types discards qualifiers.
This fix allows 'make check' to pass on ppc64le Power9 for master and v3.1.x.

> opal_fifo.c:110:26: warning: assigning to 'opal_list_item_t *' (aka 'struct opal_list_item_t *') from
>       'volatile opal_list_item_t *volatile' (aka 'volatile struct opal_list_item_t *volatile') discards qualifiers
>       [-Wincompatible-pointer-types-discards-qualifiers]
>     for (count = 0, item = fifo->opal_fifo_head.data.item ; item != &fifo->opal_fifo_ghost ;
>                          ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 1 warning generated.

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2018-05-03 21:31:27 -05:00
Howard Pritchard
0751578cbf
Merge pull request #4949 from hppritcha/topic/memkind_update
mpool/memkind: refactor to use the current API
2018-04-25 07:24:28 -06:00
Howard Pritchard
824197f886 mpool/memkind: refactor to use the current API
The mpool/memkind component was using a deprecated "partitions" API.
This commit refactors the memkind component to make use of the
supported public API.

The public API uses 3 parameters to specify a mpool "kind":

- a memkind type (which for now is just default or HBM)
- a memkind policy
- a memkind_bits (partly to specify pagesize)

The MCA parameters were changed to reflect these memkind
parameters.

Add a make check test for sanity checking of the memkind component.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-04-24 22:11:21 -06:00
luz.paz
06b121eb70 Misc. trivial typos
Found via `codespell -q 3`

Signed-off-by: luz paz <luzpaz@users.noreply.github.com>
2018-04-09 11:45:58 -04:00
Ralph Castain
345916f2f3 Remove the orte_nidmap test
Moved to the ompi-tests repo

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-16 11:47:44 -08:00
Gilles Gouaillardet
c988011afd test/util: test the regx framework
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 16:12:42 +09:00
Nathan Hjelm
7893248c5a opal/asm: add fetch-and-op atomics
This commit adds support for fetch-and-op atomics. This is needed
because and and or are irreversible operations so there needs to be a
way to get the old value atomically. These are also the only semantics
supported by C11 (there is not atomic_op_fetch, just
atomic_fetch_op). The old op-and-fetch atomics have been defined in
terms of fetch-and-op.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:23 -07:00
Nathan Hjelm
1282e98a01 opal/asm: rename existing arithmetic atomic functions
This commit renames the arithmetic atomic operations in opal to
indicate that they return the new value not the old value. This naming
differentiates these routines from new functions that return the old
value.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:22 -07:00
Nathan Hjelm
84f63d0aca opal/asm: add opal_atomic_compare_exchange_strong functions
This commit adds a new set of compare-and-exchange functions. These
functions have a signature similar to the functions found in C11. The
old cmpset functions are now deprecated and defined in terms of the
new compare-and-exchange functions. All asm backends have been
updated.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-29 12:45:44 -07:00
Nathan Hjelm
3ff34af355 opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset*
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-10-31 12:47:23 -06:00
George Bosilca
458ccc12e1
Move the profiling library in common/monitoring
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:18:23 -04:00