1
1

703 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
dcf110d432
Add missing Makefile
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-22 13:17:34 -08:00
Ralph Castain
7e2874a83d
Save the old ORTE simple tests
Useful when debugging RTE-related issues

Not for inclusion in the tarball - just added to git repo for use by
developers.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-21 06:15:06 -08:00
Charles Shereda
cbc6feaab2 Created opal_gethostname() as safer gethostname substitute.
The opal_gethostname() function provides a more robust mechanism
to retrieve the hostname than gethostname(), which can return
results that are not null-terminated, and which can vary in its
behavior from system to system.

opal_gethostname() just returns the value in opal_process_info.nodename;
this is populated in opal_init_gethostname() inside opal_init.c.

-Changed all gethostname calls in opal subtree to opal_gethostname
-Changed all gethostname calls in orte subtree to opal_gethostname
-Changed all gethostname calls in ompi subdir to opal_gethostname
-Changed all gethostname calls in oshmem subdir to opal_gethostname
-Changed opal_if.c in test subdir to use opal_gethostname
-Changed opal_init.c to include opal_init_gethostname. This function
 returns an int and directly sets opal_process_info.nodename per
 jsquyres' modifications.

Relates to open-mpi#6801

Signed-off-by: Charles Shereda <cpshereda@lanl.gov>
2020-01-13 08:52:17 -08:00
Joseph Schuchart
c385c927fb Ensure proper alignment of memory provided by MPI
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-10-01 11:54:29 +02:00
Ralph Castain
373e816b37
Ensure buffer_unload leaves the buffer in a clean state
Silence a warning in orte/nidmap

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-04 08:32:27 -07:00
George Bosilca
82d632278a
Add a test for datatypes composed by multiple predefined
elements that can be merged into a larger UINT1 type.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-30 19:56:48 -04:00
Jeff Squyres
2ab8109be1 Update OPAL DDT variable names
These variables were renamed in
904276bb44caec207638247f23139bc21bc6a09e; update them to use the new
names.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-08-27 12:00:20 -07:00
George Bosilca
0a24f0374e
Small improvements on the test.
Rework the to_self test to be able to be used as a benchmark.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:09 -04:00
George Bosilca
6c75334162
Use the correct counter name in the example.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-05-29 00:54:56 -04:00
George Bosilca
d141bf7912 Update the datatype dump to match the actual types.
Update the comments to better reflect what is going on.
Minor indentations.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-05-10 18:03:57 -04:00
George Bosilca
e42b573cd3
Fix the PVAR allocation usage.
According to the MPI standard the obj_handle is a pointer to an MPI
object, and therefore cannot be MPI_COMM_WORLD. The MPI standard example
14.6 highlight this usage.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-02-02 19:03:43 -05:00
George Bosilca
5a82c4fd07
Provide a better fix for #6285.
The issue was a little complicated due to the internal stack used in the
convertor. The main issue was that in the case where we run out of iov
space to save the raw description of the data while hanbdling a
repetition (loop), instead of saving the current position and bailing out
directly we reading of the next predefined type element. It worked in
most cases, except the one identified by the HDF5 test. However, the
biggest issue here was the drop in performance for all ensuing calls to
the convertor pack/unpack, as instead of handling contiguous loops as a
whole (and minimizing the number of memory copies) we copied data
description by data description.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-01-31 10:01:48 -05:00
Nathan Hjelm
ea40d48899
Merge pull request #6295 from ggouaillardet/topic/opal_convertor_raw
opal/datatype: fix opal_convertor_raw()
2019-01-29 10:57:29 -07:00
Gilles Gouaillardet
45fb69b2b9 ompi/datatype: fix how we compute the space needed for the args
Refs. open-mpi/ompi#6275

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-01-28 15:26:11 +09:00
Gilles Gouaillardet
0832ab5acc opal/datatype: fix opal_convertor_raw
correctly handle the case in which iovec is full and the
last accessed element of the datatype is the beginning of a loop

Refs. open-mpi/ompi#6285

Thanks Axel Huebl for reporting this

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-01-23 15:38:43 +09:00
bosilca
182a2db2a4
Merge pull request #6029 from ggouaillardet/topic/large_datatypes
opal/datatype: correctly handle large datatypes
2018-12-24 12:49:52 -05:00
Nathan Hjelm
46255d0790 test: call opal_init/finalize_util in ddt tests
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-12-18 14:37:04 -07:00
Nathan Hjelm
0edfd328f8 opal: clean up init/finalize
This commit contains the following changes:

 - Remove the unused opal_test_init/opal_test_finalize
   functions. These functions are not used by anything in the code
   base or MTT. Tests use opal_init_util/opal_finalize_util instead.

 - Get rid of gotos in opal_init_util and opal_init. Replaced them
   with a cleaner solution.

 - Automatically register cleanup functions in init functions. The
   cleanup functions are executed in the reverse order of the
   initialization functions. The cleanup functions are run in
   opal_finalize_util() before tearing down the class system.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-12-18 14:37:04 -07:00
George Bosilca
1d8ad9281f Add more details about what is going on.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-12-06 13:30:58 +09:00
George Bosilca
88a693bf71 Add a test for very large data.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-12-06 13:30:58 +09:00
Brian Barrett
e9e4d2a4bc Handle asprintf errors with opal_asprintf wrapper
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error.  However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined.  Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
Nathan Hjelm
000f9eed4d opal: add types for atomic variables
This commit updates the entire codebase to use specific opal types for
all atomic variables. This is a change from the prior atomic support
which required the use of the volatile keyword. This is the first step
towards implementing support for C11 atomics as that interface
requires the use of types declared with the _Atomic keyword.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:48:55 -06:00
Gilles Gouaillardet
a02be5e91a test: protect <sys/mount.h> with the HAVE_SYS_MOUNT_H macro
Thanks Zoltan Mizsei for bringing this to our attention.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-08-24 17:03:54 +09:00
Nathan Hjelm
1c84f48640 config: remove OPAL_ENABLE_MULTI_THREADS config macro
We long ago hard-coded this value to 1. This commit cleans it out
entirely.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-23 13:47:02 -06:00
Thananon Patinyasakdikul
390d72addd
Merge pull request #4885 from davideberius/spc_pr
Initial Software-based Performance Counters PR
2018-06-12 14:04:49 -07:00
David Eberius
d377a6b6f4 Added Software-based Performance Counters driver code along with several counters.
This code is the implementation of Software-base Performance Counters as described in the paper 'Using Software-Base Performance Counters to Expose Low-Level Open MPI Performance Information' in EuroMPI/USA '17 (http://icl.cs.utk.edu/news_pub/submissions/software-performance-counters.pdf).  More practical usage information can be found here: https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI.

All software events functions are put in macros that become no-ops when SOFTWARE_EVENTS_ENABLE is not defined.  The internal timer units have been changed to cycles to avoid division operations which was a large source of overhead as discussed in the paper.  Added a --with-spc configure option to enable SPCs in the Open MPI build.  This defines SOFTWARE_EVENTS_ENABLE.  Added an MCA parameter, mpi_spc_enable, for turning on specific counters.  Added an MCA parameter, mpi_spc_dump_enabled, for turning on and off dumping SPC counters in MPI_Finalize.  Added an SPC test and example.

Signed-off-by: David Eberius <deberius@vols.utk.edu>
2018-06-11 22:48:16 -04:00
Nathan Hjelm
74563d22a0 test/ddt_lib: remove UB/LB tests
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-05-31 09:44:19 -06:00
Geoffrey Paulsen
cc9f713d09 Fixing 'make check' test opal_fifo.
xlc on ppc64le complains about incompatible pointer types discards qualifiers.
This fix allows 'make check' to pass on ppc64le Power9 for master and v3.1.x.

> opal_fifo.c:110:26: warning: assigning to 'opal_list_item_t *' (aka 'struct opal_list_item_t *') from
>       'volatile opal_list_item_t *volatile' (aka 'volatile struct opal_list_item_t *volatile') discards qualifiers
>       [-Wincompatible-pointer-types-discards-qualifiers]
>     for (count = 0, item = fifo->opal_fifo_head.data.item ; item != &fifo->opal_fifo_ghost ;
>                          ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 1 warning generated.

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2018-05-03 21:31:27 -05:00
Howard Pritchard
0751578cbf
Merge pull request #4949 from hppritcha/topic/memkind_update
mpool/memkind: refactor to use the current API
2018-04-25 07:24:28 -06:00
Howard Pritchard
824197f886 mpool/memkind: refactor to use the current API
The mpool/memkind component was using a deprecated "partitions" API.
This commit refactors the memkind component to make use of the
supported public API.

The public API uses 3 parameters to specify a mpool "kind":

- a memkind type (which for now is just default or HBM)
- a memkind policy
- a memkind_bits (partly to specify pagesize)

The MCA parameters were changed to reflect these memkind
parameters.

Add a make check test for sanity checking of the memkind component.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-04-24 22:11:21 -06:00
luz.paz
06b121eb70 Misc. trivial typos
Found via `codespell -q 3`

Signed-off-by: luz paz <luzpaz@users.noreply.github.com>
2018-04-09 11:45:58 -04:00
Ralph Castain
345916f2f3 Remove the orte_nidmap test
Moved to the ompi-tests repo

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-16 11:47:44 -08:00
Gilles Gouaillardet
c988011afd test/util: test the regx framework
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 16:12:42 +09:00
Nathan Hjelm
7893248c5a opal/asm: add fetch-and-op atomics
This commit adds support for fetch-and-op atomics. This is needed
because and and or are irreversible operations so there needs to be a
way to get the old value atomically. These are also the only semantics
supported by C11 (there is not atomic_op_fetch, just
atomic_fetch_op). The old op-and-fetch atomics have been defined in
terms of fetch-and-op.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:23 -07:00
Nathan Hjelm
1282e98a01 opal/asm: rename existing arithmetic atomic functions
This commit renames the arithmetic atomic operations in opal to
indicate that they return the new value not the old value. This naming
differentiates these routines from new functions that return the old
value.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:22 -07:00
Nathan Hjelm
84f63d0aca opal/asm: add opal_atomic_compare_exchange_strong functions
This commit adds a new set of compare-and-exchange functions. These
functions have a signature similar to the functions found in C11. The
old cmpset functions are now deprecated and defined in terms of the
new compare-and-exchange functions. All asm backends have been
updated.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-29 12:45:44 -07:00
Nathan Hjelm
3ff34af355 opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset*
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-10-31 12:47:23 -06:00
George Bosilca
458ccc12e1
Move the profiling library in common/monitoring
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:18:23 -04:00
bosilca
a680b3ac6d Merge pull request #3853 from clementFoyer/master
OMPI monitoring: Simplify the communicator's name caching management + misc test changes
2017-09-25 12:14:36 -04:00
Brian Barrett
bffcc3bca0 util: move graph solver from usnic to util
Cisco wrote a bipartite graph solver to properly solve
interface pair selection for usNIC.  Using the reachable
framework, the TCP BTL (and possibly the runtime network
code) can use the graph solver to make more optimal pair
selection.  Jeff was happy to have the code more broadly
used, but didn't have time to do the move, hence this
commit.

There are a couple of minor changes to the code compared
to the usNIC version.  Obviously, the functions have
been renamed to match naming convention for their new
home.  Since it's easier to write unit tests for
util/ code, the unit tests have been made first class
tests run at "make check" time.  This last bit required
moving some of the definitions into a new header,
bipartite_graph_internal.h, so that they could be
included in both the library code and the test code.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-15 15:08:47 -07:00
Clement Foyer
d5c192c825 Fix typos. Fix improper output on test. Reorder benchmarks.
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-09-11 17:37:25 +02:00
Jeff Squyres
dee8cfbfd0 opal_path_nfs: ensure arrays are always long enough
This test used to have fixed-sized arrays for the mounts that it was
checking.  However, we periodically run across machines with more
mounts than can fit into those fixed-size arrays.  Rather than
periodically increasing the size of those arrays (after re-discovering
that the error is due to fixed-size arrays), just count how many
entries there are and make arrays that are big enough.

Additionally, add a check to ensure that we don't go over the max size
of the array when reading/filling them.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-09-06 07:01:45 -07:00
Mark Allen
9b029c1be3 removing nmcheck_prefix.pl due to false positives
This test has proven to produce too many false positives so far. I hope
to re-enable it in the future, but until it has a longer history of not
producing false postivies it doesn't need to produce false nuisance
failures for everybody.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-08-24 13:01:39 -04:00
Jeff Squyres
9d09fe0151 nmcheck_prefix: more updates for more compilers
Ignore a few more symbols to pass Absoft and modern gcc.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 12:28:49 -07:00
Mark Allen
245006a23d updating nmcheck_prefix.pl to accept some more compiler-generated names
Someone posted an MTT test where libmpi_usempi_ignore_tkr.so ended
up with symbols like these being identifed as errors:
    [error]   MPI
    [error]   _Cmpi_fortran_status_ignore
    [error]   _Cmpi_fortran_statuses_ignore
those must be compiler-generated names so we shouldn't identify them
as problematic.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-08-15 15:48:22 -04:00
KAWASHIMA Takahiro
04ed29ceac Merge pull request #4036 from kawashima-fj/pr/nmcheck
test: Update nmcheck_prefix.pl
2017-08-08 00:23:18 -05:00
Nathan Hjelm
76320a8ba5 opal: rename opal_atomic_init to opal_atomic_lock_init
This function is used to initalize and opal atomic lock. The old name
was confusing.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 14:15:11 -06:00
KAWASHIMA Takahiro
d468cdb7a6 test: Update nmcheck_prefix.pl
The linker of Linux/AArch64 (at least) generates `__bss_start__`,
`__bss_end__`, `_bss_end__`, and `__end__` symbols.

`libmpi_usempi_ignore_tkr.so` is added but `libmpi_usempif08.so`
is not added because `use-mpi-f08` has `contains` statements
in modules and compilers automatically generate compiler-specific
symbols for them. For example, gfortran 4.9 generates
`__mpi_f08_callbacks_MOD_mpi_comm_dup_fn` etc.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-08-07 13:54:15 +09:00
Nathan Hjelm
ebce88b7ad opal: remove generated asm code
Every modern compiler supports either inline assembly or builtin atomic
operations. Because of this it is time to delete all the code associated
with pre-built atomics.

This commit also clean out the DEC and XLC asm checks. Neither check
does anything and the XLC compiler supports GCC ASM.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-03 09:18:58 -06:00
Mark Allen
f0af4636ce testcase to check for bad symbol name prefixes
This checks the main libs that would be directly or indirectly linked
against the users executable (libmpi.so, libmpi_mpifh.so, libmpi_usempi.so,
libopen-rte, libopen-pal) using "nm" and looking for symbols without ompi_
opal_ mpi_ etc prefixes.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-07-11 02:13:21 -04:00