1
1

30986 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
675a899532 build: Allow symbol search tool to skip directories
At the end of "make install", a tool is run to search for common
symbols in the built artifacts, to work around issues on MacOS.
This tool requires an exclude list for symbols that must be
in the common section (such as in executables instead of libraries
and because Fortran).

This commit adds the ability to exclude certain directories from
the search, such as directories that are 3rd party packages or
only contain tests/executables, which will not run into problems
on MacOS.

To simplify that change, the file search in find_common_syms was
also rewritten to use the Perl-standard File::Find package instead
of calling the find executable.  Theoretically, this should be
mildly faster, but is also significantly easier to modify.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-09-30 23:34:11 +00:00
Nathan Hjelm
1f5ed0b83d
Merge pull request #8070 from devreal/osc-page-align
OSC RDMA: put memory for each process into separate pages
2020-09-29 07:54:45 -06:00
Joseph Schuchart
52b52b8ebb OSC RDMA: only touch pages before memory registration, don't fill them
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-29 07:45:07 +02:00
Joseph Schuchart
d11ccbada9 OSC RDMA: put memory of each process into separate pages
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-29 07:44:17 +02:00
bosilca
08f68671db
Merge pull request #8060 from bosilca/fix/ialltoallw
Prevent some rank from not increasing the non-blocking collective tag if they have no data to exchange.
2020-09-26 12:25:13 -04:00
Nathan Hjelm
920315611e
Merge pull request #8054 from hjelmn/kill_the_never_going_to_work_patcher_linux_component_to_prevent_future_confusion_as_to_its_effectiveness
patcher: remove the linux component
2020-09-24 19:27:02 -06:00
George Bosilca
96fea22cdd
Don't allow some rank to don't count the collective if they have no data
to exchange.

This is the same logic as in 77eaa5c applied to ialltoallw.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-24 13:29:01 -04:00
Yossi Itigin
b532564643
Merge pull request #8041 from brminich/topic/shmem_scoll_fix
SHMEM/SCOLL: Fix inplace reductions
2020-09-23 13:56:10 +03:00
Mikhail Brinskii
dfe20e0472 SHMEM/SCOLL: Fix inplace reductions
Signed-off-by: Mikhail Brinskii <mikhailb@nvidia.com>
2020-09-23 10:06:36 +03:00
bosilca
21c9c666ab
Merge pull request #8039 from bosilca/fix/adapt
Fix some corner cases with ADAPT
2020-09-18 17:18:41 -04:00
George Bosilca
77eaa5c8b8
Keep the non-blocking collective tags globally in sync.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-18 12:52:14 -04:00
George Bosilca
c98e387a53
Many fixes and improvements to ADAPT
- Add support for fallback to previous coll module on non-commutative operations (#30)
- Replace mutexes by atomic operations.
- Use the correct nbc request type (for both ibcast and ireduce)
  * coll/base: document type casts in ompi_coll_base_retain_*
- add module-wide topology cache
- use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast
- reduce number of memory allocations
- call the default request completion.
  - Remove the requests from the Fortran lookup conversion tables before completing
    and free it.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>

Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-18 12:50:17 -04:00
Nathan Hjelm
7fca99b2f7 patcher: remove the linux component
The Linux component was an attempt to hook calls by patching the dynamic
symbol table. It, unfortunately, does not work as it will always miss
calls made internally by glibc. For example, it might catch a user call
directly to munmap but will miss the chain free -> munmap. Since the
later is the common case we were trying to hook this made the component
unusable. This PR finally kills the component.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-09-18 10:23:01 -06:00
bosilca
eca00a7a3b
Merge pull request #8042 from bosilca/fix/sm_emu
Fix a copy/paste in the RDMA emulation.
2020-09-14 11:43:00 -04:00
Jeff Squyres
3a93e4f94d
Merge pull request #8038 from devreal/fix-opal-pmix-cond-init
Use correct conditional variable initializer in opal/mca/pmix/base
2020-09-14 09:38:43 -04:00
Jeff Squyres
d5791b2770
Merge pull request #8043 from ggouaillardet/topic/status_f2f08
mpif-h: fix a typo in MPI_Status_f2f08()
2020-09-14 09:32:34 -04:00
Gilles Gouaillardet
fb8bfccb83 mpif-h: fix a typo in MPI_Status_f2f08()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-14 13:56:16 +09:00
George Bosilca
49da998f33
Fix a copy/paste in the RDMA emulation.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-13 22:56:58 -04:00
Jeff Squyres
1b0dfcdfab
Merge pull request #7762 from ggouaillardet/topic/mpi_status_f08_c
Add missing MPI_Status conversion subroutines
2020-09-10 09:45:58 -04:00
Jeff Squyres
c04dc355de mpi/man: convert MPI_Status conversion man pages to Markdown
Convert the MPI_Status_f082f, MPI_Status_f082c, and MPI_Status_f2c man
pages to Markdown.  Fix some typos and improve the text a bit along
the way.

Left the raw NROFF redirect pages MPI_Status_f2f08, MPI_Status_c2f08,
and MPI_Status_c2f files as they were -- they're 1-line redirects, and
it seems simpler to leave those (vs. duplicating the Markdown).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-09-09 06:59:12 -07:00
Gilles Gouaillardet
e97d3ce645 Add missing MPI_Status conversion subroutines
Only in C bindings:
 - MPI_Status_c2f08()
 - MPI_Status_f082c()

In all bindings but mpif.h
 - MPI_Status_f082f()
 - MPI_Status_f2f08()

and the PMPI_* related subroutines

As initially inteded by the MPI forum, the Fortran to/from Fortran 2008
conversion subtoutines are *not* implemented in the mpif.h bindings.
See the discussion at https://github.com/mpi-forum/mpi-issues/issues/298

Refs. open-mpi/ompi#1475

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-09 06:59:12 -07:00
Gilles Gouaillardet
466a2b31e0 configury: cleanup .mod file
manually cleanup the generated .mod file in OMPI_FORTRAN_CHECK_BIND_C_TYPE

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-09 06:59:12 -07:00
Gilles Gouaillardet
7fce2f3057 update MPI_F08_status type
Make the C MPI_F08_status type definition match the updated
mpi_f08 type(MPI_Status) definition.

This fix the inconsistency introduced in open-mpi/ompi@98bc7af7d4

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-09 06:59:12 -07:00
Joseph Schuchart
b78c7e93db Use correct conditional variable initializer in opal/mca/pmix/base
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-09 09:05:30 +02:00
Joseph Schuchart
43e3addca6
Merge pull request #8035 from devreal/osc-ucx-fix-win-dynamic-segfault
UCX: do not dereference NULL pointer in wpmem_[free|flush]
2020-09-04 17:56:45 +02:00
Joseph Schuchart
fc025c78df UCX: do not dereference NULL pointer in wpmem_[free|flush]
Flushing or freeing a newly created dynamic window causes NULL to be passed.

Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-04 09:31:18 +02:00
Jeff Squyres
560ebc5780
Merge pull request #7716 from bosilca/coll/adapt
ADAPT: Event-driven collective implementation
2020-09-01 11:29:53 -04:00
Nathan Hjelm
01dcc39170
Merge pull request #8031 from hjelmn/some_btl_interface_cleanup
btl: remove unused descriptor flags
2020-08-31 16:29:12 -06:00
Nathan Hjelm
556a4ac0da btl: remove unused descriptor flags
This PR removes the MCA_BTL_DES_FLAGS_PUT and MCA_BTL_DES_FLAGS_GET
descriptor flags. At some point these had some meaning but they were
replaced by the rcache access flags.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-08-31 13:07:32 -06:00
Jeff Squyres
c17968c738
Merge pull request #8028 from devreal/fix-mpi3-manpage
Fix MPI versions in MPI.3 manpage
2020-08-31 09:15:32 -04:00
Joseph Schuchart
4d420348f7 Fix MPI versions in MPI.3 manpage
Thanks to Andy Riebs for reporting that on the Open MPI user mailing list (https://www.mail-archive.com/users@lists.open-mpi.org/msg34103.html)

Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-08-31 09:21:26 +02:00
bosilca
2b62a2b8c1
Merge pull request #8023 from abouteiller/bugfix/ob1_err_abort
errors_are_fatal_comm_handler takes a pointer to the error constant
2020-08-29 11:50:57 -04:00
Aurelien Bouteiller
4df5fcf48c
errors_are_fatal_comm_handler takes a pointer to the error constant as
input.

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-08-26 16:05:30 -04:00
Howard Pritchard
c1c71b22b9
Merge pull request #8002 from hppritcha/topic/ofi_gni_prov_patch_for_mtl
OFI: patch OFI MTL for GNI provider
2020-08-26 12:30:50 -06:00
Jeff Squyres
8727e981b2
Merge pull request #8015 from hppritcha/topic/squash_icc_no_log_warning
suppress icc long double message
2020-08-26 10:40:26 -04:00
Howard Pritchard
d6ac41cbbd OFI: patch OFI MTL for GNI provider
Uncovered a problem using the GNI provider with the OFI MTL.
See https://github.com/ofiwg/libfabric/issues/6194.

Related to #8001

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2020-08-26 08:25:53 -06:00
Brian Barrett
b1874e400e
Merge pull request #8019 from wckzhang/rsbandrsfix
coll/tuned: Revert RSB and RS default algorithms
2020-08-25 15:02:31 -07:00
William Zhang
57b95bcb45 coll/tuned: Revert RSB and RS default algorithms
Reduce scatter block and reduce scatter algorithms were hitting
correctness issues for non commutative strided tests. We will revert to
the original default algorithms for those two collectives (basic linear
and non overlapping respectively) in the non commutative op case.

See #8010

Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-08-25 08:44:24 -07:00
George Bosilca
ee592f3672 Address the comments on the PR.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-08-24 12:13:38 -07:00
Xi Luo
e59bde912e Remove the code handling zero count cases in ADAPT.
Set request in ibcast.c to empty when the count is 0.

Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-08-24 12:13:38 -07:00
George Bosilca
c2970a3695 Correctly handle non-blocking collectives tags
As it is possible to have multiple outstanding non-blocking collectives
provided by different collective modules, we need a consistent
mechanism to allow them to select unique tags for each instance of a
collective.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-08-24 12:13:38 -07:00
George Bosilca
8582e10d2b Consistent handling of zero counts in the MPI API.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-08-24 12:13:38 -07:00
George Bosilca
d71264569e Fix the atomic management of the bcast and reduce freelist
API consistent with other collective modules
Add comments
Other minor cleanups.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-08-24 12:13:38 -07:00
bsergentm
a4be3bb93d Coll/adapt Bull (#15)
* piggybacking Bull functionalities

* coll/adapt: Fix naming conventions and C11 atomic use

This commit fixes some naming convention issues, such as function names
which should follow the naming ompi_coll_adapt instead of
mca_coll_adapt, reserved for component and module naming (cf. tuned
collective component);

It also fixes the use of _Atomic construct, which is only valid in C11.
OPAL constructs have already been adapted to that use, so use
opal_atomic_* types instead.

* coll/adapt: Remove unused component field in module

This commit removes an unneeded field referencing the component in the
module of adapt, as it is already available through the
mca_coll_adapt_component global variable.

Signed-off-by: Marc Sergent <marc.sergent@atos.net>
Co-authored-by: Lemarinier, Pierre <pierre.lemarinier@atos.net>
Co-authored-by: pierrele <31764860+pierrele@users.noreply.github.com>
2020-08-24 12:13:38 -07:00
Xi Luo
fe73586808 Add ADAPT module
Add comments in the ADAPT module

Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-08-24 12:13:38 -07:00
Howard Pritchard
6df0e53421 suppress icc long double message
improve configury to check whether icc is handling no long double.
This prevents seeing 100s of messages like this:

icc: command line warning #10148: option '-Wno-long-double' not supported

A similar patch will be needed for pmix.

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2020-08-19 21:38:11 +00:00
Howard Pritchard
eefaadf7f1
Merge pull request #8012 from hppritcha/topic/mprobe_with_ofi_fix
ofi mtl: fix problem with mrecv
2020-08-18 17:21:37 -06:00
Howard Pritchard
e6f81ed6d6 ofi mtl: fix problem with mrecv
the ofi mtl mrecv was not properly setting the message in/out
arg to MPI_MRECV to MPI_MESSAGE_NULL.

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2020-08-18 15:39:19 -06:00
Jeff Squyres
bf4e1b4376
Merge pull request #8008 from jsquyres/pr/cleanup-of-mpi-errors-and-exceptions
Cleanup of MPI errors and exceptions
2020-08-17 16:41:25 -04:00
Jeff Squyres
20c772e733 Cleanup language about MPI exceptions --> errors
MPI-4 is finally cleaning up its language: an MPI "exception" does not
actually exist.  The only thing that exists is an MPI "error" (and
associated handlers).  This commit replaces all relevant uses of the
word "exception" with "error".  Note that this is still applicable in
versions of the MPI standard less than MPI-4.0 (indeed, nearly all the
cases fixed in this commit are just changes to comments, anyway).

One exception to this is the Java bindings, where there's an
MPIException class.  In hindsight, it probably should have been named
MPIError, but changing it now would break anyone who is using the Java
bindings.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-08-17 13:57:47 -04:00