1
1

30998 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
f500371a87 build: Fix precious variable passing to sub-configure
Be more careful than just exporting CPPFLAGS (or not) to sub-
configure scripts.  This fixes a bug in which --enable-visibility
would cause PRRTE's configure to fail, because the top-level
configure added -Wmissing-prototypes to CPPFLAGS and then
the subconfigure added -Werror at one point.  In general,
blindly exporting all the CPPFLAGS OMPI adds was a bad idea, so
we instead only export precious variables if they were
set in the calling environment, on the command line of the
top-level configure, or explicitly added to the sub-
configure environment (like CPPFLAGS for PMIx/PRRTE).

Add some envirnoment scrubbing/saving/restore wrappers and
modify PAC_CONFIG_SUBDIR_ARGS to play a little nicer with
precious variables so that this all works.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-05 19:55:33 +00:00
Brian Barrett
57df1084cd build: Fix sed expression used to ignore subconfig args
The ignore argument to PAC_CONFIG_SUBDIR_ARGS is an m4 list of
sed expressions.  --with-platform=.* ignored not just the platform
argument, but everything after it.  Fix the regular expressions to
ignore everything until the next whitespace.  This probably still
isn't entirely right, because it will fail if the argument has
spaces in it (like a path with spaces), but we fail that test
so many other places that it does not add to the fail.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-05 19:55:33 +00:00
Brian Barrett
e5d6952c9b build: Expose 3rd-party package CPPFLAGS
In reworking the 3rd-party package support for Libevent and HWLOC,
it appears that we missed exporting the opal_<package>_CPPFLAGS
variable (despite documentation).  Fix that shortcoming.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-03 16:45:42 +00:00
Gilles Gouaillardet
6c46da3245
Merge pull request #8076 from bwbarrett/bugfix/mpi-types-ignore
.gitignore: Ignore use-mpi's types file
2020-10-02 18:41:04 +09:00
Brian Barrett
8680ea99f2 build: Ignore use-mpi's types file
This file is autogenerated, so git should not try to track it.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 19:55:57 +00:00
Brian Barrett
7592eb3889
Merge pull request #8069 from bwbarrett/feature/3rdparty-packaging
Rework how Open MPI integrates with 3rd party packages
2020-10-01 12:28:31 -07:00
Brian Barrett
a30eb69cbb libevent: Upgrade Libevent to 2.1.12-stable
The refactoring patches move Libevent from a framework integration
to a 3rd-party package, but did not change the Libevent version
that Open MPI ships.  During that swap, we stopped running the
Autotools on Libevent and relied on the tools the Libevent authors
used when building the 2.0.22 release tarball.  The config.guess
in this release tarball did not work on the IBM systems.

This patch updates the release version of Libevent to 2.1.12-stable,
which will suck in a bunch of upstream bug fixes and updates
the config.guess so that the 3rd-party refactoring actually
compiles on the IBM Power systems.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:56:06 +00:00
Brian Barrett
c5d8037b85 build: Move PRRTE to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves the prrte submodule from the top-level to the
3rd-party directory, to match the behavior of other 3rd-party
packages like Libevent and PMIx.  Since Open MPI does not
support building with an external PRRTE, that functionality
is skipped in this patch.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:56:01 +00:00
Brian Barrett
8f89d15d31 build: Move PMIx to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves the PMIx library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL.  Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components.  Any pre-installed version of PMIx 3.0.0 or
newer is preferred over the internal version.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:56:00 +00:00
Brian Barrett
0e9581d478 build: Move hwloc to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves the hwloc library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL.  Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components.  Any pre-installed version of HWLOC 1.6 or
newer is preferred over the internal version.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:55:59 +00:00
Brian Barrett
9ffac85650 build: Move libevent to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves libevent from an MCA framework to a stand-alone
library built outside of OPAL.  A wrapper in opal/util is provided
to minimize the unnecessary changes in the rest of the code.  When
using the internal Libevent, it will be installed as a stand-alone
libevent.a, instead of bundled in OPAL.  Any pre-installed version
of Libevent at or after 2.0.21 is preferred over the internal
version.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:55:58 +00:00
Brian Barrett
389b4b3a78 build: Add third-party package infrastructure
With Open MPI 5.0, the decision was made to stop building 3rd-party
packages, such as Libevent, HWLOC, PMIx, and PRRTE as MCA components
and instead 1) start relying on external libraries whenever possible
and 2) Open MPI builds the 3rd party libraries (if needed) as
independent libraries, rather than linked into libopen-pal.

This patch is the first step in that process, providing foundational
changes required for supporting 3rd-party packages, such as changes
to autogen.pl, the top-level Makefile.am, and introducing two
Autoconf macros to support running sub-configure scripts; one
supporting source in tarball form and the other supporting
source in a sub-tree.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:55:48 +00:00
Brian Barrett
675a899532 build: Allow symbol search tool to skip directories
At the end of "make install", a tool is run to search for common
symbols in the built artifacts, to work around issues on MacOS.
This tool requires an exclude list for symbols that must be
in the common section (such as in executables instead of libraries
and because Fortran).

This commit adds the ability to exclude certain directories from
the search, such as directories that are 3rd party packages or
only contain tests/executables, which will not run into problems
on MacOS.

To simplify that change, the file search in find_common_syms was
also rewritten to use the Perl-standard File::Find package instead
of calling the find executable.  Theoretically, this should be
mildly faster, but is also significantly easier to modify.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-09-30 23:34:11 +00:00
Nathan Hjelm
1f5ed0b83d
Merge pull request #8070 from devreal/osc-page-align
OSC RDMA: put memory for each process into separate pages
2020-09-29 07:54:45 -06:00
Joseph Schuchart
52b52b8ebb OSC RDMA: only touch pages before memory registration, don't fill them
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-29 07:45:07 +02:00
Joseph Schuchart
d11ccbada9 OSC RDMA: put memory of each process into separate pages
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-29 07:44:17 +02:00
bosilca
08f68671db
Merge pull request #8060 from bosilca/fix/ialltoallw
Prevent some rank from not increasing the non-blocking collective tag if they have no data to exchange.
2020-09-26 12:25:13 -04:00
Nathan Hjelm
920315611e
Merge pull request #8054 from hjelmn/kill_the_never_going_to_work_patcher_linux_component_to_prevent_future_confusion_as_to_its_effectiveness
patcher: remove the linux component
2020-09-24 19:27:02 -06:00
George Bosilca
96fea22cdd
Don't allow some rank to don't count the collective if they have no data
to exchange.

This is the same logic as in 77eaa5c applied to ialltoallw.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-24 13:29:01 -04:00
Yossi Itigin
b532564643
Merge pull request #8041 from brminich/topic/shmem_scoll_fix
SHMEM/SCOLL: Fix inplace reductions
2020-09-23 13:56:10 +03:00
Mikhail Brinskii
dfe20e0472 SHMEM/SCOLL: Fix inplace reductions
Signed-off-by: Mikhail Brinskii <mikhailb@nvidia.com>
2020-09-23 10:06:36 +03:00
bosilca
21c9c666ab
Merge pull request #8039 from bosilca/fix/adapt
Fix some corner cases with ADAPT
2020-09-18 17:18:41 -04:00
George Bosilca
77eaa5c8b8
Keep the non-blocking collective tags globally in sync.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-18 12:52:14 -04:00
George Bosilca
c98e387a53
Many fixes and improvements to ADAPT
- Add support for fallback to previous coll module on non-commutative operations (#30)
- Replace mutexes by atomic operations.
- Use the correct nbc request type (for both ibcast and ireduce)
  * coll/base: document type casts in ompi_coll_base_retain_*
- add module-wide topology cache
- use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast
- reduce number of memory allocations
- call the default request completion.
  - Remove the requests from the Fortran lookup conversion tables before completing
    and free it.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>

Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-18 12:50:17 -04:00
Nathan Hjelm
7fca99b2f7 patcher: remove the linux component
The Linux component was an attempt to hook calls by patching the dynamic
symbol table. It, unfortunately, does not work as it will always miss
calls made internally by glibc. For example, it might catch a user call
directly to munmap but will miss the chain free -> munmap. Since the
later is the common case we were trying to hook this made the component
unusable. This PR finally kills the component.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-09-18 10:23:01 -06:00
bosilca
eca00a7a3b
Merge pull request #8042 from bosilca/fix/sm_emu
Fix a copy/paste in the RDMA emulation.
2020-09-14 11:43:00 -04:00
Jeff Squyres
3a93e4f94d
Merge pull request #8038 from devreal/fix-opal-pmix-cond-init
Use correct conditional variable initializer in opal/mca/pmix/base
2020-09-14 09:38:43 -04:00
Jeff Squyres
d5791b2770
Merge pull request #8043 from ggouaillardet/topic/status_f2f08
mpif-h: fix a typo in MPI_Status_f2f08()
2020-09-14 09:32:34 -04:00
Gilles Gouaillardet
fb8bfccb83 mpif-h: fix a typo in MPI_Status_f2f08()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-14 13:56:16 +09:00
George Bosilca
49da998f33
Fix a copy/paste in the RDMA emulation.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-13 22:56:58 -04:00
Jeff Squyres
1b0dfcdfab
Merge pull request #7762 from ggouaillardet/topic/mpi_status_f08_c
Add missing MPI_Status conversion subroutines
2020-09-10 09:45:58 -04:00
Jeff Squyres
c04dc355de mpi/man: convert MPI_Status conversion man pages to Markdown
Convert the MPI_Status_f082f, MPI_Status_f082c, and MPI_Status_f2c man
pages to Markdown.  Fix some typos and improve the text a bit along
the way.

Left the raw NROFF redirect pages MPI_Status_f2f08, MPI_Status_c2f08,
and MPI_Status_c2f files as they were -- they're 1-line redirects, and
it seems simpler to leave those (vs. duplicating the Markdown).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-09-09 06:59:12 -07:00
Gilles Gouaillardet
e97d3ce645 Add missing MPI_Status conversion subroutines
Only in C bindings:
 - MPI_Status_c2f08()
 - MPI_Status_f082c()

In all bindings but mpif.h
 - MPI_Status_f082f()
 - MPI_Status_f2f08()

and the PMPI_* related subroutines

As initially inteded by the MPI forum, the Fortran to/from Fortran 2008
conversion subtoutines are *not* implemented in the mpif.h bindings.
See the discussion at https://github.com/mpi-forum/mpi-issues/issues/298

Refs. open-mpi/ompi#1475

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-09 06:59:12 -07:00
Gilles Gouaillardet
466a2b31e0 configury: cleanup .mod file
manually cleanup the generated .mod file in OMPI_FORTRAN_CHECK_BIND_C_TYPE

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-09 06:59:12 -07:00
Gilles Gouaillardet
7fce2f3057 update MPI_F08_status type
Make the C MPI_F08_status type definition match the updated
mpi_f08 type(MPI_Status) definition.

This fix the inconsistency introduced in open-mpi/ompi@98bc7af7d4

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-09 06:59:12 -07:00
Joseph Schuchart
b78c7e93db Use correct conditional variable initializer in opal/mca/pmix/base
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-09 09:05:30 +02:00
Joseph Schuchart
43e3addca6
Merge pull request #8035 from devreal/osc-ucx-fix-win-dynamic-segfault
UCX: do not dereference NULL pointer in wpmem_[free|flush]
2020-09-04 17:56:45 +02:00
Joseph Schuchart
fc025c78df UCX: do not dereference NULL pointer in wpmem_[free|flush]
Flushing or freeing a newly created dynamic window causes NULL to be passed.

Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-04 09:31:18 +02:00
Jeff Squyres
560ebc5780
Merge pull request #7716 from bosilca/coll/adapt
ADAPT: Event-driven collective implementation
2020-09-01 11:29:53 -04:00
Nathan Hjelm
01dcc39170
Merge pull request #8031 from hjelmn/some_btl_interface_cleanup
btl: remove unused descriptor flags
2020-08-31 16:29:12 -06:00
Nathan Hjelm
556a4ac0da btl: remove unused descriptor flags
This PR removes the MCA_BTL_DES_FLAGS_PUT and MCA_BTL_DES_FLAGS_GET
descriptor flags. At some point these had some meaning but they were
replaced by the rcache access flags.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-08-31 13:07:32 -06:00
Jeff Squyres
c17968c738
Merge pull request #8028 from devreal/fix-mpi3-manpage
Fix MPI versions in MPI.3 manpage
2020-08-31 09:15:32 -04:00
Joseph Schuchart
4d420348f7 Fix MPI versions in MPI.3 manpage
Thanks to Andy Riebs for reporting that on the Open MPI user mailing list (https://www.mail-archive.com/users@lists.open-mpi.org/msg34103.html)

Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-08-31 09:21:26 +02:00
bosilca
2b62a2b8c1
Merge pull request #8023 from abouteiller/bugfix/ob1_err_abort
errors_are_fatal_comm_handler takes a pointer to the error constant
2020-08-29 11:50:57 -04:00
Aurelien Bouteiller
4df5fcf48c
errors_are_fatal_comm_handler takes a pointer to the error constant as
input.

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-08-26 16:05:30 -04:00
Howard Pritchard
c1c71b22b9
Merge pull request #8002 from hppritcha/topic/ofi_gni_prov_patch_for_mtl
OFI: patch OFI MTL for GNI provider
2020-08-26 12:30:50 -06:00
Jeff Squyres
8727e981b2
Merge pull request #8015 from hppritcha/topic/squash_icc_no_log_warning
suppress icc long double message
2020-08-26 10:40:26 -04:00
Howard Pritchard
d6ac41cbbd OFI: patch OFI MTL for GNI provider
Uncovered a problem using the GNI provider with the OFI MTL.
See https://github.com/ofiwg/libfabric/issues/6194.

Related to #8001

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2020-08-26 08:25:53 -06:00
Brian Barrett
b1874e400e
Merge pull request #8019 from wckzhang/rsbandrsfix
coll/tuned: Revert RSB and RS default algorithms
2020-08-25 15:02:31 -07:00
William Zhang
57b95bcb45 coll/tuned: Revert RSB and RS default algorithms
Reduce scatter block and reduce scatter algorithms were hitting
correctness issues for non commutative strided tests. We will revert to
the original default algorithms for those two collectives (basic linear
and non overlapping respectively) in the non commutative op case.

See #8010

Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-08-25 08:44:24 -07:00