1
1

31015 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
16b49dc5b3 A complete overhaul of the HAN code.
Among many other things:
- Fix an imbalance bug in MPI_allgather
- Accept more human readable configuration files. We can now specify
  the collective by name instead of a magic number, and the component
  we want to use also by name.
- Add the capability to have optional arguments in the collective
  communication configuration file. Right now the capability exists
  for segment lengths, but is yet to be connected with the algorithms.
- Redo the initialization of all HAN collectives.

Cleanup the fallback collective support.
- In case the module is unable to deliver the expected result, it will fallback
  executing the collective operation on another collective component. This change
  make the support for this fallback simpler to use.
- Implement a fallback allowing a HAN module to remove itself as
  potential active collective module, and instead fallback to the
  next module in line.
- Completely disable the HAN modules on error. From the moment an error is
  encountered they remove themselves from the communicator, and in case some
  other modules calls them simply behave as a pass-through.

Communicator: provide ompi_comm_split_with_info to split and provide info at the same time
Add ompi_comm_coll_preference info key to control collective component selection

COLL HAN: use info keys instead of component-level variable to communicate topology level between abstraction layers
- The info value is a comma-separated list of entries, which are chosen with
  decreasing priorities. This overrides the priority of the component,
  unless the component has disqualified itself.
  An entry prefixed with ^ starts the ignore-list. Any entry following this
  character will be ingnored during the collective component selection for the
  communicator.
  Example: "sm,libnbc,^han,adapt" gives sm the highest preference, followed
  by libnbc. The components han and adapt are ignored in the selection process.
- Allocate a temporary buffer for all lower-level leaders (length 2 segments)
- Fix the handling of MPI_IN_PLACE for gather and scatter.

COLL HAN: Fix topology handling
 - HAN should not rely on node names to determine the ordering of ranks.
   Instead, use the node leaders as identifiers and short-cut if the
   node-leaders agree that ranks are consecutive. Also, error out if
   the rank distribution is imbalanced for now.

Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-10-25 18:13:16 -04:00
bsergentm
220b997a58 Coll/han Bull
* first import of Bull specific modifications to HAN

* Cleaning, renaming and compilation fixing Changed all future into han.

* Import BULL specific modifications in coll/tuned and coll/base

* Fixed compilation issues in Han

* Changed han_output to directly point to coll framework output.

* The verbosity MCA parameter was removed as a duplicated of coll verbosity

* Add fallback in han reduce when op cannot commute and ppn are imbalanced

* Added fallback wfor han bcast when nodes do not have the same number of process

* Add fallback in han scatter when ppn are imbalanced

+ fixed missing scatter_fn pointer in the module interface

Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
Co-authored-by: a700850 <pierre.lemarinier@atos.net>
Co-authored-by: germainf <florent.germain@atos.net>
2020-10-09 14:17:46 -04:00
Xi Luo
182c333b21 Initial import of the HAN collective module
a hierarchical, architecture-aware collective communication module.

Add Reduce and remove up_seg_size and low_seg_size in Bcast
Increase HAN's priority

Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-10-09 14:17:46 -04:00
Jeff Squyres
0bcef049c9
Merge pull request #8053 from shintaro-iwasaki/topic/fix_issue_8036
opal/mca/threads/qthreads: Fix #8036
2020-10-08 09:40:58 -04:00
Jeff Squyres
4c86172886
Merge pull request #8081 from nariba-fj/pr/fix-typo-in-opal-util-stacktrace
opal/util: Fix typo
2020-10-08 09:06:12 -04:00
NARIBAYASHI Akira
3dc3bbc1b1 opal/util: Fix typo
Signed-off-by: NARIBAYASHI Akira <a.naribayashi@fujitsu.com>
2020-10-08 16:13:09 +09:00
bosilca
a541ab933d
Merge pull request #8066 from devreal/spc-pml-ucx
PML UCX: add SPC instrumentation for sent/received message sizes
2020-10-07 21:56:27 -04:00
bosilca
9f114c6afe
Merge pull request #8067 from devreal/fix-ob1-spc
PML OB1: Fix potential overcounting of SPC sent/received message sizes and account for fin messages
2020-10-05 22:26:24 -04:00
Brian Barrett
9779813cb6
Merge pull request #8077 from bwbarrett/bugfix/visibility
Fix 3rd-party package CPPFLAGS (and others) handlinge
2020-10-05 13:28:58 -07:00
Brian Barrett
78dfe451ed build: Fix PRRTE prefix_by_default handling
Fix typo that broke backward-compatible prefix-by-default argument
handling.  Remove some dead code while we're here.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-05 19:55:33 +00:00
Brian Barrett
f500371a87 build: Fix precious variable passing to sub-configure
Be more careful than just exporting CPPFLAGS (or not) to sub-
configure scripts.  This fixes a bug in which --enable-visibility
would cause PRRTE's configure to fail, because the top-level
configure added -Wmissing-prototypes to CPPFLAGS and then
the subconfigure added -Werror at one point.  In general,
blindly exporting all the CPPFLAGS OMPI adds was a bad idea, so
we instead only export precious variables if they were
set in the calling environment, on the command line of the
top-level configure, or explicitly added to the sub-
configure environment (like CPPFLAGS for PMIx/PRRTE).

Add some envirnoment scrubbing/saving/restore wrappers and
modify PAC_CONFIG_SUBDIR_ARGS to play a little nicer with
precious variables so that this all works.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-05 19:55:33 +00:00
Brian Barrett
57df1084cd build: Fix sed expression used to ignore subconfig args
The ignore argument to PAC_CONFIG_SUBDIR_ARGS is an m4 list of
sed expressions.  --with-platform=.* ignored not just the platform
argument, but everything after it.  Fix the regular expressions to
ignore everything until the next whitespace.  This probably still
isn't entirely right, because it will fail if the argument has
spaces in it (like a path with spaces), but we fail that test
so many other places that it does not add to the fail.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-05 19:55:33 +00:00
Shintaro Iwasaki
84dcb233bf mca/threads: set THREAD_* flags in the component's root configure.m4
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
2020-10-05 11:39:26 -05:00
Shintaro Iwasaki
919a16300c mca/threads/qthreads: implement missing functionalities
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
2020-10-05 11:39:18 -05:00
Shintaro Iwasaki
db3e598b6a mca/threads/qthreads: remove Argobots dependency
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
2020-10-05 11:39:09 -05:00
Shintaro Iwasaki
6cc17b0c6a mca/threads/qthreads: rework configury to be smarter
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
2020-10-05 11:39:01 -05:00
Brian Barrett
e5d6952c9b build: Expose 3rd-party package CPPFLAGS
In reworking the 3rd-party package support for Libevent and HWLOC,
it appears that we missed exporting the opal_<package>_CPPFLAGS
variable (despite documentation).  Fix that shortcoming.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-03 16:45:42 +00:00
Gilles Gouaillardet
6c46da3245
Merge pull request #8076 from bwbarrett/bugfix/mpi-types-ignore
.gitignore: Ignore use-mpi's types file
2020-10-02 18:41:04 +09:00
Brian Barrett
8680ea99f2 build: Ignore use-mpi's types file
This file is autogenerated, so git should not try to track it.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 19:55:57 +00:00
Brian Barrett
7592eb3889
Merge pull request #8069 from bwbarrett/feature/3rdparty-packaging
Rework how Open MPI integrates with 3rd party packages
2020-10-01 12:28:31 -07:00
Brian Barrett
a30eb69cbb libevent: Upgrade Libevent to 2.1.12-stable
The refactoring patches move Libevent from a framework integration
to a 3rd-party package, but did not change the Libevent version
that Open MPI ships.  During that swap, we stopped running the
Autotools on Libevent and relied on the tools the Libevent authors
used when building the 2.0.22 release tarball.  The config.guess
in this release tarball did not work on the IBM systems.

This patch updates the release version of Libevent to 2.1.12-stable,
which will suck in a bunch of upstream bug fixes and updates
the config.guess so that the 3rd-party refactoring actually
compiles on the IBM Power systems.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:56:06 +00:00
Brian Barrett
c5d8037b85 build: Move PRRTE to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves the prrte submodule from the top-level to the
3rd-party directory, to match the behavior of other 3rd-party
packages like Libevent and PMIx.  Since Open MPI does not
support building with an external PRRTE, that functionality
is skipped in this patch.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:56:01 +00:00
Brian Barrett
8f89d15d31 build: Move PMIx to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves the PMIx library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL.  Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components.  Any pre-installed version of PMIx 3.0.0 or
newer is preferred over the internal version.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:56:00 +00:00
Brian Barrett
0e9581d478 build: Move hwloc to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves the hwloc library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL.  Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components.  Any pre-installed version of HWLOC 1.6 or
newer is preferred over the internal version.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:55:59 +00:00
Brian Barrett
9ffac85650 build: Move libevent to a 3rd-party package
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.

This patch moves libevent from an MCA framework to a stand-alone
library built outside of OPAL.  A wrapper in opal/util is provided
to minimize the unnecessary changes in the rest of the code.  When
using the internal Libevent, it will be installed as a stand-alone
libevent.a, instead of bundled in OPAL.  Any pre-installed version
of Libevent at or after 2.0.21 is preferred over the internal
version.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:55:58 +00:00
Brian Barrett
389b4b3a78 build: Add third-party package infrastructure
With Open MPI 5.0, the decision was made to stop building 3rd-party
packages, such as Libevent, HWLOC, PMIx, and PRRTE as MCA components
and instead 1) start relying on external libraries whenever possible
and 2) Open MPI builds the 3rd party libraries (if needed) as
independent libraries, rather than linked into libopen-pal.

This patch is the first step in that process, providing foundational
changes required for supporting 3rd-party packages, such as changes
to autogen.pl, the top-level Makefile.am, and introducing two
Autoconf macros to support running sub-configure scripts; one
supporting source in tarball form and the other supporting
source in a sub-tree.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-10-01 16:55:48 +00:00
Brian Barrett
675a899532 build: Allow symbol search tool to skip directories
At the end of "make install", a tool is run to search for common
symbols in the built artifacts, to work around issues on MacOS.
This tool requires an exclude list for symbols that must be
in the common section (such as in executables instead of libraries
and because Fortran).

This commit adds the ability to exclude certain directories from
the search, such as directories that are 3rd party packages or
only contain tests/executables, which will not run into problems
on MacOS.

To simplify that change, the file search in find_common_syms was
also rewritten to use the Perl-standard File::Find package instead
of calling the find executable.  Theoretically, this should be
mildly faster, but is also significantly easier to modify.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-09-30 23:34:11 +00:00
Nathan Hjelm
1f5ed0b83d
Merge pull request #8070 from devreal/osc-page-align
OSC RDMA: put memory for each process into separate pages
2020-09-29 07:54:45 -06:00
Joseph Schuchart
52b52b8ebb OSC RDMA: only touch pages before memory registration, don't fill them
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-29 07:45:07 +02:00
Joseph Schuchart
d11ccbada9 OSC RDMA: put memory of each process into separate pages
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-29 07:44:17 +02:00
Joseph Schuchart
1fdf05f634 pml/ob1: fix SPC potential over-counting when sending ack and requesting put
mca_pml_ob1_recv_request_put_frag is used to request a put from the peer if get fails
mca_pml_ob1_recv_request_ack_send_btl is used to send an acknowledgement, not data

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-09-28 15:38:47 +02:00
Joseph Schuchart
a9ed53aa66 pml/ob1: add SPC instrumentation of sent fin messages
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-09-28 15:19:29 +02:00
Joseph Schuchart
91a94201d2 PML UCX: add SPC instrumentation for message size sent/received
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-09-28 15:12:24 +02:00
bosilca
08f68671db
Merge pull request #8060 from bosilca/fix/ialltoallw
Prevent some rank from not increasing the non-blocking collective tag if they have no data to exchange.
2020-09-26 12:25:13 -04:00
Nathan Hjelm
920315611e
Merge pull request #8054 from hjelmn/kill_the_never_going_to_work_patcher_linux_component_to_prevent_future_confusion_as_to_its_effectiveness
patcher: remove the linux component
2020-09-24 19:27:02 -06:00
George Bosilca
96fea22cdd
Don't allow some rank to don't count the collective if they have no data
to exchange.

This is the same logic as in 77eaa5c applied to ialltoallw.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-24 13:29:01 -04:00
Yossi Itigin
b532564643
Merge pull request #8041 from brminich/topic/shmem_scoll_fix
SHMEM/SCOLL: Fix inplace reductions
2020-09-23 13:56:10 +03:00
Mikhail Brinskii
dfe20e0472 SHMEM/SCOLL: Fix inplace reductions
Signed-off-by: Mikhail Brinskii <mikhailb@nvidia.com>
2020-09-23 10:06:36 +03:00
bosilca
21c9c666ab
Merge pull request #8039 from bosilca/fix/adapt
Fix some corner cases with ADAPT
2020-09-18 17:18:41 -04:00
George Bosilca
77eaa5c8b8
Keep the non-blocking collective tags globally in sync.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-18 12:52:14 -04:00
George Bosilca
c98e387a53
Many fixes and improvements to ADAPT
- Add support for fallback to previous coll module on non-commutative operations (#30)
- Replace mutexes by atomic operations.
- Use the correct nbc request type (for both ibcast and ireduce)
  * coll/base: document type casts in ompi_coll_base_retain_*
- add module-wide topology cache
- use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast
- reduce number of memory allocations
- call the default request completion.
  - Remove the requests from the Fortran lookup conversion tables before completing
    and free it.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>

Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
2020-09-18 12:50:17 -04:00
Nathan Hjelm
7fca99b2f7 patcher: remove the linux component
The Linux component was an attempt to hook calls by patching the dynamic
symbol table. It, unfortunately, does not work as it will always miss
calls made internally by glibc. For example, it might catch a user call
directly to munmap but will miss the chain free -> munmap. Since the
later is the common case we were trying to hook this made the component
unusable. This PR finally kills the component.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-09-18 10:23:01 -06:00
bosilca
eca00a7a3b
Merge pull request #8042 from bosilca/fix/sm_emu
Fix a copy/paste in the RDMA emulation.
2020-09-14 11:43:00 -04:00
Jeff Squyres
3a93e4f94d
Merge pull request #8038 from devreal/fix-opal-pmix-cond-init
Use correct conditional variable initializer in opal/mca/pmix/base
2020-09-14 09:38:43 -04:00
Jeff Squyres
d5791b2770
Merge pull request #8043 from ggouaillardet/topic/status_f2f08
mpif-h: fix a typo in MPI_Status_f2f08()
2020-09-14 09:32:34 -04:00
Gilles Gouaillardet
fb8bfccb83 mpif-h: fix a typo in MPI_Status_f2f08()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-14 13:56:16 +09:00
George Bosilca
49da998f33
Fix a copy/paste in the RDMA emulation.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-09-13 22:56:58 -04:00
Jeff Squyres
1b0dfcdfab
Merge pull request #7762 from ggouaillardet/topic/mpi_status_f08_c
Add missing MPI_Status conversion subroutines
2020-09-10 09:45:58 -04:00
Jeff Squyres
c04dc355de mpi/man: convert MPI_Status conversion man pages to Markdown
Convert the MPI_Status_f082f, MPI_Status_f082c, and MPI_Status_f2c man
pages to Markdown.  Fix some typos and improve the text a bit along
the way.

Left the raw NROFF redirect pages MPI_Status_f2f08, MPI_Status_c2f08,
and MPI_Status_c2f files as they were -- they're 1-line redirects, and
it seems simpler to leave those (vs. duplicating the Markdown).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-09-09 06:59:12 -07:00
Gilles Gouaillardet
e97d3ce645 Add missing MPI_Status conversion subroutines
Only in C bindings:
 - MPI_Status_c2f08()
 - MPI_Status_f082c()

In all bindings but mpif.h
 - MPI_Status_f082f()
 - MPI_Status_f2f08()

and the PMPI_* related subroutines

As initially inteded by the MPI forum, the Fortran to/from Fortran 2008
conversion subtoutines are *not* implemented in the mpif.h bindings.
See the discussion at https://github.com/mpi-forum/mpi-issues/issues/298

Refs. open-mpi/ompi#1475

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-09-09 06:59:12 -07:00