Among many other things:
- Fix an imbalance bug in MPI_allgather
- Accept more human readable configuration files. We can now specify
the collective by name instead of a magic number, and the component
we want to use also by name.
- Add the capability to have optional arguments in the collective
communication configuration file. Right now the capability exists
for segment lengths, but is yet to be connected with the algorithms.
- Redo the initialization of all HAN collectives.
Cleanup the fallback collective support.
- In case the module is unable to deliver the expected result, it will fallback
executing the collective operation on another collective component. This change
make the support for this fallback simpler to use.
- Implement a fallback allowing a HAN module to remove itself as
potential active collective module, and instead fallback to the
next module in line.
- Completely disable the HAN modules on error. From the moment an error is
encountered they remove themselves from the communicator, and in case some
other modules calls them simply behave as a pass-through.
Communicator: provide ompi_comm_split_with_info to split and provide info at the same time
Add ompi_comm_coll_preference info key to control collective component selection
COLL HAN: use info keys instead of component-level variable to communicate topology level between abstraction layers
- The info value is a comma-separated list of entries, which are chosen with
decreasing priorities. This overrides the priority of the component,
unless the component has disqualified itself.
An entry prefixed with ^ starts the ignore-list. Any entry following this
character will be ingnored during the collective component selection for the
communicator.
Example: "sm,libnbc,^han,adapt" gives sm the highest preference, followed
by libnbc. The components han and adapt are ignored in the selection process.
- Allocate a temporary buffer for all lower-level leaders (length 2 segments)
- Fix the handling of MPI_IN_PLACE for gather and scatter.
COLL HAN: Fix topology handling
- HAN should not rely on node names to determine the ordering of ranks.
Instead, use the node leaders as identifiers and short-cut if the
node-leaders agree that ranks are consecutive. Also, error out if
the rank distribution is imbalanced for now.
Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
* first import of Bull specific modifications to HAN
* Cleaning, renaming and compilation fixing Changed all future into han.
* Import BULL specific modifications in coll/tuned and coll/base
* Fixed compilation issues in Han
* Changed han_output to directly point to coll framework output.
* The verbosity MCA parameter was removed as a duplicated of coll verbosity
* Add fallback in han reduce when op cannot commute and ppn are imbalanced
* Added fallback wfor han bcast when nodes do not have the same number of process
* Add fallback in han scatter when ppn are imbalanced
+ fixed missing scatter_fn pointer in the module interface
Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
Co-authored-by: a700850 <pierre.lemarinier@atos.net>
Co-authored-by: germainf <florent.germain@atos.net>
a hierarchical, architecture-aware collective communication module.
Add Reduce and remove up_seg_size and low_seg_size in Bcast
Increase HAN's priority
Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Fix typo that broke backward-compatible prefix-by-default argument
handling. Remove some dead code while we're here.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Be more careful than just exporting CPPFLAGS (or not) to sub-
configure scripts. This fixes a bug in which --enable-visibility
would cause PRRTE's configure to fail, because the top-level
configure added -Wmissing-prototypes to CPPFLAGS and then
the subconfigure added -Werror at one point. In general,
blindly exporting all the CPPFLAGS OMPI adds was a bad idea, so
we instead only export precious variables if they were
set in the calling environment, on the command line of the
top-level configure, or explicitly added to the sub-
configure environment (like CPPFLAGS for PMIx/PRRTE).
Add some envirnoment scrubbing/saving/restore wrappers and
modify PAC_CONFIG_SUBDIR_ARGS to play a little nicer with
precious variables so that this all works.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
The ignore argument to PAC_CONFIG_SUBDIR_ARGS is an m4 list of
sed expressions. --with-platform=.* ignored not just the platform
argument, but everything after it. Fix the regular expressions to
ignore everything until the next whitespace. This probably still
isn't entirely right, because it will fail if the argument has
spaces in it (like a path with spaces), but we fail that test
so many other places that it does not add to the fail.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
In reworking the 3rd-party package support for Libevent and HWLOC,
it appears that we missed exporting the opal_<package>_CPPFLAGS
variable (despite documentation). Fix that shortcoming.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
The refactoring patches move Libevent from a framework integration
to a 3rd-party package, but did not change the Libevent version
that Open MPI ships. During that swap, we stopped running the
Autotools on Libevent and relied on the tools the Libevent authors
used when building the 2.0.22 release tarball. The config.guess
in this release tarball did not work on the IBM systems.
This patch updates the release version of Libevent to 2.1.12-stable,
which will suck in a bunch of upstream bug fixes and updates
the config.guess so that the 3rd-party refactoring actually
compiles on the IBM Power systems.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.
This patch moves the prrte submodule from the top-level to the
3rd-party directory, to match the behavior of other 3rd-party
packages like Libevent and PMIx. Since Open MPI does not
support building with an external PRRTE, that functionality
is skipped in this patch.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.
This patch moves the PMIx library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL. Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components. Any pre-installed version of PMIx 3.0.0 or
newer is preferred over the internal version.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.
This patch moves the hwloc library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL. Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components. Any pre-installed version of HWLOC 1.6 or
newer is preferred over the internal version.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.
This patch moves libevent from an MCA framework to a stand-alone
library built outside of OPAL. A wrapper in opal/util is provided
to minimize the unnecessary changes in the rest of the code. When
using the internal Libevent, it will be installed as a stand-alone
libevent.a, instead of bundled in OPAL. Any pre-installed version
of Libevent at or after 2.0.21 is preferred over the internal
version.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
With Open MPI 5.0, the decision was made to stop building 3rd-party
packages, such as Libevent, HWLOC, PMIx, and PRRTE as MCA components
and instead 1) start relying on external libraries whenever possible
and 2) Open MPI builds the 3rd party libraries (if needed) as
independent libraries, rather than linked into libopen-pal.
This patch is the first step in that process, providing foundational
changes required for supporting 3rd-party packages, such as changes
to autogen.pl, the top-level Makefile.am, and introducing two
Autoconf macros to support running sub-configure scripts; one
supporting source in tarball form and the other supporting
source in a sub-tree.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
At the end of "make install", a tool is run to search for common
symbols in the built artifacts, to work around issues on MacOS.
This tool requires an exclude list for symbols that must be
in the common section (such as in executables instead of libraries
and because Fortran).
This commit adds the ability to exclude certain directories from
the search, such as directories that are 3rd party packages or
only contain tests/executables, which will not run into problems
on MacOS.
To simplify that change, the file search in find_common_syms was
also rewritten to use the Perl-standard File::Find package instead
of calling the find executable. Theoretically, this should be
mildly faster, but is also significantly easier to modify.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
mca_pml_ob1_recv_request_put_frag is used to request a put from the peer if get fails
mca_pml_ob1_recv_request_ack_send_btl is used to send an acknowledgement, not data
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
- Add support for fallback to previous coll module on non-commutative operations (#30)
- Replace mutexes by atomic operations.
- Use the correct nbc request type (for both ibcast and ireduce)
* coll/base: document type casts in ompi_coll_base_retain_*
- add module-wide topology cache
- use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast
- reduce number of memory allocations
- call the default request completion.
- Remove the requests from the Fortran lookup conversion tables before completing
and free it.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
The Linux component was an attempt to hook calls by patching the dynamic
symbol table. It, unfortunately, does not work as it will always miss
calls made internally by glibc. For example, it might catch a user call
directly to munmap but will miss the chain free -> munmap. Since the
later is the common case we were trying to hook this made the component
unusable. This PR finally kills the component.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
Convert the MPI_Status_f082f, MPI_Status_f082c, and MPI_Status_f2c man
pages to Markdown. Fix some typos and improve the text a bit along
the way.
Left the raw NROFF redirect pages MPI_Status_f2f08, MPI_Status_c2f08,
and MPI_Status_c2f files as they were -- they're 1-line redirects, and
it seems simpler to leave those (vs. duplicating the Markdown).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Only in C bindings:
- MPI_Status_c2f08()
- MPI_Status_f082c()
In all bindings but mpif.h
- MPI_Status_f082f()
- MPI_Status_f2f08()
and the PMPI_* related subroutines
As initially inteded by the MPI forum, the Fortran to/from Fortran 2008
conversion subtoutines are *not* implemented in the mpif.h bindings.
See the discussion at https://github.com/mpi-forum/mpi-issues/issues/298
Refs. open-mpi/ompi#1475
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>