1
1

11 Коммитов

Автор SHA1 Сообщение Дата
Xi Luo
e65fa4ff5c Bring ADAPT collective to 4.1
This is a meta commit, that encapsulate all the ADAPT commits in the master
into a single PR for 4.1. The master commits included here are:
fe73586, a4be3bb, d712645, c2970a3, e59bde9, ee592f3 and c98e387.

Here is a detailed list of added capabilities:
* coll/adapt: Fix naming conventions and C11 atomic use
* coll/adapt: Remove unused component field in module
* Consistent handling of zero counts in the MPI API.
* Correctly handle non-blocking collectives tags
  * As it is possible to have multiple outstanding non-blocking collectives
    provided by different collective modules, we need a consistent
    mechanism to allow them to select unique tags for each instance of a
    collective.
* Add support for fallback to previous coll module on non-commutative operations (#30)
* Replace mutexes by atomic operations.
* Use the correct nbc request type (for both ibcast and ireduce)
  * coll/base: document type casts in ompi_coll_base_retain_*
* add module-wide topology cache
* use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast
* reduce number of memory allocations
* call the default request completion.
  * Remove the requests from the Fortran lookup conversion tables before completing
    and free it.
* piggybacking Bull functionalities

Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Marc Sergent <marc.sergent@atos.net>
Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
Co-authored-by: Lemarinier, Pierre <pierre.lemarinier@atos.net>
Co-authored-by: pierrele <31764860+pierrele@users.noreply.github.com>
2020-09-23 11:45:45 -04:00
Gilles Gouaillardet
c9e4240e70 mpi: retain operation and datatype in non blocking collectives
MPI standard states a user MPI_Op and/or user MPI_Datatype can be free'd
after a call to a non blocking collective and before the non-blocking
collective completes.
Retain user (only) MPI_Op and MPI_Datatype when the non blocking call is
invoked, and set a request callback so they are free'd when the MPI_Request
completes.

Thanks Thomas Ponweiser for reporting this

Fixes open-mpi/ompi#2151
Fixes open-mpi/ompi#1304

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@0fe756d416)
2019-07-12 10:27:04 +09:00
David Eberius
d377a6b6f4 Added Software-based Performance Counters driver code along with several counters.
This code is the implementation of Software-base Performance Counters as described in the paper 'Using Software-Base Performance Counters to Expose Low-Level Open MPI Performance Information' in EuroMPI/USA '17 (http://icl.cs.utk.edu/news_pub/submissions/software-performance-counters.pdf).  More practical usage information can be found here: https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI.

All software events functions are put in macros that become no-ops when SOFTWARE_EVENTS_ENABLE is not defined.  The internal timer units have been changed to cycles to avoid division operations which was a large source of overhead as discussed in the paper.  Added a --with-spc configure option to enable SPCs in the Open MPI build.  This defines SOFTWARE_EVENTS_ENABLE.  Added an MCA parameter, mpi_spc_enable, for turning on specific counters.  Added an MCA parameter, mpi_spc_dump_enabled, for turning on and off dumping SPC counters in MPI_Finalize.  Added an SPC test and example.

Signed-off-by: David Eberius <deberius@vols.utk.edu>
2018-06-11 22:48:16 -04:00
George Bosilca
366d64b7e5 Move the collective structure outside the communicator.
As we changed the ABI (forcing a major release), we can limit
the size of the predefined communicators by moving the collective
structure outside the communicator. This might have a minimal,
but unnoticeable, impact on performance. This approach has been
discussed during the January 2017 devel meeting.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-27 11:54:17 -06:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b7817d5891ff06a447ea190b0e77479.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
Gilles Gouaillardet
53b952dc2b oshmem: invoke the C PMPI_* subroutines instead of the MPI_* ones
when profiling is built.
This prevents oshmem subroutines from being wrapped twice by third
party tools (e.g. once in oshmem and once in MPI)
see discussion starting at http://www.open-mpi.org/community/lists/devel/2015/08/17842.php

Thanks to Bert Wesarg for bringing this to our attention
2015-10-13 08:52:03 +09:00
Gilles Gouaillardet
16d65a2762 fortran/mpif-h: invoke the C PMPI_* subroutines instead of the MPI_* ones
when profiling is built.
This prevents Fortran subroutines from being wrapped twice by third
party tools (e.g. once in Fortran and once in C)
see discussion starting at http://www.open-mpi.org/community/lists/devel/2015/08/17842.php
2015-10-13 08:52:02 +09:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Jeff Squyres
1448522d15 In an MPI_IBCAST, we cannot shortcut if there's only 1 process.
cmr=v1.7.4:reviewer=brbarret:subject=Fix IBCAST for COMM_SELF
-This line, and those below, will be ignored--

M    c/ibcast.c

This commit was SVN r30054.
2013-12-22 22:55:58 +00:00
Brian Barrett
b9e8e4aeb9 * Initial merge of the non-blocking collectives interface. No implementation of
the back-end yet, coming real soon now, need to solve some tag issues first.

This commit was SVN r26641.
2012-06-22 20:54:12 +00:00