2005-11-07 01:05:50 +03:00
|
|
|
/*
|
|
|
|
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
|
|
* University Research and Technology
|
|
|
|
* Corporation. All rights reserved.
|
2015-02-15 22:48:08 +03:00
|
|
|
* Copyright (c) 2004-2015 The University of Tennessee and The University
|
2005-11-07 01:05:50 +03:00
|
|
|
* of Tennessee Research Foundation. All rights
|
|
|
|
* reserved.
|
2015-02-15 22:48:08 +03:00
|
|
|
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
2005-11-07 01:05:50 +03:00
|
|
|
* University of Stuttgart. All rights reserved.
|
|
|
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
|
|
|
* All rights reserved.
|
2008-05-07 06:31:24 +04:00
|
|
|
* Copyright (c) 2008 Sun Microsystems, Inc. All rights reserved.
|
2005-11-07 01:05:50 +03:00
|
|
|
* $COPYRIGHT$
|
2015-02-15 22:48:08 +03:00
|
|
|
*
|
2005-11-07 01:05:50 +03:00
|
|
|
* Additional copyrights may follow
|
2015-02-15 22:48:08 +03:00
|
|
|
*
|
2005-11-07 01:05:50 +03:00
|
|
|
* $HEADER$
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include "ompi_config.h"
|
|
|
|
|
|
|
|
#include "mpi.h"
|
2006-02-12 04:33:29 +03:00
|
|
|
#include "ompi/constants.h"
|
- Split the datatype engine into two parts: an MPI specific part in
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
2009-07-13 08:56:31 +04:00
|
|
|
#include "ompi/datatype/ompi_datatype.h"
|
2006-02-12 04:33:29 +03:00
|
|
|
#include "ompi/communicator/communicator.h"
|
|
|
|
#include "ompi/mca/coll/base/base.h"
|
|
|
|
#include "ompi/mca/coll/coll.h"
|
|
|
|
#include "ompi/mca/coll/base/coll_tags.h"
|
2005-11-07 01:05:50 +03:00
|
|
|
#include "coll_tuned.h"
|
|
|
|
|
2005-11-11 07:49:29 +03:00
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* Notes on evaluation rules and ordering
|
|
|
|
*
|
|
|
|
* The order is:
|
2005-11-11 07:49:29 +03:00
|
|
|
* use file based rules if presented (-coll_tuned_dynamic_rules_filename = rules)
|
|
|
|
* Else
|
|
|
|
* use forced rules (-coll_tuned_dynamic_ALG_intra_algorithm = algorithm-number)
|
|
|
|
* Else
|
|
|
|
* use fixed (compiled) rule set (or nested ifs)
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
|
2005-11-07 01:05:50 +03:00
|
|
|
/*
|
|
|
|
* allreduce_intra
|
|
|
|
*
|
|
|
|
* Function: - allreduce using other MPI collectives
|
|
|
|
* Accepts: - same as MPI_Allreduce()
|
|
|
|
* Returns: - MPI_SUCCESS or error code
|
|
|
|
*/
|
|
|
|
int
|
2005-12-22 16:49:33 +03:00
|
|
|
ompi_coll_tuned_allreduce_intra_dec_dynamic (void *sbuf, void *rbuf, int count,
|
2006-10-18 06:00:46 +04:00
|
|
|
struct ompi_datatype_t *dtype,
|
|
|
|
struct ompi_op_t *op,
|
2007-08-19 07:37:49 +04:00
|
|
|
struct ompi_communicator_t *comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2005-11-07 01:05:50 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream, "ompi_coll_tuned_allreduce_intra_dec_dynamic"));
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2005-11-11 07:49:29 +03:00
|
|
|
/* check to see if we have some filebased rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[ALLREDUCE]) {
|
2006-10-18 06:00:46 +04:00
|
|
|
/* we do, so calc the message size or what ever we need and use this for the evaluation */
|
2007-02-20 07:25:00 +03:00
|
|
|
int alg, faninout, segsize, ignoreme;
|
2006-10-18 00:20:58 +04:00
|
|
|
size_t dsize;
|
2005-11-11 07:49:29 +03:00
|
|
|
|
- Split the datatype engine into two parts: an MPI specific part in
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
2009-07-13 08:56:31 +04:00
|
|
|
ompi_datatype_type_size (dtype, &dsize);
|
2005-11-11 07:49:29 +03:00
|
|
|
dsize *= count;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[ALLREDUCE],
|
2007-02-20 07:25:00 +03:00
|
|
|
dsize, &faninout, &segsize, &ignoreme);
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2007-08-19 07:37:49 +04:00
|
|
|
if (alg) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_allreduce_intra_do_this (sbuf, rbuf, count, dtype, op,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module,
|
2006-10-18 06:00:46 +04:00
|
|
|
alg, faninout, segsize);
|
2005-11-11 07:49:29 +03:00
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[ALLREDUCE].algorithm) {
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_allreduce_intra_do_forced (sbuf, rbuf, count, dtype, op,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_allreduce_intra_dec_fixed (sbuf, rbuf, count, dtype, op,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* alltoall_intra_dec
|
2005-11-07 01:05:50 +03:00
|
|
|
*
|
2006-12-21 21:40:02 +03:00
|
|
|
* Function: - seletects alltoall algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_Alltoall()
|
|
|
|
* Returns: - MPI_SUCCESS or error code (passed from the bcast implementation)
|
2005-11-07 01:05:50 +03:00
|
|
|
*/
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
int ompi_coll_tuned_alltoall_intra_dec_dynamic(void *sbuf, int scount,
|
2006-10-18 06:00:46 +04:00
|
|
|
struct ompi_datatype_t *sdtype,
|
2015-02-15 22:48:08 +03:00
|
|
|
void* rbuf, int rcount,
|
|
|
|
struct ompi_datatype_t *rdtype,
|
2007-08-19 07:37:49 +04:00
|
|
|
struct ompi_communicator_t *comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2005-11-07 01:05:50 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream, "ompi_coll_tuned_alltoall_intra_dec_dynamic"));
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2005-11-11 07:49:29 +03:00
|
|
|
/* check to see if we have some filebased rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[ALLTOALL]) {
|
2006-10-18 06:00:46 +04:00
|
|
|
/* we do, so calc the message size or what ever we need and use this for the evaluation */
|
2005-11-11 07:49:29 +03:00
|
|
|
int comsize;
|
2007-02-20 07:25:00 +03:00
|
|
|
int alg, faninout, segsize, max_requests;
|
2006-10-18 00:20:58 +04:00
|
|
|
size_t dsize;
|
2005-11-11 07:49:29 +03:00
|
|
|
|
- Split the datatype engine into two parts: an MPI specific part in
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
2009-07-13 08:56:31 +04:00
|
|
|
ompi_datatype_type_size (sdtype, &dsize);
|
2005-11-11 07:49:29 +03:00
|
|
|
comsize = ompi_comm_size(comm);
|
2012-03-06 02:23:44 +04:00
|
|
|
dsize *= (ptrdiff_t)comsize * (ptrdiff_t)scount;
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[ALLTOALL],
|
2007-02-20 07:25:00 +03:00
|
|
|
dsize, &faninout, &segsize, &max_requests);
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2007-08-19 07:37:49 +04:00
|
|
|
if (alg) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_alltoall_intra_do_this (sbuf, scount, sdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
comm, module,
|
2007-02-20 07:25:00 +03:00
|
|
|
alg, faninout, segsize, max_requests);
|
2005-11-11 07:49:29 +03:00
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[ALLTOALL].algorithm) {
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_alltoall_intra_do_forced (sbuf, scount, sdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_alltoall_intra_dec_fixed (sbuf, scount, sdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
|
|
|
|
2008-05-07 06:31:24 +04:00
|
|
|
/*
|
|
|
|
* Function: - selects alltoallv algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_Alltoallv()
|
|
|
|
* Returns: - MPI_SUCCESS or error code
|
|
|
|
*/
|
|
|
|
|
|
|
|
int ompi_coll_tuned_alltoallv_intra_dec_dynamic(void *sbuf, int *scounts, int *sdisps,
|
|
|
|
struct ompi_datatype_t *sdtype,
|
|
|
|
void* rbuf, int *rcounts, int *rdisps,
|
2015-02-15 22:48:08 +03:00
|
|
|
struct ompi_datatype_t *rdtype,
|
2008-05-07 06:31:24 +04:00
|
|
|
struct ompi_communicator_t *comm,
|
2008-07-29 02:40:57 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2008-05-07 06:31:24 +04:00
|
|
|
{
|
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream, "ompi_coll_tuned_alltoallv_intra_dec_dynamic"));
|
2008-05-07 06:31:24 +04:00
|
|
|
|
2009-08-15 01:06:23 +04:00
|
|
|
/**
|
|
|
|
* check to see if we have some filebased rules. As we don't have global
|
|
|
|
* knowledge about the total amount of data, use the first available rule.
|
|
|
|
* This allow the users to specify the alltoallv algorithm to be used only
|
|
|
|
* based on the communicator size.
|
2008-05-07 06:31:24 +04:00
|
|
|
*/
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[ALLTOALLV]) {
|
2009-08-15 01:06:23 +04:00
|
|
|
int alg, faninout, segsize, max_requests;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[ALLTOALLV],
|
2009-08-15 01:06:23 +04:00
|
|
|
0, &faninout, &segsize, &max_requests);
|
|
|
|
|
|
|
|
if (alg) {
|
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
|
|
|
return ompi_coll_tuned_alltoallv_intra_do_this (sbuf, scounts, sdisps, sdtype,
|
|
|
|
rbuf, rcounts, rdisps, rdtype,
|
|
|
|
comm, module,
|
|
|
|
alg);
|
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[ALLTOALLV].algorithm) {
|
2008-05-07 06:31:24 +04:00
|
|
|
return ompi_coll_tuned_alltoallv_intra_do_forced(sbuf, scounts, sdisps, sdtype,
|
|
|
|
rbuf, rcounts, rdisps, rdtype,
|
|
|
|
comm, module);
|
|
|
|
}
|
|
|
|
return ompi_coll_tuned_alltoallv_intra_dec_fixed(sbuf, scounts, sdisps, sdtype,
|
|
|
|
rbuf, rcounts, rdisps, rdtype,
|
|
|
|
comm, module);
|
|
|
|
}
|
|
|
|
|
2005-11-07 01:05:50 +03:00
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* barrier_intra_dec
|
2005-11-07 01:05:50 +03:00
|
|
|
*
|
2006-12-21 21:40:02 +03:00
|
|
|
* Function: - seletects barrier algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_Barrier()
|
|
|
|
* Returns: - MPI_SUCCESS or error code (passed from the barrier implementation)
|
2005-11-07 01:05:50 +03:00
|
|
|
*/
|
2007-08-19 07:37:49 +04:00
|
|
|
int ompi_coll_tuned_barrier_intra_dec_dynamic(struct ompi_communicator_t *comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2005-11-07 01:05:50 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream,"ompi_coll_tuned_barrier_intra_dec_dynamic"));
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2005-11-11 07:49:29 +03:00
|
|
|
/* check to see if we have some filebased rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[BARRIER]) {
|
2006-10-18 06:00:46 +04:00
|
|
|
/* we do, so calc the message size or what ever we need and use this for the evaluation */
|
2007-02-20 07:25:00 +03:00
|
|
|
int alg, faninout, segsize, ignoreme;
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[BARRIER],
|
2007-02-20 07:25:00 +03:00
|
|
|
0, &faninout, &segsize, &ignoreme);
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2007-08-19 07:37:49 +04:00
|
|
|
if (alg) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_barrier_intra_do_this (comm, module,
|
2006-10-18 06:00:46 +04:00
|
|
|
alg, faninout, segsize);
|
2005-11-11 07:49:29 +03:00
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[BARRIER].algorithm) {
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_barrier_intra_do_forced (comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_barrier_intra_dec_fixed (comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* bcast_intra_dec
|
2005-11-07 01:05:50 +03:00
|
|
|
*
|
|
|
|
* Function: - seletects broadcast algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_Bcast()
|
|
|
|
* Returns: - MPI_SUCCESS or error code (passed from the bcast implementation)
|
|
|
|
*/
|
2005-12-22 16:49:33 +03:00
|
|
|
int ompi_coll_tuned_bcast_intra_dec_dynamic(void *buff, int count,
|
2006-10-18 06:00:46 +04:00
|
|
|
struct ompi_datatype_t *datatype, int root,
|
2007-08-19 07:37:49 +04:00
|
|
|
struct ompi_communicator_t *comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2005-11-07 01:05:50 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream, "coll:tuned:bcast_intra_dec_dynamic"));
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2005-11-11 07:49:29 +03:00
|
|
|
/* check to see if we have some filebased rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[BCAST]) {
|
2006-10-18 06:00:46 +04:00
|
|
|
/* we do, so calc the message size or what ever we need and use this for the evaluation */
|
2007-02-20 07:25:00 +03:00
|
|
|
int alg, faninout, segsize, ignoreme;
|
2006-10-18 00:20:58 +04:00
|
|
|
size_t dsize;
|
2005-11-11 07:49:29 +03:00
|
|
|
|
- Split the datatype engine into two parts: an MPI specific part in
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
2009-07-13 08:56:31 +04:00
|
|
|
ompi_datatype_type_size (datatype, &dsize);
|
2005-11-11 07:49:29 +03:00
|
|
|
dsize *= count;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[BCAST],
|
2007-02-20 07:25:00 +03:00
|
|
|
dsize, &faninout, &segsize, &ignoreme);
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2007-08-19 07:37:49 +04:00
|
|
|
if (alg) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_bcast_intra_do_this (buff, count, datatype, root,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module,
|
2005-11-11 07:49:29 +03:00
|
|
|
alg, faninout, segsize);
|
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
|
|
|
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[BCAST].algorithm) {
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_bcast_intra_do_forced (buff, count, datatype, root,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_bcast_intra_dec_fixed (buff, count, datatype, root,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* reduce_intra_dec
|
2005-11-07 01:05:50 +03:00
|
|
|
*
|
2006-12-21 21:40:02 +03:00
|
|
|
* Function: - seletects reduce algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_reduce()
|
|
|
|
* Returns: - MPI_SUCCESS or error code (passed from the reduce implementation)
|
2015-02-15 22:48:08 +03:00
|
|
|
*
|
2005-11-07 01:05:50 +03:00
|
|
|
*/
|
2005-12-22 16:49:33 +03:00
|
|
|
int ompi_coll_tuned_reduce_intra_dec_dynamic( void *sendbuf, void *recvbuf,
|
2006-10-18 06:00:46 +04:00
|
|
|
int count, struct ompi_datatype_t* datatype,
|
|
|
|
struct ompi_op_t* op, int root,
|
2007-08-19 07:37:49 +04:00
|
|
|
struct ompi_communicator_t* comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2005-11-07 01:05:50 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream, "coll:tuned:reduce_intra_dec_dynamic"));
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2005-11-11 07:49:29 +03:00
|
|
|
/* check to see if we have some filebased rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[REDUCE]) {
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2006-10-18 06:00:46 +04:00
|
|
|
/* we do, so calc the message size or what ever we need and use this for the evaluation */
|
2007-04-26 00:39:53 +04:00
|
|
|
int alg, faninout, segsize, max_requests;
|
2006-10-18 00:20:58 +04:00
|
|
|
size_t dsize;
|
2005-11-11 07:49:29 +03:00
|
|
|
|
- Split the datatype engine into two parts: an MPI specific part in
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
2009-07-13 08:56:31 +04:00
|
|
|
ompi_datatype_type_size (datatype, &dsize);
|
2005-11-11 07:49:29 +03:00
|
|
|
dsize *= count;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[REDUCE],
|
2007-04-26 00:39:53 +04:00
|
|
|
dsize, &faninout, &segsize, &max_requests);
|
2005-11-11 07:49:29 +03:00
|
|
|
|
2007-08-19 07:37:49 +04:00
|
|
|
if (alg) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
2015-02-15 22:48:08 +03:00
|
|
|
return ompi_coll_tuned_reduce_intra_do_this (sendbuf, recvbuf, count, datatype,
|
2007-08-19 07:37:49 +04:00
|
|
|
op, root,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module,
|
2015-02-15 22:48:08 +03:00
|
|
|
alg, faninout,
|
|
|
|
segsize,
|
2007-04-26 00:39:53 +04:00
|
|
|
max_requests);
|
2005-11-11 07:49:29 +03:00
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
2005-11-07 01:05:50 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[REDUCE].algorithm) {
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_reduce_intra_do_forced (sendbuf, recvbuf, count, datatype,
|
2009-08-15 01:06:23 +04:00
|
|
|
op, root,
|
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_reduce_intra_dec_fixed (sendbuf, recvbuf, count, datatype,
|
2009-08-15 01:06:23 +04:00
|
|
|
op, root,
|
|
|
|
comm, module);
|
2005-11-07 01:05:50 +03:00
|
|
|
}
|
|
|
|
|
2007-03-05 23:40:39 +03:00
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* reduce_scatter_intra_dec
|
2007-03-05 23:40:39 +03:00
|
|
|
*
|
|
|
|
* Function: - seletects reduce_scatter algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_Reduce_scatter()
|
|
|
|
* Returns: - MPI_SUCCESS or error code (passed from
|
|
|
|
* the reduce_scatter implementation)
|
2015-02-15 22:48:08 +03:00
|
|
|
*
|
2007-03-05 23:40:39 +03:00
|
|
|
*/
|
2015-02-15 22:48:08 +03:00
|
|
|
int ompi_coll_tuned_reduce_scatter_intra_dec_dynamic(void *sbuf, void *rbuf,
|
2007-03-05 23:40:39 +03:00
|
|
|
int *rcounts,
|
|
|
|
struct ompi_datatype_t *dtype,
|
|
|
|
struct ompi_op_t *op,
|
2007-08-19 07:37:49 +04:00
|
|
|
struct ompi_communicator_t *comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2007-03-05 23:40:39 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2007-03-05 23:40:39 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream, "coll:tuned:reduce_scatter_intra_dec_dynamic"));
|
2007-03-05 23:40:39 +03:00
|
|
|
|
|
|
|
/* check to see if we have some filebased rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[REDUCESCATTER]) {
|
|
|
|
/* we do, so calc the message size or what ever we need and use
|
2007-03-05 23:40:39 +03:00
|
|
|
this for the evaluation */
|
|
|
|
int alg, faninout, segsize, ignoreme, i, count, size;
|
|
|
|
size_t dsize;
|
|
|
|
size = ompi_comm_size(comm);
|
|
|
|
for (i = 0, count = 0; i < size; i++) { count += rcounts[i];}
|
- Split the datatype engine into two parts: an MPI specific part in
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
2009-07-13 08:56:31 +04:00
|
|
|
ompi_datatype_type_size (dtype, &dsize);
|
2007-03-05 23:40:39 +03:00
|
|
|
dsize *= count;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[REDUCESCATTER],
|
|
|
|
dsize, &faninout,
|
2007-03-05 23:40:39 +03:00
|
|
|
&segsize, &ignoreme);
|
2015-02-15 22:48:08 +03:00
|
|
|
if (alg) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_reduce_scatter_intra_do_this (sbuf, rbuf, rcounts,
|
2009-08-15 01:06:23 +04:00
|
|
|
dtype, op,
|
|
|
|
comm, module,
|
2015-02-15 22:48:08 +03:00
|
|
|
alg, faninout,
|
2007-03-05 23:40:39 +03:00
|
|
|
segsize);
|
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
2015-02-15 22:48:08 +03:00
|
|
|
|
|
|
|
if (tuned_module->user_forced[REDUCESCATTER].algorithm) {
|
|
|
|
return ompi_coll_tuned_reduce_scatter_intra_do_forced (sbuf, rbuf, rcounts,
|
2007-08-19 07:37:49 +04:00
|
|
|
dtype, op,
|
|
|
|
comm, module);
|
2007-03-05 23:40:39 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
return ompi_coll_tuned_reduce_scatter_intra_dec_fixed (sbuf, rbuf, rcounts,
|
2009-08-15 01:06:23 +04:00
|
|
|
dtype, op,
|
|
|
|
comm, module);
|
2007-03-05 23:40:39 +03:00
|
|
|
}
|
|
|
|
|
2006-12-21 21:40:02 +03:00
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* allgather_intra_dec
|
2006-12-21 21:40:02 +03:00
|
|
|
*
|
|
|
|
* Function: - seletects allgather algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_Allgather()
|
|
|
|
* Returns: - MPI_SUCCESS or error code (passed from the selected
|
|
|
|
* allgather function).
|
|
|
|
*/
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
int ompi_coll_tuned_allgather_intra_dec_dynamic(void *sbuf, int scount,
|
2006-12-21 21:40:02 +03:00
|
|
|
struct ompi_datatype_t *sdtype,
|
2015-02-15 22:48:08 +03:00
|
|
|
void* rbuf, int rcount,
|
|
|
|
struct ompi_datatype_t *rdtype,
|
2007-08-19 07:37:49 +04:00
|
|
|
struct ompi_communicator_t *comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2006-12-21 21:40:02 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream,
|
2009-08-15 01:06:23 +04:00
|
|
|
"ompi_coll_tuned_allgather_intra_dec_dynamic"));
|
2015-02-15 22:48:08 +03:00
|
|
|
|
|
|
|
if (tuned_module->com_rules[ALLGATHER]) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* We have file based rules:
|
|
|
|
- calculate message size and other necessary information */
|
|
|
|
int comsize;
|
|
|
|
int alg, faninout, segsize, ignoreme;
|
|
|
|
size_t dsize;
|
2015-02-15 22:48:08 +03:00
|
|
|
|
2009-08-15 01:06:23 +04:00
|
|
|
ompi_datatype_type_size (sdtype, &dsize);
|
|
|
|
comsize = ompi_comm_size(comm);
|
2012-03-06 02:23:44 +04:00
|
|
|
dsize *= (ptrdiff_t)comsize * (ptrdiff_t)scount;
|
2015-02-15 22:48:08 +03:00
|
|
|
|
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[ALLGATHER],
|
2009-08-15 01:06:23 +04:00
|
|
|
dsize, &faninout, &segsize, &ignoreme);
|
2015-02-15 22:48:08 +03:00
|
|
|
if (alg) {
|
|
|
|
/* we have found a valid choice from the file based rules for
|
2009-08-15 01:06:23 +04:00
|
|
|
this message size */
|
|
|
|
return ompi_coll_tuned_allgather_intra_do_this (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
comm, module,
|
|
|
|
alg, faninout, segsize);
|
|
|
|
}
|
2015-02-15 22:48:08 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
|
|
|
|
/* We do not have file based rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[ALLGATHER].algorithm) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* User-forced algorithm */
|
2015-02-15 22:48:08 +03:00
|
|
|
return ompi_coll_tuned_allgather_intra_do_forced (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2007-08-19 07:37:49 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Use default decision */
|
2015-02-15 22:48:08 +03:00
|
|
|
return ompi_coll_tuned_allgather_intra_dec_fixed (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2006-12-21 21:40:02 +03:00
|
|
|
}
|
|
|
|
|
2007-07-04 03:33:12 +04:00
|
|
|
/*
|
2015-02-15 22:48:08 +03:00
|
|
|
* allgatherv_intra_dec
|
2007-07-04 03:33:12 +04:00
|
|
|
*
|
|
|
|
* Function: - seletects allgatherv algorithm to use
|
|
|
|
* Accepts: - same arguments as MPI_Allgatherv()
|
|
|
|
* Returns: - MPI_SUCCESS or error code (passed from the selected
|
|
|
|
* allgatherv function).
|
|
|
|
*/
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
int ompi_coll_tuned_allgatherv_intra_dec_dynamic(void *sbuf, int scount,
|
2007-07-04 03:33:12 +04:00
|
|
|
struct ompi_datatype_t *sdtype,
|
2015-02-15 22:48:08 +03:00
|
|
|
void* rbuf, int *rcounts,
|
2007-07-04 03:33:12 +04:00
|
|
|
int *rdispls,
|
2015-02-15 22:48:08 +03:00
|
|
|
struct ompi_datatype_t *rdtype,
|
2007-08-19 07:37:49 +04:00
|
|
|
struct ompi_communicator_t *comm,
|
2009-08-15 01:06:23 +04:00
|
|
|
mca_coll_base_module_t *module)
|
2007-07-04 03:33:12 +04:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2015-02-15 22:48:08 +03:00
|
|
|
|
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream,
|
2009-08-15 01:06:23 +04:00
|
|
|
"ompi_coll_tuned_allgatherv_intra_dec_dynamic"));
|
2015-02-15 22:48:08 +03:00
|
|
|
|
|
|
|
if (tuned_module->com_rules[ALLGATHERV]) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* We have file based rules:
|
|
|
|
- calculate message size and other necessary information */
|
|
|
|
int comsize, i;
|
|
|
|
int alg, faninout, segsize, ignoreme;
|
|
|
|
size_t dsize, total_size;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
comsize = ompi_comm_size(comm);
|
2009-08-15 01:06:23 +04:00
|
|
|
ompi_datatype_type_size (sdtype, &dsize);
|
|
|
|
total_size = 0;
|
|
|
|
for (i = 0; i < comsize; i++) { total_size += dsize * rcounts[i]; }
|
2015-02-15 22:48:08 +03:00
|
|
|
|
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[ALLGATHERV],
|
2009-08-15 01:06:23 +04:00
|
|
|
total_size, &faninout, &segsize, &ignoreme);
|
2015-02-15 22:48:08 +03:00
|
|
|
if (alg) {
|
|
|
|
/* we have found a valid choice from the file based rules for
|
2009-08-15 01:06:23 +04:00
|
|
|
this message size */
|
|
|
|
return ompi_coll_tuned_allgatherv_intra_do_this (sbuf, scount, sdtype,
|
2015-02-15 22:48:08 +03:00
|
|
|
rbuf, rcounts,
|
2009-08-15 01:06:23 +04:00
|
|
|
rdispls, rdtype,
|
|
|
|
comm, module,
|
|
|
|
alg, faninout, segsize);
|
|
|
|
}
|
2015-02-15 22:48:08 +03:00
|
|
|
}
|
2007-08-19 07:37:49 +04:00
|
|
|
|
|
|
|
/* We do not have file based rules */
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[ALLGATHERV].algorithm) {
|
2009-08-15 01:06:23 +04:00
|
|
|
/* User-forced algorithm */
|
2015-02-15 22:48:08 +03:00
|
|
|
return ompi_coll_tuned_allgatherv_intra_do_forced (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcounts,
|
|
|
|
rdispls, rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2007-08-19 07:37:49 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Use default decision */
|
2015-02-15 22:48:08 +03:00
|
|
|
return ompi_coll_tuned_allgatherv_intra_dec_fixed (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcounts,
|
|
|
|
rdispls, rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
comm, module);
|
2007-07-04 03:33:12 +04:00
|
|
|
}
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
int ompi_coll_tuned_gather_intra_dec_dynamic(void *sbuf, int scount,
|
2009-08-15 01:06:23 +04:00
|
|
|
struct ompi_datatype_t *sdtype,
|
2015-02-15 22:48:08 +03:00
|
|
|
void* rbuf, int rcount,
|
|
|
|
struct ompi_datatype_t *rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
int root,
|
|
|
|
struct ompi_communicator_t *comm,
|
|
|
|
mca_coll_base_module_t *module)
|
2007-02-28 04:11:01 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2007-02-28 04:11:01 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream,
|
2009-08-15 01:06:23 +04:00
|
|
|
"ompi_coll_tuned_gather_intra_dec_dynamic"));
|
|
|
|
|
|
|
|
/**
|
|
|
|
* check to see if we have some filebased rules.
|
|
|
|
*/
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[GATHER]) {
|
2009-08-15 01:06:23 +04:00
|
|
|
int comsize, alg, faninout, segsize, max_requests;
|
|
|
|
size_t dsize;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
comsize = ompi_comm_size(comm);
|
2009-08-15 01:06:23 +04:00
|
|
|
ompi_datatype_type_size (sdtype, &dsize);
|
|
|
|
dsize *= comsize;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[GATHER],
|
2009-08-15 01:06:23 +04:00
|
|
|
dsize, &faninout, &segsize, &max_requests);
|
|
|
|
|
|
|
|
if (alg) {
|
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
|
|
|
return ompi_coll_tuned_gather_intra_do_this (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
root, comm, module,
|
|
|
|
alg, faninout, segsize);
|
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
2007-02-28 04:11:01 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[GATHER].algorithm) {
|
2009-08-15 01:06:23 +04:00
|
|
|
return ompi_coll_tuned_gather_intra_do_forced (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
root, comm, module);
|
2007-08-19 07:37:49 +04:00
|
|
|
}
|
2007-02-28 04:11:01 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
return ompi_coll_tuned_gather_intra_dec_fixed (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
root, comm, module);
|
2007-02-28 04:11:01 +03:00
|
|
|
}
|
2007-03-03 02:19:02 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
int ompi_coll_tuned_scatter_intra_dec_dynamic(void *sbuf, int scount,
|
2009-08-15 01:06:23 +04:00
|
|
|
struct ompi_datatype_t *sdtype,
|
2015-02-15 22:48:08 +03:00
|
|
|
void* rbuf, int rcount,
|
|
|
|
struct ompi_datatype_t *rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
int root, struct ompi_communicator_t *comm,
|
|
|
|
mca_coll_base_module_t *module)
|
2007-03-03 02:19:02 +03:00
|
|
|
{
|
2007-08-19 07:37:49 +04:00
|
|
|
mca_coll_tuned_module_t *tuned_module = (mca_coll_tuned_module_t*) module;
|
2007-03-03 02:19:02 +03:00
|
|
|
|
2008-06-09 18:53:58 +04:00
|
|
|
OPAL_OUTPUT((ompi_coll_tuned_stream,
|
2009-08-15 01:06:23 +04:00
|
|
|
"ompi_coll_tuned_scatter_intra_dec_dynamic"));
|
|
|
|
|
|
|
|
/**
|
|
|
|
* check to see if we have some filebased rules.
|
|
|
|
*/
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->com_rules[SCATTER]) {
|
2009-08-15 01:06:23 +04:00
|
|
|
int comsize, alg, faninout, segsize, max_requests;
|
|
|
|
size_t dsize;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
comsize = ompi_comm_size(comm);
|
2009-08-15 01:06:23 +04:00
|
|
|
ompi_datatype_type_size (sdtype, &dsize);
|
|
|
|
dsize *= comsize;
|
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
alg = ompi_coll_tuned_get_target_method_params (tuned_module->com_rules[SCATTER],
|
2009-08-15 01:06:23 +04:00
|
|
|
dsize, &faninout, &segsize, &max_requests);
|
|
|
|
|
|
|
|
if (alg) {
|
|
|
|
/* we have found a valid choice from the file based rules for this message size */
|
|
|
|
return ompi_coll_tuned_scatter_intra_do_this (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
root, comm, module,
|
|
|
|
alg, faninout, segsize);
|
|
|
|
} /* found a method */
|
|
|
|
} /*end if any com rules to check */
|
2007-03-03 02:19:02 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
if (tuned_module->user_forced[SCATTER].algorithm) {
|
2009-08-15 01:06:23 +04:00
|
|
|
return ompi_coll_tuned_scatter_intra_do_forced (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
|
|
|
root, comm, module);
|
2007-08-19 07:37:49 +04:00
|
|
|
}
|
2007-03-03 02:19:02 +03:00
|
|
|
|
2015-02-15 22:48:08 +03:00
|
|
|
return ompi_coll_tuned_scatter_intra_dec_fixed (sbuf, scount, sdtype,
|
|
|
|
rbuf, rcount, rdtype,
|
2009-08-15 01:06:23 +04:00
|
|
|
root, comm, module);
|
2007-03-03 02:19:02 +03:00
|
|
|
}
|