533633b8cb
* Various cosmetic/style updates in the btl sm * Clean up concept of mpool module (I think that code was written way back when the concept of "modules" was fuzzy) * Bring over some old fixes from the /tmp/timattox-sm-coll/ tree to fix potential segv's when mmap'ed regions were at different addresses in different processes (thanks Tim!). * Change sm coll to no longer use mpool as its main source of shmem; rather, just mmap its own segment (because it's fixed size -- there was nothing to be gained by using mpool; shedding the use of mpool saved a lot of complexity in the sm coll setup). This effectively made Tim's fixes moot (because now everything is an offset into the mmap that is computed locally; there are no global pointers). :-) * Slightly updated common/sm to allow making mmap's for a specific set of procs (vs. ''all'' procs in the process). This potentially allows for same-host-inter-proc mmaps -- yay! * Fixed many, many things in the coll sm (particularly in reduce): * Fixed handling of MPI_IN_PLACE in reduce and allreduce * Fixed handling of non-contiguous datatypes in reduce * Changed the order of reductions to go from process (n-1)'s data to process 0's data, because that's how all other OMPI coll components work * Fixed lots of usage of ddt functions * When using a non-contiguous datatype, if the root process is not (n-1), now we used a 2nd convertor to copy from shmem to the rbuf (saves a memory copy vs. what was done before) * Lots and lots of little cleanups, clarifications, and minor optimizations (although still more could be done -- e.g., I think the use of write memory barriers is fairly sub-optimal; they could be ganged together at the root, for example) I'm marking this as "fixes trac:1988" and closing the ticket; if something is still broken, we can re-open the ticket. This commit was SVN r21967. The following Trac tickets were found above: Ticket 1988 --> https://svn.open-mpi.org/trac/ompi/ticket/1988
61 строка
2.2 KiB
C
61 строка
2.2 KiB
C
/*
|
|
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
* University Research and Technology
|
|
* Corporation. All rights reserved.
|
|
* Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
* of Tennessee Research Foundation. All rights
|
|
* reserved.
|
|
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
* University of Stuttgart. All rights reserved.
|
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
|
* All rights reserved.
|
|
* Copyright (c) 2009 Cisco Systems, Inc. All rights reserved.
|
|
* $COPYRIGHT$
|
|
*
|
|
* Additional copyrights may follow
|
|
*
|
|
* $HEADER$
|
|
*/
|
|
/** @file */
|
|
|
|
#include "ompi_config.h"
|
|
|
|
#include "ompi/constants.h"
|
|
#include "ompi/communicator/communicator.h"
|
|
#include "coll_sm.h"
|
|
|
|
|
|
/**
|
|
* Shared memory allreduce.
|
|
*
|
|
* For the moment, all we're doing is a reduce to root==0 and then a
|
|
* broadcast. It is possible that we'll do something better someday.
|
|
*/
|
|
int mca_coll_sm_allreduce_intra(void *sbuf, void *rbuf, int count,
|
|
struct ompi_datatype_t *dtype,
|
|
struct ompi_op_t *op,
|
|
struct ompi_communicator_t *comm,
|
|
mca_coll_base_module_t *module)
|
|
{
|
|
int ret;
|
|
|
|
/* Note that only the root can pass MPI_IN_PLACE to MPI_REDUCE, so
|
|
have slightly different logic for that case. */
|
|
|
|
if (MPI_IN_PLACE == sbuf) {
|
|
int rank = ompi_comm_rank(comm);
|
|
if (0 == rank) {
|
|
ret = mca_coll_sm_reduce_intra(sbuf, rbuf, count, dtype, op, 0,
|
|
comm, module);
|
|
} else {
|
|
ret = mca_coll_sm_reduce_intra(rbuf, NULL, count, dtype, op, 0,
|
|
comm, module);
|
|
}
|
|
} else {
|
|
ret = mca_coll_sm_reduce_intra(sbuf, rbuf, count, dtype, op, 0,
|
|
comm, module);
|
|
}
|
|
return (ret == OMPI_SUCCESS) ?
|
|
mca_coll_sm_bcast_intra(rbuf, count, dtype, 0, comm, module) : ret;
|
|
}
|