1
1
openmpi/ompi/mca/coll/sm/coll_sm_allreduce.c
Jeff Squyres 533633b8cb Fixes trac:1988. The little bug that turned out to be huge. Yoinks.
* Various cosmetic/style updates in the btl sm
 * Clean up concept of mpool module (I think that code was written way
   back when the concept of "modules" was fuzzy)
 * Bring over some old fixes from the /tmp/timattox-sm-coll/ tree to
   fix potential segv's when mmap'ed regions were at different
   addresses in different processes (thanks Tim!).
 * Change sm coll to no longer use mpool as its main source of shmem;
   rather, just mmap its own segment (because it's fixed size --
   there was nothing to be gained by using mpool; shedding the use of
   mpool saved a lot of complexity in the sm coll setup).  This
   effectively made Tim's fixes moot (because now everything is an
   offset into the mmap that is computed locally; there are no global
   pointers).  :-)
 * Slightly updated common/sm to allow making mmap's for a specific
   set of procs (vs. ''all'' procs in the process).  This potentially
   allows for same-host-inter-proc mmaps -- yay!
 * Fixed many, many things in the coll sm (particularly in reduce):
   * Fixed handling of MPI_IN_PLACE in reduce and allreduce
   * Fixed handling of non-contiguous datatypes in reduce
   * Changed the order of reductions to go from process (n-1)'s data
     to process 0's data, because that's how all other OMPI coll
     components work
   * Fixed lots of usage of ddt functions
   * When using a non-contiguous datatype, if the root process is not
     (n-1), now we used a 2nd convertor to copy from shmem to the rbuf
     (saves a memory copy vs. what was done before)
   * Lots and lots of little cleanups, clarifications, and minor
     optimizations (although still more could be done -- e.g., I think
     the use of write memory barriers is fairly sub-optimal; they
     could be ganged together at the root, for example)

I'm marking this as "fixes trac:1988" and closing the ticket; if something
is still broken, we can re-open the ticket.

This commit was SVN r21967.

The following Trac tickets were found above:
  Ticket 1988 --> https://svn.open-mpi.org/trac/ompi/ticket/1988
2009-09-15 00:25:21 +00:00

61 строка
2.2 KiB
C

/*
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2005 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2009 Cisco Systems, Inc. All rights reserved.
* $COPYRIGHT$
*
* Additional copyrights may follow
*
* $HEADER$
*/
/** @file */
#include "ompi_config.h"
#include "ompi/constants.h"
#include "ompi/communicator/communicator.h"
#include "coll_sm.h"
/**
* Shared memory allreduce.
*
* For the moment, all we're doing is a reduce to root==0 and then a
* broadcast. It is possible that we'll do something better someday.
*/
int mca_coll_sm_allreduce_intra(void *sbuf, void *rbuf, int count,
struct ompi_datatype_t *dtype,
struct ompi_op_t *op,
struct ompi_communicator_t *comm,
mca_coll_base_module_t *module)
{
int ret;
/* Note that only the root can pass MPI_IN_PLACE to MPI_REDUCE, so
have slightly different logic for that case. */
if (MPI_IN_PLACE == sbuf) {
int rank = ompi_comm_rank(comm);
if (0 == rank) {
ret = mca_coll_sm_reduce_intra(sbuf, rbuf, count, dtype, op, 0,
comm, module);
} else {
ret = mca_coll_sm_reduce_intra(rbuf, NULL, count, dtype, op, 0,
comm, module);
}
} else {
ret = mca_coll_sm_reduce_intra(sbuf, rbuf, count, dtype, op, 0,
comm, module);
}
return (ret == OMPI_SUCCESS) ?
mca_coll_sm_bcast_intra(rbuf, count, dtype, 0, comm, module) : ret;
}