3742c3550c
deactivated by default. It is activated by setting either of the following two MCA parameters to values greater than 0: * coll_sync_barrier_before * coll_sync_barrier_after If !_before is >0, then the sync coll collective will insert itself before the underlying collective operations and invoke a barrier before every Nth barrier (N == coll_sync_barrier_before). Similar for !_after. Note that N is a _per communicator_ value; not global to the MPI process. If both are 0 (which is the default), this component returns NULL for the comm query, meaning that it is not insertted into the coll module stack. The intent of this component is to provide a a workaround for applications with large numbers of collectives of short messages that can cause unbounded unexpected messages. Specifically, it is possible for some iterative collective communication patterns to cause unbounded unexpected messages. Forcing a barrier before or after every Nth collective operation would prevent that behavior by forcing applications to synchronize (and thereby consume any outstanding unexpected messages caused by collectives on the same communicator). Open MPI still needs to bound unexpected messages resource consumption at the receiver, but this is a viable workaround for at least some symptoms of the problem. Additionally, there has been anecdotal evidence of some applications that "perfom better" when they put barriers after other collective operations. This could be due to many factors -- including shortening the unexpected message queue. Putting this component in Open MPI allows people to try this with their own applications and give real world feedback on this kind of behavior. This commit was SVN r20584.
51 строка
1.8 KiB
C
51 строка
1.8 KiB
C
/*
|
|
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
* University Research and Technology
|
|
* Corporation. All rights reserved.
|
|
* Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
* of Tennessee Research Foundation. All rights
|
|
* reserved.
|
|
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
* University of Stuttgart. All rights reserved.
|
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
|
* All rights reserved.
|
|
* Copyright (c) 2009 Cisco Systems, Inc. All rights reserved.
|
|
* $COPYRIGHT$
|
|
*
|
|
* Additional copyrights may follow
|
|
*
|
|
* $HEADER$
|
|
*/
|
|
|
|
#include "ompi_config.h"
|
|
|
|
#include "coll_sync.h"
|
|
|
|
|
|
/*
|
|
* gather
|
|
*
|
|
* Function: - gather
|
|
* Accepts: - same arguments as MPI_Gather()
|
|
* Returns: - MPI_SUCCESS or error code
|
|
*/
|
|
int mca_coll_sync_gather(void *sbuf, int scount,
|
|
struct ompi_datatype_t *sdtype,
|
|
void *rbuf, int rcount,
|
|
struct ompi_datatype_t *rdtype,
|
|
int root, struct ompi_communicator_t *comm,
|
|
mca_coll_base_module_t *module)
|
|
{
|
|
mca_coll_sync_module_t *s = (mca_coll_sync_module_t*) module;
|
|
|
|
if (s->in_operation) {
|
|
return s->c_coll.coll_gather(sbuf, scount, sdtype,
|
|
rbuf, rcount, rdtype, root, comm,
|
|
s->c_coll.coll_gather_module);
|
|
} else {
|
|
COLL_SYNC(s, s->c_coll.coll_gather(sbuf, scount, sdtype,
|
|
rbuf, rcount, rdtype, root, comm,
|
|
s->c_coll.coll_gather_module));
|
|
}
|
|
}
|