1
1
openmpi/ompi/mca/coll/sync/help-coll-sync.txt
Jeff Squyres 3742c3550c Add "sync" collective component. This component is totally
deactivated by default.  It is activated by setting either of the
following two MCA parameters to values greater than 0:

 * coll_sync_barrier_before
 * coll_sync_barrier_after

If !_before is >0, then the sync coll collective will insert itself
before the underlying collective operations and invoke a barrier
before every Nth barrier (N == coll_sync_barrier_before).  Similar for
!_after.  Note that N is a _per communicator_ value; not global to the
MPI process.

If both are 0 (which is the default), this component returns NULL for
the comm query, meaning that it is not insertted into the coll module
stack. 

The intent of this component is to provide a a workaround for
applications with large numbers of collectives of short messages that
can cause unbounded unexpected messages.  Specifically, it is possible
for some iterative collective communication patterns to cause
unbounded unexpected messages.  Forcing a barrier before or after
every Nth collective operation would prevent that behavior by forcing
applications to synchronize (and thereby consume any outstanding
unexpected messages caused by collectives on the same communicator).

Open MPI still needs to bound unexpected messages resource consumption
at the receiver, but this is a viable workaround for at least some
symptoms of the problem.

Additionally, there has been anecdotal evidence of some applications
that "perfom better" when they put barriers after other collective
operations.  This could be due to many factors -- including shortening
the unexpected message queue.  Putting this component in Open MPI
allows people to try this with their own applications and give real
world feedback on this kind of behavior.

This commit was SVN r20584.
2009-02-18 23:32:44 +00:00

24 строки
619 B
Plaintext

# -*- text -*-
#
# Copyright (c) 2009 Cisco Systems, Inc. All rights reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#
# This is the US/English general help file for Open MPI's sync
# collective component.
#
[missing collective]
The sync collective component in Open MPI was activated on a
communicator where it did not find an underlying collective operation
defined. This usually means that the sync collective module's
priority was not set high enough. Please try increasing sync's
priority.
Local host: %s
Sync coll module priority: %d
First discovered missing collective: %s