deactivated by default. It is activated by setting either of the
following two MCA parameters to values greater than 0:
* coll_sync_barrier_before
* coll_sync_barrier_after
If !_before is >0, then the sync coll collective will insert itself
before the underlying collective operations and invoke a barrier
before every Nth barrier (N == coll_sync_barrier_before). Similar for
!_after. Note that N is a _per communicator_ value; not global to the
MPI process.
If both are 0 (which is the default), this component returns NULL for
the comm query, meaning that it is not insertted into the coll module
stack.
The intent of this component is to provide a a workaround for
applications with large numbers of collectives of short messages that
can cause unbounded unexpected messages. Specifically, it is possible
for some iterative collective communication patterns to cause
unbounded unexpected messages. Forcing a barrier before or after
every Nth collective operation would prevent that behavior by forcing
applications to synchronize (and thereby consume any outstanding
unexpected messages caused by collectives on the same communicator).
Open MPI still needs to bound unexpected messages resource consumption
at the receiver, but this is a viable workaround for at least some
symptoms of the problem.
Additionally, there has been anecdotal evidence of some applications
that "perfom better" when they put barriers after other collective
operations. This could be due to many factors -- including shortening
the unexpected message queue. Putting this component in Open MPI
allows people to try this with their own applications and give real
world feedback on this kind of behavior.
This commit was SVN r20584.
Also create mca parameter to force daemonization (previous
behavior) which might be needed on larger clusters or
to make use of the -notify flag with qsub.
This fixes trac:1783.
This commit was SVN r20582.
The following Trac tickets were found above:
Ticket 1783 --> https://svn.open-mpi.org/trac/ompi/ticket/1783
* The main thing done here is to convert from multiple FIFOs/queues per
receiver (each receiver has one FIFO for each sender) to a single FIFO/queue
per receiver (all senders sharing the same FIFO for a given receiver).
* This requires rewriting the FIFO support, so that
ompi/class/ompi_[circular_buffer_]fifo.h is no longer used and FIFO
support is instead in btl_sm.h.
* The number of FIFOs per receiver is actually an MCA tunable parameter,
but it appears that 1 or possibly 2 FIFOs (even for 112 local processes)
per receiver is sufficient.
This commit was SVN r20578.
known-bad memory access pattern. Specifically, a NULL pointer is
passed in a system call as part of a probe to figure out which
affinity API this system has. We know it's a NULL and we did it on
purpose, so don't have Valgrind yell about it.
This commit was SVN r20572.
Often, orte/util/show_help.h is included, although no functionality
is required -- instead, most often opal_output.h, or
orte/mca/rml/rml_types.h
Please see orte_show_help_replacement.sh commited next.
- Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
actually showed two *missing* #include "orte/util/show_help.h"
in orte/mca/odls/base/odls_base_default_fns.c and
in orte/tools/orte-top/orte-top.c
Manually added these.
Let's have MTT the last word.
This commit was SVN r20557.
knows what it can and cannot free (these pointers are largely unused
and therefore otherwise uninitialized in user-defined op's and
MPI_REPLACE).
This commit was SVN r20532.