MPICH2 for "small" commutative operations in the reduce_scatter basic
implementation. "small" is currently pretty big, as it doesn't take
much to beat reduce/scatterv. Need to do much more than this for
better all around performance of MPI_Reduce_scatter, but this was enough
to solve the problems I was having.
This commit was SVN r13348.
Found another places that we were incorrectly casting a C++ MPI handle
array to the corresponding C array type and hoping for the best (which
won't work at all). This commit fixes things so that we now do the
proper conversion between C<-->C++ handles.
This commit was SVN r13346.
and George on these refinements):
* Rename the static OBJ initializer macro to be
OPAL_OBJ_STATIC_INIT(class)
* Ensure that all static OBJ initializations get a refcount of 1
(doesn't ''really'' matter, since they're static, it should never
get to the point where the OBJ is DESTRUCTed, but more correct
nonetheless)
* Add a "magic number" to the OBJ when compiling with debug support.
The magic number does some rudimentary support to ensure that
you're operating on a valid OBJ (and fails an assertion if you're
not). Check to ensure that the memory contains the magic number
when performing actions of OBJ's. Also remove the magic number
when DESTRUCTing OBJs, so that if, for example, an OBJ is
DESTRUCTed more than once, we'll fail the magic number assert.
This commit was SVN r13338.
The following SVN revision numbers were found above:
r13227 --> open-mpi/ompi@96030de97b
r13228 --> open-mpi/ompi@c2e9075d29
- post isends in reverse order of posting irecvs.
if the messages arrive approximately in order, this should
minimize the time spent in matching the requests.
I did not see any performance difference over MX up to 64 nodes, but
the change makes sense and may have some impact when we have (many)
more nodes.
This commit was SVN r13337.
configured with --disable-mpi-cxx so that the default -I flags in the
wrapper compilers don't point to a directory that doesn't exist.
Thanks to Martin Audet for identifying the problem.
This commit was SVN r13296.
- Allreduce algorithms:
- Recursive doubling is used for small messages (up to 10KB) and can be used for
both commutative and non-commutative operations.
Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and
Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with
MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes.
- Ring algorithms performs well for larger messages but cannot be used for
non-commutative operations. It passed the same tests as recursive doubling, except
some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c
(which was expected).
- MPI_Allreduce with new decision function passed all of the tests mentioned above.
- Cleaning up coll_tuned_util. Moving isendrecv to static inline just like sendrecv.
This commit was SVN r13252.
not the component. This potentially allows for a mix of HCAs that
support eager RDMA and those who do not on a port-by-port basis.
This commit was SVN r13242.
* If the text to cite where the problem occurred is "\n", prettyprint
somethign a little nicer so that it's clear that we're talking
about the end of line
* Add a missing help message ("ini file:unknown field"), and display
it a little better (i.e., show the erroneous field, not a
misleading "end of line" marker)
* It's "OpenIB", not "Open IB"
This commit was SVN r13241.
needlessly registered in multiple different places, and none of them
had a good help string. There was also an inconsistent check for
setting both mpi_leave_pinned and mpi_leave_pinned_pipeline (i.e., it
was only in ob1). This commit moves the registration of these params
to one central place (ompi/runtime/ompi_mpi_params.c, with all other
mpi_* MCA params) and uses globals to propagate the values as
relevant. The error check was also moved to the central location to
ensure that we can consistency everywhere.
This commit was SVN r13226.
return the buffer address from Fortran. It is not expected
behavior. For MPI_Buffer_attach, adjust the address of
the buffer handed in so it is always aligned.
Refs trac:750
Buffer detach reviewed by Jeff Squyres
Buffer attach alignment reviewed by George Bosilca
This commit was SVN r13205.
The following Trac tickets were found above:
Ticket 750 --> https://svn.open-mpi.org/trac/ompi/ticket/750
OB1 always use first element from array of BTLs available for RDMA. The patch
change the array creation algorithm, it puts different BTL in the first element
in round robin fashion.
This commit was SVN r13174.
to the F90 binding for MPI_INITIALIZED was wrong (should have been
logical, not integer).
Fixes trac:782.
This commit was SVN r13170.
The following Trac tickets were found above:
Ticket 782 --> https://svn.open-mpi.org/trac/ompi/ticket/782
- removing static qualification on ompi_coll_tuned_sendrecv
- adding ompi_coll_tuned_isendrecv function which posts isend and irecv requests
These changes are separate from but necessary for new algorithms I am working on.
This commit was SVN r13161.
so there's no longer a race there (I used to do the unlock request last, after local completion of all the
requests completed, to try to avoid having the passive side reply to the active side, but I don't do that
anymore). The unlock side will not "unlock" the window until it actually receives the correct number of results,
so we're good there.
This fixes an issue where we would receive data on the remote side we weren't expecting that could cause
us to release a lock before it really should have been released to the requesting peer. It could also
cause a deadlock if one of the processes trying to unlock was "self", as that would result in the active
unlock never sending the unlock request, even though it sent the payload, which could cause a counter
that should always be positive to hit -1, causing an infinite loop that could only be solved by
popping up the stack, which was an impossibility.
Refs trac:785
This commit was SVN r13160.
The following Trac tickets were found above:
Ticket 785 --> https://svn.open-mpi.org/trac/ompi/ticket/785
during their send calls by dropping the loop through the list of pending control messages if any are marked
as completed.
Refs trac:784
This commit was SVN r13159.
The following Trac tickets were found above:
Ticket 784 --> https://svn.open-mpi.org/trac/ompi/ticket/784
MPI_Aint. On 64-bit big endian machines, these can have some unpleasent
issues.
Refs trac:734
This commit was SVN r13140.
The following Trac tickets were found above:
Ticket 734 --> https://svn.open-mpi.org/trac/ompi/ticket/734
Otherwise, we end up recursively calling into the progress functions
and corrupting a list that doesn't like to be corrupted.
Refs trac:561
This commit was SVN r13138.
The following Trac tickets were found above:
Ticket 561 --> https://svn.open-mpi.org/trac/ompi/ticket/561