http://www.open-mpi.org/community/lists/devel/2014/04/14496.php
Revamp the opal database framework, including renaming it to "dstore" to reflect that it isn't a "database". Move the "db" framework to ORTE for now, soon to move to ORCM
This commit was SVN r31557.
This commit fixes a bug that can cause request and communicator leaks
when cleaning up an OSC window. The should prevent a hang seen with
IMB-EXT.
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r31539.
We will track #4568 from the 1.8 CMR.
Closes trac:4568
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r31535.
The following Trac tickets were found above:
Ticket 4568 --> https://svn.open-mpi.org/trac/ompi/ticket/4568
This commit will improve the message rate when using the sendi function
by not waiting for the send to get to the remote process.
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r31526.
feature
This commit should fix a hang seen when running some of the one-sided
tests. The downside of this fix is it reduces the maximum size of the
messages that use the fast boxes. I will fix this in a later commit.
To improve performance under a heavy load I introduced sequencing to
ensure messages are given to the pml in order. I have seen little-no
impact on the message rate or latency with this change and there is a
clear improvement to the heavy message rate case.
Lets let this sit in the trunk for a couple of days to ensure that
everything is working correctly.
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r31522.
Patch from Gilles Gouaillardet on #4517 to fix handling 0-sized
messages in coll tuned with MPI_ALLTOALLV and MPI_IN_PLACE.
Reviewed by Jeff Squyres.
Fixes trac:4517
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r31521.
The following Trac tickets were found above:
Ticket 4517 --> https://svn.open-mpi.org/trac/ompi/ticket/4517
Patch from Gilles Gouaillardet on #4506 to correctly handle 0-sized
messages in coll/basic MPI_Alltoallv and MPI_Alltoallw.
Reviewed by Jeff Squyres.
Fixes trac:4506.
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r31519.
The following Trac tickets were found above:
Ticket 4506 --> https://svn.open-mpi.org/trac/ompi/ticket/4506
Ensure to also OBJ_RELEASE the neightbor and ineighbor modules.
Fixes trac:4444 (this patch is from that ticket).
This commit was SVN r31516.
The following Trac tickets were found above:
Ticket 4444 --> https://svn.open-mpi.org/trac/ompi/ticket/4444
The algorithm was failing ibm/collective/allgather and iallgather. I
cleaned up the code to eliminate duplicate code paths and tracked the
issue down to an error in the way extra nodes in the knomial exchange
are handled. The new code is more compact and has been tested with up
to 64 ranks with the ibm test suite.
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31419.
The file coll_ml_ibarrier.c wasn't included in coll/ml's Makefile.am
and the setup code from coll_ml_hier_algorithms_ibarrier.c was not
being called. It looks like this code is stale and has long since been
replaced by the code in coll_ml_barrier.c
Once all these little CMRs are approved I may make it into one roll-up
CMR to make it easier on the RM.
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31418.
a segmentation fault in the reduce cleanup
Some of the changes address false warnings produced by scan-build. I
added asserts and changed some malloc calls to calloc to silence these
warnings.
The was one issue in cleanup for reduce since the component_functions
member is changed by the allreduce call. There may be other issues
with how this code works but releasing the allocated
component_functions after setting up the static functions addresses
the primary issue (SIGSEGV).
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31417.
Two things to note:
- This change will allow us to expand the BTL interface without
having to worry about modifying BTLs that will not support the new
interfaces. More on this will come later this year as part of the
1.9 series.
- C99 guarantees that uninitialed members of structs declared outside
of functions (DATA binary section) will be initialized with
0's. This allows us to drop stuff like .btl_flags = 0, or .btl_get
= NULL.
This commit was SVN r31388.
When sending PUT_LONG, the data is sent before headers, and sometimes
the header is not flushed immediately. This creates a lot of unexpected
receives in the peer, since it would posts a receive only when gets the
header, which makes it run out of receive buffers. When the sender
eventually flushes the window, the receiver already has no buffers to
receive the header, which causes a deadlock.
The fix is to always flush the headers when doing put_long.
cmr=v1.8.1:reviewer=hjelmn
This commit was SVN r31378.
While testing one-sided on LANL systems I found a couple more OSC
bugs that were not caught during the initial testing:
- In the passive target code we read the read lock count as a
char instead of the intended uint32_t. This causes lock to
lockup when using shared locks after 127 iterations.
- The post code used the wrong group when trying to increment post
counters. This causes a segmentation fault.
- Both the post and wait code used the wrong check in the inner
loop leading to an infinite loop.
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31354.
There was a typo in the ompi_osc_gacc_long_start that was causing a
segmentation fault when executing long get accumulate operations.
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31353.
some of the collective modules, the shared memory and the profiling
interface. I left out VT, dynamic fcoll and seq rmaps.
cmr=v1.8.1:reviewer=jsquyres:subject=silence Coverity reported warnings
This commit was SVN r31309.
This commit fixes two nasty races:
- One can occur if the connection request message and connection completion
message arrive out of order. This can happen normally when adaptive routing
is used and also in a timeout situation where a UD message is lost.
- One occurs when handling an ack at the same time as we are handling the
message timeout. In this case we can not free the message or the timeout
will be operating on invalid data. This fix is a band-aid until I can come
up with a better approach. Instead of freeing the message it is marked
as inactive and the event callback is triggered immediately (this has no
affect if the callback is already active). The callback then frees the
message if it is inactive.
cmr=v1.8.1:reviewer=pasha
This commit was SVN r31305.
The last fix prevented a hang but had some cases where the results were
wrong. Fixed. Tested with armci, openmpi/ibm, openmpi/onesided.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31284.
It might be possible (don't know) for a datatype to made of a contiguous block
of a primitive datatype and have an lb. If this is ever the case the code
would have done the wrong thing. Add the lb in to be safe.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31283.
Discussed this with Manju and we decided to back this one out until a later time.
This reverts commit r31188 and closes trac:4435
This commit was SVN r31282.
The following SVN revision numbers were found above:
r31188 --> open-mpi/ompi@f1dd589092
The following Trac tickets were found above:
Ticket 4435 --> https://svn.open-mpi.org/trac/ompi/ticket/4435
There are differences between how active and passive messages are
accounted for in this component. Active message counts on the sender
side are set to zero before the control message is sent so we do not
have to add one to the expected number of messages or we end up
double counting the control message. This commit should fix that error.
Fixes regression in one-sided/test_rma1
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31281.
There were a couple of issues with the memory leak fixes and several more verbose
issues. This fixes those issues.
cmr=v1.8.1:ticket=trac:4473
This commit was SVN r31273.
The following Trac tickets were found above:
Ticket 4473 --> https://svn.open-mpi.org/trac/ompi/ticket/4473
This fixes more issues identified by armci. More issues still remain and fixes are
coming for those as well.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31272.
Also, since I put some of the macros for these silent/verbose rules up
in the top-level Makefile.man-page-rules file, I renamed it to
Makefile.ompi-rules.
I've had this sitting around for a while; now seems like as good a
time as any to commit it.
This commit was SVN r31271.
directory not the job's
This bug didn't affect the correctness of the vader results just the
cleanup. This commit removes an error message about removing a non-existent
file.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31265.
Thanks to ggouaillardet for finding and fixing these issues.
Closes trac:4460
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31264.
The following Trac tickets were found above:
Ticket 4460 --> https://svn.open-mpi.org/trac/ompi/ticket/4460
If ompi_modex_recv() fails with OPAL_ERR_DATA_VALUE_NOT_FOUND, it
simply means that the peer process did not put any usnic BTL modex
info -- it is not an error. So have the usnic BTL simply ignore that
peer (vs. disqualifying itself / treating this like a real error).
Refs trac:4442.
This commit was SVN r31258.
The following Trac tickets were found above:
Ticket 4442 --> https://svn.open-mpi.org/trac/ompi/ticket/4442
This commit should finish the work started for #869. Closing that ticket
with this commit.
Closes trac:869
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31257.
The following Trac tickets were found above:
Ticket 869 --> https://svn.open-mpi.org/trac/ompi/ticket/869