When we are only using local ranks basesmuma needs to provide an allreduce
function for both large and small message or else the coll/ml selection
logic will fail. In the future this logic should probably be updated to
just disable allreduce in coll/ml instead of disabling coll/ml. For now
it should be correct to say the basesmuma allgather works for larger
messages.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31190.
a hierarchy actually matches a bcol that is in use.
There was a bug in one of the paths to calculate the ml buffer size. I fixed
the bug and squashed all the paths together to avoid further issues (the
result was correct in another path that calculated the same value).
Additionally, the i_hier was being used as the bcol_index. This is not
correct in a couple of cases so I added a variable to keep track of the
real bcol_index.
cmr=v1.8:reviewer=pasha
This commit was SVN r31189.
bound.
This case is correctly handled by coll/ml so remove the check that diables
coll/ml in the not bound case.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31188.
This patch fixes two leaks:
- Fix typo in fallback collective code that caused coll/ml to retain
the ibcast module twice but only release it once. One of those ibcast
saves was supposed to be bcast.
- Do not check for module initialization in the module destructor. It
is possible to destruct a module that is partially setup.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31187.
Without this, an attribute copy function could return non-success, but
it would not be propagated upwards. This caused the intel
MPI_Keyval3_* tests to fail.
cmr=v1.8:reviewer=hjelmn
This commit was SVN r31147.
When you call MPI_Graph_create with a old_comm of size N, and pass
nnodes=(N=1), then the Nth proc is supposed to get MPI_COMM_NULL out.
The code in this base function didn't properly handle the proc(s) that
are supposed to get MPI_COMM_NULL out.
cmr=v1.7.5:reviewer=hjelmn
This commit was SVN r31145.
This isn't causing any errors that I know about but it does fix an
annoying valgrind warning. Simple fix, no review required.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r31130.
There are situations where coll/ml does not initialize properly. These will
eventually need to be fixed but in the meantime it is better to not always
print an error message because the collective framework can still fall back
on another collective module. This commit reduces the verbose output.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31129.
It is usually not a good idea to assert when something is not implemented
or something goes wrong. Replace asserts with debug output and return.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31128.
The necessary information is stored in the proc object. There is no need
to allgather the local process data to determine if another rank is on
the same socket.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31127.
After discussion with Manju we decided to update these the process count
limits of the shared memory collectives to an arbitrarily large number.
cmr=v1.7.5:ticket=trac:4405
This commit was SVN r31126.
The following SVN revision numbers were found above:
r31096 --> open-mpi/ompi@3f469d08e7
The following Trac tickets were found above:
Ticket 4405 --> https://svn.open-mpi.org/trac/ompi/ticket/4405
* Ensure that all endpoints[x] values are initialized to NULL
* If ibv_create_ah fails, remove each endpoint from the
module->all_endpoints list so that the endpoint can be destructed
properly.
Submitted by Jeff Squyres, reviewed by Dave Goodell.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r31111.
This provides full locality - i.e., not just node-level, but all the way down to whatever common binding level exists between the procs.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31106.
Also fixed spelling: IS_NOT_RECHABLE -> IS_NOT_REACHABLE.
Also mark a few places where opal_show_help() should have been used;
Manju will take care of these.
This commit was SVN r31104.
In r31071 I modified the logic to not increment the hierarchy level if
no processes were selected by that sbgp. That fixed a problem seen on
systems where we don't support process binding. The problem is there
is a case where we actually did select processes yet the number of
selected processes is 0. We need to increment the hierarchy in this case
as well.
This should fix the segmentation fault found by recent MTT runs. Once
this is committed to 1.7.5 remove the .ompi_ignore's from coll/ml and
bcol/ptpcoll. Tested with ompi-tests/ibm.
cmr=v1.7.5:reviewer=rhc
This commit was SVN r31081.
The following SVN revision numbers were found above:
r31071 --> open-mpi/ompi@1911d97044
This was causing JVMs to run out of stack space, and all manner of
badness ensued.
Instead, use the heap -- that's what it's there for.
cmr=v1.7.5:reviewer=rhc:subject=make coll/ml use the heap for large debug array
This commit was SVN r31073.
fails to select any processes on any nodes.
Also modified basesmsocket to only print debugging info to the framework
output.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31071.
These parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).
This commit was SVN r31056.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
* Several parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).
* Added missing PMPI F08 OMPI interfaces
This commit was SVN r31049.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
These parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).
This commit was SVN r31048.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
It is not valid to call flush outside a passive target epoch nor is
it valid to call lock/lock_all when no_locks is set. In the former
we were just semantically incorrect and the later would crash and
burn.
cmr=v1.7.5:ticket=trac:4382
This commit was SVN r31046.
The following Trac tickets were found above:
Ticket 4382 --> https://svn.open-mpi.org/trac/ompi/ticket/4382
This fixes a bug in r31029 which removes the use of the pml base request
(also not a good way since cm doesn't use the base request). We now allocate
a data structure (ugh) to determine the needed information. Tested with
mtt/onesided.
cmr=v1.7.5:ticket=trac:4379
This commit was SVN r31044.
The following SVN revision numbers were found above:
r31029 --> open-mpi/ompi@29e00f9161
The following Trac tickets were found above:
Ticket 4379 --> https://svn.open-mpi.org/trac/ompi/ticket/4379
- Return an error if the caller specified both MPI_MODE_NOPRECEDE and
MPI_MODE_NOSUCCEED to MPI_Win_fence.
- Return an error if the caller attempts to enter an active target
epoch while already in a passive target epoch.
- End an active target epoch if MPI_Win_fence is called with
MPI_MODE_NOSUCCEED.
cmr=v1.7.5:ticket=trac:4382
This commit was SVN r31043.
The following Trac tickets were found above:
Ticket 4382 --> https://svn.open-mpi.org/trac/ompi/ticket/4382
need to enable the access epoch in MPI_Win_fence.
I missed this change when I fixed the semantics of MPI_Win_create. With
this commit our one-sided MTT runs are now running clean.
cmr=v1.7.5:reviewer=dgoodell
This commit was SVN r31041.
See https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/377
This ticket adds the following functions to the standard:
- MPI_T_cvar_get_index, MPI_T_pvar_get_index, and MPI_T_category_get_index
The ticket has passed and the functions are part of MPI-3.1 that will
be released sometime later this year. In Open MPI the functions expose
existing internal functionality so they are low-risk to add to 1.8.0. I
will leave it up to Ralph whether he wants to accept these into 1.8.
cmr=v1.8:reviewer=rhc
This commit was SVN r31037.
We no longer specify interfaces with choice buffers in the TKR "mpi"
module implementation -- MPI-3 prohibits it (see r30169 and r30170 for
more details).
cmr=v1.7.5:ticket=trac:4372
This commit was SVN r31033.
The following SVN revision numbers were found above:
r30169 --> open-mpi/ompi@759ee33fd4
r30170 --> open-mpi/ompi@776f6144af
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
It seems we can't release accumulate buffers in completion callbacks
because the btls don't release registration resources until after the
callback has fired. The fix is to keep track of the unused buffers and
free them later. This should resolve issues when running IMB-EXT and
IMB-RMA.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31029.
Issue found by Absoft MTT runs.
cmr=v1.7.5:ticket=trac:4372
This commit was SVN r31028.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372