* Superfluous use of MPI_User_function in comm_create_keyval_f08.F90
* Missed adding "value" keyword to function pointer arguments in pmpi
C interfaces
Submitted by Craig, reviewed by Jeff.
Refs trac:4512
This commit was SVN r31455.
The following Trac tickets were found above:
Ticket 4512 --> https://svn.open-mpi.org/trac/ompi/ticket/4512
Changed:
- Use ompi_mpi_group_null instead of MPI_GROUP_NULL.
- Asserts don't always quiet the clang static analyser. Change them to
ifs to really quite the warnings.
cmr=v1.8.1:ticket=trac:4527:reviewer=jsquyres
This commit was SVN r31424.
The following Trac tickets were found above:
Ticket 4527 --> https://svn.open-mpi.org/trac/ompi/ticket/4527
The algorithm was failing ibm/collective/allgather and iallgather. I
cleaned up the code to eliminate duplicate code paths and tracked the
issue down to an error in the way extra nodes in the knomial exchange
are handled. The new code is more compact and has been tested with up
to 64 ranks with the ibm test suite.
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31419.
The file coll_ml_ibarrier.c wasn't included in coll/ml's Makefile.am
and the setup code from coll_ml_hier_algorithms_ibarrier.c was not
being called. It looks like this code is stale and has long since been
replaced by the code in coll_ml_barrier.c
Once all these little CMRs are approved I may make it into one roll-up
CMR to make it easier on the RM.
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31418.
a segmentation fault in the reduce cleanup
Some of the changes address false warnings produced by scan-build. I
added asserts and changed some malloc calls to calloc to silence these
warnings.
The was one issue in cleanup for reduce since the component_functions
member is changed by the allreduce call. There may be other issues
with how this code works but releasing the allocated
component_functions after setting up the static functions addresses
the primary issue (SIGSEGV).
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31417.
communicator code.
Many of the warnings were false warnings. These were silenced by
adding the appropriate asserts. Other warnings identified some
potential issues in error paths that should now be resolved.
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31416.
This commit addresses bugs discovered by ggouaillardet.
- Fix hang when creating an intercommunicator
- Fix memory leak
- Fix coverity warning cid70288
- Fix false coverity warning cid1196589
Fixes trac:4507
Fixes trac:4522
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31415.
The following Trac tickets were found above:
Ticket 4507 --> https://svn.open-mpi.org/trac/ompi/ticket/4507
Ticket 4522 --> https://svn.open-mpi.org/trac/ompi/ticket/4522
This man page contains the prototype and descriptions for both
MPI_TYPE_INDEXED and MPI_TYPE_CREATE_HINDEXED. Bastian Beischer
noticed that the type of the array_of_displacements argument in the
MPI_TYPE_CREATE_HINDEXED was wrong.
Also, a minor update to MPI_Type_hindexed.3in: indicate that the C
type is MPI_Aint and the Fortran type is INTEGER (which is why this
function was deprecated and then deleted by the MPI Forum!).
cmr=v1.8.1:reviewer=dgoodell
This commit was SVN r31411.
Nothing is generated in this file; this commit essentially just
renames ompi_config.h.in -> ompi_config.h.
cmr=v1.8.1:reviewer=dgoodell
This commit was SVN r31395.
It's a singular filename because there's only 1 interface in the
file. Also, r31375 missed updating the name in a few places, and
broke the build for compilers that supported the mpi_f08 interface.
This commit was SVN r31389.
The following SVN revision numbers were found above:
r31375 --> open-mpi/ompi@fe1935de14
Two things to note:
- This change will allow us to expand the BTL interface without
having to worry about modifying BTLs that will not support the new
interfaces. More on this will come later this year as part of the
1.9 series.
- C99 guarantees that uninitialed members of structs declared outside
of functions (DATA binary section) will be initialized with
0's. This allows us to drop stuff like .btl_flags = 0, or .btl_get
= NULL.
This commit was SVN r31388.
When sending PUT_LONG, the data is sent before headers, and sometimes
the header is not flushed immediately. This creates a lot of unexpected
receives in the peer, since it would posts a receive only when gets the
header, which makes it run out of receive buffers. When the sender
eventually flushes the window, the receiver already has no buffers to
receive the header, which causes a deadlock.
The fix is to always flush the headers when doing put_long.
cmr=v1.8.1:reviewer=hjelmn
This commit was SVN r31378.
Since we only builds the "small" size of the "mpi" module any more, it
does not take a long time to compile. So remove the warning that is
emitted.
Also remove a vestage of Windows support that was leftover in the
Fortran area (i.e., building mpi.obj).
This commit was SVN r31374.
Differentiate the pre-defined attribute and conversion interfaces into
those with INTEGER handles and those with TYPE(MPI_*) handles.
Refs trac:4157
cmr=v1.8.1:ticket=trac:4512
This commit was SVN r31372.
The following Trac tickets were found above:
Ticket 4157 --> https://svn.open-mpi.org/trac/ompi/ticket/4157
Ticket 4512 --> https://svn.open-mpi.org/trac/ompi/ticket/4512
Use type(c_funptr) to "cast" the fortran function pointers to
arbitrary C pointers. In C, we then pick up the appropriate function
pointer type.
Tested with ifort 14.0.2 and gfortran 4.9 snapshot (which is what
identified that the previous method of passing function pointers was
not Fortran'08-compliant).
Refs trac:4157
This commit was SVN r31371.
The following Trac tickets were found above:
Ticket 4157 --> https://svn.open-mpi.org/trac/ompi/ticket/4157
Junchao Zhang pointed out to me that we had the wrong parameter name
and string length specification for the "version" parameter. This
matters because Fortran allows passing by parameter name
(vs. parameter ordering). Specifically, we had the interface as:
{{{
subroutine MPI_Get_library_version_f08(name,resultlen,ierror)
character(len=MPI_MAX_PROCESSOR_NAME), intent(out) :: name
...etc.
}}}
but it should be:
{{{
subroutine MPI_Get_library_version_f08(version,resultlen,ierror)
character(len=MPI_MAX_LIBRARY_VERSION_STRING), intent(out) :: version
...etc.
}}}
Thankfully, MPI_MAX_PROCESSOR_NAME and MPI_MAX_LIBRARY_VERSION_STRING
are both 255 in OMPI, so there's no ABI issue caused by changing the
length from MMPN --> MMLVS.
The ABI is also unaffected by the parameter name change: if you
compile/link an MPI application calling MPI_GET_LIBRARY_VERSION with
1.8, it'll still run-time link with this change.
However, if an MPI program compiled using parameter name passing with
the old/incorrect parameter name ("name"), it won't be able to compile
with the new/correct parameter name ("version"). But this will only
happen for an incorrect MPI application (because the MPI-3 mandated
parameter name is "version", not "name"), so they deserve what they
get.
cmr=v1.8.1:reviewer=dgoodell
This commit was SVN r31365.
While testing one-sided on LANL systems I found a couple more OSC
bugs that were not caught during the initial testing:
- In the passive target code we read the read lock count as a
char instead of the intended uint32_t. This causes lock to
lockup when using shared locks after 127 iterations.
- The post code used the wrong group when trying to increment post
counters. This causes a segmentation fault.
- Both the post and wait code used the wrong check in the inner
loop leading to an infinite loop.
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31354.
There was a typo in the ompi_osc_gacc_long_start that was causing a
segmentation fault when executing long get accumulate operations.
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31353.
some of the collective modules, the shared memory and the profiling
interface. I left out VT, dynamic fcoll and seq rmaps.
cmr=v1.8.1:reviewer=jsquyres:subject=silence Coverity reported warnings
This commit was SVN r31309.
This commit fixes two nasty races:
- One can occur if the connection request message and connection completion
message arrive out of order. This can happen normally when adaptive routing
is used and also in a timeout situation where a UD message is lost.
- One occurs when handling an ack at the same time as we are handling the
message timeout. In this case we can not free the message or the timeout
will be operating on invalid data. This fix is a band-aid until I can come
up with a better approach. Instead of freeing the message it is marked
as inactive and the event callback is triggered immediately (this has no
affect if the callback is already active). The callback then frees the
message if it is inactive.
cmr=v1.8.1:reviewer=pasha
This commit was SVN r31305.
The last fix prevented a hang but had some cases where the results were
wrong. Fixed. Tested with armci, openmpi/ibm, openmpi/onesided.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31284.
It might be possible (don't know) for a datatype to made of a contiguous block
of a primitive datatype and have an lb. If this is ever the case the code
would have done the wrong thing. Add the lb in to be safe.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31283.
Discussed this with Manju and we decided to back this one out until a later time.
This reverts commit r31188 and closes trac:4435
This commit was SVN r31282.
The following SVN revision numbers were found above:
r31188 --> open-mpi/ompi@f1dd589092
The following Trac tickets were found above:
Ticket 4435 --> https://svn.open-mpi.org/trac/ompi/ticket/4435
There are differences between how active and passive messages are
accounted for in this component. Active message counts on the sender
side are set to zero before the control message is sent so we do not
have to add one to the expected number of messages or we end up
double counting the control message. This commit should fix that error.
Fixes regression in one-sided/test_rma1
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31281.
There were a couple of issues with the memory leak fixes and several more verbose
issues. This fixes those issues.
cmr=v1.8.1:ticket=trac:4473
This commit was SVN r31273.
The following Trac tickets were found above:
Ticket 4473 --> https://svn.open-mpi.org/trac/ompi/ticket/4473
This fixes more issues identified by armci. More issues still remain and fixes are
coming for those as well.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31272.
Also, since I put some of the macros for these silent/verbose rules up
in the top-level Makefile.man-page-rules file, I renamed it to
Makefile.ompi-rules.
I've had this sitting around for a while; now seems like as good a
time as any to commit it.
This commit was SVN r31271.
directory not the job's
This bug didn't affect the correctness of the vader results just the
cleanup. This commit removes an error message about removing a non-existent
file.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31265.
Thanks to ggouaillardet for finding and fixing these issues.
Closes trac:4460
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31264.
The following Trac tickets were found above:
Ticket 4460 --> https://svn.open-mpi.org/trac/ompi/ticket/4460
Also added some missing values and sentinels.
cmr=v1.8:ticket=trac:4470
This commit was SVN r31263.
The following SVN revision numbers were found above:
r31260 --> open-mpi/ompi@69036437b7
The following Trac tickets were found above:
Ticket 4470 --> https://svn.open-mpi.org/trac/ompi/ticket/4470
If ompi_modex_recv() fails with OPAL_ERR_DATA_VALUE_NOT_FOUND, it
simply means that the peer process did not put any usnic BTL modex
info -- it is not an error. So have the usnic BTL simply ignore that
peer (vs. disqualifying itself / treating this like a real error).
Refs trac:4442.
This commit was SVN r31258.
The following Trac tickets were found above:
Ticket 4442 --> https://svn.open-mpi.org/trac/ompi/ticket/4442
This commit should finish the work started for #869. Closing that ticket
with this commit.
Closes trac:869
cmr=v1.8.1:reviewer=jsquyres
This commit was SVN r31257.
The following Trac tickets were found above:
Ticket 869 --> https://svn.open-mpi.org/trac/ompi/ticket/869
Add a lot more information about the --level CLI option, and the nine
levels.
Also remove some now-erroneous examples regarding --version.
cmr=v1.8:reviewer=rhc
This commit was SVN r31246.
in ompi_mtl_mxm_add_procs, define the ep_index variable only
for an older version of mxm.
submitted by Alina, reviewed by Mike.
cmr=v1.8:reviewer=ompi-rm1.8
This commit was SVN r31245.
The error doesn't prevent the user from running so there is no reason
to display it unless the user requested it (through coll_ml_verbose).
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31242.
Fix a one line bug when dealing with non-contiguous sends in prepare_src. Bug was
identified by the intel test suite.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31232.
the case fix in ompi_osc_base_process_op in r31204.
There are two cases that needed to be handled:
- The target is a simple datatype (contiguous block of a primitive
type) but the origin is not. In this case we still need to pack
the origin data but we can not rely on the convertor to do the
unpack (see r31204).
- Both the origin and target datatypes are simple datatypes. In this
case we can use ompi_op_reduce to do the accumulation without having
to pack the origin data.
cmr=v1.8:ticket=trac:4449
This commit was SVN r31231.
The following SVN revision numbers were found above:
r31204 --> open-mpi/ompi@949abe45cd
The following Trac tickets were found above:
Ticket 4449 --> https://svn.open-mpi.org/trac/ompi/ticket/4449
Fixed two bugs:
- Use module->comm NOT comm to get the CID for the shared memory backing
file. This fixes the case where there are multiple shared memory windows
at the same time.
- Remember to unlink the shared memory backing file.
Refs trac:4438
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31227.
The following Trac tickets were found above:
Ticket 4438 --> https://svn.open-mpi.org/trac/ompi/ticket/4438
This commit fixes two bugs:
- We were not correctly setting the lock type in the outstanding lock
for lock_all. This caused undefined behavior.
- flush_all was incorrectly checking for comm size - 1 lock acks but
comm size flush acks. This is the reverse of what was intended.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31226.
In most cases, bad messages received by the connectivty checker are
just dropped. However, in one specific code path, a bad packet caused
an abort. Doh!
This commit does two things:
1. Improve verbose messages for all these cases
1. Simply drop incoming messages that cannot be identified as ACKs or PINGs
Submitted by Jeff Squyres, reviewed by Dave Goodell.
cmr=v1.8:reviewer=ompi-rm1.8
This commit was SVN r31225.
It is possible to get into a situation where a small accumulate operation
can not be completed because a large accumulate operation holds the lock.
In this case we may return from wait/flush/etc before the operation is
complete. To handle this case increment the expected incoming fragment
count when queuing an accumulate operation and increment the incoming
fragment count after processing the accumulate operation.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31224.
of the primitive datatype
In this case we can not use the convertor to run the accumulate operation
since the datatype is a more or less a primitive type.
cmr=v1.8:ticket=trac:4449
This commit was SVN r31222.
The following Trac tickets were found above:
Ticket 4449 --> https://svn.open-mpi.org/trac/ompi/ticket/4449
This commit fixes two issues:
- osc/rdma: The target side of an accumulate was using the target datatype
in the receive to the packed buffer. This was conflicting with the way
the reduction is done into the target buffer. Changed the receive to use
the primitive datatype.
- osc/base: The copy table was completely wrong. Fixed the table to match
the underlying datatypes (which are opal not ompi datatypes).
- osc/base: There is a problem using the optimized description. Fall back
on using the non-optimized description until we can understand what is
going wrong.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31204.
results
The code to handle completion messages did not correctly increment the
number of expected messages. This could cause wait to return before all
incoming messages are complete.
I also added a check to ensure that start returns an error if we are in
a passive access epoch.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31203.
ROMIO code assumes all processes will use the same ROMIO driver. we
were not reaching the "find a common file system" logic when NFS was
enabled, everyone stat-ed the file system without errors, but some
processees found a different file system (like if some processes are
writing to NFS and others to UFS)
See discussion beginning here:
http://lists.mpich.org/pipermail/discuss/2014-March/002403.html
Tested-by: Jeff Squyres <jsquyres@cisco.com>
Submitted by Rob Lathan, reviewed by Jeff Squyres
cmr=v1.8:reviewer=ompi-rm1.8
This commit was SVN r31201.
the ompi_common_verbs_find_ports function had a call to
ompi_ibv_get_device_list, but not to ompi_ibv_free_device_list.
fixed by Alina, reviewed by Vasily/Mike.
cmr=v1.8:reviewer=ompi-rm1.8
This commit was SVN r31200.
This commit adds large datatype description support to the osc/rdma
component. Support is provided by an additional send/recv of the datatype
description if the description does not fit in an eager buffer. The
code is designed to require minimal new code and not for speed. We
consider this code path to be a slow path.
Refs trac:1905
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31197.
The following Trac tickets were found above:
Ticket 1905 --> https://svn.open-mpi.org/trac/ompi/ticket/1905
When we are only using local ranks basesmuma needs to provide an allreduce
function for both large and small message or else the coll/ml selection
logic will fail. In the future this logic should probably be updated to
just disable allreduce in coll/ml instead of disabling coll/ml. For now
it should be correct to say the basesmuma allgather works for larger
messages.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31190.
a hierarchy actually matches a bcol that is in use.
There was a bug in one of the paths to calculate the ml buffer size. I fixed
the bug and squashed all the paths together to avoid further issues (the
result was correct in another path that calculated the same value).
Additionally, the i_hier was being used as the bcol_index. This is not
correct in a couple of cases so I added a variable to keep track of the
real bcol_index.
cmr=v1.8:reviewer=pasha
This commit was SVN r31189.
bound.
This case is correctly handled by coll/ml so remove the check that diables
coll/ml in the not bound case.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31188.
This patch fixes two leaks:
- Fix typo in fallback collective code that caused coll/ml to retain
the ibcast module twice but only release it once. One of those ibcast
saves was supposed to be bcast.
- Do not check for module initialization in the module destructor. It
is possible to destruct a module that is partially setup.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31187.
Without this, an attribute copy function could return non-success, but
it would not be propagated upwards. This caused the intel
MPI_Keyval3_* tests to fail.
cmr=v1.8:reviewer=hjelmn
This commit was SVN r31147.
When you call MPI_Graph_create with a old_comm of size N, and pass
nnodes=(N=1), then the Nth proc is supposed to get MPI_COMM_NULL out.
The code in this base function didn't properly handle the proc(s) that
are supposed to get MPI_COMM_NULL out.
cmr=v1.7.5:reviewer=hjelmn
This commit was SVN r31145.
This isn't causing any errors that I know about but it does fix an
annoying valgrind warning. Simple fix, no review required.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r31130.
There are situations where coll/ml does not initialize properly. These will
eventually need to be fixed but in the meantime it is better to not always
print an error message because the collective framework can still fall back
on another collective module. This commit reduces the verbose output.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31129.
It is usually not a good idea to assert when something is not implemented
or something goes wrong. Replace asserts with debug output and return.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31128.
The necessary information is stored in the proc object. There is no need
to allgather the local process data to determine if another rank is on
the same socket.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31127.
After discussion with Manju we decided to update these the process count
limits of the shared memory collectives to an arbitrarily large number.
cmr=v1.7.5:ticket=trac:4405
This commit was SVN r31126.
The following SVN revision numbers were found above:
r31096 --> open-mpi/ompi@3f469d08e7
The following Trac tickets were found above:
Ticket 4405 --> https://svn.open-mpi.org/trac/ompi/ticket/4405
* Ensure that all endpoints[x] values are initialized to NULL
* If ibv_create_ah fails, remove each endpoint from the
module->all_endpoints list so that the endpoint can be destructed
properly.
Submitted by Jeff Squyres, reviewed by Dave Goodell.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r31111.
This provides full locality - i.e., not just node-level, but all the way down to whatever common binding level exists between the procs.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31106.
Also fixed spelling: IS_NOT_RECHABLE -> IS_NOT_REACHABLE.
Also mark a few places where opal_show_help() should have been used;
Manju will take care of these.
This commit was SVN r31104.
In r31071 I modified the logic to not increment the hierarchy level if
no processes were selected by that sbgp. That fixed a problem seen on
systems where we don't support process binding. The problem is there
is a case where we actually did select processes yet the number of
selected processes is 0. We need to increment the hierarchy in this case
as well.
This should fix the segmentation fault found by recent MTT runs. Once
this is committed to 1.7.5 remove the .ompi_ignore's from coll/ml and
bcol/ptpcoll. Tested with ompi-tests/ibm.
cmr=v1.7.5:reviewer=rhc
This commit was SVN r31081.
The following SVN revision numbers were found above:
r31071 --> open-mpi/ompi@1911d97044
This was causing JVMs to run out of stack space, and all manner of
badness ensued.
Instead, use the heap -- that's what it's there for.
cmr=v1.7.5:reviewer=rhc:subject=make coll/ml use the heap for large debug array
This commit was SVN r31073.
fails to select any processes on any nodes.
Also modified basesmsocket to only print debugging info to the framework
output.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31071.
These parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).
This commit was SVN r31056.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
* Several parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).
* Added missing PMPI F08 OMPI interfaces
This commit was SVN r31049.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
These parameters should not be marked as INTENT(OUT) (they aren't in
the MPI-3 standard).
This commit was SVN r31048.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
It is not valid to call flush outside a passive target epoch nor is
it valid to call lock/lock_all when no_locks is set. In the former
we were just semantically incorrect and the later would crash and
burn.
cmr=v1.7.5:ticket=trac:4382
This commit was SVN r31046.
The following Trac tickets were found above:
Ticket 4382 --> https://svn.open-mpi.org/trac/ompi/ticket/4382
This fixes a bug in r31029 which removes the use of the pml base request
(also not a good way since cm doesn't use the base request). We now allocate
a data structure (ugh) to determine the needed information. Tested with
mtt/onesided.
cmr=v1.7.5:ticket=trac:4379
This commit was SVN r31044.
The following SVN revision numbers were found above:
r31029 --> open-mpi/ompi@29e00f9161
The following Trac tickets were found above:
Ticket 4379 --> https://svn.open-mpi.org/trac/ompi/ticket/4379
- Return an error if the caller specified both MPI_MODE_NOPRECEDE and
MPI_MODE_NOSUCCEED to MPI_Win_fence.
- Return an error if the caller attempts to enter an active target
epoch while already in a passive target epoch.
- End an active target epoch if MPI_Win_fence is called with
MPI_MODE_NOSUCCEED.
cmr=v1.7.5:ticket=trac:4382
This commit was SVN r31043.
The following Trac tickets were found above:
Ticket 4382 --> https://svn.open-mpi.org/trac/ompi/ticket/4382
need to enable the access epoch in MPI_Win_fence.
I missed this change when I fixed the semantics of MPI_Win_create. With
this commit our one-sided MTT runs are now running clean.
cmr=v1.7.5:reviewer=dgoodell
This commit was SVN r31041.
See https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/377
This ticket adds the following functions to the standard:
- MPI_T_cvar_get_index, MPI_T_pvar_get_index, and MPI_T_category_get_index
The ticket has passed and the functions are part of MPI-3.1 that will
be released sometime later this year. In Open MPI the functions expose
existing internal functionality so they are low-risk to add to 1.8.0. I
will leave it up to Ralph whether he wants to accept these into 1.8.
cmr=v1.8:reviewer=rhc
This commit was SVN r31037.
We no longer specify interfaces with choice buffers in the TKR "mpi"
module implementation -- MPI-3 prohibits it (see r30169 and r30170 for
more details).
cmr=v1.7.5:ticket=trac:4372
This commit was SVN r31033.
The following SVN revision numbers were found above:
r30169 --> open-mpi/ompi@759ee33fd4
r30170 --> open-mpi/ompi@776f6144af
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
It seems we can't release accumulate buffers in completion callbacks
because the btls don't release registration resources until after the
callback has fired. The fix is to keep track of the unused buffers and
free them later. This should resolve issues when running IMB-EXT and
IMB-RMA.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31029.
Issue found by Absoft MTT runs.
cmr=v1.7.5:ticket=trac:4372
This commit was SVN r31028.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
Issue found by Absoft MTT runs.
cmr=v1.7.5:ticket=trac:4372
This commit was SVN r31027.
The following Trac tickets were found above:
Ticket 4372 --> https://svn.open-mpi.org/trac/ompi/ticket/4372
Dave Goodell correctly pointed out that it is unusual to return MPI
error classes from internal ompi functions. Correct this in the RMA
case by adding an internal error code to match MPI_ERR_RMA_SYNC.
This does change OMPI_ERR_MAX. I don't think this will cause any
problems with ABI.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31012.
I have only checked that these bindings compile without warnings. They
appear to work with both intel's compilers and gfortran.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31010.
This commit resolves a number of crashed discovered my the onesided
tests in MTT. The functions in question were operating on the assumption
the user was calling RMA functions correctly.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31008.
NOTE: I transferred the oshmem-disabled-by-default from the 1.7 branch to the trunk to minimize future disruption if/when we change that option.
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31006.
- -check-shmem-params is OFF by default. It checks OSHMEM API params and will abort on bad input
- hcoll do not save fallback coll pointers for unsupported collectives.
fixed by Val, Roman, reviewed by Miked/Igor
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30995.
The datatype unpacking code assumes that the packed datatype buffer has the
same alignment as an OPAL_PTRDIFF_TYPE. This was not enforced by the rdma
one-sided component. I changed the ordering and sized of various osc/rdma
headers to ensure their sizes are a multiple of 8-bytes and modified the
fragment allocation call to ensure all headers are 8-byte aligned. While
not the cleanest way to handle this situation it should resolve the issue.
Fixes trac:4315
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r30974.
The following Trac tickets were found above:
Ticket 4315 --> https://svn.open-mpi.org/trac/ompi/ticket/4315
assumes the send request is derived from mca_pml_base_send_request_t,
but this is not true for pml cm, so we end up freeing invalid pointer.
We cannot take the data pointer from the pml send request, so we pass
the allocated buffer pointer in req_complete_cb_data, and put the
osc_rdma_module pointer in that buffer as well.
Previously, osc_pt2pt was used with pml_cm which didn't have this
problem.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30967.
* clang warning stomp
* memory barrier for volatile variable use
These can go to 1.7.5 or can slip to v1.8 -- RM decision.
Submitted by Jeff Squyres, reviewed by Dave Goodell
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30944.
* Older versions of libusnic_verbs actually return 0 when querying for
an unknown port. So also check for a magic ID in the returned data
to *really* know if the usnic extensions are supported.
* Use a union (in the common_verbs area) and memcpy (in the btl) to
avoid undefined C type aliasing behavior.
* Ensure to memset the function table to 0 if the usnic extensions
are not supported.
Submitted by Jeff Squyres, reviewed by Dave Goodell.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30935.
When compiling --with-ft there are a few compiler warnings about
unused variables. This patch fixes those compiler warnings.
This commit was SVN r30927.
Realistically, the usnic BTL doesn't need to know anything about the
underlying transport except for its header length (so that it knows
where the payload begins in a received buffer). So remove the use of
the specific transport prefix union and just rely on the usnic verbs
extension to tell us what the header length is if we're using the
usNIC/UDP transport, or sizeof(struct ibv_grh) if we're using usNIC/L2
transport.
This commit was SVN r30914.
Check the IBV_TRANSPORT_* values. In the case of IBV_TRANSPORT_IWARP,
there's an ambiguity and we need to also check to see whether the
usnic verbs externsion probe exists.
This commit was SVN r30913.
If they exist, call the usnic verbs extensions to both enable UDP
support and get the UD receiver header length that should be used
(rather than assume 40/struct GRH).
This commit was SVN r30912.
- now add_procs can be called more than once (during MPI_INIT and Inter_Comm_Create)
- adjust MXM to this reality
fixed by Alina, reviewed by Yossi/Mike
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30907.
This allows compilers to know that the code path(s) where
ompi_rte_abort() is invoked won't return (and therefore won't warn in
certain cases).
cmr=v1.8:reviewer=rhc
This commit was SVN r30891.
We don't use this functionality any more; we use the transport_type
and device name to identify usnic devices. It's slightly easier
because we can transport_type+name from ibv_device_open() and don't
have to do an additional ibv_query_device() to get its attributes.
Reviewed by Dave Goodell.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30882.
Follow on to SVN trunk r30850: consolidate the ibv_create_ah() calls
into a single loop, MPI_WAITALL-style. That is, call the (effectively
non-blocking) ibv_create_ah() for each endpoint. If we get
NULL+EAGAIN, it means that the UDP ARP is still ongoing down in the
kernel, so just try again later. We put these all into a single loop
because it allows us to parallelize the ARP progress in the kernel.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30879.
The following SVN revision numbers were found above:
r30850 --> open-mpi/ompi@3641500442
r30852 --> open-mpi/ompi@4e282a3295
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Basically: since usnic is a connectionless transport, we do not get
OS-provided services "for free" that connection-oriented transports
get, namely: "hey, I wasn't able to make a connection to peer X", and
"hey, your connection to peer X has died."
This connectivity-checker runs in a separate progress thread in the
usnic BTL in local rank 0 on each server. Upon first send in any
process, the connectivty-checker agent will send some UDP pings to the
peer to ensure that we can reach it. If we can't, we'll abort the job
with a nice show_help message.
There's a lengthy comment in btl_usnic_connectivity.h explains the
scheme and how it works.
Reviewed by Dave Goodell.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30860.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
finalize.
Closes trac:4290
cmr=v1.7.5:reviewer=miked
This commit was SVN r30854.
The following Trac tickets were found above:
Ticket 4290 --> https://svn.open-mpi.org/trac/ompi/ticket/4290
- Fix several typos is osc/rdma.
- Fix a locking issue in osc/sm that was caused by an incorrect
assumption about the semantics of opal_atomic_add_32.
- Always unlock the accumulation lock in osc/sm.
- The base of a processes shared memory window should be NULL if
the size is zero. Fixed.
cmr=v1.7.5:ticket=trac:4304
This commit was SVN r30853.
The following Trac tickets were found above:
Ticket 4304 --> https://svn.open-mpi.org/trac/ompi/ticket/4304
Follow on to SVN trunk r30850: consolidate the ibv_create_ah() calls
into a single loop, MPI_WAITALL-style. That is, call the (effectively
non-blocking) ibv_create_ah() for each endpoint. If we get
NULL+EAGAIN, it means that the UDP ARP is still ongoing down in the
kernel, so just try again later. We put these all into a single loop
because it allows us to parallelize the ARP progress in the kernel.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30852.
The following SVN revision numbers were found above:
r30850 --> open-mpi/ompi@3641500442
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
ibv_create_ah() may need to effect an ARP resolution, which may take
some time. Rather than blocking in ibv_create_ah(), the usNIC driver
may return NULL and set errno to EAGAIN indicating that we should try
again (i.e., the ARP resolution is proceeding under the covers).
So add a simple loop here to loop over ibv_create_ah() until it
returns non-(NULL+EAGAIN). A future commit will make this a bit more
efficient.
Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Dave Goodell <dgoodell@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30850.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Prior to this commit we matched local interfaces to remote interfaces in
order to create endpoints in a simplistic way. If any remote interfaces
were on the same subnet as any of our local interfaces then only local
interfaces would be paired (IP-routed remote interfaces would be
ignored).
This commit introduces a more general scheme which attempts to make the
"best" pairing of local interfaces to remote interfaces. We now cast
the problem as a graph theory problem known as the "Assignment Problem",
or finding a maximum-cardinality, minimum-weight bipartite matching. We
solve this problem by reducing the bipartite graph of interface
connectivity to a flow network and then solving for a minimum cost flow.
This is then easily converted into back into a matching on the original
bipartite graph.
In the new scheme, interfaces on the same subnet are preferred over
interfaces requiring intermediate routing hops and higher bandwidth
links are preferred over lower bandwidth links.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30849.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Querying the OS routing table is important for making decisions about
which local and remote interfaces should be paired into reliable
communication channels.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30848.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This code is intended to support usNIC interface matching functionality.
We currently view that problem as essentially the "Assignment Problem"
(http://en.wikipedia.org/wiki/Assignment_problem), for which there are
many possible solution approaches, including flow-network analysis. In
the future, we might transition to a more nuanced view of the problem
which would likely also be flow-network based.
To this end, the current code focuses on providing one major algorithm
to the core usnic BTL: `ompi_btl_usnic_solve_bipartite_assignment`. It
also exposes several typical and necessary functions for constructing,
manipulating, and querying weighted, directed graphs.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30847.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30846.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This commit adds mechanisms for writing and running unit tests in the
usnic BTL. The short version of how to run the tests is:
1. Configure with `--enable-ompi-btl-usnic-unit-tests`. This will cause
the unit testing code and test runner utility to be built.
2. Run the tests by running `ompi_btl_usnic_run_tests`.
See `README.test` for full details.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30845.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
These includes only exist in the Cisco-internal usnic-v1.6 code base,
but they should not exist anywhere except btl_usnic_compat.h in order to
minimize source differences between usnic-v1.6 and v1.7/trunk.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30844.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Lower layer (hardware or software) bugs can result in a mismatch between
our BTL-layer payload size and the actual packet length. We now check
that in order to catch these cases, which otherwise can result in
MPI-layer message corruption.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30843.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
We were missing a debug message for a very common recv case, making it a
bit harder to follow a debug log.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30842.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
There was a duplicated subnet check in the sender hash lookup routine.
This caused receivers to always fail the sender hash lookup if the
sender was in a different subnet, so the receiver would discard the
packet as though it were coming from a different job.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30841.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
If ibv_create_ah fails, we will not initialize the `endpoint->proc`.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30840.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This functionality is required for routable UDP/IP usnic traffic.
Previously we would only setup endpoints for remote interfaces on the
same subnet as the current module's local interface. This behavior
still holds if two processes share any common subnets. However, if the
two processes only have no subnets in common then we assume that all
interfaces are reachable from all other interfaces and wire them up in a
1-1, randomly-matched order somewhat similarly to the "tcp" BTL's
behavior.
Only match in different subnets if we detect UDP support in the lower
layer.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30839.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This commit decouples OMPI deployment from the version(s) of the lower
layers of the stack by probing for UDP support.
Verbs applications assume a 40-byte header (there is no current
mechanism for querying payload offset). So to support a 42-byte UDP
header without causing existing applications like ibv_ud_pingpong or
older versions of OMPI to crash, we must inform libusnic_verbs that we
are aware of the nonstandard payload offset. We do this by overriding
the `transport_type` field of the device to be 42 before calling
`ibv_open_device`. If the library resets it to something else, then we
know the lower layers are UDP capable. Otherwise we use the older
custom-L2 format.
This necessitated some minor ugliness in common_verbs, but it's as tidy
as Jeff and I know how to make it right now.
This commit only adds support for UDP headers and connectivity over the
same L2 network, it does not touch routing or interface pairing.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30838.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Just trying to be deliberate about keeping fastpath-accessed fields
grouped together to fit into the same 64-byte cache line.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30837.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30836.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30835.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Reese Faucette <rfaucett@cisco.com>
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30834.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Reese Faucette <rfaucett@cisco.com>
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30833.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Valgrind showed this one, just a bit of sloppiness with the reference
counting.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30832.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
The logic did not correctly perform the OR behavior that is described
in the doxy docs for this function. This commit fixes the logic so
that a port will be included if it has supports any of the
capabilities indicated by the passed-in flags.
Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Dave Goodell <dgoodell@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30831.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
1. Changed rng_buff_t --> opal_rng_buff_t
2. All global variables obey the prefix rule
3. Old code has been removed
4. Found a couple of unnecessary includes
Refs trac:4298
This commit was SVN r30807.
The following Trac tickets were found above:
Ticket 4298 --> https://svn.open-mpi.org/trac/ompi/ticket/4298
We're going to be bringing a bunch of usnic code to the SVN trunk
soon, and I basically brought this commit over out of order. So I'm
reverting it for now; the same functionality will come back shortly.
This commit was SVN r30805.
The following SVN revision numbers were found above:
r30804 --> open-mpi/ompi@5bedcc15bf
These constants are now upstream (see
https://git.kernel.org/cgit/libs/infiniband/libibverbs.git/commit/?id=f57a9c67eabb9e7f19c624ac3c8c27b7be55796c),
so let's support them properly in Open MPI.
Added bonus: consolidating these checks up in
ompi_check_openfabrics.m4 allowed removing some custom checks and
AC_DEFINE's from the usnic configure.m4 script.
Also change the usnic/configure.m4 check for IBV_EVENT_GID_CHANGE to
use AC_CHECK_DECLS (vs. AC_CHECK_DECL).
cmr=v1.7.5:reviewer=dgoodell
This commit was SVN r30804.
* Use the prefix rule for global variables
* Elimiante seed_prng() since it isn't necessary any more
These files will need to get edited again then the RNG type obeys the
prefix rule.
Refs trac:4298
This commit was SVN r30803.
The following SVN revision numbers were found above:
r30801 --> open-mpi/ompi@e39d9f4080
The following Trac tickets were found above:
Ticket 4298 --> https://svn.open-mpi.org/trac/ompi/ticket/4298
- Move the ptrdiff_t tests up higher in configure.ac to be with the
rest of the type tests.
- Create new OMPI_FIND_MPI_AINT_COUNT_OFFSET for finding the
corresponding types of MPI_Aint, MPI_Count, and MPI_Offset.
Consolidate all the old C and Fortran tests into this new macro (and
.m4 file).
- Fix Fortran MPI_*_KIND tests that incorrectly keyed off assumed
types (e.g., int64_t) rather than whatever the corresponding C
MPI_Aint, MPI_Count, MPI_Offset types turned out to be.
- Add new logic to ensure that sizeof(MPI_Count) <= sizeof(size_t),
because our entire PML, BTL, and convertor infrastructure requires
this. As a side effect, just like MPI_Offset the same type of
MPI_Count (because MPI_Count has to be able to hold an MPI_Offset,
so we can't let MPI_Offset be larger than a MPI_Count).
This commit was SVN r30776.
The following Trac tickets were found above:
Ticket 4205 --> https://svn.open-mpi.org/trac/ompi/ticket/4205
- MXM uses libtool versioning scheme which is enough, no need additional in OMPI
reviewed by yossi
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30768.
Optimization of the MPI_Dims_create function which omits the usage of pre
calculated prime numbers and factorize directly as discussed at the developer
list.
cmr=v1.7.5:ticket=4217:reviewer=jsquyres
This commit was SVN r30695.
The following Trac tickets were found above:
Ticket 4217 --> https://svn.open-mpi.org/trac/ompi/ticket/4217
Freeprocs variable was obtained from nnodes, so check the value of nnodes at
the begin in the MPI_PARAM_CHECK code section instead as discussed at the
developer list.
cmr=v1.7.5:reviewer=jsquyres:subject=move parameter check to begin
jsquyres, please review this CMR. Thanks.
This commit was SVN r30694.
Some older versions of libibverbs do not have `ibv_event_type_str`,
leading to compilation failures on older machines, irrespective of
whether they could ever support usNIC anyway. If we encounter any other
build issues related to "old verbs" then we should just cause the usnic
BTL to disqualify itself when it encounters "old" traits.
Thanks to Paul Hargrove for reporting the issue:
http://www.open-mpi.org/community/lists/devel/2014/02/14056.php
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30674.
only goes up to VADER_MAX_ADDRESS instead of 0xfffffffffffffffful.
cmr=v1.7.5:ticket=trac:4216
This commit was SVN r30669.
The following Trac tickets were found above:
Ticket 4216 --> https://svn.open-mpi.org/trac/ompi/ticket/4216
It turns out that ASYNCHRONOUS should not be used with ignore TKR
dummy parameters (some compilers will [correctly] warn about this).
Many thanks to Rolf Rabenseifner and Christoph Niethammer, who noticed
the problem.
Submitted by Rolf Rabenseifner, reviewed by Jeff.
cmr=v1.7.5:reviewer=ompi-rm1.7:subject=Remove ASYNCHRONOUS from the ignore TKR mpi_f08 module.
This commit was SVN r30665.
The error was caused by leaving the pipe to the async thread uninitialized, then writing to it regardless of this.
Fix is to check the existance of the async thread and the pipe to it.
reviewd by miked
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30644.
The initialization code did several allgathers on void *'s using
MPI_LONG_LONG_INT. This will produce the wrong result on 32-bit
platforms. Instead use MPI_BYTE with count = sizeof (void *).
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30627.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
Found two bugs in basesmuma:
- Release all resources when tearing down the bcol module.
- Allways call the allreduce in the smcm code. We do not know
beforehand whether all procs have all the files mapped.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30623.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
This is hot-fix patch for the issue reported by Ralph.
In future we plan to restructure ml data structure layout.
Tested by Nathan.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30619.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
just plain wrong (i.e., it gives wrong answers).
When time permits, perhaps we can put in a better algorithm for
MPI_DIMS_CREATE (Andreas Schäfer mentioned that nnodes can now be on
the order of millions, and the current algorithm is... inefficient, at
best).
This commit was SVN r30606.
The following SVN revision numbers were found above:
r30539 --> open-mpi/ompi@fb67d98867
r30540 --> open-mpi/ompi@4417ed2133
This commit was SVN r30605.
The following SVN revision numbers were found above:
r30600 --> open-mpi/ompi@7d2c4cb468
r30602 --> open-mpi/ompi@9e751a0302
r30604 --> open-mpi/ompi@3012c280cf
Revision number ranges (suitable for "git log"):
r30602-30604 --> open-mpi/ompi@9e751a03^..3012c280
The problem was caused by the static request optimization. The buffered send case
is much like the isend case in that the request structure may be needed after
MPI_Bsend completes. Fix this case by calling isend and freeing the resulting
request.
cmr=v1.7.5:ticket=trac:4149
This commit was SVN r30601.
The following Trac tickets were found above:
Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
them, but it's going to take a little time (at least one day). So
Nathan says it's ok to .ompi_ignore coll ml until he's able to fix it.
This commit was SVN r30600.
This change does not appear to increase the small message latency of ping-pong
benchmarks and fixes an issue found by our ibm datatype tests.
Fixes trac:4232
cmr=v1.7.5:ticket=trac:4149
This commit was SVN r30598.
The following Trac tickets were found above:
Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
Ticket 4232 --> https://svn.open-mpi.org/trac/ompi/ticket/4232
* Fix some comments
* Fix some spacing in the non-verbose "make" output
* Make javadoc non-verbose output like other non-verbose output
* Remove the use of JAVA_CLASS_FILES; it wasn't correct any way (it
both derived names from JAVA_SRC_FILES ''and'' used mpi/*.class, so
many files were listed twice)
* Move the generation of javadoc files to "make" time (vs. "make
install" time) by putting the "doc" subdirectory in BUILT_SOURCES
* Make doc dependent upon mpi/MPI.class, not mpi.jar -- we only need
the classes to exist, not the final jarfile.
* Make jdoc-install dependent upon a real build artifact (the doc
dir), not an artificial name that will never exist (jdoc)
* Separate the removal of the doc (and mpi) subdirectories during
"make clean" off into the clean-local target, because CLEANFILES
can really only had ''files'' added to it.
These changes also fix parallel builds.
cmr=v1.7.5:ticket=trac:4214
This commit was SVN r30547.
The following SVN revision numbers were found above:
r30531 --> open-mpi/ompi@6ca8e68e4b
The following Trac tickets were found above:
Ticket 4214 --> https://svn.open-mpi.org/trac/ompi/ticket/4214
primes. This considerably reduces the computational load when
freeprocs is large.
cmr=v1.7.5:reviewer=hjelmn:subject=MPI_Dims_create optimization
This commit was SVN r30539.
opal does not always define MB. It is recommended that opal_atomic_[rw]mb is
called instead. We will need to address the cases where these functions are
no-ops on weak-memory ordered cpus.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30534.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
Several changes are contained in this commit:
- Clean up tabs and trailing whitespaces
- Use consistent indentation in changed files
- Remove unused code. None of the removed code will ever have been
used in a trunk build.
- Clean up the smcm code quite a bit
- Do not fflush stderr and use opal_output instead of fprintf.
These changes have been tested on Cray XE-6 and PSM systems.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30533.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
MPI_SUBARRAYS_SUPPORTED and MPI_ASYNC_PROTECTS_NONBLOCKING in the F08
descriptor prototype.
This commit fixes the F08 descriptor prototype in the same was as
r30519 did for the non-F08-descriptor implementation.
Thanks to Mike Dubman for finding the issue.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30532.
The following SVN revision numbers were found above:
r30519 --> open-mpi/ompi@caaab7e8a3
Ensure that these two flags are in all of mpif.h, the mpi module, and
the mpi_f08 module. Thanks to Rolf Rabenseifner for pointing out the
issue.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30519.
During the commits to make the C/R code compile again the
blocking receive calls were replaced by non-blocking
which broke the code. This patch uses ORTE_WAIT_FOR_COMPLETION()
to wait until the non-blocking calls have finished.
This commit was SVN r30486.
This commit fixes one warning that should have caused coll/ml to segfault
on reduce. The fix should be correct but we will continue to investigate.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30477.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
for 32-bit architectures.
This commit also modifies _OMPI_CHECK_HEADER to use AC_CHECK_HEADERS instead
of AC_CHECK_HEADER. This allows components to check for multiple headers
instead of just one. The new semantics of the header check in OMPI_CHECK_PACKAGE
are to return success if at least one of the specified headers exists. The new
semantics will not break current usage.
cmr=v1.7.5:ticket=trac:4053
This commit was SVN r30476.
The following Trac tickets were found above:
Ticket 4053 --> https://svn.open-mpi.org/trac/ompi/ticket/4053
After IM with Nathan, apply patch from ticket after verification by Paul Hargrove that it fixes the problem on non-x86 32-bit platforms
Verified by Paul, RM-approved
cmr=v1.7.4:reviewer=ompi-gk1.7
This commit was SVN r30411.
The following Trac tickets were found above:
Ticket 4143 --> https://svn.open-mpi.org/trac/ompi/ticket/4143
ROMIO and Lustre 2.4.0. It has been solved upstream already; here's
the ticket:
http://trac.mpich.org/projects/mpich/ticket/1973
And here's the commit that fixed it:
a0c4278f14
OMPI does not have the other code referred to in that git commit (in
ad_lustre_hints.c).
Thanks to Adam Moody for reporting the issue.
cmr=v1.7.4:reviewer=hjelmn:subject=Fix ROMIO compile error w/ Lustre 2.4
This commit was SVN r30393.
The dist graph functions are on the trunk and have long-since been
added to the relevant lists.
cmr=v1.7.5:ticket=4163
This commit was SVN r30382.
The following Trac tickets were found above:
Ticket 4163 --> https://svn.open-mpi.org/trac/ompi/ticket/4163
The attribute and conversion callback subroutine interfaces
are used by all 3 modules, and belong in the fortran/base directory,
not the directory of a specific module.
Also clean up some comments.
cmr=v1.7.4:ticket=4162
This commit was SVN r30378.
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
Also fix the interfaces that have logical parameters (the
non-profiling versions were added/fixed a long time ago; it looks like
the profiling versions were inadvertantly skipped).
cmr=v1.7.4:ticket=4162
This commit was SVN r30377.
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
Somehow these interfaces were missed when adding these interfaces.
cmr=v1.7.4:ticket=4162
This commit was SVN r30376.
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
r30273 made the use of the Fortran "protected" keyword be
compiler-specific (i.e., configure/macro-ized it). But it
inadvertantly added the use of "protected" to some sentinel constants
that should not be protected (e.g., MPI_STATUS_IGNORE).
This commit reverts the addition of "protected" to the constants that
should not be protected.
cmr=v1.7.4:subject=Rollup of Fortran fixes for v1.7.4
This commit was SVN r30375.
The following SVN revision numbers were found above:
r30273 --> open-mpi/ompi@5f17bc3c2c
btl sendi functions currently can not handle the descriptor being NULL. The
send inline optimization was assuming (incorrectly) that NULL was ok.
cmr=v1.7.5:ticket=trac:4149
This commit was SVN r30364.
The following Trac tickets were found above:
Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
allgather.
The new collectives provide a signifigant performance increase over tuned for
small and medium messages. We are initially setting the priority lower than
tuned until this has had some time to soak in the trunk. Please set
coll_ml_priority to 90 for MTT runs.
Credit for this work goes to Manjunath Gorentla Venkata (ORNL), Pavel Shamis (ORNL),
and Nathan Hjelm (LANL).
Commit details (for reference):
Import ORNL's collectives for MPI_Allreduce, MPI_Reduce, and MPI_Allgather.
We need to take the basesmuma header into account when calculating the
ptpcoll small message thresholds. Add a define to bcol.h indicating the
maximum header size so we can take the header into account while not
making ptpcoll dependent on information from basesmuma.
This resolves an issue with allreduce where ptpcoll overwrites the
header of the next buffer in the basesmuma bank.
Fix reduce and make a sequential collective launcher in coll_ml_inlines.h
The root calculation for reduce was wrong for any root != 0. There are
four possibilities for the root:
- The root is not the current process but is in the current hierarchy. In
this case the root is the index of the global root as specified in the
root vector.
- The root is not the current process and is not in the next level of the
hierarchy. In this case 0 must be the local root since this process will
never communicate with the real root.
- The root is not the current process but will be in next level of the
hierarchy. In this case the current process must be the root.
- I am the root. The root is my index.
Tested with IMB which rotates the root on every call to MPI_Reduce. Consider
IMB the reproducer for the issue this commit solves.
Make the bcast algorithm decision an enumerated variable
Resolve various asset failures when destructing coll ml requests.
Two issues:
- Always reset the request to be invalid before returning it to the
free list. This will avoid an asset in ompi_request_t's destructor.
OMPI_REQUEST_FINI does this (and also releases the fortran handle
index).
- Never explicitly construct or destruct the superclass of an opal
object. This screws up the class function tables and will cause
either an assert failure or a segmentation fault when destructing
coll ml requests.
Cleanup allgather.
I removed the duplicate non-blocking and blocking functions and modeled
the cleanup after what I found in allreduce. Also cleaned up the code
somewhat.
Don't bother copying from the send to the recieve buffer in
bcol_basesmuma_allreduce_intra_fanin_fanout if the pointers are the
same.
The eliminates a warning about memcpy and aliasing and avoids an
unnecessary call to memcpy.
Alwasy call CHECK_AND_RELEASE on memsync collectives.
There was a call to OBJ_RELEASE on the collective communicator but
because CHECK_AND_RECYLCE was never called there was not matching call
to OBJ_RELEASE. This caused coll ml to leak communicators.
Make allreduce use the sequential collective launcher in coll_ml_inlines.h
Just launch the next collective in the component progress.
I am a little unsure about this patch. There appears to be some sort
of race between collectives that causes buffer exhaustion in some cases
(IMB Allreduce is a reproducer). Changing progress to only launch the
next bcol seems to resolve the issue but might not be the best fix.
Note that I see little-no performance penalty for this change.
Fix allreduce when there are extra sources.
There was an issue with the buffer offset calculation when there are
extra sources. In the case of extra sources == 1 the offset was set
to buffer_size (just past the header of the next buffer). I adjusted
the buffer size to take into accoun the maximum header size (see the
earlier commit that added this) and simplified the offset calculation.
Make reduce/allreduce non-blocking. This is required for MPI_Comm_idup
to work correctly.
This has been tested with various layouts using the ibm testsuite and
imb and appears to have the same performance as the old blocking version.
Fix allgather for non-contiguous layouts and simplify parsing the
topology.
Some things in this patch:
- There were several comments to the effect that level 0 of the
hierarchy MUST contain all of the ranks. At least one function
made this assumption but it was not true. I changed the sbgp
components and the coll ml initization code to enforce this
requirement.
- Ensure that hierarchy level 0 has the ranks in the correct
scatter gather order. This removes the need for a separate
sort list and fixes the offset calculation for allgather.
- There were several passes over the hierarchy to determine
properties of the hierarchy. I eliminated these extra passes
and the memory allocation associated with them and calculate the
tree properties on the fly. The same DFS recursion also handles
the re-order of level 0.
All these changes have been verified with MPI_Allreduce, MPI_Reduce, and
MPI_Allgather. All functions now pass all IBM/Open MPI, and IMB tests.
coll/ml: correct pointer usage for MPI_BOTTOM
Since contiguous datatypes are copied via memcpy (bypassing the convertor) we
need to adjust for the lb of the datatype. This corrects problems found testing
code that uses MPI_BOTTOM (NULL) as the send pointer.
Add fallback collectives for allreduce and reduce.
cmr=v1.7.5:reviewer=pasha
This commit was SVN r30363.
Per RFC. There are two optimizations in this commit:
- Allocate requests for blocking sends and receives on the stack. This
bypasses the request free list and saves two atomics on the critical path.
This change improves the small message ping-pong by 50-200ns on both AMD
and Intel CPUs.
- For small messages try to use the btl sendi function before intializing a
send request. If the sendi fails or the btl does not have a sendi function
silently fallback on the standard send path.
cmr=v1.7.5:reviewer=brbarret
This commit was SVN r30343.
Gilles Gouaillardet solution attached to ticket #4145.
Closes trac:4145.
cmr=v1.7.4:reviewer=ompi-rm1.7
cmr=v1.6.6:reviewer=ompi-rm1.6
This commit was SVN r30342.
The following Trac tickets were found above:
Ticket 4145 --> https://svn.open-mpi.org/trac/ompi/ticket/4145
Adds coll_hcoll_np mca parameter similar to that of fca component (defaults to 32). Those who use hcoll be aware that from now on the communicators less than 32 procs will run w/o hcoll by default. - Resolves fallback issue in case libhcoll runs out of allowed contexts. The solution is moving hcoll_context_create from comm_enable to comm_query. Shortly, comm_enable should never return OMPI_ERROR in the coll component with highest priority (hcoll). Otherwise the ompi coll_base_select will unselect the coll funtion pointers and module references leaving the communicator w/o coll pointer. This will cause the fail. Same behavior can be reproduced even with tuned if one would hardcore some "return OMPI_ERROR" into it's module_enable funtion. - Additionally, removed all the dead code under #if 0; removed unused variables (path for library, active_modules list) and classes (module list wrapper)
Fixed by Val, Reviewed by Devendar/Josh/Miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30341.
This commit fixes an error path that occurs when huge page allocations are
enabled. In this case we allocate a huge page and try to register it but fail.
We then were calling free on the opal object. Fix this by calling the proper free
function.
cmr=v1.7.4:reviewer=rhc
This commit was SVN r30289.
Also add a verbose flag so one can see what devices are selected as well as another flag to override
locality information and use all devices on the node.
This commit was SVN r30287.
NetBSD puts the AIO functions in -lrt, vs. the usual libc. So we
need the fbtl/posix configure.m4 to test for -lrt properly.
Reviewed by Jeff Squyres.
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=Fix NetBSD use of -laio
This commit was SVN r30274.
Add a configure test to see if the Fortran compiler supports the
PROTECTED keyword. If it does, use in mpi-f08-types.F90 (via a macro
defined in configure-fortran-output-bottom.h).
This is needed to support the PGI 9 Fortran compiler, which does not
support the PROTECTED keyword.
Note that regardless of whether we want to support the PGI 9 Fortran
compiler + mpi_f08, we need to correctly detect whether PROTECTED
works or not, and then use that determination as a criteria for
building the mpi_f08 module. Previously, mpi-f08-types.F90 used
PROTECTED unconditionally, and we didn't test for it in configure. So
if a compiler (e.g., PGI 9) supported everything else but didn't
support PROTECTED, it would try to compile the mpi_f08 stuff and choke
on the use of PROTECTED.
Refs trac:4093
This commit was SVN r30273.
The following Trac tickets were found above:
Ticket 4093 --> https://svn.open-mpi.org/trac/ompi/ticket/4093
1. Canary compile-time test: this is compiled whenever you compile
the entire OMPI tree. It's a noinst standalone library comprised
of a single .c file, so no one will notice its addition, and it
doesn't get linked/installed to any real build products. If we
are out of padding space on any predefined MPI object type, it
will fail to compile. This will alert/annoy a human, who will be
able to fix the real problem.
1. Added a "make check" test that will print out the amount of
predefined padding left on all the MPI object types.
This commit was SVN r30268.
NOTE: launch performance will be absolutely awful if you do this with BTLs that aren't configured to modex_recv on first message!
Even with "modex on demand", we still have to do a barrier in place of the modex - we simply don't move any data around, which does reduce the time impact. The barrier is required to ensure that the other proc has in fact registered all its BTL info and therefore is prepared to hand over a complete data package. Otherwise, you may not get the info you need. In addition, the shared memory BTL can fail to properly rendezvous as it expects the barrier to be in place.
This behavior will *only* take effect under the following conditions:
1. launched via mpirun
2. #procs is greater than ompi_hostname_cutoff, which defaults to UINT32_MAX
3. mca param rte_orte_direct_modex is set to 1. At the moment, we are having problems getting this param to register properly, so only the first two conditions are in effect. Still, the bottom line is you have to *want* this behavior to get it.
The planned next evolution of this will be to make the direct modex be non-blocking - this will require two fixes:
1. if the remote proc doesn't have the required info, then let it delay its response until it does. This means we need a way for the MPI layer to tell the RTE "I am done entering modex data".
2. adjust the SM rendezvous logic to loop until the required file has been created
Creating a placeholder to bring this over to 1.7.5 when ready.
cmr=v1.7.5:reviewer=hjelmn:subject=Enable direct modex at scale
This commit was SVN r30259.
This configure option was only relevant when we were generating TKR
"use mpi" interfaces for MPI subroutines with choice buffers. Now
that we aren't, the only interface that needs to accept a choice
buffer is MPI_SIZEOF (which we have to provide).
And since there's now only several dozen interfaces in the "mpi" TKR
module, there's no reason to not generate ''all'' possible array rank
values (when there were thousands of interfaces, generating 4-vs-7
array ranks per interface per type was a big deal). The default used
to be 4; now we can just hard-code it to 7, the max possible value for
Fortran 2003 (I think the max was raised ?to 11? in F2008, but let's
not go there for now).
cmr=v1.7.5:reviewer=dgoodell:subject=Remove even more dead Fortran configury
This commit was SVN r30257.
BIND(C), but not ''all'' of it. So expand our configure checks to
look for multiple different forms of BIND(C):
* ISO_C_BINDING
* SUBROUTINE ... BIND(C)
* TYPE, BIND(C)
* TYPE(foo), BIND(C, name="bar")
If the compiler supports all of these, then declare that we support
BIND(C), and the rest of the mpi_f08 checks can continue. If we miss
any one of those, don't bother continuing -- we won't build the
mpi_f08 module.
Also push the results of all of these tests down to ompi_info so that
they can be reported easily (e.g., "Hey, why doesn't my OMPI
installation have the mpi_f08 module?").
cmr=v1.7.4:reviewer=jsquyres:subject=Expand Fortran BIND(C) configure checks
This commit was SVN r30247.
LIBADD libmpi.la
cmr=v1.7.4:reviewer=brbarret:subject=Add libmpi to libmpi_usempif08_LIBADD
This commit was SVN r30245.
The following SVN revision numbers were found above:
r30244 --> open-mpi/ompi@7015343951
TKR LIBADDs libmpi_mpifh; there is no library for libmpi_usempi ignore
TKR).
Refs trac:4085
This commit was SVN r30244.
The following Trac tickets were found above:
Ticket 4085 --> https://svn.open-mpi.org/trac/ompi/ticket/4085