We don't use this functionality any more; we use the transport_type
and device name to identify usnic devices. It's slightly easier
because we can transport_type+name from ibv_device_open() and don't
have to do an additional ibv_query_device() to get its attributes.
Reviewed by Dave Goodell.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30882.
Follow on to SVN trunk r30850: consolidate the ibv_create_ah() calls
into a single loop, MPI_WAITALL-style. That is, call the (effectively
non-blocking) ibv_create_ah() for each endpoint. If we get
NULL+EAGAIN, it means that the UDP ARP is still ongoing down in the
kernel, so just try again later. We put these all into a single loop
because it allows us to parallelize the ARP progress in the kernel.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30879.
The following SVN revision numbers were found above:
r30850 --> open-mpi/ompi@3641500442
r30852 --> open-mpi/ompi@4e282a3295
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Basically: since usnic is a connectionless transport, we do not get
OS-provided services "for free" that connection-oriented transports
get, namely: "hey, I wasn't able to make a connection to peer X", and
"hey, your connection to peer X has died."
This connectivity-checker runs in a separate progress thread in the
usnic BTL in local rank 0 on each server. Upon first send in any
process, the connectivty-checker agent will send some UDP pings to the
peer to ensure that we can reach it. If we can't, we'll abort the job
with a nice show_help message.
There's a lengthy comment in btl_usnic_connectivity.h explains the
scheme and how it works.
Reviewed by Dave Goodell.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30860.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
finalize.
Closes trac:4290
cmr=v1.7.5:reviewer=miked
This commit was SVN r30854.
The following Trac tickets were found above:
Ticket 4290 --> https://svn.open-mpi.org/trac/ompi/ticket/4290
- Fix several typos is osc/rdma.
- Fix a locking issue in osc/sm that was caused by an incorrect
assumption about the semantics of opal_atomic_add_32.
- Always unlock the accumulation lock in osc/sm.
- The base of a processes shared memory window should be NULL if
the size is zero. Fixed.
cmr=v1.7.5:ticket=trac:4304
This commit was SVN r30853.
The following Trac tickets were found above:
Ticket 4304 --> https://svn.open-mpi.org/trac/ompi/ticket/4304
Follow on to SVN trunk r30850: consolidate the ibv_create_ah() calls
into a single loop, MPI_WAITALL-style. That is, call the (effectively
non-blocking) ibv_create_ah() for each endpoint. If we get
NULL+EAGAIN, it means that the UDP ARP is still ongoing down in the
kernel, so just try again later. We put these all into a single loop
because it allows us to parallelize the ARP progress in the kernel.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30852.
The following SVN revision numbers were found above:
r30850 --> open-mpi/ompi@3641500442
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
ibv_create_ah() may need to effect an ARP resolution, which may take
some time. Rather than blocking in ibv_create_ah(), the usNIC driver
may return NULL and set errno to EAGAIN indicating that we should try
again (i.e., the ARP resolution is proceeding under the covers).
So add a simple loop here to loop over ibv_create_ah() until it
returns non-(NULL+EAGAIN). A future commit will make this a bit more
efficient.
Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Dave Goodell <dgoodell@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30850.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Prior to this commit we matched local interfaces to remote interfaces in
order to create endpoints in a simplistic way. If any remote interfaces
were on the same subnet as any of our local interfaces then only local
interfaces would be paired (IP-routed remote interfaces would be
ignored).
This commit introduces a more general scheme which attempts to make the
"best" pairing of local interfaces to remote interfaces. We now cast
the problem as a graph theory problem known as the "Assignment Problem",
or finding a maximum-cardinality, minimum-weight bipartite matching. We
solve this problem by reducing the bipartite graph of interface
connectivity to a flow network and then solving for a minimum cost flow.
This is then easily converted into back into a matching on the original
bipartite graph.
In the new scheme, interfaces on the same subnet are preferred over
interfaces requiring intermediate routing hops and higher bandwidth
links are preferred over lower bandwidth links.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30849.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Querying the OS routing table is important for making decisions about
which local and remote interfaces should be paired into reliable
communication channels.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30848.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This code is intended to support usNIC interface matching functionality.
We currently view that problem as essentially the "Assignment Problem"
(http://en.wikipedia.org/wiki/Assignment_problem), for which there are
many possible solution approaches, including flow-network analysis. In
the future, we might transition to a more nuanced view of the problem
which would likely also be flow-network based.
To this end, the current code focuses on providing one major algorithm
to the core usnic BTL: `ompi_btl_usnic_solve_bipartite_assignment`. It
also exposes several typical and necessary functions for constructing,
manipulating, and querying weighted, directed graphs.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30847.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30846.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This commit adds mechanisms for writing and running unit tests in the
usnic BTL. The short version of how to run the tests is:
1. Configure with `--enable-ompi-btl-usnic-unit-tests`. This will cause
the unit testing code and test runner utility to be built.
2. Run the tests by running `ompi_btl_usnic_run_tests`.
See `README.test` for full details.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30845.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
These includes only exist in the Cisco-internal usnic-v1.6 code base,
but they should not exist anywhere except btl_usnic_compat.h in order to
minimize source differences between usnic-v1.6 and v1.7/trunk.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30844.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Lower layer (hardware or software) bugs can result in a mismatch between
our BTL-layer payload size and the actual packet length. We now check
that in order to catch these cases, which otherwise can result in
MPI-layer message corruption.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30843.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
We were missing a debug message for a very common recv case, making it a
bit harder to follow a debug log.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30842.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
There was a duplicated subnet check in the sender hash lookup routine.
This caused receivers to always fail the sender hash lookup if the
sender was in a different subnet, so the receiver would discard the
packet as though it were coming from a different job.
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30841.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
If ibv_create_ah fails, we will not initialize the `endpoint->proc`.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30840.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This functionality is required for routable UDP/IP usnic traffic.
Previously we would only setup endpoints for remote interfaces on the
same subnet as the current module's local interface. This behavior
still holds if two processes share any common subnets. However, if the
two processes only have no subnets in common then we assume that all
interfaces are reachable from all other interfaces and wire them up in a
1-1, randomly-matched order somewhat similarly to the "tcp" BTL's
behavior.
Only match in different subnets if we detect UDP support in the lower
layer.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30839.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
This commit decouples OMPI deployment from the version(s) of the lower
layers of the stack by probing for UDP support.
Verbs applications assume a 40-byte header (there is no current
mechanism for querying payload offset). So to support a 42-byte UDP
header without causing existing applications like ibv_ud_pingpong or
older versions of OMPI to crash, we must inform libusnic_verbs that we
are aware of the nonstandard payload offset. We do this by overriding
the `transport_type` field of the device to be 42 before calling
`ibv_open_device`. If the library resets it to something else, then we
know the lower layers are UDP capable. Otherwise we use the older
custom-L2 format.
This necessitated some minor ugliness in common_verbs, but it's as tidy
as Jeff and I know how to make it right now.
This commit only adds support for UDP headers and connectivity over the
same L2 network, it does not touch routing or interface pairing.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30838.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Just trying to be deliberate about keeping fastpath-accessed fields
grouped together to fit into the same 64-byte cache line.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30837.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30836.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30835.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Reese Faucette <rfaucett@cisco.com>
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30834.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Authored-by: Reese Faucette <rfaucett@cisco.com>
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30833.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
Valgrind showed this one, just a bit of sloppiness with the reference
counting.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30832.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
The logic did not correctly perform the OR behavior that is described
in the doxy docs for this function. This commit fixes the logic so
that a port will be included if it has supports any of the
capabilities indicated by the passed-in flags.
Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Dave Goodell <dgoodell@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30831.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
1. Changed rng_buff_t --> opal_rng_buff_t
2. All global variables obey the prefix rule
3. Old code has been removed
4. Found a couple of unnecessary includes
Refs trac:4298
This commit was SVN r30807.
The following Trac tickets were found above:
Ticket 4298 --> https://svn.open-mpi.org/trac/ompi/ticket/4298
We're going to be bringing a bunch of usnic code to the SVN trunk
soon, and I basically brought this commit over out of order. So I'm
reverting it for now; the same functionality will come back shortly.
This commit was SVN r30805.
The following SVN revision numbers were found above:
r30804 --> open-mpi/ompi@5bedcc15bf
These constants are now upstream (see
https://git.kernel.org/cgit/libs/infiniband/libibverbs.git/commit/?id=f57a9c67eabb9e7f19c624ac3c8c27b7be55796c),
so let's support them properly in Open MPI.
Added bonus: consolidating these checks up in
ompi_check_openfabrics.m4 allowed removing some custom checks and
AC_DEFINE's from the usnic configure.m4 script.
Also change the usnic/configure.m4 check for IBV_EVENT_GID_CHANGE to
use AC_CHECK_DECLS (vs. AC_CHECK_DECL).
cmr=v1.7.5:reviewer=dgoodell
This commit was SVN r30804.
* Use the prefix rule for global variables
* Elimiante seed_prng() since it isn't necessary any more
These files will need to get edited again then the RNG type obeys the
prefix rule.
Refs trac:4298
This commit was SVN r30803.
The following SVN revision numbers were found above:
r30801 --> open-mpi/ompi@e39d9f4080
The following Trac tickets were found above:
Ticket 4298 --> https://svn.open-mpi.org/trac/ompi/ticket/4298
- Move the ptrdiff_t tests up higher in configure.ac to be with the
rest of the type tests.
- Create new OMPI_FIND_MPI_AINT_COUNT_OFFSET for finding the
corresponding types of MPI_Aint, MPI_Count, and MPI_Offset.
Consolidate all the old C and Fortran tests into this new macro (and
.m4 file).
- Fix Fortran MPI_*_KIND tests that incorrectly keyed off assumed
types (e.g., int64_t) rather than whatever the corresponding C
MPI_Aint, MPI_Count, MPI_Offset types turned out to be.
- Add new logic to ensure that sizeof(MPI_Count) <= sizeof(size_t),
because our entire PML, BTL, and convertor infrastructure requires
this. As a side effect, just like MPI_Offset the same type of
MPI_Count (because MPI_Count has to be able to hold an MPI_Offset,
so we can't let MPI_Offset be larger than a MPI_Count).
This commit was SVN r30776.
The following Trac tickets were found above:
Ticket 4205 --> https://svn.open-mpi.org/trac/ompi/ticket/4205
- MXM uses libtool versioning scheme which is enough, no need additional in OMPI
reviewed by yossi
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30768.
Optimization of the MPI_Dims_create function which omits the usage of pre
calculated prime numbers and factorize directly as discussed at the developer
list.
cmr=v1.7.5:ticket=4217:reviewer=jsquyres
This commit was SVN r30695.
The following Trac tickets were found above:
Ticket 4217 --> https://svn.open-mpi.org/trac/ompi/ticket/4217
Freeprocs variable was obtained from nnodes, so check the value of nnodes at
the begin in the MPI_PARAM_CHECK code section instead as discussed at the
developer list.
cmr=v1.7.5:reviewer=jsquyres:subject=move parameter check to begin
jsquyres, please review this CMR. Thanks.
This commit was SVN r30694.
Some older versions of libibverbs do not have `ibv_event_type_str`,
leading to compilation failures on older machines, irrespective of
whether they could ever support usNIC anyway. If we encounter any other
build issues related to "old verbs" then we should just cause the usnic
BTL to disqualify itself when it encounters "old" traits.
Thanks to Paul Hargrove for reporting the issue:
http://www.open-mpi.org/community/lists/devel/2014/02/14056.php
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30674.
only goes up to VADER_MAX_ADDRESS instead of 0xfffffffffffffffful.
cmr=v1.7.5:ticket=trac:4216
This commit was SVN r30669.
The following Trac tickets were found above:
Ticket 4216 --> https://svn.open-mpi.org/trac/ompi/ticket/4216
It turns out that ASYNCHRONOUS should not be used with ignore TKR
dummy parameters (some compilers will [correctly] warn about this).
Many thanks to Rolf Rabenseifner and Christoph Niethammer, who noticed
the problem.
Submitted by Rolf Rabenseifner, reviewed by Jeff.
cmr=v1.7.5:reviewer=ompi-rm1.7:subject=Remove ASYNCHRONOUS from the ignore TKR mpi_f08 module.
This commit was SVN r30665.
The error was caused by leaving the pipe to the async thread uninitialized, then writing to it regardless of this.
Fix is to check the existance of the async thread and the pipe to it.
reviewd by miked
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30644.
The initialization code did several allgathers on void *'s using
MPI_LONG_LONG_INT. This will produce the wrong result on 32-bit
platforms. Instead use MPI_BYTE with count = sizeof (void *).
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30627.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
Found two bugs in basesmuma:
- Release all resources when tearing down the bcol module.
- Allways call the allreduce in the smcm code. We do not know
beforehand whether all procs have all the files mapped.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30623.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
This is hot-fix patch for the issue reported by Ralph.
In future we plan to restructure ml data structure layout.
Tested by Nathan.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30619.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
just plain wrong (i.e., it gives wrong answers).
When time permits, perhaps we can put in a better algorithm for
MPI_DIMS_CREATE (Andreas Schäfer mentioned that nnodes can now be on
the order of millions, and the current algorithm is... inefficient, at
best).
This commit was SVN r30606.
The following SVN revision numbers were found above:
r30539 --> open-mpi/ompi@fb67d98867
r30540 --> open-mpi/ompi@4417ed2133
This commit was SVN r30605.
The following SVN revision numbers were found above:
r30600 --> open-mpi/ompi@7d2c4cb468
r30602 --> open-mpi/ompi@9e751a0302
r30604 --> open-mpi/ompi@3012c280cf
Revision number ranges (suitable for "git log"):
r30602-30604 --> open-mpi/ompi@9e751a03^..3012c280
The problem was caused by the static request optimization. The buffered send case
is much like the isend case in that the request structure may be needed after
MPI_Bsend completes. Fix this case by calling isend and freeing the resulting
request.
cmr=v1.7.5:ticket=trac:4149
This commit was SVN r30601.
The following Trac tickets were found above:
Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
them, but it's going to take a little time (at least one day). So
Nathan says it's ok to .ompi_ignore coll ml until he's able to fix it.
This commit was SVN r30600.
This change does not appear to increase the small message latency of ping-pong
benchmarks and fixes an issue found by our ibm datatype tests.
Fixes trac:4232
cmr=v1.7.5:ticket=trac:4149
This commit was SVN r30598.
The following Trac tickets were found above:
Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
Ticket 4232 --> https://svn.open-mpi.org/trac/ompi/ticket/4232
* Fix some comments
* Fix some spacing in the non-verbose "make" output
* Make javadoc non-verbose output like other non-verbose output
* Remove the use of JAVA_CLASS_FILES; it wasn't correct any way (it
both derived names from JAVA_SRC_FILES ''and'' used mpi/*.class, so
many files were listed twice)
* Move the generation of javadoc files to "make" time (vs. "make
install" time) by putting the "doc" subdirectory in BUILT_SOURCES
* Make doc dependent upon mpi/MPI.class, not mpi.jar -- we only need
the classes to exist, not the final jarfile.
* Make jdoc-install dependent upon a real build artifact (the doc
dir), not an artificial name that will never exist (jdoc)
* Separate the removal of the doc (and mpi) subdirectories during
"make clean" off into the clean-local target, because CLEANFILES
can really only had ''files'' added to it.
These changes also fix parallel builds.
cmr=v1.7.5:ticket=trac:4214
This commit was SVN r30547.
The following SVN revision numbers were found above:
r30531 --> open-mpi/ompi@6ca8e68e4b
The following Trac tickets were found above:
Ticket 4214 --> https://svn.open-mpi.org/trac/ompi/ticket/4214
primes. This considerably reduces the computational load when
freeprocs is large.
cmr=v1.7.5:reviewer=hjelmn:subject=MPI_Dims_create optimization
This commit was SVN r30539.
opal does not always define MB. It is recommended that opal_atomic_[rw]mb is
called instead. We will need to address the cases where these functions are
no-ops on weak-memory ordered cpus.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30534.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
Several changes are contained in this commit:
- Clean up tabs and trailing whitespaces
- Use consistent indentation in changed files
- Remove unused code. None of the removed code will ever have been
used in a trunk build.
- Clean up the smcm code quite a bit
- Do not fflush stderr and use opal_output instead of fprintf.
These changes have been tested on Cray XE-6 and PSM systems.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30533.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
MPI_SUBARRAYS_SUPPORTED and MPI_ASYNC_PROTECTS_NONBLOCKING in the F08
descriptor prototype.
This commit fixes the F08 descriptor prototype in the same was as
r30519 did for the non-F08-descriptor implementation.
Thanks to Mike Dubman for finding the issue.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30532.
The following SVN revision numbers were found above:
r30519 --> open-mpi/ompi@caaab7e8a3
Ensure that these two flags are in all of mpif.h, the mpi module, and
the mpi_f08 module. Thanks to Rolf Rabenseifner for pointing out the
issue.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30519.
During the commits to make the C/R code compile again the
blocking receive calls were replaced by non-blocking
which broke the code. This patch uses ORTE_WAIT_FOR_COMPLETION()
to wait until the non-blocking calls have finished.
This commit was SVN r30486.
This commit fixes one warning that should have caused coll/ml to segfault
on reduce. The fix should be correct but we will continue to investigate.
cmr=v1.7.5:ticket=trac:4158
This commit was SVN r30477.
The following Trac tickets were found above:
Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
for 32-bit architectures.
This commit also modifies _OMPI_CHECK_HEADER to use AC_CHECK_HEADERS instead
of AC_CHECK_HEADER. This allows components to check for multiple headers
instead of just one. The new semantics of the header check in OMPI_CHECK_PACKAGE
are to return success if at least one of the specified headers exists. The new
semantics will not break current usage.
cmr=v1.7.5:ticket=trac:4053
This commit was SVN r30476.
The following Trac tickets were found above:
Ticket 4053 --> https://svn.open-mpi.org/trac/ompi/ticket/4053
After IM with Nathan, apply patch from ticket after verification by Paul Hargrove that it fixes the problem on non-x86 32-bit platforms
Verified by Paul, RM-approved
cmr=v1.7.4:reviewer=ompi-gk1.7
This commit was SVN r30411.
The following Trac tickets were found above:
Ticket 4143 --> https://svn.open-mpi.org/trac/ompi/ticket/4143
ROMIO and Lustre 2.4.0. It has been solved upstream already; here's
the ticket:
http://trac.mpich.org/projects/mpich/ticket/1973
And here's the commit that fixed it:
a0c4278f14
OMPI does not have the other code referred to in that git commit (in
ad_lustre_hints.c).
Thanks to Adam Moody for reporting the issue.
cmr=v1.7.4:reviewer=hjelmn:subject=Fix ROMIO compile error w/ Lustre 2.4
This commit was SVN r30393.
The dist graph functions are on the trunk and have long-since been
added to the relevant lists.
cmr=v1.7.5:ticket=4163
This commit was SVN r30382.
The following Trac tickets were found above:
Ticket 4163 --> https://svn.open-mpi.org/trac/ompi/ticket/4163
The attribute and conversion callback subroutine interfaces
are used by all 3 modules, and belong in the fortran/base directory,
not the directory of a specific module.
Also clean up some comments.
cmr=v1.7.4:ticket=4162
This commit was SVN r30378.
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
Also fix the interfaces that have logical parameters (the
non-profiling versions were added/fixed a long time ago; it looks like
the profiling versions were inadvertantly skipped).
cmr=v1.7.4:ticket=4162
This commit was SVN r30377.
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
Somehow these interfaces were missed when adding these interfaces.
cmr=v1.7.4:ticket=4162
This commit was SVN r30376.
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
r30273 made the use of the Fortran "protected" keyword be
compiler-specific (i.e., configure/macro-ized it). But it
inadvertantly added the use of "protected" to some sentinel constants
that should not be protected (e.g., MPI_STATUS_IGNORE).
This commit reverts the addition of "protected" to the constants that
should not be protected.
cmr=v1.7.4:subject=Rollup of Fortran fixes for v1.7.4
This commit was SVN r30375.
The following SVN revision numbers were found above:
r30273 --> open-mpi/ompi@5f17bc3c2c
btl sendi functions currently can not handle the descriptor being NULL. The
send inline optimization was assuming (incorrectly) that NULL was ok.
cmr=v1.7.5:ticket=trac:4149
This commit was SVN r30364.
The following Trac tickets were found above:
Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
allgather.
The new collectives provide a signifigant performance increase over tuned for
small and medium messages. We are initially setting the priority lower than
tuned until this has had some time to soak in the trunk. Please set
coll_ml_priority to 90 for MTT runs.
Credit for this work goes to Manjunath Gorentla Venkata (ORNL), Pavel Shamis (ORNL),
and Nathan Hjelm (LANL).
Commit details (for reference):
Import ORNL's collectives for MPI_Allreduce, MPI_Reduce, and MPI_Allgather.
We need to take the basesmuma header into account when calculating the
ptpcoll small message thresholds. Add a define to bcol.h indicating the
maximum header size so we can take the header into account while not
making ptpcoll dependent on information from basesmuma.
This resolves an issue with allreduce where ptpcoll overwrites the
header of the next buffer in the basesmuma bank.
Fix reduce and make a sequential collective launcher in coll_ml_inlines.h
The root calculation for reduce was wrong for any root != 0. There are
four possibilities for the root:
- The root is not the current process but is in the current hierarchy. In
this case the root is the index of the global root as specified in the
root vector.
- The root is not the current process and is not in the next level of the
hierarchy. In this case 0 must be the local root since this process will
never communicate with the real root.
- The root is not the current process but will be in next level of the
hierarchy. In this case the current process must be the root.
- I am the root. The root is my index.
Tested with IMB which rotates the root on every call to MPI_Reduce. Consider
IMB the reproducer for the issue this commit solves.
Make the bcast algorithm decision an enumerated variable
Resolve various asset failures when destructing coll ml requests.
Two issues:
- Always reset the request to be invalid before returning it to the
free list. This will avoid an asset in ompi_request_t's destructor.
OMPI_REQUEST_FINI does this (and also releases the fortran handle
index).
- Never explicitly construct or destruct the superclass of an opal
object. This screws up the class function tables and will cause
either an assert failure or a segmentation fault when destructing
coll ml requests.
Cleanup allgather.
I removed the duplicate non-blocking and blocking functions and modeled
the cleanup after what I found in allreduce. Also cleaned up the code
somewhat.
Don't bother copying from the send to the recieve buffer in
bcol_basesmuma_allreduce_intra_fanin_fanout if the pointers are the
same.
The eliminates a warning about memcpy and aliasing and avoids an
unnecessary call to memcpy.
Alwasy call CHECK_AND_RELEASE on memsync collectives.
There was a call to OBJ_RELEASE on the collective communicator but
because CHECK_AND_RECYLCE was never called there was not matching call
to OBJ_RELEASE. This caused coll ml to leak communicators.
Make allreduce use the sequential collective launcher in coll_ml_inlines.h
Just launch the next collective in the component progress.
I am a little unsure about this patch. There appears to be some sort
of race between collectives that causes buffer exhaustion in some cases
(IMB Allreduce is a reproducer). Changing progress to only launch the
next bcol seems to resolve the issue but might not be the best fix.
Note that I see little-no performance penalty for this change.
Fix allreduce when there are extra sources.
There was an issue with the buffer offset calculation when there are
extra sources. In the case of extra sources == 1 the offset was set
to buffer_size (just past the header of the next buffer). I adjusted
the buffer size to take into accoun the maximum header size (see the
earlier commit that added this) and simplified the offset calculation.
Make reduce/allreduce non-blocking. This is required for MPI_Comm_idup
to work correctly.
This has been tested with various layouts using the ibm testsuite and
imb and appears to have the same performance as the old blocking version.
Fix allgather for non-contiguous layouts and simplify parsing the
topology.
Some things in this patch:
- There were several comments to the effect that level 0 of the
hierarchy MUST contain all of the ranks. At least one function
made this assumption but it was not true. I changed the sbgp
components and the coll ml initization code to enforce this
requirement.
- Ensure that hierarchy level 0 has the ranks in the correct
scatter gather order. This removes the need for a separate
sort list and fixes the offset calculation for allgather.
- There were several passes over the hierarchy to determine
properties of the hierarchy. I eliminated these extra passes
and the memory allocation associated with them and calculate the
tree properties on the fly. The same DFS recursion also handles
the re-order of level 0.
All these changes have been verified with MPI_Allreduce, MPI_Reduce, and
MPI_Allgather. All functions now pass all IBM/Open MPI, and IMB tests.
coll/ml: correct pointer usage for MPI_BOTTOM
Since contiguous datatypes are copied via memcpy (bypassing the convertor) we
need to adjust for the lb of the datatype. This corrects problems found testing
code that uses MPI_BOTTOM (NULL) as the send pointer.
Add fallback collectives for allreduce and reduce.
cmr=v1.7.5:reviewer=pasha
This commit was SVN r30363.
Per RFC. There are two optimizations in this commit:
- Allocate requests for blocking sends and receives on the stack. This
bypasses the request free list and saves two atomics on the critical path.
This change improves the small message ping-pong by 50-200ns on both AMD
and Intel CPUs.
- For small messages try to use the btl sendi function before intializing a
send request. If the sendi fails or the btl does not have a sendi function
silently fallback on the standard send path.
cmr=v1.7.5:reviewer=brbarret
This commit was SVN r30343.
Gilles Gouaillardet solution attached to ticket #4145.
Closes trac:4145.
cmr=v1.7.4:reviewer=ompi-rm1.7
cmr=v1.6.6:reviewer=ompi-rm1.6
This commit was SVN r30342.
The following Trac tickets were found above:
Ticket 4145 --> https://svn.open-mpi.org/trac/ompi/ticket/4145
Adds coll_hcoll_np mca parameter similar to that of fca component (defaults to 32). Those who use hcoll be aware that from now on the communicators less than 32 procs will run w/o hcoll by default. - Resolves fallback issue in case libhcoll runs out of allowed contexts. The solution is moving hcoll_context_create from comm_enable to comm_query. Shortly, comm_enable should never return OMPI_ERROR in the coll component with highest priority (hcoll). Otherwise the ompi coll_base_select will unselect the coll funtion pointers and module references leaving the communicator w/o coll pointer. This will cause the fail. Same behavior can be reproduced even with tuned if one would hardcore some "return OMPI_ERROR" into it's module_enable funtion. - Additionally, removed all the dead code under #if 0; removed unused variables (path for library, active_modules list) and classes (module list wrapper)
Fixed by Val, Reviewed by Devendar/Josh/Miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30341.
This commit fixes an error path that occurs when huge page allocations are
enabled. In this case we allocate a huge page and try to register it but fail.
We then were calling free on the opal object. Fix this by calling the proper free
function.
cmr=v1.7.4:reviewer=rhc
This commit was SVN r30289.
Also add a verbose flag so one can see what devices are selected as well as another flag to override
locality information and use all devices on the node.
This commit was SVN r30287.
NetBSD puts the AIO functions in -lrt, vs. the usual libc. So we
need the fbtl/posix configure.m4 to test for -lrt properly.
Reviewed by Jeff Squyres.
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=Fix NetBSD use of -laio
This commit was SVN r30274.
Add a configure test to see if the Fortran compiler supports the
PROTECTED keyword. If it does, use in mpi-f08-types.F90 (via a macro
defined in configure-fortran-output-bottom.h).
This is needed to support the PGI 9 Fortran compiler, which does not
support the PROTECTED keyword.
Note that regardless of whether we want to support the PGI 9 Fortran
compiler + mpi_f08, we need to correctly detect whether PROTECTED
works or not, and then use that determination as a criteria for
building the mpi_f08 module. Previously, mpi-f08-types.F90 used
PROTECTED unconditionally, and we didn't test for it in configure. So
if a compiler (e.g., PGI 9) supported everything else but didn't
support PROTECTED, it would try to compile the mpi_f08 stuff and choke
on the use of PROTECTED.
Refs trac:4093
This commit was SVN r30273.
The following Trac tickets were found above:
Ticket 4093 --> https://svn.open-mpi.org/trac/ompi/ticket/4093
1. Canary compile-time test: this is compiled whenever you compile
the entire OMPI tree. It's a noinst standalone library comprised
of a single .c file, so no one will notice its addition, and it
doesn't get linked/installed to any real build products. If we
are out of padding space on any predefined MPI object type, it
will fail to compile. This will alert/annoy a human, who will be
able to fix the real problem.
1. Added a "make check" test that will print out the amount of
predefined padding left on all the MPI object types.
This commit was SVN r30268.
NOTE: launch performance will be absolutely awful if you do this with BTLs that aren't configured to modex_recv on first message!
Even with "modex on demand", we still have to do a barrier in place of the modex - we simply don't move any data around, which does reduce the time impact. The barrier is required to ensure that the other proc has in fact registered all its BTL info and therefore is prepared to hand over a complete data package. Otherwise, you may not get the info you need. In addition, the shared memory BTL can fail to properly rendezvous as it expects the barrier to be in place.
This behavior will *only* take effect under the following conditions:
1. launched via mpirun
2. #procs is greater than ompi_hostname_cutoff, which defaults to UINT32_MAX
3. mca param rte_orte_direct_modex is set to 1. At the moment, we are having problems getting this param to register properly, so only the first two conditions are in effect. Still, the bottom line is you have to *want* this behavior to get it.
The planned next evolution of this will be to make the direct modex be non-blocking - this will require two fixes:
1. if the remote proc doesn't have the required info, then let it delay its response until it does. This means we need a way for the MPI layer to tell the RTE "I am done entering modex data".
2. adjust the SM rendezvous logic to loop until the required file has been created
Creating a placeholder to bring this over to 1.7.5 when ready.
cmr=v1.7.5:reviewer=hjelmn:subject=Enable direct modex at scale
This commit was SVN r30259.
This configure option was only relevant when we were generating TKR
"use mpi" interfaces for MPI subroutines with choice buffers. Now
that we aren't, the only interface that needs to accept a choice
buffer is MPI_SIZEOF (which we have to provide).
And since there's now only several dozen interfaces in the "mpi" TKR
module, there's no reason to not generate ''all'' possible array rank
values (when there were thousands of interfaces, generating 4-vs-7
array ranks per interface per type was a big deal). The default used
to be 4; now we can just hard-code it to 7, the max possible value for
Fortran 2003 (I think the max was raised ?to 11? in F2008, but let's
not go there for now).
cmr=v1.7.5:reviewer=dgoodell:subject=Remove even more dead Fortran configury
This commit was SVN r30257.
BIND(C), but not ''all'' of it. So expand our configure checks to
look for multiple different forms of BIND(C):
* ISO_C_BINDING
* SUBROUTINE ... BIND(C)
* TYPE, BIND(C)
* TYPE(foo), BIND(C, name="bar")
If the compiler supports all of these, then declare that we support
BIND(C), and the rest of the mpi_f08 checks can continue. If we miss
any one of those, don't bother continuing -- we won't build the
mpi_f08 module.
Also push the results of all of these tests down to ompi_info so that
they can be reported easily (e.g., "Hey, why doesn't my OMPI
installation have the mpi_f08 module?").
cmr=v1.7.4:reviewer=jsquyres:subject=Expand Fortran BIND(C) configure checks
This commit was SVN r30247.
LIBADD libmpi.la
cmr=v1.7.4:reviewer=brbarret:subject=Add libmpi to libmpi_usempif08_LIBADD
This commit was SVN r30245.
The following SVN revision numbers were found above:
r30244 --> open-mpi/ompi@7015343951
TKR LIBADDs libmpi_mpifh; there is no library for libmpi_usempi ignore
TKR).
Refs trac:4085
This commit was SVN r30244.
The following Trac tickets were found above:
Ticket 4085 --> https://svn.open-mpi.org/trac/ompi/ticket/4085
intended to be used and emits a compile-time warning.
Thanks to Paul Hargrove for identifying the issue.
cmr=v1.7.4:reviewer=hjelmn:subject=remove/replace malloc.h
This commit was SVN r30231.
Avoid compiler warning about (unnecessarily) initializing 2 variables
during instantiation at the top of a switch block (but outside of any
case statements): just declare the variables at the top of the outter
block. They're already safely initialized, so don't worry about
initializing them in the instantiation.
Reviewed by Dave Goodell.
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=Don't instantiate+init variables in a switch block
This commit was SVN r30228.
ob1 dummy registration to actually be used when using udreg. Fix this by
always setting reg to NULL when mpool/udreg's register function fails.
cmr=v1.7.4:reviewer=rhc
This commit was SVN r30214.
It is now possible for orte_proc_applied_binding to be NULL (e.g., if
you mpirun --bind-to none), so we need to ensure we don't pass it down
to opal_hwloc_base_cset2*str().
Also, take the opprotunity to de-duplicate some strings that are used
in multiple places.
Refs trac:4073
This commit was SVN r30204.
The following Trac tickets were found above:
Ticket 4073 --> https://svn.open-mpi.org/trac/ompi/ticket/4073
Set comm attribute with keyval.
Wait for pending hcoll module tasks in comm delete callback where PML
still valid on the communicator. safely destroy hcoll context during
hcoll module destructor.
Author: Devendar Bureddy
reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30175.
http://www.open-mpi.org/community/lists/users/2014/01/23327.php
Revert the Fortran mpi module default size to "small", meaning that we
won't provide interfaces for MPI subroutines that take a choice buffer
any more. The short version is that MPI-3 p610:34-41 disallows it.
This commit simply removes all these subroutines from the build
process (i.e., remove them from nodist_libmpi_usempi_la_SOURCES).
Since MPI-3 actually forbids providing these interfaces, I'll do a
second commit to actually remove all the scripts and associated
Makefile.am junk.
cmr=v1.7.4:reviewer=dgoodell:subject=Remove choice buffer interfaces from Fortran mpi module
This commit was SVN r30169.
needed for correctness. The if_include/if_exclude are level 1, and
the TCP port range params are level 2; this parameter seems to be on
par with the TCP port range params.
Refs trac:4019
This commit was SVN r30161.
The following Trac tickets were found above:
Ticket 4019 --> https://svn.open-mpi.org/trac/ompi/ticket/4019
* Remove some set-but-not-used variables
* Make a convenience function return void (we weren't using the
return code, anyway)
* Mark a function as inline (it was supposed to be inline anyway)
Reviewed by Dave Goodell.
cmr=v1.7.5:reviewer=ompi-rm1.7:subject=Fix usnic BTL compiler warnings
This commit was SVN r30160.
Thanks to Tetsuya Mishima for detecting it!
cmr=v1.7.4:reviewer=jsquyres:subject=Correct tcp_not_use_nodelay option processing
This commit was SVN r30157.
- HCOLL close without init
- Call hcoll progress after comm finalize
- mpirun default for coll_hcoll_enable is 1
fixed by Igor, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30156.
upcoming GCC/gfortran 4.9's ignore TKR interface.
This was originally committed in a side mercurial repo, but I sadly
completely forgot about it until Tobias reminded me.
cmr=v1.7.5:reviewer=dgoodell:subject=Add support for gfortran 4.9 Fortran ignore TKR
This commit was SVN r30152.
configury/Makefile.am changes; this commit renames the internal
installdirs.h framework struct field names to match the configry macro
names:
* pkgdatdir -> ompidatadir
* pkglibdir -> ompilibdir
* pkgincludedir -> ompiincludedir
This commit was SVN r30145.
The following SVN revision numbers were found above:
r30140 --> open-mpi/ompi@8b778903d8
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi. This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.
This commit was SVN r30140.
Complements r30073: tighten up the string parsing of the vendor parts
ID MCA param a bit. Also fix a small memory leak: ensure to free the
array uint32_t's parsed out of the MCA param.
This commit was SVN r30128.
The following SVN revision numbers were found above:
r30073 --> open-mpi/ompi@6003702a51
The following Trac tickets were found above:
Ticket 4301 --> https://svn.open-mpi.org/trac/ompi/ticket/4301
This commit adds support for placing the send memory segment in a
traditional shared memory segment when XPMEM is not available. The
current default is to reserve 4MB for shared memory on each process.
The latest benchmarks show vader performing better than sm on both
Intel and AMD CPUs.
For large messages vader will now use CMA if it is available (and
XPMEM is not).
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r30123.
Per RFC which expired two weeks ago:
We are planning to make a change to Open MPI to always set up the btls. This
means the btl init will be called even if add_procs is never called for that
btl. In the openib btl free lists fragments are currently allocated in btl_init.
To avoid wasting that memory this commit moves that final device setup to
the add_procs function. This included allocating free lists, and starting the
async event thread.
At this time this change is safe since we have a barrier after add_procs in
MPI_Init. If this changes we will need to re-think some of the initialization
since we might have the possibility of a connection request before add_procs
is called.
Tested with Mellanox ConnectX2 and QLogic HCAs.
Commit also cleans up tabs in btl_openib_async.c.
cmr=v1.7.5:reviewer=miked
This commit was SVN r30122.
1. Fix ompi_info memory leak in usnic BTL: do not allocate memory in
the component register function, because ompi_info only calls the
component register function and then dlclose's the component -- it
does not call component finalize. Instead, defer parsing the MCA
param (and alloc'ing memory) until the component init function so
that any allocated memory can be freed in the component close
function.
1. Also add a new check to ensure that we actually have some part
numbers to check. Add a show_help message if we don't find any
vendor part IDs to check.
1. Add a verbose output if usnic disqualifies itself from selection
because THREAD_MULTIPLE was specified.
cmr=v1.7.5:reviewer=dgoodell
This commit was SVN r30073.
- Modifications to coll/hcoll component related to the changes in the libhcoll API.
Now, hcoll_destroy_context accepts one more parameter that indicates if the context was
really destroyed as a result of the call.
This new "non-blocking" context destruction fixes hang discovered in IMB with mcast enabled.
- Clean up all the left contexts (if any) on the comm_world destruction.
fixed by Val, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30055.
This patch changes all send/send_buffer occurrences in the C/R code
to send_nb/send_buffer_nb.
The new code compiles but does not work.
Changes from V1:
* #ifdef out the code (so it is preserved for later re-design)
* marked the broken C/R code with ENABLE_FT_FIXED
Changes from V2:
* just replace the blocking calls with the non-blocking calls
* all #ifdef's introduced in V1 are gone
* send_* returns error code or ORTE_SUCCESS (not the number of bytes)
This commit was SVN r30036.
This patch changes all recv/recv_buffer occurrences in the C/R code
to recv_nb/recv_buffer_nb.
The old code is still there but disabled using ifdefs (ENABLE_FT_FIXED).
The new code compiles but does not work.
Changes from V1:
* #ifdef out the code (so it is preserved for later re-design)
* marked the broken C/R code with ENABLE_FT_FIXED
Changes from V2:
* only #ifdef out the code where the behaviour is changed
(used to be blocking; now non-blocking)
This commit was SVN r30035.
Fix comm_spawn on a single host - with the new default mapping scheme, we were incorrectly computing the number of procs to put on the node.
Refs trac:4003
This commit was SVN r30033.
The following Trac tickets were found above:
Ticket 4003 --> https://svn.open-mpi.org/trac/ompi/ticket/4003
Some NFS scenarios can result in an infinite ESTALE return, which will
hang ROMIO. This commit causes ROMIO to error out after a large number
of retries instead of spinning forever.
This is MPICH commit b250d338:
http://git.mpich.org/mpich.git/commit/b250d338e66667a8a1071a5f73a4151fd59f83b2
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r29993.
function pointers set to the _map functions, and we get segv's in MTT
testing (e.g., the C++ suite, which actually calls MPI_Cart_map and
MPI_Graph_map).
cmr=v1.7.4:reviewer=bosilca:subject=Fix topo _map function pointer assignments
This commit was SVN r29988.
branch (it's not necessary on trunk/v1.7 because they require C99,
which allows variadic macros).
Also fix another compiler warning (using %p to print a (void*)).
Submitted by Jeff, reviewed by Dave.
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=two usnic BTL fixes
This commit was SVN r29966.
usnic_channel_finalize() was deregistering recv buffers before
destroying the QP to which they were posted. The QP needs to be
destroyed first so that the NIC does not attemp tto write to
deregistered memory, causing the DMAR messages.
Submitted by Reese, reviewed by Jeff.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r29963.
debugger code (not mca_topo_base_module_2_1_0_t).
I checked: we do a similar thing for coll in the communicator struct
(i.e., leave the version number off the module struct). I confess to
not remembering ''why'' we leave the version number off, but it seems
to be consistent this way...
cmr=v1.7.4:reviewer=bosilca:subject=fix debugger type symbol lookup for mca_topo_base_module_t
This commit was SVN r29953.
The following Trac tickets were found above:
Ticket 3958 --> https://svn.open-mpi.org/trac/ompi/ticket/3958
* automatically retrieve the hostname (and all RTE info) for all procs during MPI_Init if nprocs < cutoff
* if nprocs > cutoff, retrieve the hostname (and all RTE info) for a proc upon the first call to modex_recv for that proc. This would provide the hostname for debugging purposes as we only report errors on messages, and so we must have called modex_recv to get the endpoint info
* BTLs are not to call modex_recv until they need the endpoint info for first message - i.e., not during add_procs so we don't call it for every process in the job, but only those with whom we communicate
My understanding is that only some BTLs have been modified to meet that third requirement, but those include the Cray ones where jobs are big enough that launch times were becoming an issue. Other BTLs would hopefully be modified as time went on and interest in using them at scale arose. Meantime, those BTLs would call modex_recv on every proc, and we would therefore be no worse than the prior behavior.
This commit revises the MPI-RTE interface to pass the ompi_proc_t instead of the ompi_process_name_t for the proc so that the hostname can be easily inserted. I have advised the ORNL folks of the change.
cmr=v1.7.4:reviewer=jsquyres:subject=Fix thread deadlock
This commit was SVN r29931.
The following SVN revision numbers were found above:
r29917 --> open-mpi/ompi@1a972e2c9d
includes various fixes all over the C/R code which are
hard to group like the other patches.
Changes from V1:
* explain why mca_base_component_distill_checkpoint_ready no longer works
* compare return result of opal functions with OPAL_* values
Changes from V2:
* use orte_rml_oob_ft_event() instead of referencing through the modules
* properly protect variable (thanks to --enable-picky)
This commit was SVN r29922.
discovered when removing some components.
This commit was SVN r29895.
The following SVN revision numbers were found above:
r29894 --> open-mpi/ompi@58ed00296c
cmr=v1.7.4:reviewer=brbarret:subject=Disqualify sm btl for hetero procs
This commit was SVN r29882.
The following Trac tickets were found above:
Ticket 2433 --> https://svn.open-mpi.org/trac/ompi/ticket/2433
For exammple, mca_btl_sm.knem_fd remained 0, and mca_btl_sm_component_close() ended up doing closing fd 0 which belongs to someone else.
fixed by Yossi, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r29875.
flexible members.
UDCM is ready to go for 1.7.4 with this patch.
cmr=v1.7.4:ticket=3940
This commit was SVN r29861.
The following Trac tickets were found above:
Ticket 3940 --> https://svn.open-mpi.org/trac/ompi/ticket/3940
Note that this event should never happen within a single OMPI job,
because OMPI will ignore usnic ports that are down. The PORT_ACTIVE
event should only occur if a port ''was'' down and is now ''up''. But
what the heck -- if we ever do get this event, it is harmless -- just
ignore it.
This commit was SVN r29852.
This is helpful in the work for #3694: ensure that many places that
eventually end up in configure don't overly-pollute the global shell
variable space (because debugging accidental shell variable pollution
can be a real pain).
Refs trac:3694
This commit was SVN r29830.
The following Trac tickets were found above:
Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
On the off chance that the PML is twiddling fields that it really
shouldn't be...
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
This commit was SVN r29804.
MOFED apparently has a /usr/include/infiniband/verbs.h that also
defines a (slightly different but fully compatible) container_of
macro. So put proper #ifndef protection around our definition of
container_of.
Thanks to Rolf vandeVaart for pointing out the issue.
Reviewed by Dave Goodell.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r29799.
Originally udcm acks used the immediate data to indicate which message was
being acknowleged. This data was (mysteriously) junk when using QLogic HCAs so I
updated udcm to use the source info (slid, qp, etc) to determine which message was being
acked. This works as long as we don't have two messages simultaneously in flight
to a particular peer and then loose the first of the two messages. The chances of this
happening are tiny. To fix this case I updated the udcm message header to include
a pointer to the in flight message. This pointer is then sent back to the sending
process to ack receipt.
cmr=v1.7.4:ticket=trac:3940
This commit was SVN r29775.
The following Trac tickets were found above:
Ticket 3940 --> https://svn.open-mpi.org/trac/ompi/ticket/3940
This commit updates the udcm cpc to support xrc. The steps followed by udcm
mimic those in the removed xoob cpc. This update has been tested with both XRC
and RC.
Mellanox, this is intended to go into 1.7.4. Please review carefully and let
me know if there are any issues.
cmr=v1.7.4:reviewer=miked
This commit was SVN r29767.
(aka the root). This commit is based on a patch provided by Pierre
Jolivet.
Fix all the output to match the failing MPI call.
This commit was SVN r29761.
- added preprocessor conditional for vt_cupti_events_enabled
(fixes compile error when CUDA-RT wrapper are enabled and CUPTI is disabled (as reported at: https://svn.open-mpi.org/trac/ompi/changeset/29752 by Jörg Bornschein))
This commit was SVN r29754.
Fixed warnings about the need of the 'subdir-objects' option when using Automake v1.14.
Due to a bug in Automake (see http://debbugs.gnu.org/cgi/bugreport.cgi?bug=13928) the 'subdir-objects' option cannot be enabled.
To get around this problem external sources files are sym linked in the current build directory (as done in ompi/mpi/c/profile) to lead Automake to believe that all source files are in the same directory.
This commit was SVN r29732.
To support the new mpool two changes were made to the mpool infrastructure:
1) Added an mpool flag to indicate that an mpool does not need the memory
hooks to use the leave pinned protocols. This flag is checked in the
mpool lookup.
2) Add a mpool context to the base registration. This new member is used
by the udreg mpool to store the udreg context associated with the
particular registration. The new member will not break the ABI
compatibility as the new member is only currently used by the udreg
mpool.
Dynamics support for Cray systems makes use of the global rank provided by
orte to give the ugni library a unique rank for each process. Dynamics
support is not available under direct-launch (srun.)
cmr=v1.7.4
This commit was SVN r29719.
This isn't being used yet - just enabling Nathan to do what he needs.
***** NOTE: any use of the OMPI_DB_GLOBAL_RANK database key must be protected by #ifdef OMPI_DB_GLOBAL_RANK as not all RTE's will define this key. *****
This commit was SVN r29708.
http://www.open-mpi.org/community/lists/devel/2013/10/13072.php
Add support for pinning GPU Direct RDMA in openib BTL for better small message latency of GPU buffers.
Note that none of this is compiled in unless CUDA-aware support is requested.
This commit was SVN r29680.
libmpi.<OPAL_DYN_LIB_SUFFIX>, where OPAL_DYN_LIB_SUFFIX was determined
by configure.
Thanks to Ömer Demirel for reporting the issue.
Refs trac:3905.
This commit was SVN r29676.
The following Trac tickets were found above:
Ticket 3905 --> https://svn.open-mpi.org/trac/ompi/ticket/3905
Gah! The "device" variable isn't used at all in this loop (my eye
glossed over the next line and thought that "device" was used in the
free() statement, but it's actually "devices" -- not "device").
This commit was SVN r29665.
The following Trac tickets were found above:
Ticket 3091 --> https://svn.open-mpi.org/trac/ompi/ticket/3091
<usnic device name>,<eth device>,<ip address>/<CIDR prefix>
For example:
usnic_0,eth4,10.1.0.15/16
This is just handy for mapping the usnic_X device back to the IP
network to which it corresponds.
This commit was SVN r29656.
Resolves a hang when using scif for shared memory transfers. This is a
simple change and doesn't require a review.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r29653.
Cisco v1.6 git commit 913ec6c and upstream trunk r29593 (segfault fix)
introduced a performance regression by inadvertently disabling the
`module_recv_buffers` functionality. With those changes in place, the
`btl_usnic_recv.c` logic would end up mallocing a buffer that should
have otherwise come from a `module_recv_buffers` pool. It also resulted
in a small, bounded memory leak (128 buffers at each power-of-two size
interval).
The new version just places the buffer after the free list item with a
flexible array member. I bumped the pool to allocate all 128 elements
up front because the deferred allocation was modestly impacting IMB
Sendrecv performance at a few sizes.
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
This commit was SVN r29631.
The following SVN revision numbers were found above:
r29593 --> open-mpi/ompi@1ed9b8ff43
should have been all along and fix one place that uses the file
Update opal_portable_platform.h with changes to mpi_portable_platform.h made
in r29608.
Make mpi_portable_platform.h a symlink to opal_portable_platform.h, so that
they won't get out of sync. I'd like to remove mpi_portable_platform.h, but
we don't automatically add -I${includedir}/openmpi/ to make that sane from
a header include point of view, so that's future work.
This commit was SVN r29618.
The following SVN revision numbers were found above:
r29608 --> open-mpi/ompi@b71bd51cdd
Only use Portals on communicators with more than one rank
Fix computation of number of children when using the hypercube tree
This commit was SVN r29616.
patch. See ticket #3885, comment 10 for an explination of why calling
_STRINGIFY on something that's not a numerical constant is always a bad idea.
This commit was SVN r29613.
The following SVN revision numbers were found above:
r29608 --> open-mpi/ompi@b71bd51cdd
This line results in a compile error when you configure thusly:
./configure CC=icc CXX=icpc FC=ifort FCFLAGS=-i8
cmr=v1.7.4:reviewer=hjelmn:subject=fix Fortran compile with -i8
This commit was SVN r29602.
Without this commit, if you run IMB pingpong between two nodes with only
one usnic selected (e.g., via `--mca btl_usnic_if_include usnic_0`) then
the run will seem fine but will segfault at MPI_Finalize time.
This behavior has happened since Cisco v1.6 git commit ec7ddf8, upstream
trunk r29484, and upstream v1.7 r29507.
Root cause was that the free list element was being used as the recv
buffer instead of the data buffer associated with the element. So the
reassembly code would stomp all over the free list element, which would
cause the destructor to explode when the free list attempted to clean up
all of its elements. This surprisingly did not cause any other problems
until now.
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
This commit was SVN r29593.
The following SVN revision numbers were found above:
r29484 --> open-mpi/ompi@a6ed232a10
r29507 --> open-mpi/ompi@790d269ce8
If we need to use a convertor, go back to stashing that convertor in the
frag and populating segments "on the fly" (in
ompi_btl_usnic_module_progress_sends). Previously we would pack into a
chain of chunk segments at prepare_src time, unnecessarily consuming
additional memory.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
This commit was SVN r29592.
This makes it a little easier to see what's happening with callbacks to
the PML.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
This commit was SVN r29591.
This includes suppressing picky-mode warnings about __VA_ARGS__, which
we know are supported by any compilers we care about.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
This commit was SVN r29590.
Ensure that they never are touched by checking in their destructors.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Reese Faucette <rfaucett@cisco.com>
This commit was SVN r29589.
Let imagine that we have two btls in btl_openib_component_init() both points to the same openib_btl->device and as a result have the same openib_btl->device->endpoints array.
Finalization phase calls twice mca_btl_openib_finalize()->mca_btl_openib_finalize_resources().
mca_btl_openib_finalize_resources() frees endpoint related btl. But the second call of mca_btl_openib_finalize_resources() checks endpoint that is released by previus call.
fixed by Igor, reviewed by miked/vasily
cmr=v1.7.4:reviewer=ompi-gk1.7
This commit was SVN r29563.
This commit moves all the module stats into their own struct so that
the stats only need to appear as a single line in the module_t
definition, and then moves all the logic for reporting the stats into
btl_usnic_stats.c|h.
Further, the stats are now exported as MPI_T_BIND_NO_OBJECT entities
(i.e., not bound to any particular MPI handle), and are marked as
READONLY and CONTINUOUS. They currently all default to verbose level
5 ("Application tuner / detailed", according to
https://svn.open-mpi.org/trac/ompi/wiki/MCAParamLevels).
Most of the statistics are counters, but a small number are high
watermark values. Due to how counters are reported via MPI_T, none of
the counters are exported through MPI_T if the MCA param
btl_usnic_stats_relative=1 (i.e., the module resets the stats back to
zero at a given frequency).
When MPI_T_pvar_handle_alloc() is invoked on any of these pvars, it
will return a count that is equal to the number of active usnic BTL
modules. The values returned for any given pvar (e.g.,
num_total_sends) are an array containing one value for each active
usnic BTL module. The ordering of values in the array is both
consistent across all usnic pvars and stable throughout a single job:
array slot 0 corresponds to module X, array slot 1 corresponds to
module Y, etc.
Mapping which array slot corresponds to which underlying Linux usnic_X
device works as follows:
* The btl_usnic_devices MPI_T state pvar is associated with a
btl_usnic_device MPI_T enum, and be obtained via
MPI_T_pvar_get_info().
* If all usNIC pvars are of length N, the values [0,N) in the
btl_usnic_device enum are associated with strings of the
corresponding underlying Linux device.
For exampe, to look up which Linux device is reported in all usNIC
pvars' array slot 1, look up the int value 1 in the btl_usnic_devices
enum. Its corresponding string value is underlying Linux device name
(e.g., "usnic_1").
cmr=v1.7.4:subject="usnic BTL MPI_T pvars"
This commit was SVN r29545.
r29479.
This fixes some issues reported awhile ago in the openib btl. There
are a couple more unchecked mallocs but they are a bit more difficult
to fix since they are in void functions (btl_openib_endpoint.c).
Refs trac:2401.
cmr=v1.7.4:reviewer=miked
This commit was SVN r29543.
The following SVN revision numbers were found above:
r29479 --> open-mpi/ompi@d6ead2a3a5
The following Trac tickets were found above:
Ticket 2401 --> https://svn.open-mpi.org/trac/ompi/ticket/2401