Based on current implementation it is faster to use a blocking
send than the non-blocking version. Switch the exchange function
used in the barrier to use the blocking version combined with
the non-blocking version of the receive.
* atomic: add support for __atomic builtins
This commit adds support for the gcc __atomic builtins. The __sync
builtins are deprecated and have been replaced by these atomics. In
addition, the new atomics support atomic exchange which was not
supported by __sync.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
* atomic: add support for transactional memory
This commit adds support for using transactional memory when using
opal atomic locks. This feature is enabled if the __HLE__ feature is
available and the gcc builtin atomics are in use.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The request code was setting the request as pml_complete before
calling MCA_PML_OB1_SEND_REQUEST_MPI_COMPLETE. This was causing
MCA_PML_OB1_SEND_REQUEST_RETURN to be called twice in some cases. The
code now mirrors the recvreq code and only sets the request as pml
complete if the request has not already been freed.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This fixes https://github.com/open-mpi/ompi/issues/1732: i.e., the
case where the outer project has its own check for
<valgrind/valgrind.h>, but also supplements CPPFLAGS (to find
Valgrind's header files) before doing that check.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Ideally, we would tell OMPI to disable autoconf's caching of our
valgrind check result so that its check gets the right result after
adding CPPFLAGS. Not sure if we can do that.
For now, just disable our Valgrind code in embedded mode.
This will keep the x86 backend enabled under Valgrind but
it will auto-disable itself when finding identical APIC ids anyway
(because CPUID returns same outputs for all PUs).
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Fixesopen-mpi/ompi#1732
(cherry picked from commit open-mpi/hwloc@8b44fb1c81)
Add the following to zsh shell completion:
* --get-stack-traces
* --report-state-upon-timeout
* --timeout
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Add descriptions for the new --report-state-on-timeout and
--get-stack-traces options.
Also add --timeout, and cross-reference MPIEXEC_TIMEOUT with it.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Remove all @open-mpi-git-mirror entries; those are no longer necessary
since the official migration to Git/Github.
Add aliases for @users.noreply.github.com addresses.
Add fixes for what look like accidental name mispellings /
common-name-isms.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit fixes a programming error when using an aries nic. The
documentation of ugni shows that only the local alignment restriction
for get was lifted on aries. There is still a remote address alignment
restriction.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Note that this cannot be used for MPI performance testing. It is really only useful for ORTE scaling tests. It also only works with the rsh/ssh launcher.
If requested, obtain stacktraces for each application process and report it to stderr upon timeout
stack traces: minor improvements
- Also include the hostname and PID of the each process for which
we're sending the stack traces (vs. just including the ORTE process
name)
- Send a specific error message if we couldn't find "gstack" in the
$PATH (e.g., on OS X)
- Send a sepcific error message if gstack fails to run
- Print a message that obtaining the stack traces may take a few
seconds so that users don't wonder what's happening
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
help-orterun.txt: minor tweaks
Trivial update: show "--timeout" (instead of "-timeout") in the help
message, just to encourage the use of double-dash options.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
trivial: stacktrace -> stack trace
Trivial word smything.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit fixes two bugs in MPI_Wait_any:
- If all requests are inactive then the sync wait would hang forever
because no requests are attached to the sync.
- The request pointer was pointing to the request before the completed
request which caused the wrong request to be freed or marked inactive.
MPI_Wait_some had a similar issue if all the requests were pending.
These issues were identified by MTT.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes a programming error when using an aries nic. The
documentation of ugni shows that only the local alignment restriction
for get was lifted on aries. There is still a remote address alignment
restriction.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds support for Cray Aries atomic operations. This
includes 32-bit and floating point support.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit add support for more atomic operations and type. The
operations added are logical and, logical or, logical xor, swap, min,
and max. New types are 32-bit int by using the
MCA_BTL_ATOMIC_FLAG_32BIT flag, 64-bit float by using the
MCA_BTL_ATOMIC_FLAG_FLOAT flag, and 32-bit float by using both
flags. Floating point numbers are supported by packing the number in
as an int64_t or int32_t. We will update the btl interface in the
future to make this less confusing.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>