* Fix typo in the `opal_atomic_wmb` declaration.
* Fix lingering `eieio` reference in the XL assembly to be `lwsync`
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Sometimes, the ethernet interfaces can get quite high kernel indices. struct
ifreq (see netdevice(7)) defines ifr_ifindex to be int's. The OOB component
used int16_t internally for matching (in case of -mca oob_tcp_if_[in|ex]clude)
which meant that any interface index > 32767 would never be matched because the
integer would be truncated to int16_t upon return from the function. OOB would
then refuse to work because it didn't find any usable interfaces and MPI job
would abort.
Signed-off-by: Wojtek Wasko <wwasko@nvidia.com>
so the bool type is defined when using old compilers that do not support gcc builtin atomics (such as gcc 4.1.x from CentOS 5)
Fixesopen-mpi/ompi#4478
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds additional atomics math operations that are needed
throughout the codebase. The semantics of the new operations are
consistent with the existing atomics (op then fetch).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The problem is that the waiting thread is cycling using OMPI_LAZY_WAIT_FOR_COMPLETION so it can exercise opal_progress. This probably isn't as critical for the modex step, but definitely necessary for the barrier at the end of mpi_init. The problem this creates is that the lazy macro exits as soon as "active" becomes false, and then we destruct the lock.
However, wakeup_thread sets "active" to false - and then calls the condition broadcast to wakeup any waiting threads. So there is a race condition between that broadcast and the lock destruct.
Add OPAL_ACQUIRE_OBJECT and OPAL_POST_OBJECT memory barriers to help protect against thread race conditions on some platforms
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Search external libevent library in both DIR/lib64 and DIR/lib
when --with-libevent=DIR is specified but --with-libevent-libdir=DIR is not
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Fix an apparent typo in external libevent configury
Require external libevent for install of separate libpmix
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
gcc 7.[1,2] (at least) fails to correctly parse the OSX 10.13 sys/syslog.h
header. As a results we need to potect syslog support in OPAL, PMIX and
ORTE.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Turns out UCX PML calls opal_pmix.fence in its del procs
method without checking whether or not the fence method
for the pmix component was defined. Rather than patch
UCX PML, actually define a fence method for the cray pmix.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
We no longer officially support MIPS or ARM before v6. This commit
updates the configury to check for sync builtins on these
architectures and removes the MIPS and IA64 assembly from
opal/include/opal/sys.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Give packagers a configure CLI option to set the value of the MCA
variable mca_base_component_show_load_errors.
The --disable form of this option is intended for Open MPI packagers
who tend to enable support for many different types of networks and
systems in their packages. For example, consider a packager who
includes support for both the FOO and BAR networks in their Open MPI
package, both of which require support libraries (libFOO.so and
libBAR.so). If an end user only has BAR hardware, they likely only
have libBAR.so available on their systems -- not libFOO.so. Disabling
load errors by default will prevent the user from seeing potentially
confusing warnings about the FOO components failing to load because
libFOO.so is not available on their systems.
Conversely, system administrators tend to build an Open MPI that is
targeted at their specific environment, and contains few (if any)
components that are not needed. In such cases, they might want their
users to be warned that the FOO network components failed to load
(e.g., if libFOO.so was mistakenly unavailable), because Open MPI may
otherwise silently failover to a slower network path for MPI traffic.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Debounce "unreachable" notifications for tools when they disconnect
Enable the -x cmd line option for prun
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 0a5b36180a22959654461ac1303cec35313f8b4a)
Remove some build product. Tell PMIx that we don't need a new nspace generated when OMPI calls connect
Add missing Makefile
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Their is racing condition in TCP connection establishment
during simultaneous handshake. This PR handles the fix for
it.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>