if PMIx (version > 1.x) is active since all diagnostic messages will instead flow thru
the PMIx connection. Unfortunately, PMIx v1 does not support this
feature, but we can remove the stddiag support once PMIx v1 slides out
of the support window
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The problem is that the waiting thread is cycling using OMPI_LAZY_WAIT_FOR_COMPLETION so it can exercise opal_progress. This probably isn't as critical for the modex step, but definitely necessary for the barrier at the end of mpi_init. The problem this creates is that the lazy macro exits as soon as "active" becomes false, and then we destruct the lock.
However, wakeup_thread sets "active" to false - and then calls the condition broadcast to wakeup any waiting threads. So there is a race condition between that broadcast and the lock destruct.
Add OPAL_ACQUIRE_OBJECT and OPAL_POST_OBJECT memory barriers to help protect against thread race conditions on some platforms
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Search external libevent library in both DIR/lib64 and DIR/lib
when --with-libevent=DIR is specified but --with-libevent-libdir=DIR is not
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Fix an apparent typo in external libevent configury
Require external libevent for install of separate libpmix
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
gcc 7.[1,2] (at least) fails to correctly parse the OSX 10.13 sys/syslog.h
header. As a results we need to potect syslog support in OPAL, PMIX and
ORTE.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Turns out UCX PML calls opal_pmix.fence in its del procs
method without checking whether or not the fence method
for the pmix component was defined. Rather than patch
UCX PML, actually define a fence method for the cray pmix.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Give packagers a configure CLI option to set the value of the MCA
variable mca_base_component_show_load_errors.
The --disable form of this option is intended for Open MPI packagers
who tend to enable support for many different types of networks and
systems in their packages. For example, consider a packager who
includes support for both the FOO and BAR networks in their Open MPI
package, both of which require support libraries (libFOO.so and
libBAR.so). If an end user only has BAR hardware, they likely only
have libBAR.so available on their systems -- not libFOO.so. Disabling
load errors by default will prevent the user from seeing potentially
confusing warnings about the FOO components failing to load because
libFOO.so is not available on their systems.
Conversely, system administrators tend to build an Open MPI that is
targeted at their specific environment, and contains few (if any)
components that are not needed. In such cases, they might want their
users to be warned that the FOO network components failed to load
(e.g., if libFOO.so was mistakenly unavailable), because Open MPI may
otherwise silently failover to a slower network path for MPI traffic.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Debounce "unreachable" notifications for tools when they disconnect
Enable the -x cmd line option for prun
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 0a5b36180a22959654461ac1303cec35313f8b4a)
Remove some build product. Tell PMIx that we don't need a new nspace generated when OMPI calls connect
Add missing Makefile
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Their is racing condition in TCP connection establishment
during simultaneous handshake. This PR handles the fix for
it.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
Make sure hostnames are null terminated, even when they were
too long to fit in the hostname buffer.
Fixes: CID 1418232
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Be a little more deliberate about convering OMPI's --with-cuda CLI
value to hwloc's --enable-cuda configure option.
Also, unconditionally disable hwloc NVML support (because Open MPI is
not currently using it).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
purpose. Continue to link the new library back to libopen-pal to resolve the renamed symbols.
Update opal configure logic to set disable_dlopen when disable_mca_dso is given. Fix typos in disable_dlopen when setting variables (incorrect inclusion of quotes)
Signed-off-by: Ralph Castain <rhc@open-mpi.org>