Fix the test that determined whether we output "writeable" or
"read-only" for MCA vars (it was checking the wrong flag).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 176da51aec0955a51f21157b33b21f60b6f28092)
The important part of this fix is a couple places 5 was hard-coded that needed to be
strlen(OPAL_INFO_SAVE_PREFIX).
But also this contains a fix for a gcc 7.3.0 compiler warning about snprintf(). There
was an "if" statement making sure all the arguments had appropriate strlen(), but gcc
still complained about the following snprintf() because the size of the struct element
is iterator->ie_key[OPAL_MAX_INFO_KEY + 1].
Signed-off-by: Mark Allen <markalle@us.ibm.com>
libevent does not support multiple threads calling the event loop on
the same event base. This causes external libevent's to print out
re-entrant warning messages. This commit fixes the issue by protecting
the call to the event loop with an atomic swap check.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Remove this component pending re-architecture of the overall OFI
components. We have had similar issues before when multiple components
use the same library - typical issues are race conditions, initialize
and finalize errors, etc. We are seeing similar problems here as we get
broader exposure to different library version and environment
combinations.
The correct fix in the past has been to centralize the library
interactions in a "common" component. We will pursue that here by moving
some additional functions (e.g., endpoint creation) into the existing
opal/mca/common/ofi component. We can't do that and thoroughly test it
in time for the v4.0.0 release, so we'll simply remove this component
from the release.
Once we have things correctly fixed, we'll submit a PR to restore the
component plus the related fixes to some future v4.x release.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Disable async receive for CUDA under OpenIB. While a performance
optimization, it also causes incorrect results for transfers
larger than the GPUDirect RDMA limit. This change has been validated
and approved by Akshay.
References #3972
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 9344afd485d23c41a18ab57d6df8550826c24b9e)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
KNC is effectively dead. Remove corresponding SCIF
support in Open MPI.
cherry pick of PR #5737
+
news update
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit b9ac3d8931c38661132544d59be085a50df01420)
Adds device ids of different Broadcom adapters from
BCM57XXX and BCM58XXX family of HCAs.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
(cherry-picked from a53a6f7650035d1644018fd09647c5e4db6b0694)
To ensure fast box entries are complete when processed by the
receiving process the tag must be written last. This includes a zero
header for the next fast box entry (in some cases). This commit fixes
two instances where the tag was written too early. In one case, on
32-bit systems it is possible for the tag part of the header to be
written before the size. The second instance is an ordering issue. The
zero header was being written after the fastbox header.
Fixes#5375, #5638
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 850fbff441756b2f9cde1007ead3e37ce22599c2)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit updates the patcher component to either use the
__clear_cache intrinsic or the correct assembly to flush the
instruction cache.
Fixes#5631
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 1cdbceb0951c30a04b730b78f31d07543d8c3d2a)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Since openib is on its long, slow way out the door, don't let it
complain about not being able to find any NICs at run time.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 098ec55e37261aee4b9672ec393726c2510247a1)
This commit fixes a bug when using the UCT btl with the UCX memory
hooks disabled. We were misssing a call to
opal_mem_hooks_unregister_release to remove the btl memory hook
callback.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 36c206d2d616578c97853d1d69727a1d6e165c1e)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
If someone specifies --with-verbs-usnic, actually do a configury check
to ensure that it will compile (vs. assuming that it will compile if
someone asks for it).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 05e5f61fe1c961927eae5bb5c0eb2021ac99afa6)
- added synonim to common ucx variables to allow
to print it in opal_info -a
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit e00f7a68ba0b1012f954910e39b26f6075f3d006)
Since version hwloc 2.0.0 has a new organization of NUMA nodes on the
topology tree. This commit adds the detection of local NUMA object for
hwloc => 2.0.0, which fixes the procs bindings policy for rmaps mindist
component.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
(cherry picked from commit e5291ccc34a621295a9d3fc3b1d470a4e4c790e2)
The write memory barrier was intended to precede setting a fast-box
header but instead follows it. This commit moves the memory barrier to
the intended location.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit dca3516765a4b5927b1877ca59d952baec42bc4a)
The Autoconf AC_CONFIG_* macros can only be instantiated exacly once
for any given file, *and* they must be in a code execution path at run
time for the target file to be generated at the end of configure.
For example, if you want to generate file ABC at the end of configure,
you must invoke the AC_CONFIG_FILES(ABC) macro in a code path that
will get executed when configure is run.
That's pretty straightforward.
What's not straightforward is two corner cases:
1. You cannot invoke the AC_CONFIG_FILES(ABC) macro for the same file
more than once. If you do, autoreconf will fail (even before you
can run configure).
2. If AC_CONFIG_FILES(ABC) is not in a code path that is executed by
configure, the file ABC is not registered properly, and ABC will
not be generated at the end of configure.
This applies to hwloc because hwloc's HWLOC_SETUP_CORE macro calls
both AC_CONFIG_FILES and AC_CONFIG_HEADER to setup its Makefiles
(etc.) so that targets like "make distclean" and "make distcheck" will
work properly. Hence, we *have* to invoke HWLOC_SETUP_CORE.
However, the MCA_opal_hwloc_hwloc201_CONFIG macro has a few side
effects. It would be nice to do able to do something like this:
```
if hwloc:extern is going to be used:
Invoke minimal HWLOC_SETUP_CORE (with no side effects)
else
Invoke full HWLOC_SETUP_CORE (with side effects)
fi
```
But we can't, because autoreconf will detect that AC_CONFIG_FILES has
been invoked on the same files more than once (regardless of whether
those code paths will be executed at run time or not). Kaboom.
Similarly, we can't do this:
```
if hwloc:extern is not going to be used:
Invoke full HWLOC_SETUP_CORE (with side effects)
fi
```
Because then hwloc's AC_CONFIG_FILES won't be registered properly when
hwloc:external *is* used (i.e., when the HWLOC_SETUP_CORE macro is not
in a code path that is executed at run time), and targets like "make
distclean" will fail because hwloc's Makefiles won't have been setup.
Kaboom.
But remember that the hwloc framework is a bit special: there will
only ever be 2 comoponents: external and internal. External is
guaranteed to be configured first because of its priority. So the
internal component (i.e., this component) immediately knows if it is
going to be used or not based on whether the external component
configuration succeeded or failed.
Specifically: regardless of whether the internal component (i.e., this
component) is going to be used, we have to invoke HWLOC_SETUP_CORE.
But we can manage the side effects: allow the side effects when
this/internal component is going to be used, and avoid the side
effects when this/internal component is not going to be used.
This is a little less clean than I would have liked, but because of
Autoconf's oddity about its AC_CONFIG_* macros, this is the only
solution I could come up with.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 01e4570af759b113b965b63df7bfc72a78d69654)
In order to make "make distclean" (and friends) work, we need to
*always* invoke the embedded configure script -- even if we know that
we're not going to use this component.
But in cases where we know we're not going to use this component, we
also need to avoid the side effects of the code path that is used when
we *do* want to use this component. So split the two possibilities
into two different macros:
1. MCA_opal_event_libevent2022_FAKE_CONFIG: which does almost nothing
except invoke the underlying "configure" script.
2. MCA_opal_event_libevent2022_REAL_CONFIG: which does all the real
work (including invoking the underlying "configure" script).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 69aa46e1676c00bb41e54b95bbcf1df3b00dc9c1)