PMIx reigstration callback functions are used when regitering PMIx
event handler.
This patch adjusts two such callback functions:
model_registration_callback()
in ompi/interlib/interlib.c and
ompi_errhandler_registration_callback()
in ompi/errhandler/errhandler.c
Both of them employes the following code structure:
static void xxx_callback(int status,
size_t errhandler_ref,
void *cbdata)
{
myreg_t *trk = (myreg_t*)cbdata;
trk->status = status;
interlibhandler_id = errhandler_ref;
trk->active = false;
}
The workflow is:
1. caller will call opal_pmix.register_evhandler() with
callback function as an argument.
2. caller will call OMPI_LAZY_WAIT_FOR_COMPLETION(trk.active)
to wait for trk->active to became false,
3. PMIx do the registration on anther thread, then call the
registration callback function, which will set trk->active
to false.
4. caller check trk->status to determine whether registration
succeeded.
The expected behavior of the registration callback functions therefore
is that trk->status be updated first, then trk->active be set to false.
However, on ARM based systems, the expected behavior is not guaranteed
because ARM uses a relaxed memory model.
To address this issue, this patch added a call to opal_atomic_wmb()
(write memory barrier) after trk->status being set, to achieve the
expected behavior.
Signed-off-by: Wei Zhang <wzam@amazon.com>
Open MPI doesn't support any transports on MacOS which require
memory manager hooks. The memory patcher component uses the
syscall interface, which has been deprecated in recent versions
of MacOS. Since we don't need it and it emits warnings about
deprecation, disable the memory patcher component on MacOS.
Fixes#5671
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 19e16d5fd0e3bc148b47d957b9b84a425c87777c)
Remove the pack/unpack pragma around net/if.h on MacOS, which
was added to fix a bug in MacOS X 10.4.x on 64-bit platforms.
The bug was fixed in Mac OS X 10.5.0 and, sometime in the last
11 years, compilers started emitting warnings about the fact
that the Apple header stomped over the pragma pack settings
from the workaround. We already don't support versions of MacOS
earlier than 10.5, so there's no point in keeping the workaround.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit a25df3f29e213c5ef094d66082b0e07e9d5a0759)
Add some "const"s that needed to be applied here on the v4.1.x branch,
effectively by cherry-picking part of b65ec273074 from master.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Exclude HAN, don't include it.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit 33105b031bbc821a6c5d816c4801d62072347f9c)
Fixes#8195. This PR doesn't fix all the warnings from #8195, but
fixes many of them (e.g., I didn't get the "string might be truncated"
warnings on my Mac).
This is an adaptation of 14aa5fae3c42f14a1c6a259dede93d5ca7ecb82c from
master; it drops some things that aren't relevant here on the v4.1.x
branch and adds a few more warnings fixes that are relevant here on
v4.1.x that aren't relevant on master.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry-picked from 14aa5fae3c42f14a1c6a259dede93d5ca7ecb82c)
Thanks FX Coudert for reporting this issue and pointing
to a solution.
Refs. open-mpi/ompi#8218
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(back-ported from commit open-mpi/ompi@3f45ceda1b)
This is a one-off commit for the release branches that fixes
some typos introduced when backporting
open-mpi/ompi@35e7d86eb1
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
- fix path to getdate.sh
- do not prepend "date" to the revision
- support git worktree
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 930d3c469551eaa4d30b6105226018e0392152d7)
This has shown to be more effective in achieving overlap
of inter- and intra-node communication and reduces the inital
delay before hitting the network.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit 1cdc85564ed6c771f301c63d6bc6d8c1c8cf4a4c)
Also make coll/tuned the default for shared memory communication
as coll/sm has shown performance issues that need investigation.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit 971d58c52454a6edecdbb1a44ebd037a86e69a69)
The selectable list is sorted with lowest to highest priority so the
user-defined preferences should be appended to the list.
The preference treatment should also maintain the order provided by the user
(first item has highest priority) so switch the loop order.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit dd54af94508dc9ccee3e589276a9ede62fc8e409)
There are no manpages in v3.2.
Port of https://github.com/openpmix/openpmix/pull/1930
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 7b11693783429c43cb30475e4b54e691bf79529c)
The total size depends on number of ranks so the usual ranges don't work.
Thus, use the average across all ranks to make a decision.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit f670364d764bf7409e03860bf539a0a2884ffab3)
MPI_Ialltoallw() and friends take a const MPI_Datatype types[] argument.
In order to be able to call OBJ_RELEASE(types[0]), we used to simply
drop the const modifier. This change make it right by introducing the
OBJ_RELEASE_NO_NULLIFY(object) macro that no more set object = NULL
if the object is freed.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit c49e5e5c4a6902a7bf462a98c24c9ebd86b4ab63)
Mark the node as "unusable" so it does not get included when computing
number of procs for the case where the user does not specify -np.
Signed-off-by: Ralph Castain <rhc@pmix.org>
This commit removes the unnecessary call to `fi_getinfo()` when
initializing the MTL. `cq_data_size` is a domain attribute that will be
available to the MTL from the initial query itself. FI_DIRECTED_RECV is
a primary capability that has to be requested for a provider to enable
it, so adding that to the initial requirement. The redundant query was
also overwriting the contents of the prov object, which already had the
include/exclude filtering and multi-NIC logic applied to it.
Signed-off-by: Raghu Raja <craghun@amazon.com>
(cherry picked from commit 6233dea68d8495e20746c8e8d8af8d9c03a20206)
These selections seem harmful in my measurements and don't seem to be
motivated by previous measurement data.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit a15e5dc7f042f21f8adc08453b13bc7210bf2bac)