this commit syncs ompio related directories in v4.1.x to master. The efforts to bring the lustre performance fixes and support for external32 data representation over were too overwhelming when dealing with every single pr individually.
There are a very few minor modification that had to be done for syncing:
- v4.1.x does not have opal/mutex.h
- v4.1.x does not have opal_atomic_int32_t datatype
- the io module structure has two fewer function pointers (related to info_set/get) compared to the version on master.
Tested so far with the ompio testsuite as well as hdf5-1.10.5 testsuite (testphdf5, t_shapesame, t_bigio) on an XFS file system.
More tests on Lustre and BeeGFS to follow.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Assign all cpu's on node to the daemon
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 7bac7eed6ef423e47fe980b4c32eae36b8e1d4cb)
PMIx reigstration callback functions are used when regitering PMIx
event handler.
This patch adjusts two such callback functions:
model_registration_callback()
in ompi/interlib/interlib.c and
ompi_errhandler_registration_callback()
in ompi/errhandler/errhandler.c
Both of them employes the following code structure:
static void xxx_callback(int status,
size_t errhandler_ref,
void *cbdata)
{
myreg_t *trk = (myreg_t*)cbdata;
trk->status = status;
interlibhandler_id = errhandler_ref;
trk->active = false;
}
The workflow is:
1. caller will call opal_pmix.register_evhandler() with
callback function as an argument.
2. caller will call OMPI_LAZY_WAIT_FOR_COMPLETION(trk.active)
to wait for trk->active to became false,
3. PMIx do the registration on anther thread, then call the
registration callback function, which will set trk->active
to false.
4. caller check trk->status to determine whether registration
succeeded.
The expected behavior of the registration callback functions therefore
is that trk->status be updated first, then trk->active be set to false.
However, on ARM based systems, the expected behavior is not guaranteed
because ARM uses a relaxed memory model.
To address this issue, this patch added a call to opal_atomic_wmb()
(write memory barrier) after trk->status being set, to achieve the
expected behavior.
Signed-off-by: Wei Zhang <wzam@amazon.com>
Open MPI doesn't support any transports on MacOS which require
memory manager hooks. The memory patcher component uses the
syscall interface, which has been deprecated in recent versions
of MacOS. Since we don't need it and it emits warnings about
deprecation, disable the memory patcher component on MacOS.
Fixes#5671
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 19e16d5fd0e3bc148b47d957b9b84a425c87777c)
Remove the pack/unpack pragma around net/if.h on MacOS, which
was added to fix a bug in MacOS X 10.4.x on 64-bit platforms.
The bug was fixed in Mac OS X 10.5.0 and, sometime in the last
11 years, compilers started emitting warnings about the fact
that the Apple header stomped over the pragma pack settings
from the workaround. We already don't support versions of MacOS
earlier than 10.5, so there's no point in keeping the workaround.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit a25df3f29e213c5ef094d66082b0e07e9d5a0759)
Add some "const"s that needed to be applied here on the v4.1.x branch,
effectively by cherry-picking part of b65ec273074 from master.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Exclude HAN, don't include it.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit 33105b031bbc821a6c5d816c4801d62072347f9c)
Fixes#8195. This PR doesn't fix all the warnings from #8195, but
fixes many of them (e.g., I didn't get the "string might be truncated"
warnings on my Mac).
This is an adaptation of 14aa5fae3c42f14a1c6a259dede93d5ca7ecb82c from
master; it drops some things that aren't relevant here on the v4.1.x
branch and adds a few more warnings fixes that are relevant here on
v4.1.x that aren't relevant on master.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry-picked from 14aa5fae3c42f14a1c6a259dede93d5ca7ecb82c)
Thanks FX Coudert for reporting this issue and pointing
to a solution.
Refs. open-mpi/ompi#8218
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(back-ported from commit open-mpi/ompi@3f45ceda1b)
This is a one-off commit for the release branches that fixes
some typos introduced when backporting
open-mpi/ompi@35e7d86eb1
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
- fix path to getdate.sh
- do not prepend "date" to the revision
- support git worktree
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 930d3c469551eaa4d30b6105226018e0392152d7)
This has shown to be more effective in achieving overlap
of inter- and intra-node communication and reduces the inital
delay before hitting the network.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit 1cdc85564ed6c771f301c63d6bc6d8c1c8cf4a4c)
Also make coll/tuned the default for shared memory communication
as coll/sm has shown performance issues that need investigation.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit 971d58c52454a6edecdbb1a44ebd037a86e69a69)
The selectable list is sorted with lowest to highest priority so the
user-defined preferences should be appended to the list.
The preference treatment should also maintain the order provided by the user
(first item has highest priority) so switch the loop order.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit dd54af94508dc9ccee3e589276a9ede62fc8e409)
There are no manpages in v3.2.
Port of https://github.com/openpmix/openpmix/pull/1930
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 7b11693783429c43cb30475e4b54e691bf79529c)
The total size depends on number of ranks so the usual ranges don't work.
Thus, use the average across all ranks to make a decision.
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
(cherry picked from commit f670364d764bf7409e03860bf539a0a2884ffab3)