This option is problematic, and has never worked in an Open MPI v4.0.x
release tarball. Given that PMIx is now available elsewhere, it isn't
worth fixing this option.
See https://github.com/open-mpi/ompi/issues/6228 for more detail.
NOTE: This is a v4.0.x-specific commit because this option no longer
exists on master because we deleted the entire pmix3x component.
Hence, it's not possible to cherry-pick anything from master back to
the v4.0.x branch.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 447b14061880e218371f9eb0cbe427b8358d45b8)
1. Remove debug output in iallgather (I have forgotten to remove it).
2. Remove an incorrect comment in description of ibcast
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 64abd0f405)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
An implementation of R. Rabenseifner's algorithm for MPI_Iallreduce.
This algorithm is a combination of a reduce-scatter implemented with recursive vector halving
and recursive distance doubling, followed either by an allgather.
Limitations:
-- count >= 2^{\floor{\log_2 p}}
-- commutative operations only
-- intra-communicators only
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 73e048b62a)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Implements recursive doubling algorithm for MPI_Iallgather.
The algorithm can be used only for power-of-two number of processes.
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit a7386c1e09)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
An implementation of R. Rabenseifner's algorithm for MPI_Ireduce.
This algorithm is a combination of a reduce-scatter implemented with recursive vector halving
and recursive distance doubling, followed either by a gather.
Limitations:
-- count >= 2^{\floor{\log_2 p}}
-- commutative operations only
-- intra-communicators only
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 7bd63e79c8)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Remove dead code that was causing warnings about unused static
functions.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 2e24e6ec08)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Implements recursive doubling algorithm for MPI_Iexscan.
The algorithm preserves order of operations so it can be used both
by commutative and non-commutative operations.
The MCA parameter 'coll_libnbc_iexscan_algorithm' was added for dynamic
algorithm selection.
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit dfe203e167)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Implements recursive doubling algorithm for MPI_Iscan. The algorithm preserves order of operations so it can be used both by commutative and non-commutative operations.
The MCA parameter coll_libnbc_iscan_algorithm was added for dynamic algorithm selection.
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 3d43ff0f32)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Gcc 8 identified hb_tree_csearch() as an infinite recursion, and it
turns out that we never call this function, anyway. So just remove
it.
Fixes#5670.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 06c1bf73da)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
always initialize 'size'.
Only the a2a_sched_diss() alltoall algorithm is impacted,
and this algo is currently unused, so there is no need
to backport nor update the NEWS file for now.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit ff48e92864)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
The call of MPI_Allgatherv with sendbuf and sendtype parameters equal to MPI_IN_PLACE and NULL correspondingly, produces the segmentation fault.
The problem is that sendtype is used even when sendbuf value is MPI_IN_PLACE. But according to the standard, sendtype and sendcount parameters should be ignored in this case.
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit b45e190e66)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
also add common verbose variable.
Note the verbosity thing is a little tricky owing to the way the MCA frameworks and components are registered and
and initialized. The BTL's are registered/initialized prior to the MTL components even getting registered.
Here's the change in ofi mtl mca parameters. Before commit:
MCA mtl ofi: parameter "mtl_ofi_provider_include" (current value: "psm2", data source: environment, level: 1 user/basic, type: string)
Comma-delimited list of OFI providers that are considered for use (e.g., "psm,psm2"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.
MCA mtl ofi: parameter "mtl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string)
Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.
After commit:
MCA btl ofi: parameter "btl_ofi_provider_include" (current value: "", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_include)
Comma-delimited list of OFI providers that are considered for use (e.g., "psm,psm2"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.
MCA btl ofi: parameter "btl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_exclude)
Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.
MCA mtl ofi: parameter "mtl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_exclude)
Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.
MCA mtl ofi: parameter "mtl_ofi_verbose" (current value: "0", data source: default, level: 3 user/all, type: int, synonym of: opal_common_ofi_verbose)
related to #7755
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 9f1081a07a)
(cherry picked from commit 45b643d0cf)
This commit fixes an issue with non-debug builds where adding an
attachment to the attachment list doesn't actually happen. This
causes all MPI_Win_detach calls to fail. The call was within an
assert which is optimized out in optimized builds.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit 8ee80d8855)
This bug was first seen in a different product that's using the same
interception code as OMPI. But I think it's potentially in OMPI too.
In my vanilla build of OMPI master on RH8 if I "gdb libopen-pal.so" and
"disassemble intercept_brk", I'm seeing a suspicious extra instruction
in front of PATCHER_BEGIN:
0x00000000000d6778 <+40>: std r2,24(r1) // something gcc put in front
0x00000000000d677c <+44>: std r2,96(r1) // PATCHER_BEGIN's toc_save
0x00000000000d6780 <+48>: nop // NOPs from PATCHER_BEGIN
0x00000000000d6784 <+52>: nop // that get replaced
0x00000000000d6788 <+56>: nop // by instructions that
0x00000000000d678c <+60>: nop // change r2
0x00000000000d6790 <+64>: nop //
Later there are loads from that location like
0x000000000019e0e4 <+132>: ld r2,24(r1)
that make me nervous since that's the pre-updated value.
I believe this is the same thing Nathan is describing way back in a9bc692d
and his solution was to put a second call around each interception, where
the outer call is just
intercept_brk():
PATCHER_BEGIN
_intercept_brk()
PATCHER_END
and the inner call _intercept_brk() is where the bulk of the code goes.
What I'm seeing is that _intercept_brk() is being inlined and probably
negating Nathan's fix. So I want to add __opal_attribute_noinline__ to
restore the fix.
With this commit in place, the disassembly of intercept_brk becomes tiny
because it's no longer inlining _intercept_brk() and the susipicious
early save of r2 is gone. I made the same fix to all the intercept_*
functions, although intercept_brk was the only one that had a suspicious
save of r2.
As far as empirical failures though, we only have those from the non-OMPI
product that's using the same patcher code. I'm not actually getting OMPI
to fail from the above suspicious data being saved in r1+24.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
(cherry picked from commit ddd1f578ec)
mtl_btl_ofi_rcache_init() initializes patcher which should only take
place things are single threaded. OFI providers may start spawn threads,
so initialize the rcache before creating OFI objects to prevent races.
Authored-by: John L. Byrne <john.l.byrne@hpe.com>
Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
(cherry picked from commit f1b21cb776)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Added the flag OPAL_OFI_PCI_DATA_AVAILABLE to remove accessing the nic
object in
fi_info when the ofi version does not support that structure.
Signed-off-by: Nikola Dancejic dancejic@amazon.com
(cherry picked from commit ae2a447b0e)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Update the OPAL_CHECK_OFI configury macro:
- Make it safe to call the macro multiple times:
- The checks only execute the first time it is invoked
- Subsequent invocations, it just emits a friendly "checking..."
message so that configure output is sensible/logical
- With the goal of ultimately removing opal/mca/common/ofi, rename the
output variables from OPAL_CHECK_OFI to be
opal_ofi_{happy|CPPFLAGS|LDFLAGS|LIBS}.
- Update btl/ofi, btl/usnic, and mtl/ofi for these new conventions.
- Also, don't use AC_REQUIRE to invoke OPAL_CHECK_OFI because that
causes the macro to be invoked at a fairly random time, which makes
configure stdout confusing / hard to grok.
- Remove a little left-over kruft in OPAL_CHECK_OFI, too (which
resulted in an indenting change, making the change to
opal_check_ofi.m4 look larger than it really is).
Thanks Alastair McKinstry for the report and initial fix.
Thanks Rashika Kheria for the reminder.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit f5e1a672cc)
NOTE: This patch was cherry-picked into the v4.0.x branch as 9ad871fc,
but the OFI BTL changes were skipped, because the OFI BTL was not in
the v4.0.x branch. This version of the cherry pick brings in the
changes to the OFI BTL.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Adds the capability to select a NIC based on hardware locality.
Creates a list of NICs that share the same cpuset as the process,
then selects the NIC based on the (local rank) % (number of NICs).
If no NICs are available that share the same cpuset, the selection process
will create a list of all available NICs and make a selection based on
(local rank) % (number of NICs)
Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
(cherry picked from commit 167d75b42a)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Change ompi_mtl_ofi_get_endpoint() to call the active PML's
add_procs() rather than the OFI MTL add_procs() directly when
discovering a new process during operation.
Functionally, this has no impact in correct operation. However,
the current behavior means that the heterogenous and active PML
checks are not being executed in the dynamic discovery case.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 64d70b3076)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>