1
1

3675 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
ac66f46dc7 mtl/ofi: check for FI_LOCAL_COMM+FI_REMOTE_COMM
Make sure to get an RDM provider that can provide both local and
remote communication.  We need this check because some providers could
be selected via RXD or RXM, but can't provide local communication, for
example.

Add OPAL_CHECK_OFI_VERSION_GE() m4 macro to check that the Libfabric
we're building against is >= a target version.  Use this check in two
places:

1. MTL/OFI: Make sure it is >= v1.5, because the FI_LOCAL_COMM /
   FI_REMOTE_COMM constants were introduced in Libfabric API v1.5.
2. BTL/usnic: It already had similar configury to check for Libfabric
   >= v1.1, but the usnic component was checking for >= v1.3.  So
   update the btl/usnic configury to use the new macro and check for
   >= v1.3.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 21bc9042e1ff9c3075ababda9eaf49ccc90f64db)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
guserav
fc74a25b7c common/ofi: Set HPE as owner of component
Signed-off-by: guserav <erik.zeiske@web.de>
(cherry picked from commit 56c3d9a2380b073508a6760599dbe6178cb8fdf9)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
guserav
d22d413baa common/ofi: Fix open-mpi/ompi#2519
As discussed in open-mpi/ompi#2519 the common component does not depend
on libfabric yet. This commit introduces this dependency by just calling
fi_version().

Signed-off-by: guserav <erik.zeiske@hpe.com>
(cherry picked from commit 8a67a95c993dbfc2e3fa652777cab6ee20a4a735)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
guserav
19771ada0c common/ofi: Fix check for OFI in build files
The changes made in f5e1a672ccd5db127e85e1e8f6bcfeb8a8b04527
have been done after the common/ofi component was removed and thus the
component doesn't reflect the changes made their.

Namely f5e1a672ccd5db127e85e1e8f6bcfeb8a8b04527 changed:
- How to call OPAL_CHECK_OFI (It sets opal_ofi_happy to yes now)
- Dropped the common part in the build flags for ofi

Signed-off-by: guserav <erik.zeiske@web.de>
(cherry picked from commit 0e25c95eaeabca67cbe5ad6370171e1cff47e52a)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
guserav
a3752b681b Revert "Remove opal/mca/common/ofi."
This reverts commit dd20174532928e0c9cdbe7b206868e6e4bea9d0b.

Signed-off-by: guserav <erik.zeiske@web.de>
(cherry picked from commit 4ad78aaa156f118c907451b4e72d13dfebcbccf2)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Jeff Squyres
9ad1c152dd btl/ofi/Makefile.am: down with tabs!
Replace all tabs with spaces.  No code or logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit b556cabfe937f84f30e8870e0a8c128b839e5dfd)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Aravind Gopalakrishnan
0d2a0b1568 btl/ofi: Fix valgrind complaints on uninitialized pointer use
It doesn't seem like the BTL was using uninitialized pointer. But simply
setting the rcache pointer to NULL after destroying it makes the valgrind
errors go away.

Fixes Issue #6345

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit 786e686d4347655b574e609c65626c8323bb49b2)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Thananon Patinyasakdikul
8ebd0d8f24 btl/ofi: fixed compiler warning on OSX.
This commit closes #6049

Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
(cherry picked from commit d9bd54c628f73b9c4ded3c5baa5c6454d8f173f1)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Thananon Patinyasakdikul
f9439c6d18 btl/ofi: Added 2 side communication support.
The 2 sided communication support is added for non-tagmatching provider
to take advantage of this BTL and PML OB1. The current state is
"functional" and not optimized for performance.

Two sided support is disabled by default and can be turned on by mca
parameter: "mca_btl_ofi_mode".

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
(cherry picked from commit 080115d44069e0c461a1af105cd41f28849cdffc)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Brian Barrett
e975c9975c Revert "Remove the OFI/BTL component"
This reverts commit 192f0f6fff4b4ba0f207054626c935941ff0b5d8.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Mark Allen
57c7d68233 adding op-codes for syscall ipc for shmat/shmdt
These op codes used to be in bits/ipc.h but were removed in glibc in 2015
with a comment saying they should be defined in internal headers:
https://sourceware.org/bugzilla/show_bug.cgi?id=18560
and when glibc uses that syscall it seems to do so from its own definitions:
https://github.com/bminor/glibc/search?q=IPCOP_shmat&unscoped_q=IPCOP_shmat

So I think using #ifndef and defining them if they're not already defined
using the values from glibc is the best option.

At IBM it was the testing on redhat 8 that found this as an issue
(the opcodes being undefined on the system made the #define HAS_SHMDT
evaluate to false so intercept_shmat / intercept_shmdt were
left undefined so shmat/shmdt memory events went unintercepted).

(cherry picked from commit e8fab058dac7300569cb54b08e5500115f8bab8f)
Signed-off-by: Mark Allen <markalle@us.ibm.com>
2020-06-04 14:24:17 -04:00
Joshua Hursey
7dba35f785 A slightly stronger check for LSF's libevent
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit 05e095a1eef737849750a5f4f649e72c393eeb74)
2020-05-18 15:10:17 -04:00
Joshua Hursey
886f41fe33 Move from legacy -levent to recommended -levent_core
* `libevent_core.so` contains the core functionality that we depend upon
   - `libevent.so` library has been identified as the legacy target.
   - `libevent_core.so` exists as far back as Libevent 2.0.5 (oldest supported by OMPI)
 * `libevent_pthreads.so` can work with either `-levent` or `-levent_core`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-05-09 14:55:39 -04:00
Joshua Hursey
fc4199e3ba Add checks for libevent.so conflict with LSF
* LSF ships a `libevent.so` that is no related to the `libevent.so`
   shipped with Libevent.
 * Add some checks to the configure logic to detect scenarios where this
   conflict can be detected, and provide the user with a descriptive
   warning message.
   - When detected by `event/external` this is just a warning since
     the internal component may be able to be used instead.
     - This happens when the user supplies the LSF path via the
       `LDFLAGS` envar instead of via `--with-lsf-libdir`.
   - When detected by a LSF component and LSF was explicitly requested
     then this becomes an error. Otherwise it will just print the warning
     and that component will fail to build.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-05-09 14:55:39 -04:00
Joshua Hursey
22d8fa197b event/external: Fix typo in LDFLAGS vs LIBS var before check
* This should have been `LDFLAGS` not `LIBS`. Either works, but
   `LDFLAGS` is more correct. We should also include `CPPFLAGS`
   just in case the header is important to the check.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-05-05 12:10:40 -04:00
Ralph Castain
e1543e37c6
Silence unnecessary error logs
Tracks https://github.com/openpmix/openpmix/pull/1669

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-09 07:13:02 -07:00
Joseph Schuchart
7b1beb0f6c Harmonize return values of progress callbacks
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
(cherry picked from commit 2c97187ee05e592346206a697ca3d9531d600fcc)
2020-03-30 18:58:57 +02:00
Ralph Castain
898b4f2210
Support hwloc retrieval using legacy key
Re-enable support for Slurm plugin using earlier PMIx_LOCAL_TOPO key

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-17 08:48:04 -07:00
Austen Lauria
f7979fbc82 Fix segv in btl/vader.
Keep track of the connected procs in vader_add_procs().
Otherwise, the same rank will reconnect the same shmem
segment (rank 0+...) multiple times instead of the next
one as intended.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
(cherry picked from commit f69c8d6819dcc14f471cea90b50dc8ca98de12d4)
2020-03-06 11:21:24 -05:00
Geoffrey Paulsen
11d79d1f6e Updating OMPI v4.0.x to PMIx v3.1.5 released
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2020-02-24 12:13:20 -05:00
Geoff Paulsen
a1259e6a14
Merge pull request #7356 from hppritcha/topic/pr7201_to_v40x
Topic/pr7201 to v40x
2020-02-12 14:06:31 -06:00
Brice Goglin
f136804c45 hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 1.11
Build was broken by mistake in commit d40662edc41a5a4d09ae690b640cfdeeb24e15a1

Fixes #7362

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
(cherry picked from commit 907ad854b46b42ae7cb1e9c87238691a5cc25e36)
2020-02-11 18:37:18 -06:00
Geoffrey Paulsen
81ad9bfdb6 Adding PMIx v3.1.5rc2
Adding PMIx v3.1.5rc2 from:
  https://github.com/openpmix/openpmix/releases/tag/v3.1.5rc2

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2020-02-10 17:05:53 -06:00
Brice Goglin
6702a4febb opal/hwloc: remove some unused variables when building with hwloc < 1.7
Refs #7362

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
(cherry picked from commit 329d4451a6cdd544e532a29f594f6e5ee63e06da)
2020-02-10 15:54:48 -06:00
Howard Pritchard
bed0ce70a7 fix a problem with opal_asprintf
not being defined.

related to #7201

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-02-03 14:22:59 -07:00
Brice Goglin
82567996f7 hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 2.0
Both opal_hwloc_base_get_relative_locality() and _get_locality_string()
iterate over hwloc levels to build the proc locality information.
Unfortunately, NUMA nodes are not in those normal levels anymore since 2.0.
We have to explicitly look a the special NUMA level to get that locality info.

I am factorizing the core of the iterations inside dedicated "_by_depth"
functions and calling them again for the NUMA level at the end of the loops.

Thanks to Hatem Elshazly for reporting the NUMA communicator split failure
at https://www.mail-archive.com/users@lists.open-mpi.org/msg33589.html

It looks like only the opal_hwloc_base_get_locality_string() part is needed
to fix that split, but there's no reason not to fix get_relative_locality()
as well.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
(cherry picked from commit ea80a20e108cb69efc67ad04ad968da7b85772af)
2020-02-03 13:29:41 -07:00
Howard Pritchard
0b2b9d7660
Merge pull request #7325 from hppritcha/topic/pr_7304_to_v4.0.x
btl/vader: modify how the max attachment address is determined
2020-01-24 08:00:36 -07:00
Nathan Hjelm
66684bbda3 btl/vader: modify how the max attachment address is determined
This PR removes the constant defining the max attachment address and
replaces it with the largest address that shows up in /proc/self/maps.
This should address issues found on AARCH64 where the max address
may differ based on the configuration.

Since the calculated max address may differ between processes the
max address is sent as part of the modex and stored in the endpoint
data.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit 728d51f9f3f2df6577e5f9729b9d6a0fe9441d37)
2020-01-19 15:05:41 -08:00
Nathan Hjelm
a64a7c8a0a btl/vader: fix issues with xpmem registration invalidation
This commit fixes an issue discovered in the XPMEM registration cache. It
was possible for a registration to be invalidated by multiple threads
leading to a double-free situation or re-use of an invalidated registration.

This commit fixes the issue by setting the INVALID flag on a registation
when it will be deleted. The flag is set while iterating over the tree
to take advantage of the fact that a registration can not be removed
from the VMA tree by a thread while another thread is traversing the VMA
tree.

References #6524
References #7030
Closes #6534

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit f86f805be1145ace46b570c5c518555b38e58cee)
2020-01-19 13:42:00 -08:00
Geoffroy Vallee
a479beeeae Add the missing code to check a return code
Signed-off-by: Geoffroy Vallee <geoffroy.vallee@gmail.com>
(cherry picked from commit de6f130b4a5b001ef85317fe3ebc2c8f8c8077e9)
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2020-01-08 15:27:38 -05:00
Jeff Squyres
814f3e9caa hwloc: clarify --with-hwloc behavior
Clarify in README what --with-hwloc does in its different use cases.

Also, ensure that the behavior when specifying `--with-hwloc` is the
same as if that option is not specified at all.  This is what we did
in Open MPI <= v3.x; looks like we inadvertantly caused `--with-hwloc`
to be synonymous with `--with-hwloc=external` in v4.0.0.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 18c3e1af5ef281e8b502a7ad778256889cc707d1)
2019-12-19 12:30:40 -08:00
Howard Pritchard
f6914ee35c
Merge pull request #7229 from tkordenbrock/topic/v4.0.x/portals4.fix.flowcontrol.bugs
v4.0.x: portals4: fix flow control bugs
2019-12-18 08:35:46 -07:00
Maxwell Coil
07a54b7025 memory/patcher: fix compiler warning
syscall() returns a long, but we are invoking shmat(), which returns
a void*.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
(cherry picked from commit 52a9cce6f3dcd87e9ae66177398b60b9317e9339)
2019-12-14 12:26:47 -05:00
Todd Kordenbrock
2c082b6c7c btl-portals4: fix a flow control configure bug
This commit fixes a configure bug that caused flow control to be
disabled regardless of the configure options used.

Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
(cherry picked from commit f7e74b6a3d19cd6e3edc3364783311e595f81b3d)
2019-12-12 08:43:20 -06:00
Jeff Squyres
91866789f2 mpool/base: fix basic mpool_base() function
The prior implementation was simply wrong.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 53ebea12aa98ba5440924a8ee914f6707f63608b)
2019-12-06 10:23:56 -08:00
Joseph Schuchart
a346756bf4 uGNI: Fix potential deadlock when processing outstanding transfers
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
(cherry picked from commit c09ca039b4703e26d6d7a0494e042dd27827e091)
2019-11-19 22:39:21 +01:00
Akshay Venkatesh
db3e563749 OPAL/MCA/BTL/OPENIB: Detect ConnectX-6 HCAs
Signed-off-by: Akshay Venkatesh <akvenkatesh@nvidia.com>
2019-11-18 17:04:54 -08:00
Howard Pritchard
59b24ab4f7 btl/uct: add UCT API version check to configury
related to #7128

The UCX crew is no longer guaranteeing that the UCT API is going to be frozen,
so this is kind of a whack-a-mole problem trying to keep the BTL UCT working
with various changing UCT APIs.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 9d345d9aa000233bec148540b071cecffc94438c)
2019-11-07 10:01:52 -07:00
Nathan Hjelm
55e01220cd btl/uct: fix compilation for UCX 1.7.0
Ref #7128

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit a3026c016a6a8be379f62585b6ddc070175c8106)
2019-11-07 10:00:33 -07:00
Nathan Hjelm
47ec3e4d2b btl/uct: add support for OpenUCX v1.8 API changes
OpenUCX broke the UCT API again in v1.8. This commit updates
btl/uct to fix compilation with current OpenUCX master
(future v1.8). Further changes will likely be needed for
the final release.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit 526775dfd7ad75c308532784de4fb3ffed25458f)
2019-11-07 10:00:21 -07:00
Jeff Squyres
c6592822c0 btl/usnic: set retrans_timeout back down to 5ms
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 3080033a8c4db64199b03b6058e18488f619088c)
2019-10-15 07:54:32 -07:00
Jeff Squyres
1565239506 btl/usnic: set ack_iteration_delay default to 4
It was previously accidentally set to 0.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 132e4cab3bc71df0da87368a332d6af0090a6977)
2019-10-15 07:54:31 -07:00
Jeff Squyres
22bc268e6e btl/usnic: properly size freelist items
Move the prefix area from the head to the body in relevant size
computations.  This fixes a problem in high traffic situations where
usNIC may have sent from unregistered memory.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit fe7f772f21627b01838c007db7cedbbb0ce8b536)
2019-10-04 16:47:19 -07:00
Jeff Squyres
58155bc760 btl/usnic: cap the number of resends per progress iteration
New MCA param: btl_usnic_max_resends_per_iteration.  This is the max
number of resends we'll do in a single pass through usNIC component
progress.  This prevents progress from getting stuck in an endless
loop of retransmissions (i.e., if more retransmissions are triggered
during the sending of retransmissions).  Specifically: we need to
leave the resend loop to allow receives to happen (which may ACK
messages we have sent previously, and therefore cause pending resends
to be moot).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 27e3040dfeba00a9a2615a217c164899f0009e59)
2019-10-04 16:47:13 -07:00
Jeff Squyres
8f929c68f1 btl/usnic: increase default retrans_timeout
Significantly increase the default retrans timeout.  If the
retrans timeout is too soon, we can end up in a retransmission storm
where the logic will continually re-transmit the same frames during a
single run through the usNIC progress function (because the timer for
a single frame expires before we have run through re-transmitting all
the frames pending re-transmission).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 3cc95d86b2123f38f392e56adca7ac8a1fef6454)
2019-10-04 16:47:11 -07:00
Jeff Squyres
b5cb03450c btl/usnic: clarifications and fixes regarding ACKs
New MCA parameter: btl_usnic_ack_iteration_delay.  Set this to the
number of times through the usNIC component progress function before
sending a standalone ACK (vs. piggy-backing the ACK on any other send
going to the target peer).

Use "ticks" language to clarify that we're really counting the number
of times through the usNIC component DATA_CHANNEL completion check (to
check for incoming messages) -- it has no relation to wall clock time
whatsoever.

Also slightly change the channel-checking scheme in usNIC component
progress: only check the PRIORITY channel once (vs. checking it once,
not finding anything, and then falling through the progress_2() where we
check PRIORITY again and then check the DATA channel).

As before, if our "progress" libevent fires, increment the tick
counter enough to guarantee that all endpoints that need an ACK will
get triggered to send standalone ACKs the next time through progress,
if necessary.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 968b1a51b59898877a8c7268d463d3d7d78d86a3)
2019-10-04 16:47:09 -07:00
Jeff Squyres
0839a9c313 btl/usnic: s/get_nsec/get_nticks/g
Rename "get_nsec()" to "get_ticks()" to more accurately reflect that
this function has no correlation to wall clock time at all.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit ce2910a28aea61043b81324c67999f3a47cfe7ac)
2019-10-04 16:47:08 -07:00
Adrian Reber
674655c641 Do not use CMA in user namespaces
Trying out to run processes via mpirun in Podman containers has shown
that the CMA btl_vader_single_copy_mechanism does not work when user
namespaces are involved.

Creating containers with Podman requires at least user namespaces to be
able to do unprivileged mounts in a container

Even if running the container with user namespace user ID mappings which
result in the same user ID on the inside and outside of all involved
containers, the check in the kernel to allow ptrace (and thus
process_vm_{read,write}v()), fails if the same IDs are not in the same
user namespace.

One workaround is to specify '--mca btl_vader_single_copy_mechanism none'
and this commit adds code to automatically skip CMA if user namespaces
are detected and fall back to MCA_BTL_VADER_EMUL.

Signed-off-by: Adrian Reber <areber@redhat.com>
(cherry picked from commit fc68d8a90fe86284e9dc730f878b55c0412f01d2)
2019-09-20 19:12:48 -07:00
Nathan Hjelm
5a945f668c btl/vader: when using single-copy emulation fragment large rdma
This commit changes how the single-copy emulation in the vader btl
operates. Before this change the BTL set its put and get limits
based on the max send size. After this change the limits are unset
and the put or get operation is fragmented internally.

References #6568

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit ae91b11de2314ab11a9842d9738cd14f8f1e393b)
2019-09-17 20:01:37 -06:00
Geoff Paulsen
893ea3f91f
Merge pull request #6929 from rhc54/cmr40/pmix314
Remove unnecessary error log
2019-08-30 14:10:36 -05:00