1
1
Граф коммитов

3916 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
174e967dbc
Remove ORTE project
Will be replaced by PRRTE. Ensure that OMPI and OPAL layers build
without reference to ORTE. Setup opal/pmix framework to be static.
Remove support for all PMI-1 and PMI-2 libraries. Add support for
"external" pmix component as well as internal v4 one.

remove orte: misc fixes

 - UCX fixes
 - VPATH issue
 - oshmem fixes
 - remove useless definition
 - Add PRRTE submodule
 - Get autogen.pl to traverse PRRTE submodule
 - Remove stale orcm reference
 - Configure embedded PRRTE
 - Correctly pass the prefix to PRRTE
 - Correctly set the OMPI_WANT_PRRTE am_conditional
 - Move prrte configuration to the end of OMPI's configure.ac
 - Make mpirun a symlink to prun, when available
 - Fix makedist with --no-orte/--no-prrte option
 - Add a `--no-prrte` option which is the same as the legacy
   `--no-orte` option.
 - Remove embedded PMIx tarball. Replace it with new submodule
   pointing to OpenPMIx master repo's master branch
 - Some cleanup in PRRTE integration and add config summary entry
 - Correctly set the hostname
 - Fix locality
 - Fix singleton operations
 - Fix support for "tune" and "am" options

Signed-off-by: Ralph Castain <rhc@pmix.org>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-02-07 18:20:06 -08:00
Brice Goglin
329d4451a6 opal/hwloc: remove some unused variables when building with hwloc < 1.7
Refs #7362

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2020-02-04 22:56:46 +01:00
Brice Goglin
907ad854b4 hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 1.11
Build was broken by mistake in commit d40662edc41a5a4d09ae690b640cfdeeb24e15a1

Fixes #7362

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2020-02-04 22:56:46 +01:00
Austen Lauria
824dbcbcf3 Protect use of _Static_assert().
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-02-04 13:46:58 -05:00
Howard Pritchard
d2b68e6ecd
Merge pull request #7201 from bgoglin/master
hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 2.0
2020-02-03 11:23:42 -07:00
George Bosilca
72501f8f9c Consistent return from all progress functions.
This fix ensures that all progress functions return the number of
completed events.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-01-28 20:16:53 +01:00
Joseph Schuchart
2c97187ee0 Harmonize return values of progress callbacks
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-01-28 20:15:03 +01:00
Austen Lauria
1275766037
Merge pull request #7207 from devreal/remove_shmem_seg_hdr
Remove unused opal_shmem_seg_hdr_t to retain alignment
2020-01-28 13:57:55 -05:00
Aurelien Bouteiller
9f4365fef6
Merge pull request #6783 from abouteiller/export/macos-epipe
Prevent EPIPE on OSX.
2020-01-28 11:18:46 -05:00
Brian Barrett
d768d82231
Merge pull request #7167 from wckzhang/reachable_netlinks
Reachable documentation change
2020-01-27 15:39:14 -08:00
Brian Barrett
fc8c7a5869
Merge pull request #7134 from wckzhang/btl_tcp_interface_match
btl tcp: Use reachability and graph solving for global interface matching
2020-01-27 15:38:49 -08:00
Austen Lauria
10f6a77640
Merge pull request #7315 from abouteiller/export/tcp_errors_v2
Handle error cases in TCP BTL (v2)
2020-01-27 17:03:07 -05:00
Aurelien Bouteiller
76021e35ee
Adding a description of the FIN message for future reference.
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-01-27 13:32:34 -05:00
Aurelien Bouteiller
93846fd0ee
Remove the pending event when socket is TCP_FAILED
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-01-27 13:32:34 -05:00
Aurelien Bouteiller
6b3be224d4
Adding a FIN message to differentiate normal TCP closing from failures
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-01-27 13:32:34 -05:00
Aurelien Bouteiller
b7be64482a
Revert "Revert "Handle error cases in TCP BTL""
This reverts commit 5162011428.

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-01-27 13:31:06 -05:00
William Zhang
e958f3cf22 btl tcp: Use reachability and graph solving for global interface matching
Previously we used a fairly simple algorithm in
mca_btl_tcp_proc_insert() to pair local and remote modules. This was a
point in time solution rather than a global optimization problem (where
global means all modules between two peers). The selection logic would
often fail due to pairing interfaces that are not routable for traffic.
The complexity of the selection logic was Θ(n^n), which was expensive.
Due to poor scalability, this logic was only used when the number of
interfaces was less than MAX_PERMUTATION_INTERFACES (default 8). More
details can be found in this ticket:
https://svn.open-mpi.org/trac/ompi/ticket/2031 (The complexity estimates
in the ticket do not match what I calculated from the function)
As a fallback, when interfaces surpassed this threshold, a brute force
O(n^2) double for loop was used to match interfaces.

This commit solves two problems. First, the point-in-time solution is
turned into a global optimization solution. Second, the reachability
framework was used to create a more realistic reachability map. We
switched from using IP/netmask to using the reachability framework,
which supports route lookup. This will help many corner cases as well as
utilize any future development of the reachability framework.

The solution implemented in this commit has a complexity mainly derived
from the bipartite assignment solver. If the local and remote peer both
have the same number of interfaces (n), the complexity of matching will
be O(n^5).

With the decrease in complexity to O(n^5), I calculated and tested
that initialization costs would be 5000 microseconds with 30 interfaces
per node (Likely close to the maximum realistic number of interfaces we
will encounter). For additional datapoints, data up to 300 (a very
unrealistic number) of interfaces was simulated. Up until 150
interfaces, the matching costs will be less than 1 second, climbing to
10 seconds with 300 interfaces. Reflecting on these results, I removed
the suboptimal O(n^2) fallback logic, as it no longer seems necessary.

Data was gathered comparing the scaling of initialization costs with
ranks. For low number of interfaces, the impact of initialization is
negligible. At an interface count of 7-8, the new code has slightly
faster initialization costs. At an interface count of 15, the new code
has slower initialization costs. However, all initialization costs
scale linearly with the number of ranks.

In order to use the reachable function, we populate local and remote
lists of interfaces. We then convert the interface matching problem
into a graph problem. We create a bipartite graph with the local and
remote interfaces as vertices and use negative reachability weights as
costs. Using the bipartite assignment solver, we generate the matches
for the graph. To ensure that both the local and remote process have
the same output, we ensure we mirror their respective inputs for the
graphs. Finally, we store the endpoint matches that we created earlier
in a hash table. This is stored with the btl_index as the key and a
struct mca_btl_tcp_addr_t* as the value. This is then retrieved during
insertion time to set the endpoint address.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-01-21 18:24:08 +00:00
Jeff Squyres
ed753afbc0 hwloc2: advance hwloc git submodule
Advance to hwloc-2.1.0rc2-33-g38433c0f, which includes a .gitignore
update that we want here in Open MPI.

Be warned; this is actually 33 commits beyond the hwloc v2.1.0 tag.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-01-21 09:42:39 -08:00
Nathan Hjelm
037b0bd9ee
Merge pull request #7304 from hjelmn/btl_vader_fix_max_address_on_aarch64
btl/vader: modify how the max attachment address is determined
2020-01-15 17:02:33 -07:00
Nathan Hjelm
728d51f9f3 btl/vader: modify how the max attachment address is determined
This PR removes the constant defining the max attachment address and
replaces it with the largest address that shows up in /proc/self/maps.
This should address issues found on AARCH64 where the max address
may differ based on the configuration.

Since the calculated max address may differ between processes the
max address is sent as part of the modex and stored in the endpoint
data.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-14 15:15:36 -08:00
Nathan Hjelm
61f96b3d6d
Merge pull request #7283 from hjelmn/fix_issues_in_both_vader_and_opal_interval_tree_t_that_were_causing_issue_6524
Fix issues in both vader and opal interval tree t that were causing issue 6524
2020-01-14 14:54:51 -07:00
Jeff Squyres
25931ea8bf
Merge pull request #7200 from cpshereda/master-opal_gethostname-change
Fix unsafe use of gethostname()
2020-01-13 16:07:43 -05:00
Charles Shereda
cbc6feaab2 Created opal_gethostname() as safer gethostname substitute.
The opal_gethostname() function provides a more robust mechanism
to retrieve the hostname than gethostname(), which can return
results that are not null-terminated, and which can vary in its
behavior from system to system.

opal_gethostname() just returns the value in opal_process_info.nodename;
this is populated in opal_init_gethostname() inside opal_init.c.

-Changed all gethostname calls in opal subtree to opal_gethostname
-Changed all gethostname calls in orte subtree to opal_gethostname
-Changed all gethostname calls in ompi subdir to opal_gethostname
-Changed all gethostname calls in oshmem subdir to opal_gethostname
-Changed opal_if.c in test subdir to use opal_gethostname
-Changed opal_init.c to include opal_init_gethostname. This function
 returns an int and directly sets opal_process_info.nodename per
 jsquyres' modifications.

Relates to open-mpi#6801

Signed-off-by: Charles Shereda <cpshereda@lanl.gov>
2020-01-13 08:52:17 -08:00
Jeff Squyres
21bc9042e1 mtl/ofi: check for FI_LOCAL_COMM+FI_REMOTE_COMM
Make sure to get an RDM provider that can provide both local and
remote communication.  We need this check because some providers could
be selected via RXD or RXM, but can't provide local communication, for
example.

Add OPAL_CHECK_OFI_VERSION_GE() m4 macro to check that the Libfabric
we're building against is >= a target version.  Use this check in two
places:

1. MTL/OFI: Make sure it is >= v1.5, because the FI_LOCAL_COMM /
   FI_REMOTE_COMM constants were introduced in Libfabric API v1.5.
2. BTL/usnic: It already had similar configury to check for Libfabric
   >= v1.1, but the usnic component was checking for >= v1.3.  So
   update the btl/usnic configury to use the new macro and check for
   >= v1.3.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-01-13 08:19:53 -08:00
Brian Barrett
f853971bc1
Merge pull request #6821 from jsquyres/pr/make-hwloc201-tarball-a-submodule
hwloc v2.1.0: use a git submodule
2020-01-08 07:41:37 -08:00
Nathan Hjelm
f86f805be1 btl/vader: fix issues with xpmem registration invalidation
This commit fixes an issue discovered in the XPMEM registration cache. It
was possible for a registration to be invalidated by multiple threads
leading to a double-free situation or re-use of an invalidated registration.

This commit fixes the issue by setting the INVALID flag on a registation
when it will be deleted. The flag is set while iterating over the tree
to take advantage of the fact that a registration can not be removed
from the VMA tree by a thread while another thread is traversing the VMA
tree.

References #6524
References #7030
Closes #6534

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-07 22:50:52 -07:00
Eisuke Kawashima
d26d4e1d63
Fix typo and update URLs (https, redirection) [skip ci]
Signed-off-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
2020-01-07 03:52:25 +09:00
Jeff Squyres
a2a9a9516b hwloc2: bump up to hwloc v2.1.0
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
c292e759da hwloc2: bump up to hwloc 2.0.4
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
e5722acc37 hwloc201: replace with "hwloc2" component+git submodule
Rename the component to be "hwloc2" (since it can now be any v2.x.y
version of hwloc), and make the embedded copy of hwloc be a git
submodule.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
18c3e1af5e hwloc: clarify --with-hwloc behavior
Clarify in README what --with-hwloc does in its different use cases.

Also, ensure that the behavior when specifying `--with-hwloc` is the
same as if that option is not specified at all.  This is what we did
in Open MPI <= v3.x; looks like we inadvertantly caused `--with-hwloc`
to be synonymous with `--with-hwloc=external` in v4.0.0.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-19 08:38:57 -08:00
Todd Kordenbrock
1af6dbe277
Merge pull request #7066 from tkordenbrock/topic/master/portals4.fix.flowcontrol.bugs
portals4: fix flow control bugs
2019-12-11 06:31:26 -06:00
Maxwell Coil
52a9cce6f3 memory/patcher: fix compiler warning
syscall() returns a long, but we are invoking shmat(), which returns
a void*.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-08 13:56:00 -05:00
Jeff Squyres
53ebea12aa mpool/base: fix basic mpool_base() function
The prior implementation was simply wrong.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-05 18:57:37 -05:00
William Zhang
a471f8749f reachable: Update documentation on reachable function
We have decided to show interfaces that are identical to itself as
reachable. This is consistent with the previous netmask logic when
determining reachability.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-12-02 18:04:40 +00:00
William Zhang
ce40436895 reachable/netlink: Show an interface as reachable to itself
Due to the way netlinks detects reachability, it will not show an
interface as reachable to itself, even if it can pass through a loopback
interface. To maintain similar behavior with netmasks, we display an
interface as reachable to itself.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-12-02 18:04:40 +00:00
Joseph Schuchart
ee80babe5c Remove unused opal_shmem_seg_hdr_t to retain alignment
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-11-29 08:40:14 +01:00
Brice Goglin
ea80a20e10 hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 2.0
Both opal_hwloc_base_get_relative_locality() and _get_locality_string()
iterate over hwloc levels to build the proc locality information.
Unfortunately, NUMA nodes are not in those normal levels anymore since 2.0.
We have to explicitly look a the special NUMA level to get that locality info.

I am factorizing the core of the iterations inside dedicated "_by_depth"
functions and calling them again for the NUMA level at the end of the loops.

Thanks to Hatem Elshazly for reporting the NUMA communicator split failure
at https://www.mail-archive.com/users@lists.open-mpi.org/msg33589.html

It looks like only the opal_hwloc_base_get_locality_string() part is needed
to fix that split, but there's no reason not to fix get_relative_locality()
as well.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2019-11-27 12:41:33 +01:00
Geoffroy Vallee
de6f130b4a
Add the missing code to check a return code
Signed-off-by: Geoffroy Vallee <geoffroy.vallee@gmail.com>
2019-11-24 11:04:10 -05:00
Austen Lauria
edcd6d8aeb
Merge pull request #7146 from bgoglin/master
fix typos hlwoc->hwloc
2019-11-13 15:38:00 -05:00
Nathan Hjelm
09dd383f8b
Merge pull request #7108 from devreal/btl-ugni-deadlock
uGNI: Fix potential deadlock when processing outstanding transfers
2019-11-11 10:56:56 -08:00
Howard Pritchard
9d345d9aa0 btl/uct: add UCT API version check to configury
related to #7128

The UCX crew is no longer guaranteeing that the UCT API is going to be frozen,
so this is kind of a whack-a-mole problem trying to keep the BTL UCT working
with various changing UCT APIs.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2019-11-06 14:27:58 -07:00
Brice Goglin
5c6bd7ea4e fix typos hlwoc->hwloc
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2019-11-06 10:42:36 +01:00
Nathan Hjelm
a3026c016a btl/uct: fix compilation for UCX 1.7.0
Ref #7128

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2019-11-05 12:53:26 -08:00
George Bosilca
476562752f
Correctly report TCP connect errors.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-10-31 18:33:15 -04:00
Joseph Schuchart
c09ca039b4 uGNI: Fix potential deadlock when processing outstanding transfers
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-10-26 12:21:17 +02:00
Austen Lauria
aa8be9c12d
Merge pull request #6284 from devreal/ompi-rdma-memalign
Ensure proper alignment of memory provided by MPI
2019-10-25 12:27:58 -04:00
Nathan Hjelm
b1ef5a40fa
Merge pull request #7016 from hjelmn/fix_btl_uct_from_yet_another_unannounced_api_break_in_the_openucx_uct_layer
btl/uct: add support for OpenUCX v1.8 API changes
2019-10-17 06:27:18 -07:00
Jeff Squyres
b6c4d5c118
Merge pull request #7060 from jsquyres/pr/usnic-mca-updates
BTL usnic MCA updates
2019-10-15 10:48:10 -04:00
Stanislav Kirillov
0e0763e006
fix ipv6 btl connection bug
Signed-off-by: Stanislav Kirillov <staskirillof@yandex.ru>
2019-10-10 11:20:37 +00:00
Todd Kordenbrock
f7e74b6a3d btl-portals4: fix a flow control configure bug
This commit fixes a configure bug that caused flow control to be
disabled regardless of the configure options used.

Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
2019-10-09 17:12:56 -05:00
Geoff Paulsen
4e1e6f8972
Merge pull request #6993 from awlauria/fix_warnings_master
Fix miscellaneous compiler warnings.
2019-10-09 09:17:02 -05:00
Jeff Squyres
3080033a8c btl/usnic: set retrans_timeout back down to 5ms
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-08 11:17:54 -07:00
Jeff Squyres
132e4cab3b btl/usnic: set ack_iteration_delay default to 4
It was previously accidentally set to 0.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-08 11:17:30 -07:00
Jeff Squyres
fe7f772f21 btl/usnic: properly size freelist items
Move the prefix area from the head to the body in relevant size
computations.  This fixes a problem in high traffic situations where
usNIC may have sent from unregistered memory.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-04 14:40:56 -07:00
Jeff Squyres
27e3040dfe btl/usnic: cap the number of resends per progress iteration
New MCA param: btl_usnic_max_resends_per_iteration.  This is the max
number of resends we'll do in a single pass through usNIC component
progress.  This prevents progress from getting stuck in an endless
loop of retransmissions (i.e., if more retransmissions are triggered
during the sending of retransmissions).  Specifically: we need to
leave the resend loop to allow receives to happen (which may ACK
messages we have sent previously, and therefore cause pending resends
to be moot).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-04 13:05:51 -07:00
Jeff Squyres
3cc95d86b2 btl/usnic: increase default retrans_timeout
Significantly increase the default retrans timeout.  If the
retrans timeout is too soon, we can end up in a retransmission storm
where the logic will continually re-transmit the same frames during a
single run through the usNIC progress function (because the timer for
a single frame expires before we have run through re-transmitting all
the frames pending re-transmission).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-04 13:05:51 -07:00
Jeff Squyres
968b1a51b5 btl/usnic: clarifications and fixes regarding ACKs
New MCA parameter: btl_usnic_ack_iteration_delay.  Set this to the
number of times through the usNIC component progress function before
sending a standalone ACK (vs. piggy-backing the ACK on any other send
going to the target peer).

Use "ticks" language to clarify that we're really counting the number
of times through the usNIC component DATA_CHANNEL completion check (to
check for incoming messages) -- it has no relation to wall clock time
whatsoever.

Also slightly change the channel-checking scheme in usNIC component
progress: only check the PRIORITY channel once (vs. checking it once,
not finding anything, and then falling through the progress_2() where we
check PRIORITY again and then check the DATA channel).

As before, if our "progress" libevent fires, increment the tick
counter enough to guarantee that all endpoints that need an ACK will
get triggered to send standalone ACKs the next time through progress,
if necessary.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-04 13:05:51 -07:00
Jeff Squyres
ce2910a28a btl/usnic: s/get_nsec/get_nticks/g
Rename "get_nsec()" to "get_ticks()" to more accurately reflect that
this function has no correlation to wall clock time at all.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-04 13:05:51 -07:00
Jeff Squyres
f3429d7a44 btl/usnic: pack a wire data struct
Might as well save a few bytes when sending this struct across the
network via the __opal_attribute_packed__ attribute.

That being said, also re-order the elements in this struct so that
there's no holes to begin with.  Do this so that the compiler/runtime
won't effect (slow) unaligned reads/writes because of the
__opal_attribute_packed__ attribute.

The "packed" attribute is really more about defensive programming
(e.g., if we make a mistake and have a hole, "packed" will remove it
for us).

*** Do not bring this commit back to existing/already-released release
branches: it will cause incompatibility, since it effectively changes
the usNIC BTL wire protocol.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-10-04 13:05:51 -07:00
Austen Lauria
0d4004cc3c Fix miscellaneous compiler warnings.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2019-10-01 16:27:25 -04:00
Joseph Schuchart
c385c927fb Ensure proper alignment of memory provided by MPI
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-10-01 11:54:29 +02:00
Gilles Gouaillardet
1c4a3598d0 pmix/pmix4x: refresh to the latest open PMIx master
refresh to openpmix/openpmix@ea3b29b1a4

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-10-01 14:27:22 +09:00
Nathan Hjelm
8473a66466 btl/uct: fix bug when using a transport without zero-copy
This commit fixes a crash that can occur if a transport
is usable but doesn't have zero-copy support. In this
case do not attempt to use zero-copy and set the max
send size off the bcopy limit.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2019-09-27 17:26:37 -07:00
Nathan Hjelm
526775dfd7 btl/uct: add support for OpenUCX v1.8 API changes
OpenUCX broke the UCT API again in v1.8. This commit updates
btl/uct to fix compilation with current OpenUCX master
(future v1.8). Further changes will likely be needed for
the final release.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2019-09-27 12:34:48 -07:00
Jeff Squyres
8038fac8f9
Merge pull request #6844 from adrianreber/check_for_user_ns
Do not use CMA in user namespaces
2019-09-20 22:10:42 -04:00
Nathan Hjelm
ae91b11de2 btl/vader: when using single-copy emulation fragment large rdma
This commit changes how the single-copy emulation in the vader btl
operates. Before this change the BTL set its put and get limits
based on the max send size. After this change the limits are unset
and the put or get operation is fragmented internally.

References #6568

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2019-09-05 23:08:53 -07:00
Adrian Reber
fc68d8a90f
Do not use CMA in user namespaces
Trying out to run processes via mpirun in Podman containers has shown
that the CMA btl_vader_single_copy_mechanism does not work when user
namespaces are involved.

Creating containers with Podman requires at least user namespaces to be
able to do unprivileged mounts in a container

Even if running the container with user namespace user ID mappings which
result in the same user ID on the inside and outside of all involved
containers, the check in the kernel to allow ptrace (and thus
process_vm_{read,write}v()), fails if the same IDs are not in the same
user namespace.

One workaround is to specify '--mca btl_vader_single_copy_mechanism none'
and this commit adds code to automatically skip CMA if user namespaces
are detected and fall back to MCA_BTL_VADER_EMUL.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-09-05 20:15:19 +02:00
Ralph Castain
9a2902c047
Force use of pmix/preg/native component
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-03 17:54:44 -07:00
Ralph Castain
2ebc1fa1b7
Update to current PMIx 4.0.0 (master)
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-03 14:46:55 -07:00
Howard Pritchard
71e1fad4a9
Merge pull request #6855 from hkuno/hkuno/mmap_loop
Fix mmap infinite recurse in memory patcher
2019-08-26 14:28:53 -06:00
Howard Pritchard
5a3646fd1d
Merge pull request #6884 from guserav/reintroduce-common-ofi
Reintroduce common ofi
2019-08-23 13:12:25 -06:00
Sergey Oblomov
182023febb SPML/UCX: fixed hang in SHMEM_FINALIZE
- used MPI _Barrier to synchronize processes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-21 12:04:46 +03:00
guserav
56c3d9a238 common/ofi: Set HPE as owner of component
Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-20 10:13:02 -07:00
guserav
8a67a95c99 common/ofi: Fix open-mpi/ompi#2519
As discussed in open-mpi/ompi#2519 the common component does not depend
on libfabric yet. This commit introduces this dependency by just calling
fi_version().

Signed-off-by: guserav <erik.zeiske@hpe.com>
2019-08-12 16:17:59 -07:00
guserav
0e25c95eae common/ofi: Fix check for OFI in build files
The changes made in f5e1a672cc
have been done after the common/ofi component was removed and thus the
component doesn't reflect the changes made their.

Namely f5e1a672cc changed:
- How to call OPAL_CHECK_OFI (It sets opal_ofi_happy to yes now)
- Dropped the common part in the build flags for ofi

Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-12 16:15:23 -07:00
guserav
4ad78aaa15 Revert "Remove opal/mca/common/ofi."
This reverts commit dd20174532.

Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-09 10:33:21 -07:00
Yossi Itigin
ec9def1406
Merge pull request #6864 from hoopoepg/topic/ucx-ppn-hint
UCX: added PPN hint for UCX context
2019-08-07 13:45:38 +03:00
Brian Barrett
827a2bcc3d
Merge pull request #6852 from wckzhang/opalifnamesize
opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
2019-08-05 16:16:05 -07:00
Sergey Oblomov
43186e494b UCX: added PPN hint for UCX context
- added PPN hint for UCX context init

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-05 18:07:06 +03:00
Harumi Kuno
fca8436064 Fix mmap infinite recurse in memory patcher
This commit fixes issue #6853 by removing
MacOS/Darwin-specific logic from intercept_mmap.
It also opportunistically converts tabs to spaces.

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
2019-08-01 10:18:25 -07:00
William Zhang
4ebb37a26c opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
Due to IF_NAMESIZE being a reused and conditionally defined macro,
issues could arise from macro mismatches. In particular, in cases where
opal/util/if.h is included, but net/if.h is not, IF_NAMESIZE will be 32.
If net/if.h is included on Linux systems, IF_NAMESIZE will be 16. This
can cause a mismatch when using the same macro on a system. Thus
different parts of the code can have differring ideas on the size of a
structure containing a char name[IF_NAMESIZE]. To avoid this error case,
we avoid reusing the IF_NAMESIZE macro and instead define our own as
OPAL_IF_NAMESIZE.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-29 21:24:39 +00:00
Ralph Castain
c5c93e3391
Update to PMIx master
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-07-29 12:20:20 -07:00
Ralph Castain
d202e10c14
Provide locality for all procs on node
Update PMIx to latest master to get supporting updates. For
connect/accept (part of comm_spawn as well), lookup locality for all
participating procs on the node and compute the relative locality so it
can be used for MPI operations.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-07-22 09:23:38 -07:00
Brian Barrett
41c2007af5
Merge pull request #6820 from wckzhang/cleanup
btl tcp: Fix error path memory leak
2019-07-16 15:50:32 -07:00
Gilles Gouaillardet
06c6325bc8
Merge pull request #6822 from ggouaillardet/topic/pmix_refresh
pmix/pmix4x: refresh to the latest PMIx master
2019-07-16 12:56:31 +09:00
Gilles Gouaillardet
4510711e95 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@03a8b5daab

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-07-16 11:31:20 +09:00
William Zhang
8c3b8a87c5 btl tcp: Fix error path memory leak
After the OPAL_MODEX_RECV call, remote_addrs was not freed in the error
path. Moved the free call into cleanup to ensure we always free this
memory before leaving the function.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-15 22:35:04 +00:00
William Zhang
c0c3e1b540 reachable: Update documentation on reachable function
Added information on the type of objects provided in the list as well as
the required fields for them.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-11 21:36:12 +00:00
William Zhang
c9214cc53e reachable: Change list name from *_if to *_ifs
The parameter names were misleading due to implying a single interface
instead of a list. This will provide more clarity in distinguishing the
list of interfaces from each individual interface.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-11 21:33:24 +00:00
Jeff Squyres
650dd3e4cf opal/if/linux_ipv6: unify gethostname() call behavior
Unfortunately, https://github.com/open-mpi/ompi/pull/6797 was merged
before all feedback was received (39b799d936).  This PR is a minor
addendum to that commit.

This PR simply removes a meaningless `= {0}` operation.

The use of gethostname() here -- and many other places in the code
base -- is technically unsafe.  See
https://github.com/open-mpi/ompi/issues/6801 for a further description
of the issue and a suggested fix.  But the risk is quite low;
real-world hostnames are usually much shorter than
OPAL_MAXHOSTNAMELEN.  Hence, this PR just removes the meaningless
operation and leaves a real fix for gethostname() usage to a potential
future PR.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-07-09 10:38:24 -07:00
Orivej Desh
39b799d936 Fix if_linux_ipv6 verbose output of interface addresses
Previously the verbose output of if_linux_ipv6_open looked like this:

    found interface ab c: 0ab: a b: abc: 0 0: a 0🔡 0 0 scope 0

This changes the output to:

    found interface eth0 inet6 ab0c🆎a0b🔤0:a00:abcd:0 scope 0

Signed-off-by: Orivej Desh <orivej@gmx.fr>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-07-09 05:45:15 -07:00
George Bosilca
4620c351ea
Prevent EPIPE on OSX.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-06-28 15:32:59 -04:00
Gilles Gouaillardet
63aa156bb0 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@99971222ce

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-27 09:35:49 +09:00
Gilles Gouaillardet
5679a88867 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@f67efc835c

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-24 10:17:23 +09:00
Ralph Castain
d4070d5f58
Fix finalize of flux component
Per patches from @SteVwonder and @garlick

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-06-18 21:14:04 -07:00
Gilles Gouaillardet
d9326ff2ca pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@186dca196c

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-10 15:17:43 +09:00
markalle
008ab98946
Merge pull request #6531 from markalle/patcher_additions
shmat/shmdt additions for patcher
2019-05-30 12:16:05 -05:00
Nathan Hjelm
b78066720c btl/uct: add support for UCX 1.6.x
This commit updates the uct btl to support the v1.6.x release of
UCX. This release breaks API.

Signed-off-by: Nathan Hjelm <hjelmn@cs.unm.edu>
2019-05-21 04:31:57 -06:00
Yossi Itigin
84ae05c7bc
Merge pull request #6675 from hoopoepg/topic/ucx-common-init-patcher-on-hooks-used-only
COMMON/UCX: init memhooks infra on external hooks only
2019-05-16 22:35:32 +03:00
Sergey Oblomov
ebc457baf5 COMMON/UCX: removed ucs stuff
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 20:56:30 +03:00
Sergey Oblomov
a0a9306066 COMMON/UCX: init memhooks infra on external hooks only
- initialize memory hooks infrastructure only in case
  if external memory hooks are requested

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 20:13:16 +03:00
Nathan Hjelm
3e1dd36241 btl/uct: check for support before disabling UCX memory hooks
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2019-05-15 13:49:10 -06:00
Jeff Squyres
df5f7afb14 usnic: fix Coverity false positives
Add some Coverity inline notation to tell Coverity that these
functions never return.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-05-14 13:53:25 -07:00
Jeff Squyres
566e6f1ca3 btl/usnic: remove legacy code
Remove compatibility code for multiple versions of BTL_IN_OPAL,
BTL_VERSION, and RCACHE_VERSION.  This stuff was really only necessary
when we were actively swapping code between multiple release branches
that had large variations in core OMPI infrastructure.  These large
variations have now been around for quite a while, so the need for
this "compat" layer is significantly reduced.  It hasn't been removed
simply because a few of the "compat" names a slightly more friendly
than the real names (e.g., the SEND/RECV/PUT names).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-05-11 05:19:36 -07:00
Jeff Squyres
8a2441603f btl/usnic: remove all calls to abort()
Inspired by https://github.com/open-mpi/ompi/pull/5205, finally remove
all calls to abort() from the usnic BTL.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-05-11 05:17:29 -07:00
Nathan Hjelm
b82a08254f btl/ugni: fix 32-bit compare-and-swap atomics
This commit fixes an error in the 32-bit compare-and-swap atomic support
for Aries networks. The code was incorrectly using the non-fetching
version of cswap which was causing the routing to return
OPAL_ERR_BAD_ARG.

Signed-off-by: Nathan Hjelm <hjelmn@cs.unm.edu>
2019-05-10 09:59:54 -06:00
Yossi Itigin
5d2200a7d6
Merge pull request #6605 from brminich/topic/shmem_all2all_put
SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
2019-05-01 12:00:21 +03:00
Mikhail Brinskii
2ef5bd8b36 SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.

Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-26 14:47:58 +03:00
Gilles Gouaillardet
562809fca1 pmix/pmix4x: refresh to the latest PMIx master
refrest pmi4x to pmix/pmix@bde4a8a54f

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-04-23 09:31:43 +09:00
Ralph Castain
f4aa783848 Remove all the linkages back to libpmix in pmix components
This link-back seems to be breaking OMPI for some reason. I'm not sure we need it in PMIx anyway, but we'll investigate over there.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-04-16 13:30:20 -07:00
Mark Allen
bdd92a7a64 -cpu-set as a constraint rather than as a binding
The first category of issue I'm addressing is that recent code changes
seem to only consider -cpu-set as a binding option. Eg a command like
this
  % mpirun -np 2 --report-bindings --use-hwthread-cpus \
      --bind-to cpulist:ordered --map-by hwthread --cpu-set 6,7 hostname
which just round robins over the --cpu-set list.

Example output which seems fine to me:
> MCW rank 0: [..../..B./..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..../...B/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

It should also be possible though to pass a --cpu-set to most other
map/bind options and have it be a constraint on that binding. Eg
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by ppr:2:node,pe=2 --cpu-set 6,7,12,13 hostname

The first command above errors that
> Conflicting directives for mapping policy are causing the policy
> to be redefined:
>   New policy:   RANK_FILE
>   Prior policy:  BYHWTHREAD

The error check in orte_rmaps_rank_file_open() is likely too aggressive.
The intent seems to be that any option like "--map-by whatever" will
check to see if a rankfile is in use, and report that mapping via rmaps
and using an explicit rankfile is a conflict.

But the check has been expanded to not just check
    NULL != orte_rankfile
but also errors out if
    (NULL != opal_hwloc_base_cpu_list &&
    !OPAL_BIND_ORDERED_REQUESTED(opal_hwloc_binding_policy))
which seems to be only recognizing -cpu-set as a binding option and
ignoring -cpu-set as a constraint on other binding policies.

For now I've changed the
    NULL != opal_hwloc_base_cpu_list
to
    OPAL_BIND_TO_CPUSET == OPAL_GET_BINDING_POLICY(opal_hwloc_binding_policy)
so it hopefully only errors out if -cpu-set is being used as a binding
policy.  Whether I did that right or not it's enough to get to the next
stage of testing the example commands I have above.

Another place similar logic is used is hwloc_base_frame.c where it has
    /* did the user provide a slot list? */
    if (NULL != opal_hwloc_base_cpu_list) {
        OPAL_SET_BINDING_POLICY(opal_hwloc_binding_policy, OPAL_BIND_TO_CPUSET);
    }
where it used to (long ago) only do that if
    !OPAL_BINDING_POLICY_IS_SET(opal_hwloc_binding_policy)
I think the new code is making it impossible to use --cpu-set as anything
other than a binding policy.

That brings us past the error detection and into the real functionality, some of
which has been stripped out, probably in moving to hwloc-2:
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname
> MCW rank 0: [B.../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [.B../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

The rank_by() function in rmaps_base_ranking.c makes an array out of objects
returned from
    opal_hwloc_base_get_obj_by_type(,,,i,)
which uses df_search().  That function changed quite a bit from hwloc-1 to 2
but it used to include a check for
    available = opal_hwloc_base_get_available_cpus(topo, start)
which is where the bitmask from --cpu-set goes.  And it used to skip objs that
had hwloc_bitmap_iszero(available).

So I restored that behavior in ds_search() by adding a "constrained_cpuset" to
replace start->cpuset that it was otherwise processing.  With that change in
place the first command works:
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname
> MCW rank 0: [..../..B./..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..../...B/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

The other command uses a different path though that still ignored the
available mask:
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by ppr:2:node:pe=2 --cpu-set 6,7,12,13 hostname
> MCW rank 0: [BB../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..BB/..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
In bind_generic() the code used to call
opal_hwloc_base_find_min_bound_target_under_obj() which used
opal_hwloc_base_get_ncpus(), and that's where it would
intersect objects with the available cpuset and skip over ones
that were't available. To match the old behavior I added a few
lines in bind_generic() to skip over objects that don't intersect
the available mask. After that we get
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by ppr:2:node:pe=2 --cpu-set 6,7,12,13 hostname
> MCW rank 0: [..../..BB/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..../..../..../BB../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

I think the above changes are improvements, but I don't feel like they're
comprehensive.  I only traced through enough code to fix the two specific
bugs I was dealing with.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2019-04-12 15:33:56 -04:00
Gilles Gouaillardet
9ce8d7b568 pmix/pmix4x: refresh to the latest PMIx
refrest pmi4x to pmix/pmix@2531c0c3d1

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-04-09 14:03:00 +09:00
KAWASHIMA Takahiro
77286a41aa opal/sys: Introduce OPAL_HAVE_SYS_TIMER_GET_FREQ macro
... to avoid using an architecture name macro in
`opal/mca/timer/linux/timer_linux_component.c`.

The function name `opal_sys_timer_freq` is also changed for
consistency with `opal_sys_timer_get_cycles`.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2019-04-04 11:48:02 +09:00
Mark Allen
eb888118e8 shmat/shmdt additions for patcher
This is mostly based off recent UCX additions to their patcher:
    https://github.com/openucx/ucx/pull/2703

They added triggers for
* mmap when (flags & MAP_FIXED) && (addr != NULL)
* shmat when (shmflg & SHM_REMAP) && (shmaddr != NULL)

Beyond that I noticed they already had a trigger for
* madvise when (advice == MADV_FREE)
that we didn't so I added that.

And the other main thing is we didn't really have shmat/shmdt
active for some systems because we only had a path for
syscall(SYS_shmdt, ) but we needed to also have a path for
syscall(SYS_ipc, IPCOP_shmdt, ) and same for shmat.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2019-03-29 14:38:46 -04:00
bosilca
b54fdf5dd9
Merge pull request #6541 from bwbarrett/bugfix/enotconn
btl/tcp: Skip printing error message in racy cleanup path
2019-03-28 22:42:52 -04:00
Brian Barrett
d5360711fa btl/tcp: Skip printing error message in racy cleanup path
Avoid printing an error message about ENOTCONN return codes from
getpeername() when handling an incoming connection request.  At
this point in the receive state machine, the remote process has
been verified to be a valid OMPI instance.  In all-to-all startup
at 4k rank scale, we're seeing this error message when the remote
side drops the connection because it realizes it's the "loser"
in the connection race.  We were already doing all the right things,
other than printing a scary error message.  So skip the error
message and call it good.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2019-03-28 23:12:35 +00:00
Gilles Gouaillardet
77060cad07 btl/vader: fix finalize sequence
free the component mpool in mca_btl_vader_component_close()
and after freeing soem objects that depend on it such as
mca_btl_vader_component.vader_frags_user

Thanks Christoph Niethammer for reporting this.

Refs. open-mpi/ompi#6524

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-03-27 11:57:40 +09:00
Gilles Gouaillardet
e844f76725 pmix/pmix4x: refresh to the latest PMIx
refrest pmi4x to pmix/pmix@20cc9c041e

Fixes open-mpi/ompi#6513

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-03-25 13:33:18 +09:00
Joshua Ladd
9ab6ecba65
Merge pull request #6492 from janjust/oshmem-multiple-contexts-master
Oshmem multiple contexts
2019-03-22 17:34:46 -04:00
Xin Zhao
9c3d00b144 ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy
ompi/oshmem/spml/ucx: optimize spml ucx progress

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e1c1ab0202 ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:37 +02:00
Josh Hursey
53cd31ed7e
Merge pull request #6504 from jjhursey/rm-hash-pmix4
Do not force 'hash' gds on direct modex in pmix4x
2019-03-19 20:35:12 -05:00
Ralph Castain
0f26d8c76b Silence warnings
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-19 10:27:39 -07:00
Ralph Castain
c4be211741 Sync to latest PMIx master
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-19 10:27:12 -07:00
Joshua Hursey
1314cf2640 Do not force 'hash' gds on direct modex in pmix4x
* Forcing the 'hash' gds component should not be necessary any more.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2019-03-19 11:53:26 -05:00
Joshua Hursey
c2581d0e33 Do not force 'hash' gds on direct modex
* Forcing the 'hash' gds component should not be necessary any more.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2019-03-18 21:52:32 -05:00
Josh Hursey
ad8c842e7d
Merge pull request #6477 from markalle/report_bindings_strlen
opal_hwloc_base_cset2str() off-by-1 in its strncat()
2019-03-14 12:42:50 -05:00
Sergey Oblomov
c319cf9ade COMMON/UCX: rewording of hooks suggestion
- also updated output macro

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-14 11:00:57 +02:00
Sergey Oblomov
d8e3562bae PML/SPML/UCX: added evaluation of mmap events
- there was a set of UCX related issues reported which caused
  by mmap API hooks conflicts. We added diagnostic of such
  problems to simplify bug-resolving pipeline

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-12 21:14:27 +02:00
Mark Allen
30d60994d2 opal_hwloc_base_cset2str() off-by-1 in its strncat()
I think the strncat() calls here need to be of the form
    strncat(str, new_str_to_add, len - strlen(new_str_to_addstr) - 1);
since in the OMPI calls len is being used as total number of bytes
in str.

strncat(dest,src,n) on the other hand is documented as writing up to
n chars from the incoming string plus 1 for the null, for n+1 total
bytes it can write.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2019-03-11 14:35:53 -04:00
Jeff Squyres
14563770a1 btl/usnic: amend Makefile.am fix from b4097626ab
Use $(AM_CPPFLAGS) in $(usnic_btl_run_tests_CPPFLAGS) so that we don't
have to replicate hard-coded values.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-03-05 09:30:21 -08:00
Gilles Gouaillardet
b4097626ab btl/usnic: fix usnic_btl_run_tests CPPFLAGS
do define the OMPI_LIBMPI_NAME macro via the CPPFLAGS.
The issue occurs when Open MPI is configured with
--enable-opal-btl-usnic-unit-tests

Thanks George Marselis for reporting this issue

Refs. open-mpi/ompi#6441

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-03-05 09:57:55 +09:00
Ralph Castain
8fd6107987
Merge pull request #6418 from rhc54/topic/slurm
Update slurm pmi configury to account for pmix
2019-02-27 14:30:45 -08:00
Ben Menadue
17dcc7041a Hold off running hwloc:external feature tests until after we decide if we're using the internal or external component. This fixes #6430.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2019-02-25 16:58:11 +11:00
Ralph Castain
cd1b5641be Update slurm pmi configury to account for pmix
When Slurm is built against PMIx, some installations place a copy of the
PMIx library that Slurm is linking against in the Slurm PMI location.
Current configury ignores that location. The desired behavior is to look
for a PMIx lib in that location when --with-pmi is given. If the user
also specifies --with-pmix and gives a different location, then override
anything previously found and look for it where the user directed.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-02-21 11:33:35 -08:00
Artem Polyakov
13a8e42108
Merge pull request #6163 from artpol84/osc/mt_submission
Refactoring of osc/ucx component for MT
2019-02-20 09:41:27 -08:00
Jeff Squyres
170d5d119e
Merge pull request #6409 from dmitrygladkov/topic/btl/tcp
btl/tcp: Fix copy-paste misprint
2019-02-20 12:12:18 -05:00
Dmitry Gladkov
9920da4992 btl/tcp: Fix copy-paste misprint
Signed-off-by: Dmitry Gladkov <dmitrygla@mellanox.com>
2019-02-20 11:18:02 +02:00
Artem Polyakov
91d6115d99 opal/common/ucx: Adjust the threasholds for periodical flushes
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
3aadc2b5e1 opal/common/ucx: Fix periodical flush in the worker pool
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
84dfe1277c opal/common/ucx: Rename wpool recv_worker to dflt_worker
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
8a990c2b64 opal/common/ucx: Add comments clarifying data structures
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
19e2ae2efb opal/common/ucx: Switch to opal/tsd
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
7984d7d997 opal/common/ucx: Remove unused debugging macro
Will be reintroduced later if needed and after adaptation to the OMPI
infrastructure.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
43f16d8796 opal/common/ucx: Remove common_ucx_int.h
Place the content of common_ucx_int.h back to the common_ucx.h and
include common_ucx_wpool.h explicitly.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
bb7d360621 opal/common/ucx: add refcnt in tlocal_ctx_tbl entry to keep track of usage
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
101036651b opal/common/ucx: Fix the bug in wpool's periodical flush
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
bcb52ecade opal/common/ucx: add winfo ptr into req
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
33517428a1 opal/common/ucx: add periodical flush and counter to opal directory.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00