1
1
Граф коммитов

871 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
b715dd2657 btl/uct: fix AC_CHECK_DECLS usage
AC_CHECK_DECLS take a comma separated list of macros/symbols,
so replace the whitespace separator with a comma.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-10-26 15:36:02 +09:00
Nathan Hjelm
1b37328ba8 btl/uct: update for UCT_CB_FLAG_SYNC removal
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2018-10-21 18:57:42 -06:00
Brian Barrett
457f058e73 btl tcp: Simplify modex address selection
Simplify selection of the address to publish for a given BTL TCP
module in the module exchange code.  Rather than looping through
all IP addresses associated with a node, looking for one that
matches the kindex of a module, loop over the modules and
use the address stored in the module structure.  This also
happens to be the address that the source will use to bind()
in a connect() call, so this should eliminate any confusion
(read: bugs) when an interface has multiple IPs associated with
it.

Refs #5818

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-18 02:23:21 +00:00
Brian Barrett
4f19221af2 btl tcp: Simplify module address storage
Today, a btl tcp module is associated with exactly one IP
address (IPv4 or IPv6).  There's no need to reserve space
for both an IPv4 and IPv6 address in the module structure,
since the module will only be associated with one or the
other.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-17 16:21:17 +00:00
Brian Barrett
2acc4b7e7f btl tcp: Add workaround for "dropped connection" issue
Work around a race condition in the TCP BTL's proc setup code.
The Cisco MTT results have been failing on TCP tests due to a
"dropped connection" message some percentage of the time.
Some digging shows that the issue happens in a combination of
multiple NICs and multiple threads.  The race is detailed in
https://github.com/open-mpi/ompi/issues/3035#issuecomment-429500032.

This patch doesn't fix the race, but avoids it by forcing
the MPI layer to complete all calls to add_procs across the
entire job before any process leaves MPI_INIT.  It also
reduces the scalability of the TCP BTL by increasing start-up
time, but better than hanging.

The long term fix is to do all endpoint setup in the first
call to add_procs for a given remote proc, removing the
race.  THis patch is a work around until that patch can
be developed.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-16 18:33:30 -07:00
Nathan Hjelm
37a3f32e47
Merge pull request #5913 from hjelmn/uct_btl_fixes
btl/uct: fix deadlock in connection code
2018-10-16 19:11:54 -06:00
Nathan Hjelm
707d35deeb btl/uct: fix deadlock in connection code
This commit fixes a deadlock that can occur when using a TL that
supports the connect to endpoint model. The deadlock was occurring
while processing an incoming connection requests. This was done from
an active-message callback. For some unknown reason (at this time)
this callback was sometimes hanging. To avoid the issue the connection
active-message is saved for later processing.

At the same time I cleaned up the connection code to eliminate
duplicate messages when possible.

This commit also fixes some bugs in the active-message send path:

 - Correctly set all fragment fields in prepare_src.

 - Fix bug when using buffered-send. We were not reading the return
   code correctly (which is in bytes). This resulted in a message
   getting sent multiple times.

 - Don't try to progress sends from the btl_send function when in an
   active-message callback. It could lead to deep recursion and an
   eventual crash if we get a trace like
   send->progress->am_complete->ob1_callback->send->am_complete...

Closes #5820
Closes #5821

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-16 18:28:47 -06:00
Brian Barrett
da1189d771
Merge pull request #5916 from bwbarrett/revert/6acebc4
Revert "Handle error cases in TCP BTL"
2018-10-16 13:54:18 -07:00
Jeff Squyres
cef6cf0ac5 opal: update some string handling
Make various string handling locations a little more robust.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-10-14 16:04:28 -07:00
Brian Barrett
5162011428 Revert "Handle error cases in TCP BTL"
This reverts commit 6acebc40a1.

This patch is causing numerous "Socket closed" messages which are
causing most of the failures on Cisco's MTT run.  See
https://github.com/open-mpi/ompi/issues/5849 for more information.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-12 15:01:54 -07:00
Nathan Hjelm
6ed68da870 btl/uct: use the correct tl interface attributes
It is apparently possible for different instances of the same UCT
transport to have different limits (max short put for example). To
account for this we need to store the attributes per TL context not
per TL. This commit fixes the issue.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-11 11:33:17 -06:00
Brian Barrett
902b57919c btl tcp: Print number of endpoints on error
While trying to debug #3035, it's not clear whether there is
an issue with the modex data or printing the address list.
Print the number of endpoints on the error, which will help
determine which case is happening to Cisco.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-10 08:29:48 -07:00
Brian Barrett
bd07cc6707 btl tcp: Improve initialization debugging
When creating TCP BTL modules, print more information about the
module's ethernet association, including the first address associated
with the device, as debug output.

Fix a flipped output string for IPv4 and IPv6 addresses in the
modex send code.

Add the addresses being published in the modex to the debugging
output in modex send, to help match failures in endpoint match.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-10 08:29:48 -07:00
Nathan Hjelm
39be6ec15c btl/uct: bug fixes and general improvements
This commit updates the uct btl to change the transports parameter
into a priority list. The dc_mlx5, rc_mlx5, and ud transports to the
priority list. This will give better out of the box performance for
multi-threaded codes beacuse the *_mlx5 transports can avoid the mlx5
lock inside libmlx5_rdmav2.

This commit also fixes a number of leaks and a possible deadlock when
using RDMA.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-09 15:15:45 -06:00
Brian Barrett
e9e4d2a4bc Handle asprintf errors with opal_asprintf wrapper
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error.  However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined.  Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
Nathan Hjelm
8291f6722d btl/vader: fix race condition in writing header
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2018-10-05 16:30:06 -06:00
Geoff Paulsen
56a55cdb4b
Merge pull request #5786 from jsquyres/pr/string-madness
Replace strcpy() and strncpy() with (new) opal_string_copy()
2018-10-04 16:12:46 -05:00
Nathan Hjelm
dfa8d3a81a btl/vader: work around Oracle compiler bug
This commit works around an Oracle C compiler bug in 5.15 (not sure
when it was introduced). The bug is triggered when we chain
assignments of atomic variables. Ex:

_Atomic intptr x, y;
intptr_t z = 0;

x = y = z;

Will produce a compiler error of the form:

operand cannot have void type: op "="
assignment type mismatch:
	long "=" void

To work around the issue we are removing the chain assignment and
setting the head and tail on different lines.

Fixes #5814

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-03 08:55:51 -06:00
Nathan Hjelm
66a7dc4c72 btl/vader: ensure the fast box tag is always read first
On some platfoms reading a 64-bit value is non-atomic and it is
possible that the two 32-bit values are read in the wrong order. To
ensure the tag is always read first this commit reads the tag before
reading the full 64-bit value.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-10-03 07:17:32 -06:00
George Bosilca
a3a492b42c
Small pedantic fixes.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-10-02 12:08:18 -04:00
George Bosilca
9164e26e2f
Provide the correct socklen to bind.
Get Brian's patch from #5825 and his log message:
Fix a failure in binding the initiating side of a connection
on MacOS. MacOS doesn't like passing the size of the storage
structure (sockaddr_storage) instead of the expected size of
the structure (sockaddr_in or sockaddr_in6), which was causing
bind() failures. This patch simply changes the structure size
to the expected size.

Add a more clear error message in debug mode.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-10-02 12:06:40 -04:00
Jeff Squyres
5dae086f7e btl/tcp: output the IP address correctly
Per
https://github.com/open-mpi/ompi/issues/3035#issuecomment-426085673,
it looks like the IP address for a given interface is being stashed in
two places: on the endpoint and on the module.

1. On the endpoint, it is storing the moral equivalent of a
   (struct sockaddr_in.sin_addr).
2. On the module, it is storing a full (struct sockaddr_storage).

The call to opal_net_get_hostname() expects a full (struct sockaddr*)
-- not just the stripped-down (struct sockaddr_in.sin_addr).  Hence,
when the original code was passing in the endpoint's (struct
sockaddr_in.sin_addr) and opal_net_get_hostname() was treating it
like a (struct sockaddr), hilarity ensued (i.e., we got the wrong
output).

This commit eliminates the call to opal_net_get_hostname() and just
calls inet_ntop() directly to convert the (struct
sockaddr_in.sin_addr) to a string.

NOTE: Per the github comment cited above, there can be a disparity
between the IP address cached on the endpoint vs. the IP address
cached on the module.  This only happens with interfaces that have
more than one IP address.  This commit does not fix that issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-10-01 16:12:57 -07:00
Jeff Squyres
28e51ad4d4 opal: convert from strncpy() -> opal_string_copy()
In many cases, this was a simple string replace.  In a few places, it
entailed:

1. Updating some comments and removing now-redundant foo[size-1]='\0'
   statements.
2. Updating passing (size-1) to (size) (because opal_string_copy()
   wants the entire destination buffer length).

This commit actually fixes a bunch of potential (yet quite unlikely)
bugs where we could have ended up with non-null-terminated strings.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-09-27 11:56:18 -07:00
Jeff Squyres
6bb356ab87 Squash a bunch of harmless compiler warnings.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-09-26 12:15:21 -07:00
Nathan Hjelm
c6230657a2 opal: fix warning
Fixes #5713

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2018-09-21 00:51:40 -06:00
Brian Barrett
9344afd485 openib: Disable CUDA async by default
Disable async receive for CUDA under OpenIB.  While a performance
optimization, it also causes incorrect results for transfers
larger than the GPUDirect RDMA limit.  This change has been validated
and approved by Akshay.

References #3972

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-09-20 07:58:10 -07:00
Howard Pritchard
dccf780546
Merge pull request #5737 from hppritcha/topic/remove_scif_support
SCIF: remove it
2018-09-19 11:38:27 -06:00
Howard Pritchard
b9ac3d8931 SCIF: remove it
KNC is effectively dead.  Remove corresponding SCIF
support in Open MPI.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-09-19 10:39:52 -06:00
bosilca
b0e2975f86
Merge pull request #5701 from ICLDisco/export/errors_ib
Adding error handling in OpenIB BTL
2018-09-19 09:45:58 -04:00
bosilca
464e1abbab
Merge pull request #5700 from ICLDisco/export/tcp_errors
Handle error cases in TCP BTL
2018-09-19 09:44:38 -04:00
Michael Kuron
17b0f1fcc3 Deal with EOPNOTSUPP returned from getsockopt()
This can be returned when running on QEMU user-mode emulation,
which does not support getsockopt with SO_RCVTIMEO.

Signed-off-by: Michael Kuron <mkuron@icp.uni-stuttgart.de>
2018-09-16 14:55:28 +02:00
Jeff Squyres
8f2620d3af misc: compiler warning fixes
A variety of small compiler warning fixes.  The 2 PMIx fixes are
already committed upstream.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-09-15 06:04:13 -07:00
Nathan Hjelm
6363889cd0
Merge pull request #5696 from hjelmn/vader_5375
btl/vader: ensure that the send tag is always written last
2018-09-14 12:33:35 -06:00
Nathan Hjelm
1071d72130
Merge pull request #5445 from hjelmn/asm_type
Update opal to use C11 atomics if available
2018-09-14 12:32:56 -06:00
Nathan Hjelm
000f9eed4d opal: add types for atomic variables
This commit updates the entire codebase to use specific opal types for
all atomic variables. This is a change from the prior atomic support
which required the use of the volatile keyword. This is the first step
towards implementing support for C11 atomics as that interface
requires the use of types declared with the _Atomic keyword.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:48:55 -06:00
Nathan Hjelm
850fbff441 btl/vader: ensure that the send tag is always written last
To ensure fast box entries are complete when processed by the
receiving process the tag must be written last. This includes a zero
header for the next fast box entry (in some cases). This commit fixes
two instances where the tag was written too early. In one case, on
32-bit systems it is possible for the tag part of the header to be
written before the size. The second instance is an ordering issue. The
zero header was being written after the fastbox header.

Fixes #5375, #5638

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:35:35 -06:00
Selvin Xavier
a53a6f7650 Add support for different Broadcom HCAs
Adds device ids of different Broadcom adapters from
BCM57XXX and BCM58XXX family of HCAs.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
2018-09-12 02:22:34 -07:00
Jeff Squyres
a1b879d176
Merge pull request #5651 from jsquyres/pr/dont-let-openib-yell-about-no-nics
btl/openib: don't complain about no NICs
2018-09-07 15:00:24 -04:00
Howard Pritchard
dc02e54320
Merge pull request #5516 from thananon/ofi_send
btl/ofi: Added 2 sided communication support.
2018-09-06 18:39:23 -06:00
Jeff Squyres
098ec55e37 btl/openib: don't complain about no NICs
Since openib is on its long, slow way out the door, don't let it
complain about not being able to find any NICs at run time.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-09-06 11:26:58 -07:00
Nathan Hjelm
36c206d2d6 btl/uct: add missing opal_mem_hooks_unregister_release call
This commit fixes a bug when using the UCT btl with the UCX memory
hooks disabled. We were misssing a call to
opal_mem_hooks_unregister_release to remove the btl memory hook
callback.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-05 13:01:45 -06:00
Jeff Squyres
fe0852bcb4 Miscellaneous compiler warning stomps.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-24 07:39:14 -07:00
Nathan Hjelm
c74cf666a9 btl/vader: clean up debuging and squash warning
References #5512

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-22 10:57:32 -06:00
Aurélien Bouteiller
e46c907468
Adding error handling in OpenIB BTL
bugfix: major: openib send credits returned correctly after a fault for pending frags to dead processes; also tweak the default IB retry timeouts tomake this happen faster

Make it compile in non-debug builds

Mark the IB endpoint as failed when invoking an error; this resolves UDCM connection deadlocks

Changing the default IB retry timeouts is not a good idea.
We'll need to find another way to speedup credit recovery in failure cases.

Remove ULFM specific cases

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2018-08-16 16:53:17 -04:00
Aurelien Bouteiller
6acebc40a1
Handle error cases in TCP BTL
When an error is returned by the socket operations, trigger the
appropriate error path in the PML to give an opportunity for
rerouting/error handling.

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2018-08-14 15:35:24 -04:00
Nathan Hjelm
dca3516765 btl/vader: move memory barrier to where it belongs
The write memory barrier was intended to precede setting a fast-box
header but instead follows it. This commit moves the memory barrier to
the intended location.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-13 10:14:34 -06:00
George Bosilca
6d11a45f44
Remove few warnings identified by @rhc in #5514.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-08-03 16:21:06 -04:00
Thananon Patinyasakdikul
080115d440 btl/ofi: Added 2 side communication support.
The 2 sided communication support is added for non-tagmatching provider
to take advantage of this BTL and PML OB1. The current state is
"functional" and not optimized for performance.

Two sided support is disabled by default and can be turned on by mca
parameter: "mca_btl_ofi_mode".

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-08-03 12:30:03 -07:00
Nathan Hjelm
47ed8e8830 btl/uct: fix compile warnings/errors
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-07-23 14:04:38 -06:00
Howard Pritchard
4447738098
Merge pull request #5414 from hppritcha/topic/iwarp_only_by_default
btl/openib: only look for iwarp/roce by default
2018-07-17 20:08:00 -06:00