1
1

67 Коммитов

Автор SHA1 Сообщение Дата
William Zhang
4ebb37a26c opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
Due to IF_NAMESIZE being a reused and conditionally defined macro,
issues could arise from macro mismatches. In particular, in cases where
opal/util/if.h is included, but net/if.h is not, IF_NAMESIZE will be 32.
If net/if.h is included on Linux systems, IF_NAMESIZE will be 16. This
can cause a mismatch when using the same macro on a system. Thus
different parts of the code can have differring ideas on the size of a
structure containing a char name[IF_NAMESIZE]. To avoid this error case,
we avoid reusing the IF_NAMESIZE macro and instead define our own as
OPAL_IF_NAMESIZE.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-29 21:24:39 +00:00
bosilca
b54fdf5dd9
Merge pull request #6541 from bwbarrett/bugfix/enotconn
btl/tcp: Skip printing error message in racy cleanup path
2019-03-28 22:42:52 -04:00
Brian Barrett
d5360711fa btl/tcp: Skip printing error message in racy cleanup path
Avoid printing an error message about ENOTCONN return codes from
getpeername() when handling an incoming connection request.  At
this point in the receive state machine, the remote process has
been verified to be a valid OMPI instance.  In all-to-all startup
at 4k rank scale, we're seeing this error message when the remote
side drops the connection because it realizes it's the "loser"
in the connection race.  We were already doing all the right things,
other than printing a scary error message.  So skip the error
message and call it good.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2019-03-28 23:12:35 +00:00
Dmitry Gladkov
9920da4992 btl/tcp: Fix copy-paste misprint
Signed-off-by: Dmitry Gladkov <dmitrygla@mellanox.com>
2019-02-20 11:18:02 +02:00
Brian Barrett
457f058e73 btl tcp: Simplify modex address selection
Simplify selection of the address to publish for a given BTL TCP
module in the module exchange code.  Rather than looping through
all IP addresses associated with a node, looking for one that
matches the kindex of a module, loop over the modules and
use the address stored in the module structure.  This also
happens to be the address that the source will use to bind()
in a connect() call, so this should eliminate any confusion
(read: bugs) when an interface has multiple IPs associated with
it.

Refs #5818

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-18 02:23:21 +00:00
Brian Barrett
4f19221af2 btl tcp: Simplify module address storage
Today, a btl tcp module is associated with exactly one IP
address (IPv4 or IPv6).  There's no need to reserve space
for both an IPv4 and IPv6 address in the module structure,
since the module will only be associated with one or the
other.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-17 16:21:17 +00:00
Brian Barrett
2acc4b7e7f btl tcp: Add workaround for "dropped connection" issue
Work around a race condition in the TCP BTL's proc setup code.
The Cisco MTT results have been failing on TCP tests due to a
"dropped connection" message some percentage of the time.
Some digging shows that the issue happens in a combination of
multiple NICs and multiple threads.  The race is detailed in
https://github.com/open-mpi/ompi/issues/3035#issuecomment-429500032.

This patch doesn't fix the race, but avoids it by forcing
the MPI layer to complete all calls to add_procs across the
entire job before any process leaves MPI_INIT.  It also
reduces the scalability of the TCP BTL by increasing start-up
time, but better than hanging.

The long term fix is to do all endpoint setup in the first
call to add_procs for a given remote proc, removing the
race.  THis patch is a work around until that patch can
be developed.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-16 18:33:30 -07:00
Brian Barrett
bd07cc6707 btl tcp: Improve initialization debugging
When creating TCP BTL modules, print more information about the
module's ethernet association, including the first address associated
with the device, as debug output.

Fix a flipped output string for IPv4 and IPv6 addresses in the
modex send code.

Add the addresses being published in the modex to the debugging
output in modex send, to help match failures in endpoint match.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-10 08:29:48 -07:00
Brian Barrett
e9e4d2a4bc Handle asprintf errors with opal_asprintf wrapper
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error.  However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined.  Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
George Bosilca
a3a492b42c
Small pedantic fixes.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-10-02 12:08:18 -04:00
Michael Kuron
17b0f1fcc3 Deal with EOPNOTSUPP returned from getsockopt()
This can be returned when running on QEMU user-mode emulation,
which does not support getsockopt with SO_RCVTIMEO.

Signed-off-by: Michael Kuron <mkuron@icp.uni-stuttgart.de>
2018-09-16 14:55:28 +02:00
Jeff Squyres
f200b866df btl/tcp: roll back parts of 40afd525f8
Some of the show_help() messages that were added in 40afd525f8 were
really normal / expected behavior (e.g., if 2 peers connect in the TCP
BTL more-or-less simultaneously, one of them will drop the connection
-- no need to show_help() about this; it's expected behavior).  Roll
back these messages to be opal_output_verbose() kinds of messages.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-04-07 12:28:10 -07:00
Jeff Squyres
8c419294a8 btl/tcp: fix CID 710596
sizeof(addrs[0].addr_inet)==16 (so that it can handle IPv6 addresses),
but the memory that we are copying from (my_ss->sin_addr) is only 4
bytes long.  Don't copy beyond the end of that source buffer.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:22 -07:00
Jeff Squyres
40afd525f8 btl/tcp: make error messages more specific
Convert some verbose messages to opal_show_help() messages.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-21 19:34:03 -07:00
Jordan Cherry
d7e7e3acb7 tcp btl: Fix multiple-link connection establishment.
Fix case where the btl_tcp_links MCA parameter is used to create multiple TCP connections between peers.
    Three issues were resulting in hangs during large message transfer:
      * The 2nd..btl_tcp_link connections were dropped during establishment because the per-process
        address check was binary, rather than a count
      * The accept handler would not skip a btl module that was already in use, resulting in all
        connections for a given address being vectored to a single btl
      * Multiple addresses in the same subnet caused connections to be
        stalled, as the receiver would always use the same (first) address
        found.  Binding the outgoing connection solves this issue
     *  Lastly fix race condition created by connections being started at the exact same time
        by accpeting connections not in the closed state, allowing endpoint_accept to resolve
        dispute

    Signed-off-by: Jordan Cherry <cherryj@amazon.com>
2018-02-27 16:36:44 +00:00
Mohan Gandhi
6d642e8d94 Btl tcp: Fix racing condition on simultaneous handshake
Their is racing condition in TCP connection establishment
during simultaneous handshake. This PR handles the fix for
it.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-10-03 13:13:43 -07:00
George Bosilca
d10522a01c
Set a hard limit on the TCP max fragment size.
Some OSes have hardcoded limits to prevent overflowing over an int32_t.
We can either detect this at configure (which might be a nicer but
incomplete solution), or always force the pipelined protocol over TCP.
As it only covers data larger than 1GB, no performance penalty is to be
expected.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
50f471e31e
Cleanup a set of warnings reported by Ralph.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-22 23:00:18 -04:00
Mohan
fc32ae401e Btl Tcp: Updated tcp handshake methods
This commit has two changes

1. Adding magic string during handshake can cause
issue when used with older version of MPI. Hence set
RCVTIMEO paramter to 2 second
2. Using single call during handshake instead of
two calls

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-18 10:06:52 -07:00
Mohan
e3dfe11da9 Btl tcp: Improving verbose around tcp
As part of improvement towards tcp btl we
are improving verbose in general

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 17:22:16 -07:00
Mohan
4bc7b214dc Btl tcp: Improving verbose around IPV6
As part of improvement around tcp btl debugging
& verbose. we are improving verbose around IPV6

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
0741fad479 Btl tcp: BTL_ERROR to show_help & update func behaviour
As part of improvement towards tcp debugging
we are moving few BTL_ERROR to show_help and also
update the function behaviour of
mca_btl_tcp_endpoint_complete_connect to return
SUCCESS and ERROR cases.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
368f9f0dfc Btl tcp: Using magic string to verify mpi connection
As part of improvement towards handling failure case
in btl tcp we are using magic string to verify mpi
connection. In case if there is mismatch or missing
magic string we can identify that we are trying to
connect with someother process.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:13 -07:00
Brian Barrett
3b991498be btl tcp: Don't set socket buffer size by default
Set the default send and receive socket buffer size to 0,
which means Open MPI will not try to set a buffer size during
startup.

The default behavior since near day one of the TCP BTL has been
to set the send and receive socket buffer sizes to 128 KiB.  A
number that works great on 1 GbE, but not so great on 10 GbE
fabrics of any real size.  Modern TCP stacks, particularly on
Linux, have gotten much smarter about buffer sizes and are much
less efficient if a buffer size is set (even if set to something
large).

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-04-28 14:14:49 -07:00
Gilles Gouaillardet
c62498ab3d btl/tcp: remove reference to just removed tcp_local
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-07 09:32:09 +09:00
Gilles Gouaillardet
7e5da7382e btl/tcp: plug leaks when closing component
remove tcp_local from the tcp_procs table, and release it

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 11:35:59 +09:00
George Bosilca
d0dddef53d
Protect the tcp_endpoints list from concurrent accesses.
Thanks Gilles for your help.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2016-11-11 00:06:03 -05:00
Gilles Gouaillardet
a49422fe84 btl/tcp: get rid of the MCA_BTL_TCP_SUPPORT_PROGRESS_THREAD macro
since pthreads are now mandatory, the MCA_BTL_TCP_SUPPORT_PROGRESS_THREAD
is always true and hence can be safely removed

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-08 14:00:05 +09:00
Nathan Hjelm
a681837ba8 btl/tcp: fix double list remove
This commit fixes an abort during finalize because pending events were
removed from the list twice.

References #2030

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-13 09:23:12 -06:00
George Bosilca
3adff9d323 Fixes #1793.
Reshape the tearing down process (connection close) to prevent race
conditions between the main thread and the progress thread.

Minor cleanups.
2016-08-24 22:45:19 -04:00
George Bosilca
d2abff583e Fix race condition during BTL TCP tear-down.
bot🏷️bug
bot:assign:@hjelmn
2016-05-30 10:47:14 -05:00
Karol Mroz
ca6ddf3270 btl/tcp: autodetect bandwidth and latency if unset
Fixes open-mpi/ompi#120

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:52 +02:00
Karol Mroz
b9c6c43c6b btl/tcp: add default defines for bandwidth and latency
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:52 +02:00
Ralph Castain
7eca2f9650 Add missing include 2016-03-30 01:34:01 -07:00
Gilles Gouaillardet
e852d85cc1 btl/tcp: add missing mca_btl_tcp_dump() subroutine 2016-03-30 16:10:15 +09:00
Ralph Castain
f70c5c495b Tsk...tsk...replace references to ompi values with opal 2016-03-29 13:35:43 -07:00
George Bosilca
d0165818b3 Initialize all common symbols. 2016-03-29 16:08:27 -04:00
George Bosilca
f69eba1bc4 Update the copyright and cleanup the code.
Per @jsquyres suggestion remove all trailing spaces.
Credit to `sed -i.bak 's/ *$//' */[ch]`.
2016-03-28 14:41:01 -04:00
Thananon Patinyasakdikul
92062492b9 Enable Threading in the BTL TCP
Added mca parameter to turn progress thread on/off
Add a flag to check if we have btl progress thread.
Added macro for ob1 matching lock.
Update the AUTHORS file.
2016-03-28 14:41:01 -04:00
George Bosilca
32277db6ab Add support for async progress in the BTL TCP.
All BTL-only operations (basically all data movements
with the exception of the matching operation) can now
be handled for the TCP BTL by a progress thread.
2016-03-28 14:40:50 -04:00
Nathan Hjelm
6f8f2325ed btl: btls are now required to set the send flag if supported
This commit updates each non-compliant btl to send the
MCA_BTL_FLAGS_SEND flag in the btl_flags field if send is
supported. This fixes a problem identified after the latest bml/r2
update which excplicitly checks for the send flag.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:54 -06:00
Jeff Squyres
bc9e5652ff whitespace: purge whitespace at end of lines
Generated by running "./contrib/whitespace-purge.sh".
2015-09-08 09:47:17 -07:00
Ralph Castain
cf6137b530 Integrate PMIx 1.0 with OMPI.
Bring Slurm PMI-1 component online
Bring the s2 component online

Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways.

Bring the OMPI pubsub/pmi component online

Get comm_spawn working again

Ensure we always provide a cpuset, even if it is NULL

pmix/cray: adjust cray pmix component for pmix

Make changes so cray pmix can work within the integrated
ompi/pmix framework.

Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet

Cleanup comm_spawn - procs now starting, error in connect_accept

Complete integration
2015-08-29 16:04:10 -07:00
Ralph Castain
683efcb850 Rename the current opal_event_base to opal_sync_event_base in preparation for adding an async progress thread to opal. No functional changes made here - just a simple rename. 2015-07-11 10:08:19 -07:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Jeff Squyres
4b59be4e4c btl tcp: cosmetic changes and updates
No logic changes.

Update some stale/incorrect comments, fix some indenting and style.
2015-06-06 10:17:20 -07:00
Gilles Gouaillardet
e1cc931e1b btl/tcp: silence CID 710616 2015-03-05 14:20:08 +09:00
Rolf vandeVaart
e48bc77342 Fix the coverity fix 2015-02-27 12:49:44 -05:00
Gilles Gouaillardet
f33cd58ee9 btl/tcp: fix misc memory leaks
as reported by Coverity with CIDs 710615, 710616 and 710618
2015-02-27 19:16:22 +09:00
George Bosilca
778ba0317e Revert "Minor cleanups."
This reverts commit 3b4da0bda4c3a29e4b20639825ce1642df157b2a.
2015-02-26 17:53:58 -05:00