openmpi

Автор	SHA1	Сообщение	Дата
William Zhang	8c3b8a87c5	btl tcp: Fix error path memory leak After the OPAL_MODEX_RECV call, remote_addrs was not freed in the error path. Moved the free call into cleanup to ensure we always free this memory before leaving the function. Signed-off-by: William Zhang <wilzhang@amazon.com>	2019-07-15 22:35:04 +00:00
bosilca	b54fdf5dd9	Merge pull request #6541 from bwbarrett/bugfix/enotconn btl/tcp: Skip printing error message in racy cleanup path	2019-03-28 22:42:52 -04:00
Brian Barrett	d5360711fa	btl/tcp: Skip printing error message in racy cleanup path Avoid printing an error message about ENOTCONN return codes from getpeername() when handling an incoming connection request. At this point in the receive state machine, the remote process has been verified to be a valid OMPI instance. In all-to-all startup at 4k rank scale, we're seeing this error message when the remote side drops the connection because it realizes it's the "loser" in the connection race. We were already doing all the right things, other than printing a scary error message. So skip the error message and call it good. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2019-03-28 23:12:35 +00:00
Dmitry Gladkov	9920da4992	btl/tcp: Fix copy-paste misprint Signed-off-by: Dmitry Gladkov <dmitrygla@mellanox.com>	2019-02-20 11:18:02 +02:00
Brian Barrett	a1e85b03aa	btl tcp: Fix compile error in IPv6 In 457f058 I broke the TCP BTL with --enable-ipv6. This patch fixes the compile error, so IPv6 works again. Fixed #5996 Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-30 12:31:04 -07:00
Brian Barrett	457f058e73	btl tcp: Simplify modex address selection Simplify selection of the address to publish for a given BTL TCP module in the module exchange code. Rather than looping through all IP addresses associated with a node, looking for one that matches the kindex of a module, loop over the modules and use the address stored in the module structure. This also happens to be the address that the source will use to bind() in a connect() call, so this should eliminate any confusion (read: bugs) when an interface has multiple IPs associated with it. Refs #5818 Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-18 02:23:21 +00:00
Brian Barrett	4f19221af2	btl tcp: Simplify module address storage Today, a btl tcp module is associated with exactly one IP address (IPv4 or IPv6). There's no need to reserve space for both an IPv4 and IPv6 address in the module structure, since the module will only be associated with one or the other. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-17 16:21:17 +00:00
Brian Barrett	2acc4b7e7f	btl tcp: Add workaround for "dropped connection" issue Work around a race condition in the TCP BTL's proc setup code. The Cisco MTT results have been failing on TCP tests due to a "dropped connection" message some percentage of the time. Some digging shows that the issue happens in a combination of multiple NICs and multiple threads. The race is detailed in https://github.com/open-mpi/ompi/issues/3035#issuecomment-429500032. This patch doesn't fix the race, but avoids it by forcing the MPI layer to complete all calls to add_procs across the entire job before any process leaves MPI_INIT. It also reduces the scalability of the TCP BTL by increasing start-up time, but better than hanging. The long term fix is to do all endpoint setup in the first call to add_procs for a given remote proc, removing the race. THis patch is a work around until that patch can be developed. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-16 18:33:30 -07:00
Brian Barrett	da1189d771	Merge pull request #5916 from bwbarrett/revert/6acebc4 Revert "Handle error cases in TCP BTL"	2018-10-16 13:54:18 -07:00
Jeff Squyres	cef6cf0ac5	opal: update some string handling Make various string handling locations a little more robust. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-10-14 16:04:28 -07:00
Brian Barrett	5162011428	Revert "Handle error cases in TCP BTL" This reverts commit 6acebc40a194c92ab38a28553c2c8b04eb391820. This patch is causing numerous "Socket closed" messages which are causing most of the failures on Cisco's MTT run. See https://github.com/open-mpi/ompi/issues/5849 for more information. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-12 15:01:54 -07:00
Brian Barrett	902b57919c	btl tcp: Print number of endpoints on error While trying to debug #3035, it's not clear whether there is an issue with the modex data or printing the address list. Print the number of endpoints on the error, which will help determine which case is happening to Cisco. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-10 08:29:48 -07:00
Brian Barrett	bd07cc6707	btl tcp: Improve initialization debugging When creating TCP BTL modules, print more information about the module's ethernet association, including the first address associated with the device, as debug output. Fix a flipped output string for IPv4 and IPv6 addresses in the modex send code. Add the addresses being published in the modex to the debugging output in modex send, to help match failures in endpoint match. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-10 08:29:48 -07:00
Brian Barrett	e9e4d2a4bc	Handle asprintf errors with opal_asprintf wrapper The Open MPI code base assumed that asprintf always behaved like the FreeBSD variant, where ptr is set to NULL on error. However, the C standard (and Linux) only guarantee that the return code will be -1 on error and leave ptr undefined. Rather than fix all the usage in the code, we use opal_asprintf() wrapper instead, which guarantees the BSD-like behavior of ptr always being set to NULL. In addition to being correct, this will fix many, many warnings in the Open MPI code base. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-08 16:43:53 -07:00
George Bosilca	a3a492b42c	Small pedantic fixes. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-10-02 12:08:18 -04:00
George Bosilca	9164e26e2f	Provide the correct socklen to bind. Get Brian's patch from #5825 and his log message: Fix a failure in binding the initiating side of a connection on MacOS. MacOS doesn't like passing the size of the storage structure (sockaddr_storage) instead of the expected size of the structure (sockaddr_in or sockaddr_in6), which was causing bind() failures. This patch simply changes the structure size to the expected size. Add a more clear error message in debug mode. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-10-02 12:06:40 -04:00
Jeff Squyres	5dae086f7e	btl/tcp: output the IP address correctly Per https://github.com/open-mpi/ompi/issues/3035#issuecomment-426085673, it looks like the IP address for a given interface is being stashed in two places: on the endpoint and on the module. 1. On the endpoint, it is storing the moral equivalent of a (struct sockaddr_in.sin_addr). 2. On the module, it is storing a full (struct sockaddr_storage). The call to opal_net_get_hostname() expects a full (struct sockaddr*) -- not just the stripped-down (struct sockaddr_in.sin_addr). Hence, when the original code was passing in the endpoint's (struct sockaddr_in.sin_addr) and opal_net_get_hostname() was treating it like a (struct sockaddr), hilarity ensued (i.e., we got the wrong output). This commit eliminates the call to opal_net_get_hostname() and just calls inet_ntop() directly to convert the (struct sockaddr_in.sin_addr) to a string. NOTE: Per the github comment cited above, there can be a disparity between the IP address cached on the endpoint vs. the IP address cached on the module. This only happens with interfaces that have more than one IP address. This commit does not fix that issue. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-10-01 16:12:57 -07:00
bosilca	464e1abbab	Merge pull request #5700 from ICLDisco/export/tcp_errors Handle error cases in TCP BTL	2018-09-19 09:44:38 -04:00
Michael Kuron	17b0f1fcc3	Deal with EOPNOTSUPP returned from getsockopt() This can be returned when running on QEMU user-mode emulation, which does not support getsockopt with SO_RCVTIMEO. Signed-off-by: Michael Kuron <mkuron@icp.uni-stuttgart.de>	2018-09-16 14:55:28 +02:00
Jeff Squyres	fe0852bcb4	Miscellaneous compiler warning stomps. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-08-24 07:39:14 -07:00
Aurelien Bouteiller	6acebc40a1	Handle error cases in TCP BTL When an error is returned by the socket operations, trigger the appropriate error path in the PML to give an opportunity for rerouting/error handling. Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2018-08-14 15:35:24 -04:00
Jeff Squyres	7b0dd03e92	tcp/btl: fix a cast The current cast is functional, but isn't really the way it should be done. This commit makes the cast the way it should be done. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-29 07:25:46 -07:00
Jeff Squyres	57bc657e7f	btl/tcp: fix hash map usage Fix two facepalms: 1. The "uint32" in the hash map functions refer to the key size, not the value size. The values are always 64 bits. 2. Pass the straight value to the "set" functions -- not the pointer to the value. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-28 15:29:41 -07:00
Ralph Castain	0ddbc75ce5	Merge pull request #4930 from kizill/fix-ipv6 fixed ipv6 OOB connection problems (fix issue #1585)	2018-06-26 09:13:53 -07:00
Jeff Squyres	3767ce27c0	btl/tcp: trivial whitespace clean No code/logic changes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-23 08:04:12 -07:00
Jeff Squyres	9034717876	btl/tcp: use a hash map for kernel IP interface indexes The giant size of the TCP proc struct is causing a problem in some environments (because it is allocated on the stack), and it was too big, anyway. Instead, use a hash map. That way, it starts small and can grow if it needs to. It also makes no assumptions about the values of the kernel interface indexes. Fixes #5292. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-23 08:03:30 -07:00
George Bosilca	6ff11267fb	Remove warnings identified by clang. Plus minor spacing and indentation issues. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-04-14 17:14:12 -04:00
Jeff Squyres	f200b866df	btl/tcp: roll back parts of 40afd525f8 Some of the show_help() messages that were added in 40afd525f8 were really normal / expected behavior (e.g., if 2 peers connect in the TCP BTL more-or-less simultaneously, one of them will drop the connection -- no need to show_help() about this; it's expected behavior). Roll back these messages to be opal_output_verbose() kinds of messages. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-04-07 12:28:10 -07:00
Jeff Squyres	8c419294a8	btl/tcp: fix CID 710596 sizeof(addrs[0].addr_inet)==16 (so that it can handle IPv6 addresses), but the memory that we are copying from (my_ss->sin_addr) is only 4 bytes long. Don't copy beyond the end of that source buffer. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:22 -07:00
Jeff Squyres	a17f4afdc7	btl/tcp: fix CID 1416634 Fix resource leak in the TCP BTL. Also add a little defensive programming. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	40afd525f8	btl/tcp: make error messages more specific Convert some verbose messages to opal_show_help() messages. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-21 19:34:03 -07:00
Stanislav Kirillov	c2bfca19ba	fix ipv6 missing copypaste Signed-off-by: Stanislav Kirillov <staskirillof@yandex.ru>	2018-03-21 02:25:04 +03:00
Jordan Cherry	d7e7e3acb7	tcp btl: Fix multiple-link connection establishment. Fix case where the btl_tcp_links MCA parameter is used to create multiple TCP connections between peers. Three issues were resulting in hangs during large message transfer: * The 2nd..btl_tcp_link connections were dropped during establishment because the per-process address check was binary, rather than a count * The accept handler would not skip a btl module that was already in use, resulting in all connections for a given address being vectored to a single btl * Multiple addresses in the same subnet caused connections to be stalled, as the receiver would always use the same (first) address found. Binding the outgoing connection solves this issue * Lastly fix race condition created by connections being started at the exact same time by accpeting connections not in the closed state, allowing endpoint_accept to resolve dispute Signed-off-by: Jordan Cherry <cherryj@amazon.com>	2018-02-27 16:36:44 +00:00
Mohan Gandhi	6d642e8d94	Btl tcp: Fix racing condition on simultaneous handshake Their is racing condition in TCP connection establishment during simultaneous handshake. This PR handles the fix for it. Signed-off-by: Mohan Gandhi <mohgan@amazon.com>	2017-10-03 13:13:43 -07:00
bosilca	ab68aced23	Merge pull request #3738 from bosilca/topic/tcp_event_count Fix the TCP performance impact when BTL not used	2017-09-19 23:08:58 -04:00
George Bosilca	d10522a01c	Set a hard limit on the TCP max fragment size. Some OSes have hardcoded limits to prevent overflowing over an int32_t. We can either detect this at configure (which might be a nicer but incomplete solution), or always force the pipelined protocol over TCP. As it only covers data larger than 1GB, no performance penalty is to be expected. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-09-01 18:52:48 -04:00
George Bosilca	c340da2586	A first cut at the large data problem with TCP. As long as the writev and readv support a sum larger than a uint32_t this version will work. For the other OSes a different patch is required. This patch is a slight modification of the one proposed by @ggouaillardet. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-09-01 18:52:48 -04:00
Josh Hursey	ad87aa2674	Merge pull request #4121 from jjhursey/explore/dlopen-local mca: Dynamic components link against project lib	2017-08-25 13:15:51 -05:00
Joshua Hursey	e1d079544b	mca: Dynamic components link against project lib * Resolves #3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-08-24 11:56:16 -04:00
George Bosilca	50f471e31e	Cleanup a set of warnings reported by Ralph. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-08-22 23:00:18 -04:00
Mohan	fc32ae401e	Btl Tcp: Updated tcp handshake methods This commit has two changes 1. Adding magic string during handshake can cause issue when used with older version of MPI. Hence set RCVTIMEO paramter to 2 second 2. Using single call during handshake instead of two calls Signed-off-by: Mohan Gandhi <mohgan@amazon.com>	2017-08-18 10:06:52 -07:00
Mohan	e3dfe11da9	Btl tcp: Improving verbose around tcp As part of improvement towards tcp btl we are improving verbose in general Signed-off-by: Mohan Gandhi <mohgan@amazon.com>	2017-08-17 17:22:16 -07:00
Mohan	4bc7b214dc	Btl tcp: Improving verbose around IPV6 As part of improvement around tcp btl debugging & verbose. we are improving verbose around IPV6 Signed-off-by: Mohan Gandhi <mohgan@amazon.com>	2017-08-17 16:45:14 -07:00
Mohan	0741fad479	Btl tcp: BTL_ERROR to show_help & update func behaviour As part of improvement towards tcp debugging we are moving few BTL_ERROR to show_help and also update the function behaviour of mca_btl_tcp_endpoint_complete_connect to return SUCCESS and ERROR cases. Signed-off-by: Mohan Gandhi <mohgan@amazon.com>	2017-08-17 16:45:14 -07:00
Mohan	368f9f0dfc	Btl tcp: Using magic string to verify mpi connection As part of improvement towards handling failure case in btl tcp we are using magic string to verify mpi connection. In case if there is mismatch or missing magic string we can identify that we are trying to connect with someother process. Signed-off-by: Mohan Gandhi <mohgan@amazon.com>	2017-08-17 16:45:13 -07:00
Mohan	c30a42917c	Btl tcp: Refactoring non-blocking send/receive function Moving non-blocking send/receive function to btl_tcp will help reusing these function where ever needed. In this case we plan to reuse receive function to retrive magic string to validate established connection is from mpi process. Signed-off-by: Mohan Gandhi <mohgan@amazon.com>	2017-08-17 16:45:13 -07:00
Gilles Gouaillardet	32606ad476	btl/tcp: fix heterogeneous support for put / large messages Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-07-12 10:27:45 +09:00
George Bosilca	bd5650d680	Fix the TCP performance impact when not used Based on an idea from Brian move the libevent trigger update to a later stage instead of the generic add/del procs. So, we are doing the increment/decrement when we register the recv handler for an endpoint, so basically when we create and connect a socket to a peer. The benefit is that as long as TCP is not used, there should be no impact on the performance of other BTLs. The drawback is that the first TCP connection will be slightly slower, but then once we have a peer connected over TCP things go back to normal. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-06-23 11:15:45 +02:00
Ralph Castain	a737d0f963	Merge pull request #3430 from bosilca/topic/tcp_hostname Use the OPAL function to get the hostname.	2017-05-03 06:42:02 -07:00
Brian Barrett	3b991498be	btl tcp: Don't set socket buffer size by default Set the default send and receive socket buffer size to 0, which means Open MPI will not try to set a buffer size during startup. The default behavior since near day one of the TCP BTL has been to set the send and receive socket buffer sizes to 128 KiB. A number that works great on 1 GbE, but not so great on 10 GbE fabrics of any real size. Modern TCP stacks, particularly on Linux, have gotten much smarter about buffer sizes and are much less efficient if a buffer size is set (even if set to something large). Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-04-28 14:14:49 -07:00

1 2 3

142 Коммитов