openmpi

Автор	SHA1	Сообщение	Дата
Dave Goodell	4875f48eaa	usnic: enable UDP support This commit decouples OMPI deployment from the version(s) of the lower layers of the stack by probing for UDP support. Verbs applications assume a 40-byte header (there is no current mechanism for querying payload offset). So to support a 42-byte UDP header without causing existing applications like ibv_ud_pingpong or older versions of OMPI to crash, we must inform libusnic_verbs that we are aware of the nonstandard payload offset. We do this by overriding the `transport_type` field of the device to be 42 before calling `ibv_open_device`. If the library resets it to something else, then we know the lower layers are UDP capable. Otherwise we use the older custom-L2 format. This necessitated some minor ugliness in common_verbs, but it's as tidy as Jeff and I know how to make it right now. This commit only adds support for UDP headers and connectivity over the same L2 network, it does not touch routing or interface pairing. Reviewed-by: Jeff Squyres <jsquyres@cisco.com> cmr=v1.7.5:ticket=trac:4253 This commit was SVN r30838. The following Trac tickets were found above: Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253	2014-02-26 07:44:35 +00:00
Dave Goodell	cadaa1c424	usnic: Shrink sequence numbers to 16 bits Authored-by: Reese Faucette <rfaucett@cisco.com> Reviewed-by: Jeff Squyres <jsquyres@cisco.com> cmr=v1.7.5:ticket=trac:4253 This commit was SVN r30834. The following Trac tickets were found above: Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253	2014-02-26 07:40:10 +00:00
Dave Goodell	707e594d13	usnic: Use INLINE flag more often, saving the DMA is useful. Authored-by: Reese Faucette <rfaucett@cisco.com> Reviewed-by: Jeff Squyres <jsquyres@cisco.com> cmr=v1.7.5:ticket=trac:4253 This commit was SVN r30833. The following Trac tickets were found above: Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253	2014-02-26 07:39:53 +00:00
Dave Goodell	82db913490	usnic: fix module_recv_buffers perf regression Cisco v1.6 git commit 913ec6c and upstream trunk r29593 (segfault fix) introduced a performance regression by inadvertently disabling the `module_recv_buffers` functionality. With those changes in place, the `btl_usnic_recv.c` logic would end up mallocing a buffer that should have otherwise come from a `module_recv_buffers` pool. It also resulted in a small, bounded memory leak (128 buffers at each power-of-two size interval). The new version just places the buffer after the free list item with a flexible array member. I bumped the pool to allocate all 128 elements up front because the deferred allocation was modestly impacting IMB Sendrecv performance at a few sizes. Reviewed-by: Reese Faucette <rfaucett@cisco.com> This commit was SVN r29631. The following SVN revision numbers were found above: r29593 --> open-mpi/ompi@1ed9b8ff43	2013-11-07 01:27:31 +00:00
Dave Goodell	73a943492c	usnic: pack via convertor on the fly If we need to use a convertor, go back to stashing that convertor in the frag and populating segments "on the fly" (in ompi_btl_usnic_module_progress_sends). Previously we would pack into a chain of chunk segments at prepare_src time, unnecessarily consuming additional memory. Reviewed-by: Jeff Squyres <jsquyres@cisco.com> Reviewed-by: Reese Faucette <rfaucett@cisco.com> This commit was SVN r29592.	2013-11-04 22:52:03 +00:00
Dave Goodell	825686a205	usnic: certain send frag members are immutable Ensure that they never are touched by checking in their destructors. Reviewed-by: Jeff Squyres <jsquyres@cisco.com> Reviewed-by: Reese Faucette <rfaucett@cisco.com> This commit was SVN r29589.	2013-11-04 22:51:24 +00:00
Reese Faucette	f35d9b50e3	Cisco CSCuj22803: fixes for Bsend changes required to support MPI_Bsend(). Introduces concept of attaching a buffer to a large segment that the PML can scribble into and we will send from. The reason we don't use a pinned buffer and send directly from that is that usnic_verbs does not (yes) support num_sge>1 for regular sends. This means the data gets copied twice, but that is unavoidable. changed the logic in handle_large_send to be more sensible Incorporated David's review comments This commit was SVN r29184.	2013-09-17 07:27:39 +00:00
Reese Faucette	25b5c84d0f	Cisco CSCuj13135: Data corruption in MPI_Bsend_ator_c Do not assume that the "size" passed to alloc_send() will be the same as the size of the message the resulting fragment will hold when usnic_send() is called. This means usnic_send()/usnic_put() can never trust any pre-computed size values, and are only allowed to look at the lengths and pointers of the elements in the desc SG list. This commit was SVN r29183.	2013-09-17 07:25:05 +00:00
Dave Goodell	a669bd01e6	usnic: revamp convertor handling. The fix for the HPL SEGV was incorrect because it assumed the prepare_src() routine was always allowed to return "bytes processed" less than the requested "bytes to send". It turns out this is only true if the convertor is what limits the size, we are not allowed to limit the data sent for our own reasons, else we break login in the upper layers. This means we need to learn the number of bytes out of the size requested the convertor will give us, no matter how big the size is. Unfortunately, this is a destructive test, and (currently) the only way to learn that number is to actually have the convertor copy the data out into buffers. This change implements this, copying the entire data out into a chain of send segments which are attached to the large send fragment. Now we can always return the proper size value to the PML. Fixes Cisco bug CSCuj08024 Authored-by: Reese Faucette <rfaucett@cisco.com> Should be included in usnic v1.7.3 roll-up CMR (refs trac:3760) This commit was SVN r29137. The following Trac tickets were found above: Ticket 3760 --> https://svn.open-mpi.org/trac/ompi/ticket/3760	2013-09-06 03:21:21 +00:00
Dave Goodell	9cab9777d9	usnic: properly destroy embedded small send frag Without this, an `--enable-debug` build would hit an assertion in the list code when run under valgrind with `--malloc-fill=0xff` or any other case where malloc returned non-zeroed buffers. Also allow the normal OBJ_ machinery to handle the constructor invocation ordering for us instead of doing it by hand (which could have led to future bugs). Reviewed-by: jsquyres@cisco.com cmr=v1.7.4 Depends on trunk functionality in r29095 and r29096. Refs trac:3740,#3741. This commit was SVN r29127. The following SVN revision numbers were found above: r29095 --> open-mpi/ompi@d1b5940e97 r29096 --> open-mpi/ompi@a552921171 The following Trac tickets were found above: Ticket 3740 --> https://svn.open-mpi.org/trac/ompi/ticket/3740	2013-09-04 20:59:12 +00:00
Jeff Squyres	4b6006402d	Use the RTE framework instead of calling ORTE directly. Brian (rightfully) hit me on the head with the don't-use-ORTE-use-the-rte-framework clue bat; the usnic BTL now nicely plays with the RTE framework. This commit was SVN r28907.	2013-07-22 17:28:23 +00:00
Jeff Squyres	194b285447	First commit of the Cisco usNIC BTL. This BTL accesses the Cisco usNIC Linux device via the Linux verbs API via Unreliable Datagram queue pairs. A few noteworthy points: * This BTL does most of its own fragmentation; it tells the PML that it has a very high max_send_size (much higher than the network MTU). * Since UD fragments are, by definition, unreliable, the usnic BTL handles all of its own reliability via a sliding window approach using the opal_hotel construct and many tricks stolen from the corpus of knowledge surrounding efficient TCP. * There is a fun PML latency-metric based optimization for NUMA awareness of short messages. * Note that this is ''not'' a generic UD verbs BTL; it is specific to the Cisco usNIC device. This commit was SVN r28879.	2013-07-19 22:13:58 +00:00

12 Коммитов