1
1

5 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
6003702a51 Minor improvements to the usnic BTL:
1. Fix ompi_info memory leak in usnic BTL: do not allocate memory in
    the component register function, because ompi_info only calls the
    component register function and then dlclose's the component -- it
    does not call component finalize.  Instead, defer parsing the MCA
    param (and alloc'ing memory) until the component init function so
    that any allocated memory can be freed in the component close
    function.
 1. Also add a new check to ensure that we actually have some part
    numbers to check.  Add a show_help message if we don't find any
    vendor part IDs to check.
 1. Add a verbose output if usnic disqualifies itself from selection
    because THREAD_MULTIPLE was specified.

cmr=v1.7.5:reviewer=dgoodell

This commit was SVN r30073.
2013-12-24 11:57:35 +00:00
Jeff Squyres
74d1278f48 btl_usnic_util.c:ompi_btl_usnic_util_abort() also passes in the strerror().
This commit was SVN r29188.
2013-09-17 12:35:51 +00:00
Reese Faucette
f35d9b50e3 Cisco CSCuj22803: fixes for Bsend
changes required to support MPI_Bsend().  Introduces concept of
attaching a buffer to a large segment that the PML can scribble into and
we will send from.  The reason we don't use a pinned buffer and send
directly from that is that usnic_verbs does not (yes) support num_sge>1
for regular sends.  This means the data gets copied twice, but that is
unavoidable.

changed the logic in handle_large_send to be more sensible

Incorporated David's review comments

This commit was SVN r29184.
2013-09-17 07:27:39 +00:00
Jeff Squyres
87910daf51 Fix a collection of bugs found by QA and Coverity, and make some minor
improvements:

* Fix minor memory leaks during component_init
* Ensure that an initialization loop does not underflow an unsigned int
* Improve mlock limit checking
* Fix set of BTL modules created during component_init when failing to
  get QP resources or otherwise excluding some (but not all) usnic
  verbs devices
* Fix/improve error messages to be consistent with other Cisco
  documentation
* Randomize the initial sliding window sequence number so that we
  silently drop incoming frames from previous jobs that still have
  existant processes in the middle of dying (and are still
  transmitting) 
* Ensure we don't break out of add_procs too soon and create an
  asymetrical view of what interfaces are available

This commit was SVN r28975.
2013-08-01 16:56:15 +00:00
Jeff Squyres
194b285447 First commit of the Cisco usNIC BTL.
This BTL accesses the Cisco usNIC Linux device via the Linux verbs
API via Unreliable Datagram queue pairs.  A few noteworthy points:

 * This BTL does most of its own fragmentation; it tells the PML that
   it has a very high max_send_size (much higher than the network
   MTU).
 * Since UD fragments are, by definition, unreliable, the usnic BTL
   handles all of its own reliability via a sliding window approach
   using the opal_hotel construct and many tricks stolen from the
   corpus of knowledge surrounding efficient TCP.
 * There is a fun PML latency-metric based optimization for NUMA
   awareness of short messages.
 * Note that this is ''not'' a generic UD verbs BTL; it is specific to
   the Cisco usNIC device.

This commit was SVN r28879.
2013-07-19 22:13:58 +00:00