The Cisco-maintained v1.6 port of the usnic BTL has diverged from the
upstream trunk and v1.7 branches. This commit adjusts the trunk to more
closely match the v1.6 branch to simplify future merging and
cherry-picking.
The usnic MCA parameters also need work on this side.
Should be included in usnic v1.7.3 roll-up CMR (refs trac:3760)
This commit was SVN r29138.
The following Trac tickets were found above:
Ticket 3760 --> https://svn.open-mpi.org/trac/ompi/ticket/3760
Authored-by: Reese Faucette <rfaucett@cisco.com>
Should be included in usnic v1.7.3 roll-up CMR (refs trac:3760)
This commit was SVN r29136.
The following Trac tickets were found above:
Ticket 3760 --> https://svn.open-mpi.org/trac/ompi/ticket/3760
Should be included in usnic v1.7.3 roll-up CMR (refs trac:3760)
This commit was SVN r29135.
The following Trac tickets were found above:
Ticket 3760 --> https://svn.open-mpi.org/trac/ompi/ticket/3760
Should be included in usnic v1.7.3 roll-up CMR (refs trac:3760)
This commit was SVN r29134.
The following Trac tickets were found above:
Ticket 3760 --> https://svn.open-mpi.org/trac/ompi/ticket/3760
- round segment buffer allocation to cache-line
- split some routines into an inline fast section and a called
slower section
- introduce receive fastpath in component_progress that:
o returns immediately if there is a packet available on priority
queue and fastpath is enabled
o disables fastpath for 1 time after use to provide fairness to
other processing
o defers receive buffer posting
o defers bookeeping for receive until next call
to usnic_component_progress
Authored-by: Reese Faucette <rfaucett@cisco.com>
Should be included in usnic v1.7.3 roll-up CMR (refs trac:3760)
This commit was SVN r29133.
The following Trac tickets were found above:
Ticket 3760 --> https://svn.open-mpi.org/trac/ompi/ticket/3760
improvements:
* Fix minor memory leaks during component_init
* Ensure that an initialization loop does not underflow an unsigned int
* Improve mlock limit checking
* Fix set of BTL modules created during component_init when failing to
get QP resources or otherwise excluding some (but not all) usnic
verbs devices
* Fix/improve error messages to be consistent with other Cisco
documentation
* Randomize the initial sliding window sequence number so that we
silently drop incoming frames from previous jobs that still have
existant processes in the middle of dying (and are still
transmitting)
* Ensure we don't break out of add_procs too soon and create an
asymetrical view of what interfaces are available
This commit was SVN r28975.
Use the new sysfs files to check that there are enough VFs, QPs, and
CQs for all the MPI processes on this server.
Move the checking code into its own subroutine to make it smaller and
easier to read/grok.
This commit was SVN r28937.
Brian (rightfully) hit me on the head with the
don't-use-ORTE-use-the-rte-framework clue bat; the usnic BTL now
nicely plays with the RTE framework.
This commit was SVN r28907.
This BTL accesses the Cisco usNIC Linux device via the Linux verbs
API via Unreliable Datagram queue pairs. A few noteworthy points:
* This BTL does most of its own fragmentation; it tells the PML that
it has a very high max_send_size (much higher than the network
MTU).
* Since UD fragments are, by definition, unreliable, the usnic BTL
handles all of its own reliability via a sliding window approach
using the opal_hotel construct and many tricks stolen from the
corpus of knowledge surrounding efficient TCP.
* There is a fun PML latency-metric based optimization for NUMA
awareness of short messages.
* Note that this is ''not'' a generic UD verbs BTL; it is specific to
the Cisco usNIC device.
This commit was SVN r28879.