needed instead of creating it and then canceling if it is not needed. Change
error handling during finalize so that it will not skip async thread
destruction. Otherwise async thread may segfault during openib module unloading.
This commit was SVN r16782.
to a pending queue of eager rdma QP instead of correct pending list. This patch
fixes this by getting reed of "eager rdma qp" notion. Packet is always send
over its order QP. The patch also adds two pending queues for high and low prio
packets. Only high prio packets are sent over eager RDMA channel.
This commit was SVN r16780.
main idea (except of cleanup) is to save on initialisation of unneeded fields
and to use C type checking system to catch obvious errors.
This commit was SVN r16779.
Each one of them has a field to store QP type, but this is redundant.
Store qp type only in one structure (the component one).
This commit was SVN r16272.
one HCA. Multiple ports, LMC, multiple BTLs per one LID. Having only one CQ for
all of them substantially reduce polling time.
This commit was SVN r15933.
/tmp/jms-modular-wireup branch):
* This commit moves all the openib BTL connection code out of
btl_openib_endpoint.c and into a connect "pseudo-component" area,
meaning that different schemes for doing OFA connection schemes can
be chosen via function pointer (i.e., MCA parameter) at run-time.
* The connect/connect.h file includes comments describing the
specific interface for the connect pseudo-component.
* Two pseudo-components are in this commit (more can certainly be
added).
* oob: use the same old oob/rml scheme for creating OFA connections
that we've had forever; this now just puts the logic into this
self-contained pseudo-component.
* rdma_cm: a currently-empty set of functions (that currently
return NOT_IMPLEMENTED) that will someday use the RDMA connection
manager to make OFA connections.
This commit was SVN r15786.
This mpool will have no btl module owner there was no btl created for
the HCA with no ports, but it will still be tracked in the mpool
framework (i.e., it's available).
If MPI_ALLOC_MEM is called by the app, one of two things will happen:
1. if there's an HCA on the host with some active ports, the openib
btl component will still be in the process space, and therefore
the "mpool with no btl" (MWNB) module will still be able to call
the reg/dereg functions, and all will be fine. However, if
MPI_FREE_MEM is never invoked to free the memory, bad things will
happen during MPI_FINALIZE. The pml is finalized, which finalizes
all the btls. The btls finalize all their mpools and all is fine.
But later we close down the mpool framework which then finalizes
any left over mpool modules, such as MWNB. However, the openib
BTL module functions that the MWNB was registered with are no
longer in the process space, and it segv's while trying deregister
the memory.
2. if there are *no* HCA's on the host with active ports, then the
openib btl will have been unloaded, and when the MWNM tries to
register the memory, the functions it tries to call (in the openib
btl) are no longer there, and we segv.
This commit was SVN r15735.
it to at least compile. If you actually get to the point of invoking
the openib btl progress thread, you'll get a big opal_output warning
that it is pretty much guaranteed not to work.
This commit was SVN r15628.
sender piggybacks a number of credit messages it received from a peer. A number
of outstanding credit messages is limited. This is needed to never ever fall
back to HW flow control.
This commit was SVN r15580.
eager RDMA receive path and checks internally from where it was called from to
perform different tasks. Leave only common code in there and move other code
to appropriate places.
This commit was SVN r15579.
1. Galen's fine-grain control of queue pair resources in the openib
BTL.
1. Pasha's new implementation of asychronous HCA event handling.
Pasha's new implementation doesn't take much explanation, but the new
"multifrag" stuff does.
Note that "svn merge" was not used to bring this new code from the
/tmp/ib_multifrag branch -- something Bad happened in the periodic
trunk pulls on that branch making an actual merge back to the trunk
effectively impossible (i.e., lots and lots of arbitrary conflicts and
artifical changes). :-(
== Fine-grain control of queue pair resources ==
Galen's fine-grain control of queue pair resources to the OpenIB BTL
(thanks to Gleb for fixing broken code and providing additional
functionality, Pasha for finding broken code, and Jeff for doing all
the svn work and regression testing).
Prior to this commit, the OpenIB BTL created two queue pairs: one for
eager size fragments and one for max send size fragments. When the
use of the shared receive queue (SRQ) was specified (via "-mca
btl_openib_use_srq 1"), these QPs would use a shared receive queue for
receive buffers instead of the default per-peer (PP) receive queues
and buffers. One consequence of this design is that receive buffer
utilization (the size of the data received as a percentage of the
receive buffer used for the data) was quite poor for a number of
applications.
The new design allows multiple QPs to be specified at runtime. Each
QP can be setup to use PP or SRQ receive buffers as well as giving
fine-grained control over receive buffer size, number of receive
buffers to post, when to replenish the receive queue (low water mark)
and for SRQ QPs, the number of outstanding sends can also be
specified. The following is an example of the syntax to describe QPs
to the OpenIB BTL using the new MCA parameter btl_openib_receive_queues:
{{{
-mca btl_openib_receive_queues \
"P,128,16,4;S,1024,256,128,32;S,4096,256,128,32;S,65536,256,128,32"
}}}
Each QP description is delimited by ";" (semicolon) with individual
fields of the QP description delimited by "," (comma). The above
example therefore describes 4 QPs.
The first QP is:
P,128,16,4
Meaning: per-peer receive buffer QPs are indicated by a starting field
of "P"; the first QP (shown above) is therefore a per-peer based QP.
The second field indicates the size of the receive buffer in bytes
(128 bytes). The third field indicates the number of receive buffers
to allocate to the QP (16). The fourth field indicates the low
watermark for receive buffers at which time the BTL will repost
receive buffers to the QP (4).
The second QP is:
S,1024,256,128,32
Shared receive queue based QPs are indicated by a starting field of
"S"; the second QP (shown above) is therefore a shared receive queue
based QP. The second, third and fourth fields are the same as in the
per-peer based QP. The fifth field is the number of outstanding sends
that are allowed at a given time on the QP (32). This provides a
"good enough" mechanism of flow control for some regular communication
patterns.
QPs MUST be specified in ascending receive buffer size order. This
requirement may be removed prior to 1.3 release.
This commit was SVN r15474.
than just the PML/BTLs these days. Also clean up the code so that it
handles the situation where not all nodes register information for a given
node (rather than just spinning until that node sends information, like
we do today).
Includes r15234 and r15265 from the /tmp/bwb-modex branch.
This commit was SVN r15310.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r15234
r15265
branch:
* Support btl_openib_if_include and btl_openib_if_exclude MCA
parameters, similar to those supported by other BTLs. Each take a
comma-delimited lists of identifiers. Identifiers can be HCA
interface names (e.g., ipath0, mthca1, etc.) or an HCA interface
name and port numbers (e.g., ipath0:1, mthca1:2, etc.). It is an
error to specify both _include and _exclude. If you specify a
non-existant (or non-ACTIVE) HCA and/or port, you'll get a warning
unless you disable the warning by setting the MCA parameter
btl_openib_warn_nonexistent_if to 0.
* Start updating to use BEGIN_C_DECLS and END_C_DECLS
* A few other minor fixes that were picked up along the way.
This commit was SVN r15063.
Set bandwidth for all ports of mthca0:
--mca btl_openib_bandwidth_mthca0 1000
Set bandwidth for port 1 of mthca1:
--mca btl_openib_bandwidth_mthca1:1 1000
Set latency for port 2 lid 123 on mthca0:
--mca btl_openib_latency_mthca0:2:123 20
This commit was SVN r15041.
have the SRQ interface.
* Instead of setting AC_DEFINEs per MCA component, set per test. THe
answers can never be difference, and this will speed sed just a teeny
bit
This commit was SVN r14856.
finally brings in functionality that is already on the 1.2 branch, and
was developed and tested in the v1.2ofed branch (and other places).
Short version of new features:
* Support for ibv_fork_init()
* Automatically fill in the openib BTL bandwidth value by
querying the HCA port
* Installdirs functionality
* Fixes to always use -I in the Fortran wrapper compilers (#924)
* Gleb's mpool updates
* Remove some kruft in btl/openib/configure.m4, therefore
fixing the harmless warnings noted in #665
* Bunches of updates to the Linux RPM spec file
I.e., effectively the same thing that r14411 brought to the v1.2
branch.
Also effectively brought in r14432 and r14433 (some fixes on top of
the original r14411 commit to v1.2). Still need to bring in the moral
equivalent of r14445 after this commit (fixes to installdirs).
This commit was SVN r14449.
The following SVN revision numbers were found above:
r14411 --> open-mpi/ompi@83b31314ae
r14432 --> open-mpi/ompi@a48f160595
r14433 --> open-mpi/ompi@68f346d2bc
r14445 --> open-mpi/ompi@13d366b827
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.
This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.
This commit closes trac:158
More details to follow.
This commit was SVN r14051.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r13912
The following Trac tickets were found above:
Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158