mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1. If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).
This commit was SVN r10521.
explicitly enabled at run-time with the mca parameter
io_romio_enable_parallel_optimizations set to something non-zero.
This will enable some magic flags in Panasas if the user didn't
set them (either on or off) and do some slightly better things
with strided collective writes.
This commit was SVN r10516.
standard). This macro allow us to specify the length of the fragment. Now we are
able to know how the message is fragmented between the network devices or inside
the communication protocol.
This commit was SVN r10508.
specified check that the put function is available for the BTL. Same safe check for
the GET function. At the end make sure that at least on communication protocol is
specified, otherwise force the send flag.
This commit was SVN r10507.
by the BTL (btl_max_rdma_size). Now the PUT protocol is pipelined even if there
is just one network between the 2 peers. Unfortunately, this problem is present
the 1.1 (no pipeline for the PUT protocol).
This commit was SVN r10499.
SIGUSR2. This can be extended later if needed to include other
signals we should forward to the user processes (TSTP and CONT,
perhaps?)
* Since the signal handlers don't actually run in signal context, we
can use malloc/fprintf/etc. So clean up some of the signal handler
code so that we don't keep message buffers around for the life of
the process
This commit was SVN r10496.
time.
UD is connectionless, and as long as peers are statically assigned to QPs,
there is no reason to set up the adressing information lazily.
Lots of code was axed, as endpoints no longer have state. Removed a
number of other elements in the endpoint struct to make it as lightweight
as possible.
I was able to remove an entire function call/branch in the send path,
which I believe is the main contributor to a 2us drop in NetPIPE latency.
Some whitespace cleanups as well.
Passes IBM test suite, and all but certain intel tests that were failing
before the change, over ob1 PML.
This commit was SVN r10494.
Moved a lot of the module-specific init from the component init to the module init.
Try keeping a pointer to reduce indexing, didn't seem to help - leaving in place
for now.
This commit was SVN r10485.
Playing around with OPAL_LIKELY/UNLIKELY, no real gains yet.
Reworked progress() to process many WC's at a time, as well
as immediately repost groups of receive buffers.
This commit was SVN r10481.
was smaller than the CACHE_LINE_SIZE. Here is the version that works.
In fact this works on 2 steps. First we set the element size to something
multiple of the desired alignment. Then when we allocate memory, we compute
the total size, and we will align each of the elements (we allocate
multiple of them every time) to the CACHE_LINE_SIZE.
This commit was SVN r10479.
bytes). The simplest way to make sure they are aligned is to update
the size of the basic element to a multiple of the desired alignment.
It will use a little bit more memory, but the improvements on the SM BTL
seems quite interesting.
This commit was SVN r10478.
cannot include the PMPI_WTIME|WTICK functions in the external and
double precision statements because some compilers complain about
this. Instead, we need to use the macro that is defined by
configure.ac (MPIF_H_PMPI_W_FUNCS). This unfortunately means that we
need to generate mpif.h (in addition to mpif-config.h) because the
"external" statement is toxic to F90 compilers.
This commit was SVN r10464.
with the other methodology even if there are no choice buffers and no
special constants. But it keeps the Makefile.am simple and the
methodology consistent.
This commit was SVN r10462.