Moved a lot of the module-specific init from the component init to the module init.
Try keeping a pointer to reduce indexing, didn't seem to help - leaving in place
for now.
This commit was SVN r10485.
Playing around with OPAL_LIKELY/UNLIKELY, no real gains yet.
Reworked progress() to process many WC's at a time, as well
as immediately repost groups of receive buffers.
This commit was SVN r10481.
was smaller than the CACHE_LINE_SIZE. Here is the version that works.
In fact this works on 2 steps. First we set the element size to something
multiple of the desired alignment. Then when we allocate memory, we compute
the total size, and we will align each of the elements (we allocate
multiple of them every time) to the CACHE_LINE_SIZE.
This commit was SVN r10479.
bytes). The simplest way to make sure they are aligned is to update
the size of the basic element to a multiple of the desired alignment.
It will use a little bit more memory, but the improvements on the SM BTL
seems quite interesting.
This commit was SVN r10478.
cannot include the PMPI_WTIME|WTICK functions in the external and
double precision statements because some compilers complain about
this. Instead, we need to use the macro that is defined by
configure.ac (MPIF_H_PMPI_W_FUNCS). This unfortunately means that we
need to generate mpif.h (in addition to mpif-config.h) because the
"external" statement is toxic to F90 compilers.
This commit was SVN r10464.
with the other methodology even if there are no choice buffers and no
special constants. But it keeps the Makefile.am simple and the
methodology consistent.
This commit was SVN r10462.
was that declaring the type of MPI_WTICK and MPI_TIME in mpif-common.h
would allow the F90 bindings to call through to the back end f77
function and have the right return type. But upon reflection, that's
silly -- we were just declaring the variables MPI_WTICK and MPI_WTIME
that were of type double precision. Duh.
So add some fixed (non-generated) wrapper F90 functions to call the
back-end *C* MPI_WTICK and MPI_TIME functions (vs. the back end *F77*
functions). We have to call the back-end C functions because there's
a name conflict if we try to call the back-end F77 functions -- for
the same reasons that we can't "implicitly" define MPI_WTIME and
MPI_WTICK in the f90 module, we can't call such an implicitly-defined
function. So we had to add new back-end C functions that are directly
callable from Fortran, the easiest implementation of which was to
provide 4 one-line functions for each (rather than muck around with
weak symbols).
This commit was SVN r10448.
1. ompi/mca/btl/udapl/btl_udapl_proc.c should be including
btl_udapl_endpoint.h for mca_btl_udapl_proc_insert function.
2. btl_udapl_endpoint.c it looks like you are using
&endpoint->endpoint_lock when you should use &ep->endpoint_lock in a
OPAL_THREAD_LOCK call.
3. btl_udapl_frag.h has a couple opal_list_item_t's that should be
ompi_free_list_item_t in the _FRAG_ALLOC_{EAGER,MAX} macros.
This commit was SVN r10442.
call to the _UNPACK macro, in the case where the length of the received data
is zero. This might happens on the PUT protocol.
This commit was SVN r10431.
mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1. If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).
This commit was SVN r10424.
send/receive queues if there is something available in the short message
rdma queues, then we have to poll *ALL* the rdma queues before exiting,
or we aren't fair about frag reception and fall into degenerate matching
cases.
This commit was SVN r10410.
where to look for registrations that were used in the alloc/free code don't
work (because the memory returned from malloc() -- whowever gets around to
calling it) might actually be registered already. So just call malloc
and free directly and avoid the whole issue when leave pinned is on. After
all, you have to pay the registration cost sometime, and if leave pinned
is on, you only have to pay it once. It makes things much simpler to
have that once be at first use rather than during ALLOC_MEM, and as far
as I can read, we're still standards conformant this way.
This commit was SVN r10406.
correctly compute the final checksum. This is not a bug in the case where
both the sender and the receiver execute EXACTLY the same checksum
computations but is definitively a problem if not (such as the buffered case).
This commit was SVN r10367.