it limits the number of circular buffers allocated between each pair of peers.
This allows for more tight memory usage control.
This commit was SVN r14120.
Queue_empty is determined by the reader, and is it's local view.
However, the writer may continue writing to this queue. The decision
to go on to the next cb_fifo is done in an atomic region, checking the
writer's view. The writer also "changes it's view" in an atomic
region protected by the same lock.
This commit was SVN r13968.
allocated from mpool memory (which is registered memory for RDMA transports)
This is not a problem for a small jobs, but for a big number of ranks an
amount of waisted memory is big.
This commit was SVN r13921.
buffer fails. If cb is already allocated, but it is full and allocation of
additional cb fails, we spin waiting for receiver to free space in existing
cb.
This commit was SVN r13635.
investivating #817:
* Remove use of legal_numbits member and always just use the full
size of the array. There was a corner case where legal_numbits was
not an even multiple of the number of bits in the array where bits
would not get freed properly, ususally causing wasted fortran
MPI handles, or, as in the case of #817, wasted attribute keyvals
(i.e., the user freed them, but the bitmap didn't reflect the
free).
* Re-order some error checks to ensure that we don't segv (we don't
currently trigger this problem anywhere; I just noticed it while
doing the other attribute keyval and legal_numbits work).
Since this change affects all Fortran MPI handles, I ran all the intel
and ibm tests and all still pass with this change.
This commit was SVN r13561.
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).
This commit was SVN r12878.
all platforms. The only exceptions (and I will not deal with them
anytime soon) are on Windows:
- the write functions which require the length to be an int when it's
a size_t on all UNIX variants.
- all iovec manipulation functions where the iov_len is again an int
when it's a size_t on most of the UNIXes.
As these only happens on Windows, so I think we're set for now :)
This commit was SVN r12215.
Just follow inc_num and you will understand. Now _resize will grow the list to match
the required number of elements as described in the comment in the .h file.
This commit was SVN r12074.
constrained:
* Make sure we always have a number of eager fragments available
that scales with the number of processes communicating with
a given proc over shared memory
* Use FREE_LIST_GET instead of FREE_LIST_WAIT to return an
error to the PML when resource exhaustion occurs
* Don't dereference the frag during alloc unless we're sure
it's not NULL
Reviewed by: Galen
Refs trac:413
This commit was SVN r12053.
The following Trac tickets were found above:
Ticket 413 --> https://svn.open-mpi.org/trac/ompi/ticket/413
then use broadcast in order to wake them up. If there is only one then use signal
(which is supposed to be faster) and of course if there are no threads
waiting then just continue.
This commit was SVN r12049.
Keeping the cache misses as low as possible is always a good approach.
The opal_list_t is widely used, it should be a highly optimized class.
The same functionality can be reached with one one sentinel instead
of 2 currently used.
I don't have anything against the STL version, but so far nothing can
compare with the Knuth algorithm. I replace the current implementation
with a modified version of the Knuth algorithm (the one described in
The Art of Computer Programming). As expected, the latency went down.
This commit was SVN r10776.
was smaller than the CACHE_LINE_SIZE. Here is the version that works.
In fact this works on 2 steps. First we set the element size to something
multiple of the desired alignment. Then when we allocate memory, we compute
the total size, and we will align each of the elements (we allocate
multiple of them every time) to the CACHE_LINE_SIZE.
This commit was SVN r10479.
bytes). The simplest way to make sure they are aligned is to update
the size of the basic element to a multiple of the desired alignment.
It will use a little bit more memory, but the improvements on the SM BTL
seems quite interesting.
This commit was SVN r10478.
free list. It use the size attached to the free list, and the internal
memory segments to find out all the items allocated by this free list.
This commit was SVN r9669.
The free lst using atomic operations. I didn't want to completely
change the behavior, so we still use a mutex for the extreme cases (like
no more available items and we cannot allocate more). I test it for a
while on non multi-threading environment, but not enough on a multi-threaded
build.
This commit was SVN r9623.
flag, new flags to be included when convertor is initialized
- modified pml/btl module defs and added stub functions for diagnostic
output routines to dump state of queues / endpoints
- updates to data reliability pml
This commit was SVN r9329.