- Added some basic flow control to limit number of posted sends.
- Merged endpoint send/recv lock into single endpoint lock.
- Set the LMR triplet length in the send path, not at allocation time.
This has to be done because upper layers might send less than the
amount allocated.
- Alter the tie-breaker if statement protecting the second call
to dat_ep_connect(). The logic was reversed compared to the tie-
breaker for the first dat_ep_connect(), making it possible for
3 or more processes to form a deadlock loop.
- Some asserts were added for debugging purposes.. leaving them
in place for now.
This commit was SVN r10317.
Sender can receive and complete PUT request before it gets completion on the first rndv packet. senreq struct may be reused for the next MPI_Send and unexpected completion mess up the things. I sometimes got SEGV and sometimes data corruption.
This commit was SVN r10301.
btl_openib_ib_mtu and btl_mvapi_ib_mtu MCA params by showing the valid
values what what they represent (got a question about this from Cisco
testing engineers).
This commit was SVN r10277.
UD is the Unreliable Datagram transport for Infiniband, specifically OpenIB. This BTL is derived from the existing openib BTL, which is RC (Reliable Connection) based.
Still a work in progress, as there is a lot of work left to do. Specifically, performance, scalability, and flow control need to be addressed.
Currently I'm playing around with different methods for handling receive buffers, as well as profiling to figure out where the time is going.
This commit was SVN r10271.
1) don't need tree if memory is just malloc'd
2) fix memory and free list leak..
3) deregister first and then free... doh..
This commit was SVN r10251.
Added a tree to track memory allocation from MPI_Alloc_mem, this allows us to
free the registrations in a sane fashion.. also should be faster..
This commit was SVN r10248.
support for progress threads, so we shouldn't build them or try to use
them when support for progress threads has been requested. The TCP, GM,
SELF, and SM BTLs should have progress thread support, so they aren't
disabled. The Portals BTL isn't compiled on platforms with threads,
so it doens't need to be updated.
This commit was SVN r10156.
Do this rather than the my_list pointer because we need to do some
things that are somewhat special because we pre-pin eager fragments but
not send fragments. Also makes a couple ideas I have slightly easier to
play around with.
This commit was SVN r10127.
Instead of figuring out which free list the fragment belongs to based on size
we simply store a pointer to the list which it belongs in the fragment.
This was reviewed by Brian and should hit all the branches.
This commit was SVN r10072.
Trying to remember what I did here.. eager/max messages should work now, no RDMA yet. A number of other fixes and cleanups.
I do know of two problems:
Bad stuff happens when flooded with send frags too quickly - the BTL doesn't handle flow control.
Certain IBM tests turn up a length assertion in the datatype engine - needs more investigation.
This commit was SVN r10070.