Galen Shipman
c9e0eda190
Initialize the completion queue to a reasonable size based on maximum number
...
of send/receives outstanding.
Use ibv_cq_resize if available after initial creation of completion queue if
cq_size is too small (based on number of peers).
This commit was SVN r11053.
2006-07-30 00:58:40 +00:00
Gleb Natapov
72575d81d2
Create separate pool for control messages. It is unlimited, but the maximum number of element that are allocated from it is limited by number of connections.
...
This commit was SVN r11028.
2006-07-27 14:09:30 +00:00
Gleb Natapov
4b605295b3
remove unused field.
...
This commit was SVN r10965.
2006-07-24 06:12:16 +00:00
Gleb Natapov
3b34dc8df8
remove MCA_BTL_IB_FRAG_ALIGN. Alignment is handled in free_list_t.
...
This commit was SVN r10945.
2006-07-23 12:33:49 +00:00
Gleb Natapov
91f48f9a79
Merge with gleb-pml branch. Add out of resource handling support to PML layer.
...
If resource is not available request is added to one of the pending list and retried later.
This commit was SVN r10900.
2006-07-20 14:44:35 +00:00
Gleb Natapov
383694c68d
Add support to get alignemnt buffers from free_list_t. Convert openib BTL to new interface.
...
This commit was SVN r10899.
2006-07-20 14:39:05 +00:00
Gleb Natapov
e05ec69dc4
print "flush error" only once.
...
This commit was SVN r10672.
2006-07-06 08:03:01 +00:00
Gleb Natapov
9b0807e547
Put pending fragment on the right waiting list.
...
This commit was SVN r10671.
2006-07-06 07:51:23 +00:00
Galen Shipman
7e079d20ab
fix for stupid casting.. addresses issue on PPC64 where sizes get set
...
improperly and badness ensues..
This commit was SVN r10574.
2006-06-29 21:58:50 +00:00
Gleb Natapov
c8f75c472a
remove modulo op from fast path. Improvement 0.02-0.04ms.
...
This commit was SVN r10538.
2006-06-28 12:00:47 +00:00
Gleb Natapov
e58a89ef3e
OMPI_ENABLE_DEBUG is always defined (to 0 or 1). Use #if and nto #ifdef.
...
This commit was SVN r10537.
2006-06-28 11:25:09 +00:00
Gleb Natapov
704a5eb645
Support for LMC (lid mask count) and multiple QPs per port.
...
This commit was SVN r10536.
2006-06-28 07:23:08 +00:00
Jeff Squyres
df45221a3e
Until a real fix for #142 is found, this workaround prohibits using
...
mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1. If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).
This commit was SVN r10521.
2006-06-27 10:43:03 +00:00
Gleb Natapov
52208d7bf9
Whe don't need to register zero sized frags.
...
This commit was SVN r10519.
2006-06-27 08:50:12 +00:00
Galen Shipman
8855e5b73a
Fixes for DR as well as better diagnostic..
...
Successfully passing the intel test suite with/without induced errors/drops.
This commit was SVN r10518.
2006-06-26 22:29:29 +00:00
Gleb Natapov
b7715395cb
Return descriptor before sending credits one more time. We may need it.
...
This commit was SVN r10495.
2006-06-26 07:05:58 +00:00
Jeff Squyres
1d27ca5d0a
Until a real fix for #142 is found, this workaround prohibits using
...
mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1. If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).
This commit was SVN r10424.
2006-06-20 11:32:46 +00:00
Jeff Squyres
600bf4295a
Update the help message to be slightly more concise and clear
...
This commit was SVN r10422.
2006-06-20 11:23:38 +00:00
Brian Barrett
3d027e57a8
* fix for ticket #141 . If we are going to shortcut out of polling the
...
send/receive queues if there is something available in the short message
rdma queues, then we have to poll *ALL* the rdma queues before exiting,
or we aren't fair about frag reception and fall into degenerate matching
cases.
This commit was SVN r10410.
2006-06-17 21:32:25 +00:00
Galen Shipman
218a438509
finished the ompi_free_list_t class nightmare..
...
This commit was SVN r10314.
2006-06-12 22:09:03 +00:00
Jeff Squyres
a4030ad2d9
Improve the tremendously unhelpful MCA help message for the
...
btl_openib_ib_mtu and btl_mvapi_ib_mtu MCA params by showing the valid
values what what they represent (got a question about this from Cisco
testing engineers).
This commit was SVN r10277.
2006-06-09 18:02:45 +00:00
Galen Shipman
cc54b07aa0
add better error messages for vapi retry exceeded errors.
...
This commit was SVN r10219.
2006-06-06 02:04:56 +00:00
Galen Shipman
9e6e7575b9
doh... add the file..
...
This commit was SVN r10210.
2006-06-05 21:24:42 +00:00
Galen Shipman
f05dee0435
add help file to explain why things went south..
...
This commit was SVN r10209.
2006-06-05 21:23:45 +00:00
Galen Shipman
74c97fb784
cleanup error reporting.. use ompi_proc_t->proc_name if available this gives
...
us source/dest hostnames for communication errors..
This goes to 1.1 branch (reviewed by Brian)..
This commit was SVN r10200.
2006-06-05 20:02:41 +00:00
Galen Shipman
0344ae4ac5
Fix to allow eager limit and max send size to be any size (within resource limitations). Instead of storing the ompi_free_list_t * in the fragment, we use the frag type enum, this tells us where the frag came from and where it should return.. This could also be done in mvapi but is not a high priority moving forward..
...
Review by Brian, needs to hit the trunk + 1.1 release..
This commit was SVN r10157.
2006-06-01 02:32:18 +00:00
Brian Barrett
5163f2b296
Fix for bug #36 . The MX, MVAPI, and OpenIB components don't have
...
support for progress threads, so we shouldn't build them or try to use
them when support for progress threads has been requested. The TCP, GM,
SELF, and SM BTLs should have progress thread support, so they aren't
disabled. The Portals BTL isn't compiled on platforms with threads,
so it doens't need to be updated.
This commit was SVN r10156.
2006-06-01 01:30:16 +00:00
Gleb Natapov
f590d8a190
fix eager RDMA on PPC64.
...
This commit was SVN r10059.
2006-05-25 11:05:12 +00:00
Gleb Natapov
0c34d5c9e6
fix endpoint matching in on demand connection establishment. This fix is in mvapi btl already.
...
This commit was SVN r9855.
2006-05-09 12:12:52 +00:00
Tim Woodall
6523c12e4b
- decrease eager limit to 12K (improves latency)
...
- trigger event library while setting up connections
This commit was SVN r9645.
2006-04-14 22:28:05 +00:00
Tim Woodall
c6489cb5aa
- turn on eager rdma by default
...
This commit was SVN r9641.
2006-04-14 21:11:14 +00:00
Gleb Natapov
98282a3567
fix spelling. threashold -> threshold.
...
This commit was SVN r9577.
2006-04-08 08:13:37 +00:00
Gleb Natapov
b6ab1f4262
fix compilation warnings.
...
This commit was SVN r9515.
2006-04-02 11:32:25 +00:00
Gleb Natapov
79bcfb096f
Add type to frag. Sometimes we need to know that a frag is from short rdma area.
...
I used hack for this that doesn't work for mvapi, so changing it to something more sane.
This commit was SVN r9477.
2006-03-30 15:26:21 +00:00
Gleb Natapov
590c992a7e
fix recursive lock of openib_btl->ib_lock.
...
This commit was SVN r9427.
2006-03-26 15:02:43 +00:00
Gleb Natapov
01a119c3c5
fix compilation bug with --enable-mpi-threads
...
This commit was SVN r9426.
2006-03-26 13:24:10 +00:00
Gleb Natapov
a5a78b10cc
Implementation of short message RDMA. Endpoint registers circular buffer and sends its address and rkey to the peer. Peer uses this buffer to eagerly RDMA small message into it. Endpoint polls the buffer for message arrival before checking HP/LP QPs. Set btl_openib_use_eager_rdma to 1 to enable it.
...
This commit was SVN r9425.
2006-03-26 08:30:50 +00:00
Tim Woodall
712468dbef
add diagnostic interface
...
This commit was SVN r9328.
2006-03-17 17:39:41 +00:00
Galen Shipman
440417e92c
Add max_btls option
...
This commit was SVN r9263.
2006-03-13 17:03:21 +00:00
Galen Shipman
e58b758031
standardize behavior of btl_alloc, if the size is larger than the max send
...
size, btl_alloc returns NULL.
This commit was SVN r9114.
2006-02-22 17:37:59 +00:00
Brian Barrett
566a050c23
Next step in the project split, mainly source code re-arranging
...
- move files out of toplevel include/ and etc/, moving it into the
sub-projects
- rather than including config headers with <project>/include,
have them as <project>
- require all headers to be included with a project prefix, with
the exception of the config headers ({opal,orte,ompi}_config.h
mpi.h, and mpif.h)
This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Galen Shipman
c8045bf397
Fixup for ORTE datatype checkin,
...
- use appropriate header files
- change calls from orte_dps to orte_dss
This commit was SVN r8920.
2006-02-07 15:20:44 +00:00
Tim Woodall
a2fde48f2f
changes from release branch
...
This commit was SVN r8858.
2006-01-31 16:17:18 +00:00
Tim Woodall
bcd6c525f8
removed duplicate locks
...
This commit was SVN r8857.
2006-01-31 16:12:37 +00:00
Tim Woodall
e861158fcd
- removed debug code
...
- removed extraneous memset
This commit was SVN r8798.
2006-01-24 23:38:41 +00:00
Galen Shipman
d657052510
misc cleanup..
...
This commit was SVN r8731.
2006-01-18 16:20:50 +00:00
Galen Shipman
84a09e4f4e
use #if not #ifdef..
...
This commit was SVN r8720.
2006-01-17 21:07:34 +00:00
Galen Shipman
0c81c0a6ce
use ibv_get_device_list if present
...
(submitted from roland)
This commit was SVN r8712.
2006-01-17 16:23:35 +00:00
Tim Woodall
a584c60dbe
re-worked flow control logic to take into account the return
...
of credits from the peer prior to local completion, so that
we don't overrun the number of send wqes available.
This commit was SVN r8683.
2006-01-12 23:42:44 +00:00
Tim Woodall
63d0438991
merge in changes from release branch
...
This commit was SVN r8637.
2006-01-04 16:34:45 +00:00