1
1
Граф коммитов

139 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
9b93f48e22 fix compile warnings in previous commit.
This commit was SVN r11554.
2006-09-07 13:31:50 +00:00
Gleb Natapov
d0caffa0aa Consolidate receive buffers prepost code for HP/LP QPs.
This commit was SVN r11552.
2006-09-07 13:05:41 +00:00
Gleb Natapov
298c825592 Remove #if OMPI_MCA_BTL_OPENIB_HAVE_SRQ. Always compile SRQ.
This commit was SVN r11537.
2006-09-06 05:45:37 +00:00
Gleb Natapov
4f05da21a1 remove unused macro
This commit was SVN r11536.
2006-09-06 05:36:26 +00:00
Gleb Natapov
424e412391 Make eager rdma work with SRQ enabled.
This commit was SVN r11530.
2006-09-05 16:04:04 +00:00
Gleb Natapov
c13240a1d1 remove rdma_credits from openib BTL header. Use one field for regular and rdma credits.
This commit was SVN r11529.
2006-09-05 16:02:09 +00:00
Gleb Natapov
fe932ca7bf consolidate part of HP/LP fields.
This commit was SVN r11528.
2006-09-05 16:00:18 +00:00
Gleb Natapov
b6bac100b0 Move error path out of the way.
This commit was SVN r11527.
2006-09-05 15:59:02 +00:00
Gleb Natapov
ffe7051488 fix compilation warnings.
This commit was SVN r11524.
2006-09-05 09:16:22 +00:00
Jeff Squyres
7fe337ce3b Yoinks -- remove some debugging output.
This commit was SVN r11515.
2006-09-01 11:48:26 +00:00
Jeff Squyres
6c2e938d31 Update the comment to reflect that this can now be a comma-delimited list.
This commit was SVN r11507.
2006-08-30 20:28:48 +00:00
Jeff Squyres
91bdbc0673 This commit fixes a few things. It looks bigger than it is because a
bunch of code changed indenting level and some code got moved out of
one function and made into its own subroutine.

- Gleb pointed out that I wasn't taking into account values from the
  default section of the INI file (and not finding values in the INI
  file is not an error).
- I incorrectly thought that 0x5ad was Mellanox's vendor ID.  Turns
  out that 0x5ad is Cisco's ID, while 0x2c9 is Mellanox.
  Specifically, Cisco burns its own firmware into the HCA which
  replaces the vendor ID, although the part ID stays the same.  So
  it's Mellanox hardware with Cisco firmware.  And apparently several
  of us do that.  :-)  So I expanded the concept of the vendor_id in
  the INI file to allow for lists of vendor IDs.  
- Along with that, I updated the default INI file to list all the IB
  vendors (that I am aware of -- certainly open to putting more data
  in there from other vendors) who overwrite Mellanox's vendor_id with
  their own for the part numbers that we have on file.

This commit was SVN r11506.
2006-08-30 20:21:47 +00:00
Jeff Squyres
9e2488bfe9 George found a great way to avoid warnings from flex for that unused
function.  Woo hoo!

This commit was SVN r11469.
2006-08-28 13:44:37 +00:00
Gleb Natapov
c70eb43e43 Align eager RDMA buffer so that last byte of the buffer is on the last byte of
the CPU cache line. Improves zero byte latency a little bit because of L1 cache
miss reduction.

This commit was SVN r11465.
2006-08-28 11:03:56 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00
Gleb Natapov
21e99cd334 init mtu parameter when no warn is set.
This commit was SVN r11388.
2006-08-24 10:42:42 +00:00
Galen Shipman
fbf7e9cf1c use int32_t's not size_t's (interface change in ORTE)..
This commit was SVN r11322.
2006-08-22 16:26:36 +00:00
Galen Shipman
e5c594c211 More updates for the async error handler for btl's
In order to provide backwards compatability the framework versions are bumped
and the handler registeration function is at the end of the btl struct.
Testing done on sm, openib, and gm.. 

This commit was SVN r11256.
2006-08-17 22:02:01 +00:00
Galen Shipman
3b49953ce2 Add error callback to the btl interface, this allows error to be delivered to
the upperlayer assynchronously although there are some issues with this.. such
as there are multiple consumers of the btl's.. who get's the

This commit was SVN r11232.
2006-08-16 20:21:38 +00:00
Jeff Squyres
474564a6b1 Bring over all the work from the /tmp/ib-hw-detect branch. In
addition to my design and testing, it was conceptually approved by
Gil, Gleb, Pasha, Brad, and Galen.  Functionally [probably somewhat
lightly] tested by Galen.  We may still have to shake out some bugs
during the next few months, but it seems to be working for all the
cases that I can throw at it.

Here's a summary of the changes from that branch: 

* Move MCA parameter registration to a new file (btl_openib_mca.c):
   * Properly check the retun status of registering MCA params
   * Check for valid values of MCA parameters
   * Make help strings better
   * Otherwise, the only default value of an MCA param that was
     changed was max_btls; it went from 4 to -1 (meaning: use all
     available)
 * Properly prototyped internal functions in _component.c
   * Made a bunch of functions static that didn't need to be public
   * Renamed to remove "mca_" prefix from static functions
   * Call new MCA param registration function
   * Call new INI file read/lookup/finalize functions
   * Updated a bunch of macros to be "BTL_" instead of "ORTE_"
   * Be a little more consistent with return values
   * Handle -1 for the max_btls MCA param
   * Fixed a free() that should have been an OBJ_RELEASE()
   * Some re-indenting
 * Added INI-file parsing
   * New flex file: btl_openib_ini.l
   * New default HCA params .ini file (probably to be expanded over
     time by other HCA vendors)
   * Added more show_help messages for parsing problems
   * Read in INI files and cache the values for later lookup
   * When component opens an HCA, lookup to see if any corresponding
     values were found in the INI files (ID'ed by the HCA vendor_id
     and vendor_part_id)
   * Added btl_openib_verbose MCA param that shows what the INI-file
     stuff does (e.g., shows which MTU your HCA ends up using)
   * Added btl_openib_hca_param_files as a colon-delimited list of INI
     files to check for values during startup (in order,
     left-to-right, just like the MCA base directory param).
   * MTU is currently the only value supported in this framework.
   * It is not a fatal error if we don't find params for the HCA in
     the INI file(s).  Instead, just print a warning.  New MCA param
     btl_openib_warn_no_hca_params_found can be used to disable
     printing the warning.
 * Add MTU to peer negotiation when making a connection
   * Exchange maximum MTU; select the lesser of the two

This commit was SVN r11182.
2006-08-14 19:30:37 +00:00
Galen Shipman
f7015abb92 set the inline_max to something.. doh..
This commit was SVN r11133.
2006-08-08 17:24:12 +00:00
Galen Shipman
c93711cfdb checking for max_inline_data == 0 as an error condition is not valid,, so
don't do it.. 

This commit was SVN r11132.
2006-08-08 16:53:47 +00:00
Brian Barrett
9c30aefff5 * constant is always defined -- use #if, not #ifdef
This commit was SVN r11089.
2006-08-02 18:37:41 +00:00
Galen Shipman
fb9210463f clarify assignment..
This commit was SVN r11065.
2006-07-31 20:54:54 +00:00
Galen Shipman
ce0b8d9b48 cleanup of cq/srq sizing..
This commit was SVN r11061.
2006-07-31 17:24:39 +00:00
Galen Shipman
c9e0eda190 Initialize the completion queue to a reasonable size based on maximum number
of send/receives outstanding.

Use ibv_cq_resize if available after initial creation of completion queue if
cq_size is too small (based on number of peers). 

This commit was SVN r11053.
2006-07-30 00:58:40 +00:00
Gleb Natapov
72575d81d2 Create separate pool for control messages. It is unlimited, but the maximum number of element that are allocated from it is limited by number of connections.
This commit was SVN r11028.
2006-07-27 14:09:30 +00:00
Gleb Natapov
4b605295b3 remove unused field.
This commit was SVN r10965.
2006-07-24 06:12:16 +00:00
Gleb Natapov
3b34dc8df8 remove MCA_BTL_IB_FRAG_ALIGN. Alignment is handled in free_list_t.
This commit was SVN r10945.
2006-07-23 12:33:49 +00:00
Gleb Natapov
91f48f9a79 Merge with gleb-pml branch. Add out of resource handling support to PML layer.
If resource is not available request is added to one of the pending list and retried later.

This commit was SVN r10900.
2006-07-20 14:44:35 +00:00
Gleb Natapov
383694c68d Add support to get alignemnt buffers from free_list_t. Convert openib BTL to new interface.
This commit was SVN r10899.
2006-07-20 14:39:05 +00:00
Gleb Natapov
e05ec69dc4 print "flush error" only once.
This commit was SVN r10672.
2006-07-06 08:03:01 +00:00
Gleb Natapov
9b0807e547 Put pending fragment on the right waiting list.
This commit was SVN r10671.
2006-07-06 07:51:23 +00:00
Galen Shipman
7e079d20ab fix for stupid casting.. addresses issue on PPC64 where sizes get set
improperly and badness ensues..

This commit was SVN r10574.
2006-06-29 21:58:50 +00:00
Gleb Natapov
c8f75c472a remove modulo op from fast path. Improvement 0.02-0.04ms.
This commit was SVN r10538.
2006-06-28 12:00:47 +00:00
Gleb Natapov
e58a89ef3e OMPI_ENABLE_DEBUG is always defined (to 0 or 1). Use #if and nto #ifdef.
This commit was SVN r10537.
2006-06-28 11:25:09 +00:00
Gleb Natapov
704a5eb645 Support for LMC (lid mask count) and multiple QPs per port.
This commit was SVN r10536.
2006-06-28 07:23:08 +00:00
Jeff Squyres
df45221a3e Until a real fix for #142 is found, this workaround prohibits using
mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1.  If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).

This commit was SVN r10521.
2006-06-27 10:43:03 +00:00
Gleb Natapov
52208d7bf9 Whe don't need to register zero sized frags.
This commit was SVN r10519.
2006-06-27 08:50:12 +00:00
Galen Shipman
8855e5b73a Fixes for DR as well as better diagnostic..
Successfully passing the intel test suite with/without induced errors/drops. 

This commit was SVN r10518.
2006-06-26 22:29:29 +00:00
Gleb Natapov
b7715395cb Return descriptor before sending credits one more time. We may need it.
This commit was SVN r10495.
2006-06-26 07:05:58 +00:00
Jeff Squyres
1d27ca5d0a Until a real fix for #142 is found, this workaround prohibits using
mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1.  If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).

This commit was SVN r10424.
2006-06-20 11:32:46 +00:00
Jeff Squyres
600bf4295a Update the help message to be slightly more concise and clear
This commit was SVN r10422.
2006-06-20 11:23:38 +00:00
Brian Barrett
3d027e57a8 * fix for ticket #141. If we are going to shortcut out of polling the
send/receive queues if there is something available in the short message
  rdma queues, then we have to poll *ALL* the rdma queues before exiting,
  or we aren't fair about frag reception and fall into degenerate matching
  cases.

This commit was SVN r10410.
2006-06-17 21:32:25 +00:00
Galen Shipman
218a438509 finished the ompi_free_list_t class nightmare..
This commit was SVN r10314.
2006-06-12 22:09:03 +00:00
Jeff Squyres
a4030ad2d9 Improve the tremendously unhelpful MCA help message for the
btl_openib_ib_mtu and btl_mvapi_ib_mtu MCA params by showing the valid
values what what they represent (got a question about this from Cisco
testing engineers).

This commit was SVN r10277.
2006-06-09 18:02:45 +00:00
Galen Shipman
cc54b07aa0 add better error messages for vapi retry exceeded errors.
This commit was SVN r10219.
2006-06-06 02:04:56 +00:00
Galen Shipman
9e6e7575b9 doh... add the file..
This commit was SVN r10210.
2006-06-05 21:24:42 +00:00
Galen Shipman
f05dee0435 add help file to explain why things went south..
This commit was SVN r10209.
2006-06-05 21:23:45 +00:00
Galen Shipman
74c97fb784 cleanup error reporting.. use ompi_proc_t->proc_name if available this gives
us source/dest hostnames for communication errors.. 

This goes to 1.1 branch (reviewed by Brian).. 

This commit was SVN r10200.
2006-06-05 20:02:41 +00:00