1
1
Граф коммитов

1170 Коммитов

Автор SHA1 Сообщение Дата
Jon Mason
3970c3ff6c Add Chelsio T3 to ompi/mca/btl/openib/mca-btl-openib-hca-params.ini
This commit was SVN r17101.
2008-01-09 22:14:18 +00:00
Jon Mason
597c7e68f1 Minor cleanups
This commit was SVN r17100.
2008-01-09 21:54:11 +00:00
Rolf vandeVaart
870fa8b1f1 Pad the sm btl header to double-word alignment. Preserves PML
header as double-word aligned and prevents bus errors on SPARC
based servers.  This is part of fix for #1148.

Refs trac:1148

This commit was SVN r17090.

The following Trac tickets were found above:
  Ticket 1148 --> https://svn.open-mpi.org/trac/ompi/ticket/1148
2008-01-09 18:50:51 +00:00
Gleb Natapov
25ce70bb92 Call mca_btl_openib_endpoint_post_send() holding endpoint lock and not holding
qp lock since this is what the function assumes.

This commit was SVN r17086.
2008-01-09 14:46:41 +00:00
Pavel Shamis
99f51482e3 Fixing openib finalization flow.
This commit was SVN r17085.
2008-01-09 12:36:30 +00:00
Gleb Natapov
51d6ca0cb6 Provide no lock version of mca_btl_openib_endpoint_post_rr(). On connection
creation we call it with endpoint lock already held.

This commit was SVN r17084.
2008-01-09 10:39:35 +00:00
Gleb Natapov
50af6b9e78 Rearrange functions order so that functions are defined before they are used. No
code changes here.

This commit was SVN r17083.
2008-01-09 10:27:15 +00:00
Gleb Natapov
621fa223c5 Create free lists of fragments per HCA, not per BTL. Saves memory in case of
multiple LMCs.

This commit was SVN r17082.
2008-01-09 10:26:21 +00:00
Gleb Natapov
5ce3213158 Rearrange functions order so that functions are defined before they are used. No
code changes here.

This commit was SVN r17081.
2008-01-09 10:05:41 +00:00
Pavel Shamis
fbf7bcd9a9 We need to prepost on srq/xrc before reply with ENDPOINT_XOOB_CONNECT_XRC_RESPONSE.
This commit was SVN r17066.
2008-01-08 10:30:16 +00:00
Rolf vandeVaart
0f0fde3490 Partial fix for #1148. Enable this for 32-bit sparc as well as 64-bit sparc.
This commit was SVN r17059.
2008-01-07 15:43:44 +00:00
Gleb Natapov
c3bbf69356 Set send_flags correctly in btl_openib_put. Otherwise we may reuse flags from
previous use of the buffer and they may be incorrect.

This commit was SVN r17058.
2008-01-07 10:19:07 +00:00
George Bosilca
48f5a26e8c Cast to keep VC happy (quiet).
This commit was SVN r17054.
2008-01-04 23:13:32 +00:00
Jeff Squyres
a234ba198a Remove superflous / unused -D from Makefile.am.
This commit was SVN r17030.
2008-01-02 18:00:20 +00:00
Jeff Squyres
c9bea80f8f Fix unbalanced parenthesees noticed by Paul Hargove.
This commit was SVN r17029.
2008-01-02 13:34:07 +00:00
Gleb Natapov
2fb6947f88 Destroy endpoints that use eager rdma communication before destroying SRQ. Do't
skip async event thread destruction if SRQ was not destroyed, or it will segfault
on module removal.

This commit was SVN r17025.
2007-12-23 13:58:31 +00:00
Gleb Natapov
b06d92bdab OpenIB BTL has three channels through which data can be received (eager rdma,
high prio QPs and low prio QPs) and because not all of them are polled each time
progrgess() is called (to save on latency) starvation is possible. The commit
fixes this. Now each channel is polled, but higher priority channels are polled
more often. Three new parameters are introduced that control polling ratios 
between different channels.

This commit was SVN r17024.
2007-12-23 12:29:34 +00:00
Brad Penoff
4c2571b54c fixed more 64 bit SCTP BTL warnings
This commit was SVN r17022.
2007-12-21 21:50:00 +00:00
Brad Penoff
195faa37b6 fixed send side of 64 bit compilation warnings
This commit was SVN r17019.
2007-12-21 19:11:50 +00:00
Jeff Squyres
558d179e2e Fix typo.
This commit was SVN r17012.
2007-12-21 14:25:48 +00:00
George Bosilca
906e8bf1d1 Replace the ompi_pointer_array with opal_pointer_array. The next step
(sometimes after the merge with the ORTE branch), the opal_pointer_array
will became the only pointer_array implementation (the orte_pointer_array
will be removed).

This commit was SVN r17007.
2007-12-21 06:02:00 +00:00
Tim Mattox
bbeef5b84b Change the MX BTL's exclusivity to MCA_BTL_EXCLUSIVITY_DEFAULT,
so that it is higher than the new TCP BTL exclusivity as of r16942.

The portals BTL maintainer may want to do the same...

This commit was SVN r16995.

The following SVN revision numbers were found above:
  r16942 --> open-mpi/ompi@80e9730100
2007-12-19 21:24:45 +00:00
Pavel Shamis
fcbca510d8 The ib_inline_max should be updated only when SEND qp is created.
This commit was SVN r16973.
2007-12-17 10:30:30 +00:00
Gleb Natapov
f79e344ea4 Fix bug in debug build.
This commit was SVN r16972.
2007-12-17 10:26:18 +00:00
Gleb Natapov
64a95f63cd Fix error reporting in openib if parameter value is out of range.
This commit was SVN r16971.
2007-12-16 14:04:36 +00:00
Gleb Natapov
8b511b969d Introduce a new BTL parameter btl_rndv_eager_limit which determines size of a
first fragment of rendezvous protocol. Remove no longer used btl_min_send_size
parameter.

This commit was SVN r16969.
2007-12-16 08:35:17 +00:00
Jeff Squyres
213b5d5c6e Per long threads on the mailing list and much confusion discussion
about linkers, have all OPAL, ORTE, and OMPI components '''not'' link
against the OPAL, ORTE, or OMPI libraries.

See ttp://www.open-mpi.org/community/lists/users/2007/10/4220.php for
details (or https://svn.open-mpi.org/trac/ompi/wiki/Linkers for a
better-formatted version of the same info).

This commit was SVN r16968.
2007-12-15 13:32:02 +00:00
Brad Penoff
540d483dd3 64 bit fix and initial Solaris support
This commit was SVN r16967.
2007-12-15 03:28:10 +00:00
Donald Kerr
d05d3afaed clean up and make consistent the reporting out from the udapl btl; report out readeable event string instead of just a number
This commit was SVN r16954.
2007-12-13 15:32:26 +00:00
Brad Penoff
ecd563b0fa reduced noise for SCTP BTL on RHEL4U4
This commit was SVN r16951.
2007-12-13 03:15:29 +00:00
Jeff Squyres
80e9730100 Per http://www.open-mpi.org/community/lists/devel/2007/12/2698.php and
this thread:
http://www.open-mpi.org/community/lists/devel/2007/12/2807.php, set
TCP's exclusivity to LOW+100 and SCTP's exclusivity to LOW.

This commit was SVN r16942.
2007-12-12 15:55:37 +00:00
Jon Mason
e05cd7b0e4 To modify the default connection method, a "btl_openib_connect <arg>"
should be passed via commandline.  However, there is a slight coding
bug in the openib connect code.  When registering the name of the
option, mca_base_param_reg_string will prepend the relevant info
("btl_openib_" in this case).  The existing code will require
"btl_openib_btl_openib_connect" instead of "btl_openib_connect".
This patch corrects this.

This commit was SVN r16937.
2007-12-11 20:36:36 +00:00
Galen Shipman
a04d21b459 Make CNL compile again..
This commit was SVN r16929.
2007-12-11 16:14:30 +00:00
Gleb Natapov
2a59b2a68f 1. Set segments length in prepare_src() after packing because actual size may be
smaller then allocated size.

2. If reserve zero don't allocate coalesced frag since it will be RDMAed, not
send.  The logic was other way around.

This commit was SVN r16928.
2007-12-11 13:10:52 +00:00
Jon Mason
df82fcb917 Slight word usage and grammar error in the openib btl help test. I
believe the change below is the intended meaning.

This commit was SVN r16921.
2007-12-10 21:50:48 +00:00
Donald Kerr
a604fca52c follow on change to r16901 and r16898; the interface change mca_btl_udapl_alloc() was not applied to two locations in this file
This commit was SVN r16918.

The following SVN revision numbers were found above:
  r16898 --> open-mpi/ompi@7364b7cf47
  r16901 --> open-mpi/ompi@e2e211f23b
2007-12-10 18:10:52 +00:00
Gleb Natapov
17611dafbe Fix pointer casting on 32bit machines.
This commit was SVN r16907.
2007-12-09 14:15:35 +00:00
Gleb Natapov
2f9c5b46cf Return OMPI_ERR_RESOURCE_BUSY from openib_btl_send() if fragment is not on wire.
This commit was SVN r16906.
2007-12-09 14:14:11 +00:00
Gleb Natapov
493951e09d Add heterogeneous support to message coalescing.
This commit was SVN r16903.
2007-12-09 14:10:25 +00:00
Gleb Natapov
b4698dc6df Use flags provided during allocation to coalesce to correct priority queue.
This commit was SVN r16902.
2007-12-09 14:08:55 +00:00
Gleb Natapov
e2e211f23b Add flags parameter to btl_alloc() and btl_prepare_src() functions. If BTL
knows at the time of allocation priority of a descriptor it may do some
optimizations.

This commit was SVN r16901.
2007-12-09 14:08:01 +00:00
Gleb Natapov
5313a2baa7 Message coalescing for openib BTL. If fragment is waiting to be transmitted in
a pending queue pack another message into it if there is enough space there.

This commit was SVN r16900.
2007-12-09 14:05:13 +00:00
Gleb Natapov
7302cd24eb Call btl_alloc() from btl_prepare_src() to have one point of frag allocation.
This commit was SVN r16899.
2007-12-09 14:02:32 +00:00
Gleb Natapov
7364b7cf47 Add endpoint parameter to btl_alloc() function. Enables various optimizations
inside BTL.

This commit was SVN r16898.
2007-12-09 14:00:42 +00:00
Gleb Natapov
de3761208a Send cm_seen by eager rdma channel. Encode qp index into credits filed. If
cm_seen is not send here non symmetric eager rdma connection may hang.

This commit was SVN r16896.
2007-12-09 13:56:13 +00:00
Tim Mattox
d188642715 Apparently the SCTP BTL has a btl_sctp_component.h file that needs to be
part of the "sources" list.  Hopefully this will clear of the nightly
tarball creation for the trunk.

This commit was SVN r16895.
2007-12-08 04:05:59 +00:00
Karl Mroz
71b54d8e4e Removed .ompi_ignore and .ompi_unignore from SCTP BTL.
This commit was SVN r16893.
2007-12-07 17:02:32 +00:00
Jon Mason
20294e7800 There is a double call to ompi_btl_openib_connect_base_open in
mca_btl_openib_mca_setup_qps().  It looks like someone just forgot to
clean-up the previous call when they added the check for the return
code.

I ran a quick IMB test over IB to verify everything is still working.

This commit was SVN r16870.
2007-12-06 17:25:38 +00:00
Pavel Shamis
e8aeadb11e XRC fixes:
- create separate xrc domain file for each hca
- return error if we failed to create xrc file.

This commit was SVN r16853.
2007-12-05 14:32:44 +00:00
Pavel Shamis
f60ca0e4e5 Removing unused mca_btl_openib_ib_address_status
This commit was SVN r16835.
2007-12-04 13:16:26 +00:00
Pavel Shamis
57728986f8 Fixing XRC multiport/multisubnet support.
This commit was SVN r16819.
2007-12-03 09:49:53 +00:00
Gleb Natapov
b2858236fb Use new free list interface.
This commit was SVN r16818.
2007-12-02 15:13:11 +00:00
Gleb Natapov
a774cd98f8 Put send completions to low prio CQ. Receive is more important.
This commit was SVN r16817.
2007-12-02 14:46:37 +00:00
Gleb Natapov
b17f5b7480 Change how default receive queues parameters are calculated. Current default
parameters don't make any sense. Credits are never piggybacked. Also make
default queue sizes to be calculated from eager_limit and max_send_size values.

This commit was SVN r16816.
2007-12-02 14:43:28 +00:00
Rich Graham
6e77414a68 changes to the ompi_free_list_ex - called ompi_free_list_ex_new, for now.
This commit was SVN r16803.
2007-11-29 21:18:37 +00:00
Jeff Squyres
8c0060701c Stub out the ibcm CPC.
This commit was SVN r16800.
2007-11-29 13:23:17 +00:00
Pavel Shamis
8aca6eb31b OFED 1.3 doesn't implement ibv_resize_cq for connectX.
On error exit from ibv_resize_cq we should to check if the function
is implemented.

This commit was SVN r16799.
2007-11-28 15:23:19 +00:00
Gleb Natapov
5f242c77f2 Post each recv wr not separately but in one call to ibv_post_recv().
This commit was SVN r16798.
2007-11-28 14:57:15 +00:00
Gleb Natapov
14cffee726 Uninline mca_btl_openib_post_srr() function.
This commit was SVN r16797.
2007-11-28 14:52:31 +00:00
Pavel Shamis
1c314ef4c3 If XRC qp was specified in btl_openib_receive_queues we automatically should
choose xoob connection module.

This commit was SVN r16796.
2007-11-28 10:33:32 +00:00
Pavel Shamis
488a508732 Removing comments from help file.
This commit was SVN r16795.
2007-11-28 10:16:08 +00:00
Pavel Shamis
3e2e4f6d2a Removing unused lid.
This commit was SVN r16794.
2007-11-28 10:06:57 +00:00
Pavel Shamis
aa79bdabc8 Removing port_touse - we don't really need it
This commit was SVN r16793.
2007-11-28 09:57:48 +00:00
Pavel Shamis
2ffbe8776a Fixing compilation problems in openib
This commit was SVN r16792.
2007-11-28 09:38:49 +00:00
Gleb Natapov
218adb2a96 Account for eager rdma credit fragments when creating send queue. Create XRC
receive QP with zero receive and send queue length. We don't going to use this
QP for send and receives a posted to SRQs.

This commit was SVN r16791.
2007-11-28 07:22:01 +00:00
Gleb Natapov
601952a952 Don't shared endpoint->qps array, only pointer to actual QP. Calculate send
queue size for shared QP based on all endpoints that want to use it.

This commit was SVN r16790.
2007-11-28 07:21:07 +00:00
Gleb Natapov
b46c9cc7bc Make xrc use srq_qp unions instead of the xrc_qp which is exactly like srq_qp.
This commit was SVN r16789.
2007-11-28 07:20:26 +00:00
Gleb Natapov
be0981fc07 Change a type of xrc_recv_qp to "struct ibv_qp".
This commit was SVN r16788.
2007-11-28 07:19:36 +00:00
Gleb Natapov
bd47da4699 Initial XRC support by Mellanox.
This commit was SVN r16787.
2007-11-28 07:18:59 +00:00
Gleb Natapov
b49788c499 Receive queue is not used in case of SRQ QP, so don't create one.
This commit was SVN r16786.
2007-11-28 07:17:22 +00:00
Gleb Natapov
923666b75c Process pending put/get frags on endpoint connection establishment.
This commit was SVN r16785.
2007-11-28 07:16:52 +00:00
Gleb Natapov
e502402470 Fix endpoint destructor to not skip closed endpoints.
This commit was SVN r16784.
2007-11-28 07:15:54 +00:00
Gleb Natapov
5a4e953aaa Allow share the same qp for different buffer sizes. Needed for XRC support.
This commit was SVN r16783.
2007-11-28 07:15:20 +00:00
Gleb Natapov
b123696d57 Fix async thread creation and destruction. Create async thread only when it is
needed instead of creating it and then canceling if it is not needed. Change
error handling during finalize so that it will not skip async thread
destruction. Otherwise async thread may segfault during openib module unloading.

This commit was SVN r16782.
2007-11-28 07:14:34 +00:00
Gleb Natapov
5463eb892c Send all explicit credits for PP QPs of all orders over smallest PP qp.
This commit was SVN r16781.
2007-11-28 07:13:34 +00:00
Gleb Natapov
a9f864d15c If there is an eager rdma credit, but there is no WQE to send a packet we add it
to a pending queue of eager rdma QP instead of correct pending list. This patch
fixes this by getting reed of "eager rdma qp" notion. Packet is always send
over its order QP. The patch also adds two pending queues for high and low prio
packets. Only high prio packets are sent over eager RDMA channel.

This commit was SVN r16780.
2007-11-28 07:12:44 +00:00
Gleb Natapov
6a2d210b7d Use OMPI object system to make fragment hierarchy more object oriented. The
main idea (except of cleanup) is to save on initialisation of unneeded fields
and to use C type checking system to catch obvious errors.

This commit was SVN r16779.
2007-11-28 07:11:14 +00:00
Gleb Natapov
267cd2342a Cleanup. Remove unused functions.
This commit was SVN r16778.
2007-11-28 07:08:56 +00:00
Ron Brightwell
924414f92f Added support for Accelerated Portals for the btl.
This commit was SVN r16771.
2007-11-21 21:34:17 +00:00
Brad Penoff
fb5536f11d conforming SCTP BTL to Open MPI naming conventions and IP requirements
This commit was SVN r16764.
2007-11-21 10:13:41 +00:00
Andrew Friedley
c50f2aa74c fix warning
This commit was SVN r16759.
2007-11-20 16:55:12 +00:00
Brad Penoff
ede8a6a7a1 adjusting for Linux when sctp_recvmsg returns 0 for remote close
This commit was SVN r16742.
2007-11-20 06:02:08 +00:00
Tim Prins
f42fcd36db make the mx btl compile again after the free list changes
This commit was SVN r16735.
2007-11-19 19:41:22 +00:00
Brad Penoff
f34ddfef80 for SCTP BTL, added Mac OS X support for systems using SCTP NKE (Network Kernel Extension)
This commit was SVN r16729.
2007-11-17 02:56:27 +00:00
Brad Penoff
5abd2d8064 initial SCTP BTL commit
This commit was SVN r16723.
2007-11-13 23:39:16 +00:00
Jeff Squyres
a4d571f8ad Fix typo that broke the build.
This commit was SVN r16635.
2007-11-02 09:19:55 +00:00
Rich Graham
27a748e7eb change all instances of ompi_free_list_init to ompi_free_list_init_new. Header
and payload data are specified separately at this stage.

This commit was SVN r16633.
2007-11-01 23:38:50 +00:00
Andrew Friedley
46516d98e1 Update MCA params -- sd_num_peer is no longer used, change rd_num_init to rd_num
This commit was SVN r16601.
2007-10-29 22:56:30 +00:00
Andrew Friedley
8273b61471 Bugfix for hangs in certain communication patterns, particularly alltoall.
This commit was SVN r16600.
2007-10-29 21:51:28 +00:00
George Bosilca
d67c0eefb4 Remove a compilation warning about using uninitialized variables.
This commit was SVN r16589.
2007-10-26 20:15:28 +00:00
George Bosilca
b1b5cb6453 Looks like SO_REUSEPORT it's not defined on some platforms. Switch
to the conventional SO_REUSEADDR instead.

This commit was SVN r16588.
2007-10-26 19:56:21 +00:00
George Bosilca
337f78a4a8 Restrict the port range for the OOB and the BTL. Each protocols (v4 and v6)
has his own range which is defined by a min value and a range. By default
there is no limitation on the port range, which is exactly the same
behavior as before.

This commit was SVN r16584.
2007-10-26 16:36:51 +00:00
Gleb Natapov
3a63eb6c17 Cleanup macro definitions.
This commit was SVN r16554.
2007-10-23 13:33:19 +00:00
Gleb Natapov
d836f3dbbe Remove unused macro.
This commit was SVN r16552.
2007-10-23 13:18:10 +00:00
Gleb Natapov
18ed60edeb Revert previous commit. There was no memory leak, the pointer is saved inside
free list for future use. This patch moves BTL initialization into separate
function too.

This commit was SVN r16551.
2007-10-23 12:57:45 +00:00
Gleb Natapov
657e544e02 Fix memory leak. Define init_data on a stack instead of allocation it each time.
This commit was SVN r16550.
2007-10-23 11:10:52 +00:00
Gleb Natapov
9e2d5acf8e Remove unused filed from openib fragment structure.
This commit was SVN r16549.
2007-10-23 07:38:29 +00:00
Gleb Natapov
63dde87076 If SM BTL cannot send fragment because the cyclic buffer is full put the
fragment on the pending list and send it later instead of spinning on
opal_progress().

This commit was SVN r16537.
2007-10-22 12:07:22 +00:00
Jeff Squyres
b7eeae0a74 Remove the mvapi BTL. Woo hoo!
This commit was SVN r16483.
2007-10-17 14:08:03 +00:00
Jeff Squyres
94b1e9cff9 Update to use BTL_VERBOSE and BTL_ERROR instead of opal_output'ing to
the mca_btl_base_output stream directly (and relying on it to be -1 if
we didn't want any output).

This commit was SVN r16449.
2007-10-15 17:53:02 +00:00
Rolf vandeVaart
3dd5196338 Remove the --mca btl_base_debug flag and clean up
the use of the --mca btl_base_verbose flag.  The
btl framework now matches all the other frameworks.
Slightly modify error messages for clarity.

This commit was SVN r16443.
2007-10-15 13:10:20 +00:00
George Bosilca
436b0f2a5b Way to many numbers in this uint32_t.
This commit was SVN r16437.
2007-10-12 13:11:55 +00:00
Jeff Squyres
3500376d9e Remove a warning about an unused label.
This commit was SVN r16429.
2007-10-11 16:38:37 +00:00
Galen Shipman
6a25a635de that shouldn't have slipped through..
This commit was SVN r16411.
2007-10-09 19:07:23 +00:00
Galen Shipman
6b051e255e already checked size.. no need to do it again..
This commit was SVN r16409.
2007-10-09 18:59:10 +00:00
Nysal Jan
b51d85fb3f Fix assertion failure "assert( 0 == btl_endpoint->endpoint_cache_length )" while executing mt_coll testcase.
This commit was SVN r16408.
2007-10-09 18:00:01 +00:00
Josh Hursey
7437f37e96 This commit contains the following:
* Fix some missing includes in a few places.
 * Add the cr_request() functionality to the BLCR CRS component.
   We are now dependent upon the 0.6.* series of BLCR.
 * Made the CR notification mechanism a registered function.
   This way we can have an OPAL-only version and it can be replaced at
   runtime with the ORTE version.
 * Add a 'opal_cr_allow_opal_only' parameter that will enable OPAL-only
   CR functionality when the user wants it. Default: Disabled.
 * Fix the placement of a checkpoint request check in MPI_Init
 * Pull the OPAL notification mechanism into the SnapC framework.
   * We no longer fork/exec the 'opal-checkpoint' command for local
   checkpointing, the Local coordinator in the orted does this directly.
   * The Local and Application coordinator talk together bypassing the OPAL
   notifiation mechanism.
   * Optimized the Local <-> App Coordinator communication.
   * Improved the structure used to track vpid_snapshots in the local coord.
 * Fix a race condition in which an application under heavy communication load
   may produce an inconsistent global checkpoint.

This commit was SVN r16389.
2007-10-08 20:53:02 +00:00
Jeff Squyres
f92154fc72 Gah -- ompi_info doesn't setup the connect pseudo component, so it'll
be NULL.  Ensure to protect for this.

This commit was SVN r16333.
2007-10-04 18:03:56 +00:00
Jeff Squyres
13fa7ae93e It's not necessary to link against all 3 libs (in fact, we shouldn't
do it -- let libtool pull them in via the .la file if it needs to)

This commit was SVN r16332.
2007-10-04 18:01:30 +00:00
Jeff Squyres
80ce974291 Fixes trac:1156: ensure to finalize the "connect" sub-component.
This commit was SVN r16330.

The following Trac tickets were found above:
  Ticket 1156 --> https://svn.open-mpi.org/trac/ompi/ticket/1156
2007-10-04 17:36:12 +00:00
Andrew Friedley
5be7f5e2dc fixes trac:1154
Check if an exclusion string (i.e. '-mca btl ^sm) was provided; if so OFUD just disables itself.

This commit was SVN r16307.

The following Trac tickets were found above:
  Ticket 1154 --> https://svn.open-mpi.org/trac/ompi/ticket/1154
2007-10-02 20:37:16 +00:00
Gleb Natapov
60af46d541 We have QP description in component structure, module structure and endpoint.
Each one of them has a field to store QP type, but this is redundant.
Store qp type only in one structure (the component one).

This commit was SVN r16272.
2007-09-30 16:14:17 +00:00
Gleb Natapov
9c04b127f5 Forget to put this fix in previous commit.
This commit was SVN r16271.
2007-09-30 15:33:20 +00:00
Gleb Natapov
3a15d645be Remove lcl_qp_attr from endpoint qp description. It is used during init only.
This commit was SVN r16270.
2007-09-30 15:29:35 +00:00
Gleb Natapov
c7105eadc7 Update Voltaire copyright.
This commit was SVN r16189.
2007-09-24 10:11:52 +00:00
Jeff Squyres
33955a0ed0 Oops -- when converted from uint to int, -1 (the default value,
meaning "infinite") is no longer larger than the minimum required
size.  So put in an appropriate test to ensure that "infinite" was not
requested. 

This commit was SVN r16142.
2007-09-17 19:28:21 +00:00
Jeff Squyres
130a272cec Fix some compiler warnings about signed/unsigned comparisons.
This commit was SVN r16139.
2007-09-17 13:08:45 +00:00
Jeff Squyres
6004e177e0 Fixes trac:1133: if you specify a max freelist size that is too small,
you'll get a helpful error message and the openib BTL will deactivate
itself.

This commit was SVN r16133.

The following Trac tickets were found above:
  Ticket 1133 --> https://svn.open-mpi.org/trac/ompi/ticket/1133
2007-09-14 21:42:56 +00:00
George Bosilca
617ff3a413 Add a MCA parameter for the ELAN MAP ID file.
Fix small memory bugs, and track the final segfault. Still some ork to do.

This commit was SVN r16117.
2007-09-12 21:25:35 +00:00
Shiqing Fan
a0660f4deb - Just some type casts.
This commit was SVN r16100.
2007-09-12 15:29:58 +00:00
Rainer Keller
a3b30749b0 - Only lock/unlock when using threads.
Basically revert this part of r16015.

This commit was SVN r16029.

The following SVN revision numbers were found above:
  r16015 --> open-mpi/ompi@435e7d80e9
2007-08-31 12:34:48 +00:00
Rainer Keller
9c1c345c07 - head_lock is an opal_atomic_lock_t...
This commit was SVN r16028.
2007-08-31 12:20:21 +00:00
Shiqing Fan
80fdd5e2a4 - Need to be exported.
This commit was SVN r16021.
2007-08-30 14:16:03 +00:00
Gleb Natapov
435e7d80e9 Remove rc parameter from MCA_BTL_SM_FIFO_WRITE() macro. It cannot fail in
current implementation.

This commit was SVN r16015.
2007-08-30 13:21:52 +00:00
Brian Barrett
59b22533f2 Enable RDMA for heterogeneous situations. Currently done by overloading
the ompi_convertor_need_buffers function to only return 0 if the convertor
is homogeneous (which it never does on the trunk, but does to on v1.2, but
that's a different issue).  Only enable the heterogeneous rdma code for
a btl if it supports it (via a flag), as some btls need some work for this
to work properly.  Currently only TCP and OpenIB extensively tested

This commit was SVN r15990.
2007-08-28 21:23:44 +00:00
Rich Graham
bc97d22182 remove tabs. Remove old code that was commented out.
This commit was SVN r15975.
2007-08-28 03:08:36 +00:00
Rich Graham
4d58f9aed7 Add comments. Move temporary receive object from a free list object to
a stack object.

This commit was SVN r15971.
2007-08-27 21:41:04 +00:00
Gleb Natapov
33196d972b post_send() function is called without endpoint lock held from explicit credits
update function so eager_rdma_remote.head have to be updated in a thread safe
manner.

This commit was SVN r15966.
2007-08-27 11:37:01 +00:00
Gleb Natapov
32a61c3bf2 Credit fragment is not protected properly from concurrent access. There is a
race that can prevent further explicit credits update from been sent. Fix the
race.

This commit was SVN r15965.
2007-08-27 11:34:59 +00:00
Brad Benton
ccda5c9c74 Modified the MCA_BTL_TCP_CONNECTED case in mca_btl_tcp_endpoint_send_handler()
to always first check for a NULL frag pointer before trying to send the
fragment.  This avoids an issue in multi-threaded execution in which 
multiple threads working on the same endpoint can result in a thread 
finding itself here with nothing to send.

This commit was SVN r15963.
2007-08-26 23:40:02 +00:00
Gleb Natapov
becf4aa9c9 ompi_pointer_array_get_size doesn't return how much elements are actually in an
array, so count them by ourselves.

This commit was SVN r15943.
2007-08-22 09:31:12 +00:00
Gleb Natapov
d8f3063895 Create only one CQ for all BTLs on the same HCA. Many BTLs can be created for
one HCA. Multiple ports, LMC, multiple BTLs per one LID. Having only one CQ for
all of them substantially reduce polling time.

This commit was SVN r15933.
2007-08-20 12:28:25 +00:00
Brad Benton
1ddba9ec65 Lock the endpoint before doing endpoint_state processing. This ensures
that the subsequent unlock is valid.

This commit was SVN r15890.
2007-08-16 18:11:29 +00:00
Tim Prins
5a795128af Change it so that different components in orte use unique rml tags
This commit was SVN r15881.
2007-08-16 14:02:35 +00:00
Jeff Squyres
d7c5fea096 * Fix problem caused by r15848: the test parser was looking for
semicolons but the new specitifcation string used colons.  The text
   parser now looks for colons.
 * Changed all opal_output() error messages to
   much-more-helpful/descriptive opal_show_help() messages.
 * A few minor style/indenting fixes

This commit was SVN r15850.

The following SVN revision numbers were found above:
  r15848 --> open-mpi/ompi@dd30597f39
2007-08-14 14:46:13 +00:00
Jeff Squyres
dd30597f39 Change the default receive_queues value per
http://www.open-mpi.org/community/lists/devel/2007/08/2100.php.

This commit was SVN r15848.
2007-08-13 21:51:05 +00:00
Jeff Squyres
50bae9c603 Bring in the modular-wireup stuff for the openib BTL (from
/tmp/jms-modular-wireup branch):

 * This commit moves all the openib BTL connection code out of
   btl_openib_endpoint.c and into a connect "pseudo-component" area,
   meaning that different schemes for doing OFA connection schemes can
   be chosen via function pointer (i.e., MCA parameter) at run-time.
 * The connect/connect.h file includes comments describing the
   specific interface for the connect pseudo-component.
 * Two pseudo-components are in this commit (more can certainly be
   added).
   * oob: use the same old oob/rml scheme for creating OFA connections
     that we've had forever; this now just puts the logic into this
     self-contained pseudo-component.
   * rdma_cm: a currently-empty set of functions (that currently
     return NOT_IMPLEMENTED) that will someday use the RDMA connection
     manager to make OFA connections.

This commit was SVN r15786.
2007-08-06 23:40:35 +00:00
Jeff Squyres
0fb8cf65a8 If you have an HCA with no active ports, we still create an mpool.
This mpool will have no btl module owner there was no btl created for
the HCA with no ports, but it will still be tracked in the mpool
framework (i.e., it's available).

If MPI_ALLOC_MEM is called by the app, one of two things will happen:

 1. if there's an HCA on the host with some active ports, the openib
    btl component will still be in the process space, and therefore
    the "mpool with no btl" (MWNB) module will still be able to call
    the reg/dereg functions, and all will be fine.  However, if
    MPI_FREE_MEM is never invoked to free the memory, bad things will
    happen during MPI_FINALIZE.  The pml is finalized, which finalizes
    all the btls.  The btls finalize all their mpools and all is fine.
    But later we close down the mpool framework which then finalizes
    any left over mpool modules, such as MWNB.  However, the openib
    BTL module functions that the MWNB was registered with are no
    longer in the process space, and it segv's while trying deregister
    the memory.
 2. if there are *no* HCA's on the host with active ports, then the
    openib btl will have been unloaded, and when the MWNM tries to
    register the memory, the functions it tries to call (in the openib
    btl) are no longer there, and we segv.

This commit was SVN r15735.
2007-08-01 20:53:34 +00:00
Gleb Natapov
758f932aa6 Handle credit in a thread safe manner. I am sure more work will have to be done
in this are.

This commit was SVN r15721.
2007-08-01 12:15:43 +00:00
Gleb Natapov
9c20d67301 1) Return IB header to it's previous size by using char for cm_seen field.
2) Allow to specify rd_win/rd_rsv parameters by user, but make them optional.

This commit was SVN r15719.
2007-08-01 12:10:56 +00:00
Gleb Natapov
2d9669a69d mca_btl_openib_endpoint_post_send() is called with endpoint lock held.
No need to call lock() in btl_openib_acquire_send_resources().

This commit was SVN r15678.
2007-07-30 09:03:08 +00:00
Jeff Squyres
cae00d1854 Passing NULL to pthread_exit() is verbotten.
This commit was SVN r15661.
2007-07-27 01:06:36 +00:00
Jeff Squyres
015fc08ff4 Remove the ib_static_rate MCA parameter; it will be replaced with a
dynamic mechanism to adjust the rate only if necessary (e.g., two
ports of differing speeds are connected).

This commit was SVN r15653.
2007-07-26 21:10:51 +00:00
Gleb Natapov
cce6bb478c Process message before reposting buffers. This way rd_posted should be
calculated properly.

This commit was SVN r15635.
2007-07-26 13:56:07 +00:00
Pavel Shamis
bda6f1a5cf Fixing compilation problem in openib btl progress thread.
This commit was SVN r15631.
2007-07-26 11:35:15 +00:00
Gleb Natapov
1f18b060ce If eager_rdma_local in not initialized credits and rd_win are zero and the
comparison is always true.

This commit was SVN r15629.
2007-07-26 07:53:35 +00:00
Jeff Squyres
e36038bb17 We know that --enable-progress-threads doesn't work. But this allows
it to at least compile.  If you actually get to the point of invoking
the openib btl progress thread, you'll get a big opal_output warning
that it is pretty much guaranteed not to work.

This commit was SVN r15628.
2007-07-26 00:58:56 +00:00
Galen Shipman
514811c50b cleanup btl.h comments
document the btl interface a bit better

This commit was SVN r15618.
2007-07-25 17:26:23 +00:00
Galen Shipman
438a56e0d7 update copyrights for ib_multifrag commit
This commit was SVN r15612.
2007-07-25 15:03:34 +00:00
Galen Shipman
325c184fb4 remove debugging "abort()"
fix a debugging assert

This commit was SVN r15611.
2007-07-25 14:51:19 +00:00
Jeff Squyres
f4b117957d Add MCA parameter to enable/disable Nagle's algorithm on the TCP BTL.
This commit was SVN r15606.
2007-07-25 12:21:00 +00:00
Donald Kerr
be0bf9c27d add a missing subroutine prototype
This commit was SVN r15590.
2007-07-24 21:07:57 +00:00
Jeff Squyres
f2a2b2c0f9 A little more error checking; clean up the invalid MCA help message
This commit was SVN r15589.
2007-07-24 20:57:40 +00:00
Gleb Natapov
5b7d3faedc Implement "credit management for credit messages" protocol. On each message a
sender piggybacks a number of credit messages it received from a peer. A number
of outstanding credit messages is limited. This is needed to never ever fall
back to HW flow control.

This commit was SVN r15580.
2007-07-24 15:19:51 +00:00
Gleb Natapov
45a7a0650b btl_openib_handle_incoming() is called from regular receive path and from
eager RDMA receive path and checks internally from where it was called from to
perform different tasks. Leave only common code in there and move other code
to appropriate places.

This commit was SVN r15579.
2007-07-24 13:23:08 +00:00
George Bosilca
0486e8949e Remove all warnings.
This commit was SVN r15570.
2007-07-23 21:06:25 +00:00
Donald Kerr
2df5576d1d add support for if_include/if_exclude mca parameter to allow selection of udapl registry interface adapters; reviewed by rolf van de vaart
This commit was SVN r15565.
2007-07-23 19:49:34 +00:00
George Bosilca
21a7670390 Update the elan BTL. Now we support the following protocols: send, put
and partially get.

This commit was SVN r15564.
2007-07-23 19:07:13 +00:00
Brian Barrett
5b9fa7e998 reapply r15517 and r15520, which were removed in r15527 so that I could get
the RML/OOB merge in slightly easier

This commit was SVN r15530.

The following SVN revision numbers were found above:
  r15517 --> open-mpi/ompi@41977fcc95
  r15520 --> open-mpi/ompi@9cbc9df1b8
  r15527 --> open-mpi/ompi@2d17dd9516
2007-07-20 02:34:29 +00:00
Brian Barrett
2d17dd9516 temporarily back our r15517 and 15520 so that I can get the RML / OOB changes
to cleanly apply

This commit was SVN r15527.

The following SVN revision numbers were found above:
  r15517 --> open-mpi/ompi@41977fcc95
2007-07-20 01:10:34 +00:00
Ralph Castain
41977fcc95 Remove the cellid field from the orte_process_name_t structure. This only affects a handful of files in itself, but...
Cleanup ALL instances of output involving the printing of orte_process_name_t structures using the ORTE_NAME_ARGS macro so that the number of fields and type of data match. Replace those values with a new macro/function pair ORTE_NAME_PRINT that outputs a string (using the new thread safe data capability) so that any future changes to the printing of those structures can be accomplished with a change to a single point.

Note that I could not possibly find outputs that directly print the orte_process_name_t fields, but only dealt with those that used ORTE_NAME_ARGS. Hence, you may still have a few outputs that bark during compilation. Also, I could only verify those that fall within environments I can compile on, so other environments may yield some minor warnings.

This commit was SVN r15517.
2007-07-19 20:56:46 +00:00
Pavel Shamis
d837f1446b It is work around for Ticket #1092.
It will prevent the error failure in openib finalize
but it doesn't resolve the actual issue. I guess that
oneside tests some how allocates memory (mpool?) and doesn't 
release it. Need to check it.

This commit was SVN r15488.
2007-07-18 18:02:13 +00:00
Gleb Natapov
45fcb45e31 Remove debug checks that produce lots of warnings during compilation.
This commit was SVN r15479.
2007-07-18 13:49:15 +00:00
Gleb Natapov
30b2183314 Remove debug output from a hot path.
This commit was SVN r15478.
2007-07-18 12:48:34 +00:00
Jeff Squyres
3bc940ac27 Fix three things from r15474 (thanks to Brian for noticing):
* bml.h had a change that introduced a variable named "_order" to
   avoid a conflict with a local variable.  The namespace starting
   with _ belongs to the os/compiler/kernel/not us.  So we can't start
   symbols with _.  So I replaced it with arg_order, and also updated
   the threaded equivalent of the macro that was modified.
 * in btl_openib_proc.c, one opal_output accidentally had its string
   reverted from "ompi_modex_recv..." to
   "mca_pml_base_modex_recv....".  This was fixed.
 * The change to ompi/runtime/ompi_preconnect.c was entirely
   reverted; it was an artifact of debugging.

This commit was SVN r15475.

The following SVN revision numbers were found above:
  r15474 --> open-mpi/ompi@8ace07efed
2007-07-18 11:38:06 +00:00
Jeff Squyres
8ace07efed This commit brings in two major things:
1. Galen's fine-grain control of queue pair resources in the openib
   BTL.
1. Pasha's new implementation of asychronous HCA event handling.

Pasha's new implementation doesn't take much explanation, but the new
"multifrag" stuff does.  

Note that "svn merge" was not used to bring this new code from the
/tmp/ib_multifrag branch -- something Bad happened in the periodic
trunk pulls on that branch making an actual merge back to the trunk
effectively impossible (i.e., lots and lots of arbitrary conflicts and
artifical changes).  :-(

== Fine-grain control of queue pair resources ==

Galen's fine-grain control of queue pair resources to the OpenIB BTL
(thanks to Gleb for fixing broken code and providing additional
functionality, Pasha for finding broken code, and Jeff for doing all
the svn work and regression testing).

Prior to this commit, the OpenIB BTL created two queue pairs: one for
eager size fragments and one for max send size fragments.  When the
use of the shared receive queue (SRQ) was specified (via "-mca
btl_openib_use_srq 1"), these QPs would use a shared receive queue for
receive buffers instead of the default per-peer (PP) receive queues
and buffers.  One consequence of this design is that receive buffer
utilization (the size of the data received as a percentage of the
receive buffer used for the data) was quite poor for a number of
applications.

The new design allows multiple QPs to be specified at runtime.  Each
QP can be setup to use PP or SRQ receive buffers as well as giving
fine-grained control over receive buffer size, number of receive
buffers to post, when to replenish the receive queue (low water mark)
and for SRQ QPs, the number of outstanding sends can also be
specified.  The following is an example of the syntax to describe QPs
to the OpenIB BTL using the new MCA parameter btl_openib_receive_queues:

{{{
-mca btl_openib_receive_queues \
     "P,128,16,4;S,1024,256,128,32;S,4096,256,128,32;S,65536,256,128,32"
}}}

Each QP description is delimited by ";" (semicolon) with individual
fields of the QP description delimited by "," (comma).  The above
example therefore describes 4 QPs.

The first QP is:

    P,128,16,4

Meaning: per-peer receive buffer QPs are indicated by a starting field
of "P"; the first QP (shown above) is therefore a per-peer based QP.
The second field indicates the size of the receive buffer in bytes
(128 bytes).  The third field indicates the number of receive buffers
to allocate to the QP (16).  The fourth field indicates the low
watermark for receive buffers at which time the BTL will repost
receive buffers to the QP (4).

The second QP is:

    S,1024,256,128,32

Shared receive queue based QPs are indicated by a starting field of
"S"; the second QP (shown above) is therefore a shared receive queue
based QP.  The second, third and fourth fields are the same as in the
per-peer based QP.  The fifth field is the number of outstanding sends
that are allowed at a given time on the QP (32).  This provides a
"good enough" mechanism of flow control for some regular communication
patterns.

QPs MUST be specified in ascending receive buffer size order.  This
requirement may be removed prior to 1.3 release.

This commit was SVN r15474.
2007-07-18 01:15:59 +00:00
George Bosilca
c839694fb8 Dont print anything when the user requested a specific MX interface.
This commit was SVN r15426.
2007-07-14 00:04:50 +00:00
Galen Shipman
06b97cb267 fix template btl
This commit was SVN r15413.
2007-07-13 20:06:22 +00:00
Josh Hursey
d4d5a351c1 Silence a compiler warning when not using IPV6.
Also convert a few statements to conform to coding standard for Open MPI.

This commit was SVN r15407.
2007-07-13 16:38:36 +00:00
Josh Hursey
021249fa65 Use the new MCA metadata flag instead of 'false' for the newly added components
This commit was SVN r15400.
2007-07-13 14:39:17 +00:00
George Bosilca
8643f38adf Don't allow the BTL to be closed before the end of the process. Count the
number of times the BTLs are opened, and then don't remove them until
close was called the same number of times.

This commit was SVN r15376.
2007-07-11 22:21:04 +00:00
Brian Barrett
1f2942cf2a * Provide flag if the BTL can do RDMA, but requires a prepare_{src,dst}
that exactly describes the buffer to be used as the target of the
    operation
  * Use the above flag to disable components setting the flag from being
    used for real RDMA operations for the one-sided component (the
    BTLs will still be used for RDMA transfers for the PML and for
    send/receive communication for the OSC component) 

This commit was SVN r15375.
2007-07-11 21:21:40 +00:00
Jeff Squyres
8aa8a667da Use the OMPI version number for the component number, like all other
btl components.

This commit was SVN r15363.
2007-07-11 15:45:25 +00:00
Donald Kerr
88c9dfdf9f improve message to user when dat_ia_open fails
This commit was SVN r15362.
2007-07-11 15:20:35 +00:00
Andrew Friedley
87dd4bbd47 No idea how I did this.. thanks again to Jeff.
This commit was SVN r15345.
2007-07-10 20:37:42 +00:00
Brian Barrett
1d02b9e7b5 Fix a bunch of issues exposed by Ken Cain in getting Open MPI to work with
VxWorks.  Still some issues remaining, I'm sure.

Refs trac:1010

This commit was SVN r15320.

The following Trac tickets were found above:
  Ticket 1010 --> https://svn.open-mpi.org/trac/ompi/ticket/1010
2007-07-10 03:46:57 +00:00
George Bosilca
1200fa4ac5 The first version of the Elan BTL.
This commit was SVN r15319.
2007-07-09 21:03:13 +00:00
Jeff Squyres
cee9c214c7 Update the vendor ID list to include HP (0x1708). Thanks to Peter
Kjellstrom for pointing this out.

This commit was SVN r15316.
2007-07-09 20:09:31 +00:00
Brian Barrett
8b9e8054fd Move modex from pml base to general ompi runtime, sicne it's used by more
than just the PML/BTLs these days.  Also clean up the code so that it
handles the situation where not all nodes register information for a given
node (rather than just spinning until that node sends information, like
we do today).

Includes r15234 and r15265 from the /tmp/bwb-modex branch.

This commit was SVN r15310.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r15234
  r15265
2007-07-09 17:16:34 +00:00
Andrew Friedley
b212cf4dae Fix a signedness warning reported by Jeff/MTT.
This commit was SVN r15309.
2007-07-09 15:30:29 +00:00
Andrew Friedley
77038b65a8 Bring the UD BTL over to the trunk, named 'ofud'.
This commit was SVN r15298.
2007-07-05 23:42:54 +00:00
Sven Stork
21f12f29f8 - fix a sm bug that causes segfaults in the case of threaded builds.
The problem is that in the case of threaded builds for every fifo
  a head and tail lock will be allocated inside the shared memory
  segment and the ptr is stored inside the fifo. In the case that the sm backend
  file will be mapped in all processes at the same address (mostly the
  case for non-thread builds) this is fine, but in the cases when the
  processes map the file at different addresses this addresses cause big
  trouble in other processes than the one that allocted the locks. 
  Therefore the send lock addresses have to be recalculated to match
  the local mapping of the processes that use them.

This commit was SVN r15291.
2007-07-05 14:26:32 +00:00
Brian Barrett
41afd4ebee Clean up the MX configure test a bit. Use AC macros instead of hand
writing them.  Better tests, less code, and caching.  Update the code
to match changes in configure defines.

This commit was SVN r15287.
2007-07-04 22:07:30 +00:00
George Bosilca
dfa5ae34e1 Per a discussion with Kees Verstoep and Reese Faucette add one more
argument to the query for the line speed. This function is still not
documented, and it really look strange that we have to respecify the
nic_id (it's already attached to the endpoint).

This commit was SVN r15241.
2007-06-28 20:58:00 +00:00
Brian Barrett
f8fb1e9720 Fix some compile failures on Solaris 9 because it doesn't have V6ONLY.
This commit was SVN r15237.
2007-06-28 18:52:15 +00:00
George Bosilca
aec0b00f29 Get some hints about the network and propagate them to the upper level.
This commit was SVN r15236.
2007-06-28 18:51:48 +00:00
Sven Stork
428f697542 - addition to r15198. Update also the prepare destintation functions.
This commit was SVN r15199.

The following SVN revision numbers were found above:
  r15198 --> open-mpi/ompi@f63dd902cb
2007-06-26 12:07:30 +00:00
Sven Stork
f63dd902cb - bring the order changes of r14768 also to the mvapi btl
This commit was SVN r15198.

The following SVN revision numbers were found above:
  r14768 --> open-mpi/ompi@3401bd2b07
2007-06-26 09:34:44 +00:00
Jeff Squyres
022bd30558 Back out r15158 because it apparently breaks with recent versions of
flex (which, incidentally, emit ''more'' warnings than earlier
versions).  Grumble.

This commit was SVN r15166.

The following SVN revision numbers were found above:
  r15158 --> open-mpi/ompi@57d09c10f7
2007-06-21 21:14:10 +00:00
Jeff Squyres
57d09c10f7 Avoid some compiler warnings that come up ''every day'' in MTT (and
have been for eons): make a symbol be used in a dumb but harmless way.

This commit was SVN r15158.
2007-06-21 15:42:06 +00:00
Gleb Natapov
b88b7dedfe Rename btl_rdma_offset to btl_pipeline_send_length.
This commit was SVN r15153.
2007-06-21 07:12:40 +00:00
Jeff Squyres
84487f5c4b Update and correct the help messages for the generic BTL MCA
parameters.  Hopefully, they now make more sense to the mostly naieve
user...

This commit was SVN r15147.
2007-06-20 16:37:50 +00:00
Jeff Squyres
930a9b7682 Make the help messages for if_include/if_exclude a little better.
This commit was SVN r15134.
2007-06-19 13:38:58 +00:00
Gleb Natapov
643037907f Convert all #ifdef OMPI_ENABLE_DEBUG to #if.
This commit was SVN r15117.
2007-06-17 07:14:47 +00:00
George Bosilca
ceb8abe9c1 OMPI_ENABLE_DEBUG require an #if not an #ifdef
This commit was SVN r15107.
2007-06-15 19:22:19 +00:00
Josh Hursey
6cdfefad87 Fix portals BTL and cnos RML.
Both were failing due to interface changes that were never 
applied to them properly.

This commit was SVN r15082.
2007-06-14 18:49:41 +00:00
Jeff Squyres
2399b9a535 Ensure to initialize the variable so that we don't segv.
This commit was SVN r15078.
2007-06-14 13:59:28 +00:00
Gleb Natapov
7b9ae49fe1 This time correctly calculate local BTL rank among all BTLs in a subnet.
This commit was SVN r15073.
2007-06-14 10:27:11 +00:00
Jeff Squyres
1e18265c16 Bring over the functionality from the /tmp/jnysal-openib-wireup
branch:

 * Support btl_openib_if_include and btl_openib_if_exclude MCA
   parameters, similar to those supported by other BTLs.  Each take a
   comma-delimited lists of identifiers.  Identifiers can be HCA
   interface names (e.g., ipath0, mthca1, etc.)  or an HCA interface
   name and port numbers (e.g., ipath0:1, mthca1:2, etc.).  It is an
   error to specify both _include and _exclude.  If you specify a
   non-existant (or non-ACTIVE) HCA and/or port, you'll get a warning
   unless you disable the warning by setting the MCA parameter
   btl_openib_warn_nonexistent_if to 0.
 * Start updating to use BEGIN_C_DECLS and END_C_DECLS
 * A few other minor fixes that were picked up along the way.

This commit was SVN r15063.
2007-06-14 01:59:25 +00:00
Gleb Natapov
8164723014 Allow to configure bandwidth and latency with finer granularity.
Set bandwidth for all ports of mthca0:
--mca btl_openib_bandwidth_mthca0 1000

Set bandwidth for port 1 of mthca1:
--mca btl_openib_bandwidth_mthca1:1 1000

Set latency for port 2 lid 123 on mthca0:
--mca btl_openib_latency_mthca0:2:123 20

This commit was SVN r15041.
2007-06-13 12:47:38 +00:00
Gleb Natapov
5c3f511451 Properly determine btl's rank among all btls withing the same subnet.
This commit was SVN r15038.
2007-06-13 11:15:58 +00:00
Brian Barrett
27ad954265 Fix a couple of problems with the way we were using orte_process_name_t
structures in the system.  Instead of using memcmp, use the ns function.
This won't cause a problem as long as all three elements of the name are
ints, but if they have different sizes, alignment and padding rules
can cause memcmp() to compare padding space, which rarely holds a sane
value.

This commit was SVN r14998.
2007-06-11 19:12:11 +00:00
George Bosilca
e2dd0a50fc A better version alowing for multi-rails or clusters of clusters. A lot of cleanups.
This commit was SVN r14963.
2007-06-08 20:37:20 +00:00
George Bosilca
c66cf32ee2 Cleaning up. Removing all unused variables and fields in the MX BTL and
component structures.

This commit was SVN r14957.
2007-06-07 21:02:18 +00:00
Tim Prins
06bf4c3f3b fix some printf warnings
This commit was SVN r14934.
2007-06-06 22:37:26 +00:00
George Bosilca
6a5e039466 Allow smart connection to be setup. Each peer now has attached to it thea unique
id based on the last half of the mapper MAC. This allow us to figure out how
to connect peers. This allow the MX BTL to be used in a cluster of cluster 
configuration where each cluster have MX internally as well as on a multi
rail MX system.

This commit was SVN r14932.
2007-06-06 21:42:11 +00:00
Galen Shipman
5340f5e320 Try to cleanup the flow control logic a bit
Renamed a few variables 
Inialize the reserve receive buffers to 1, prior to this they were initialized
to zero. 

This commit was SVN r14919.
2007-06-06 18:51:09 +00:00
Gleb Natapov
de58336c45 Let rdma_pipeline_offset to be set to zero.
This commit was SVN r14900.
2007-06-06 11:54:25 +00:00
Donald Kerr
8ecbc71ed2 add support for connection private data, off by default
This commit was SVN r14878.
2007-06-05 19:29:50 +00:00
Gleb Natapov
ac1e8f81af Lets be real. TCP latency is slightly worse then mx/openib.
This commit was SVN r14865.
2007-06-05 12:22:57 +00:00
Gleb Natapov
fbd033b162 Cut&Paste error in r14795. Fix.
This commit was SVN r14862.

The following SVN revision numbers were found above:
  r14795 --> open-mpi/ompi@6b0d8c0858
2007-06-05 10:07:06 +00:00
Brian Barrett
508da4e959 OS X apparently really doesn't like shared libraries with unresolvable
symbols in them and environ is defined only in the final application
(probably in crt1.o).  Apple provides a function for getting at the
environment, so use that instead if it's available.

This commit was SVN r14857.
2007-06-05 03:03:59 +00:00
Brian Barrett
a446af5b6b * Remove unneeded SRQ test -- we no longer support OFED builds that don't
have the SRQ interface.
  * Instead of setting AC_DEFINEs per MCA component, set per test.  THe
    answers can never be difference, and this will speed sed just a teeny
    bit

This commit was SVN r14856.
2007-06-05 01:49:26 +00:00
Gleb Natapov
6b0d8c0858 TCP BTL ignores btl_tcp_bandwidth parameter. Fix it.
This commit was SVN r14795.
2007-05-30 14:12:05 +00:00
Donald Kerr
91c9b7b6f9 don't call dat_evd_resize if new value is less than or equal to current because ofed stack does not return DAT_INVALID_STATE
This commit was SVN r14792.
2007-05-29 20:08:16 +00:00
Gleb Natapov
f191834e56 No need for MCA_BTL_FLAGS_NEED_ACK any more. As of commit r14768 this is the
default behaviour.

This commit was SVN r14782.

The following SVN revision numbers were found above:
  r14768 --> open-mpi/ompi@3401bd2b07
2007-05-27 11:25:39 +00:00
Gleb Natapov
444762456e Don't dereference NULL pointer. Fix bug introduced in r14768.
This commit was SVN r14781.

The following SVN revision numbers were found above:
  r14768 --> open-mpi/ompi@3401bd2b07
2007-05-27 09:24:56 +00:00
Galen Shipman
3401bd2b07 Add optional ordering to the BTL interface.
This is required to tighten up the BTL semantics. Ordering is not guaranteed,
but, if the BTL returns a order tag in a descriptor (other than
MCA_BTL_NO_ORDER) then we may request another descriptor that will obey
ordering w.r.t. to the other descriptor.


This will allow sane behavior for RDMA networks, where local completion of an
RDMA operation on the active side does not imply remote completion on the
passive side. If we send a FIN message after local completion and the FIN is
not ordered w.r.t. the RDMA operation then badness may occur as the passive
side may now try to deregister the memory and the RDMA operation may still be
pending on the passive side. 

Note that this has no impact on networks that don't suffer from this
limitation as the ORDER tag can simply always be specified as
MCA_BTL_NO_ORDER.

This commit was SVN r14768.
2007-05-24 19:51:26 +00:00
Jeff Squyres
81df632e29 Clarification to MCA parameter help messages
This commit was SVN r14765.
2007-05-24 19:18:29 +00:00
George Bosilca
7459ab45f1 This is the complete commit for the TCP header issue. Jeff commit a partial
fix (r14749) and then backed it out (r14753).

As we are unable to send more than a 32 bits length over TCP in one go, there
is no reason to have an uint64 length in the header. This reduce the size
of the TCP header.

This commit was SVN r14755.

The following SVN revision numbers were found above:
  r14749 --> open-mpi/ompi@48c026ce6b
  r14753 --> open-mpi/ompi@28ed850b4c
2007-05-24 16:40:49 +00:00
Jeff Squyres
28ed850b4c Back out r14749; it wasn't quite ready for prime time yet...
This commit was SVN r14753.

The following SVN revision numbers were found above:
  r14749 --> open-mpi/ompi@48c026ce6b
2007-05-24 15:46:15 +00:00
Jeff Squyres
48c026ce6b Commit a patch from George (reviewed by Brian): reduce the size of the
mca_btl_tcp_hdr_t struct and remove the need for the heterogeneous
padding by changing the type of the "size" member to be uint32_t
(vs. uint64_t).  The value would never be greater than 32 bits anyway,
so having the type be uint64_t was wasteful.

This commit was SVN r14749.
2007-05-24 15:08:57 +00:00
Brian Barrett
075389f67d fix some printf warnings
This commit was SVN r14740.
2007-05-23 21:19:26 +00:00
George Bosilca
b2e805db61 Nothing relevant. Indentation, typos, change PTL to BTL.
This commit was SVN r14727.
2007-05-23 14:03:52 +00:00
Pavel Shamis
5ceaa605d7 Adding new vendor_part_id for Mellanox Hermon HCA
This commit was SVN r14705.
2007-05-21 13:33:54 +00:00
Donald Kerr
23280bd7da remove an assignment which is not required
This commit was SVN r14692.
2007-05-18 01:33:02 +00:00
Donald Kerr
588d5bd6a9 clean up compile warnings
This commit was SVN r14691.
2007-05-17 23:37:47 +00:00
George Bosilca
7738079ab9 Remove unused variable.
This commit was SVN r14689.
2007-05-17 20:01:30 +00:00
Gleb Natapov
b2c8fcdbab Forget to add file in r14681.
This commit was SVN r14682.

The following SVN revision numbers were found above:
  r14681 --> open-mpi/ompi@3ebaff8dfe
2007-05-17 08:41:01 +00:00
Gleb Natapov
3ebaff8dfe Implement new BTL parameters:
We eagerly send data up to btl_*_eager_limit with the match
Upon ACK of the MATCH we start using send/receives of size
btl_*_max_send_size up to the btl_*_rdma_pipeline_offset
After the btl_*_rdma_pipeline_offset we begin using RDMA writes of
size btl_*_rdma_pipeline_frag_size.

Now, on a per message basis we only use the above protocol if the
message is larger than btl_*_min_rdma_pipeline_size

btl_*_eager_limit - > same
btl_*_max_send_size -> same
btl_*_rdma_pipeline_offset -> btl_*_min_rdma_size
btl_*_rdma_pipeline_frag_size -> btl_*_max_rdma_size


btl_*_min_rdma_pipeline_size is new..

This patch also moves all BTL common parameters initialisation into
btl_base_mca.c file.

This commit was SVN r14681.
2007-05-17 07:54:27 +00:00
Brian Barrett
33a5758521 Some IPv6 improvements:
* Move ipv6comat.h code into opal_config_bottom.h and change into some
    more intelligent testing of structures
  * Change opal's if interface to use sockaddr instead of sockaddr_storage,
    as the RFCs suggest we do
  * Move the networking code in opal that isn't directly related to if
    detection into net.h
  * Add quicky function to get the port out of either a sockaddr_in
    or sockaddr_in6, saving a bunch of code in the oob.
  * Update TCP oob and btl with new interface

This commit was SVN r14679.
2007-05-17 01:17:59 +00:00
Donald Kerr
c40307fd27 add user warning message to inform when udapl btl is no longer able to register memory
This commit was SVN r14678.
2007-05-16 21:04:50 +00:00
Brian Barrett
7708c4f887 Don't complain about unsupported protocols. Needs to be made better,
but this will quit the whining from platforms where the kernel doesn't
have IPv6 support.

This commit was SVN r14676.
2007-05-16 20:11:47 +00:00
Sven Stork
22af6d38e6 - UNexport symbols that shouldn't be needed outside the libraries
- replace #if/#endif with BEGIN/END_C_DECLS
- reformating

This commit was SVN r14669.
2007-05-16 15:46:52 +00:00
Gleb Natapov
61e889a1d9 Fix breakage of GM by r13921. On receive GM provides only buffer pointer
without any context so we need to save a context somewhere so it can be
retrieved given only buffer pointer. This patch saves context (pointer to
frag) just before start of a buffer so it can be be easily retrieved. 

This commit was SVN r14664.

The following SVN revision numbers were found above:
  r13921 --> open-mpi/ompi@90fb58de4f
2007-05-16 12:20:58 +00:00
Donald Kerr
2ed72bf2e2 break evd_qlen into individual qlens (async,dto,conn); add checks based on udapl limits and number of peers
This commit was SVN r14659.
2007-05-15 17:47:00 +00:00
Pavel Shamis
cd87b05711 Added check for IBV_EVENT_CLIENT_REREGISTER async
event that was not exists in old openib gen2 versions
(Ticket #1025)

This commit was SVN r14658.
2007-05-15 13:53:49 +00:00
Brian Barrett
21e00f6f0c Clean up a couple of configure things:
* Require Autoconf 2.60 or higher and remove some cruft
    required for AC 2.59 or the AC 2.59 / AC 2.60 mix
  * Remove a bunch of now unnecessary AC_SUBST calls
  * Use the libtool-provided variables for the -I and
    library to use when compiling against ltdl

Fixes trac:1000

This commit was SVN r14652.

The following Trac tickets were found above:
  Ticket 1000 --> https://svn.open-mpi.org/trac/ompi/ticket/1000
2007-05-15 04:23:48 +00:00
Jeff Squyres
92090967b1 Add definitions for Hemon/ConnectX Mellanox HCA
This commit was SVN r14639.
2007-05-10 12:27:51 +00:00
Donald Kerr
436d370d51 latency improvements: use ompi_free_list_init_ex, create optimal alignment parameter, remove rdma guarantee path, replace dat_lmt_sync_rdma with use of volatile
This commit was SVN r14634.
2007-05-09 19:41:25 +00:00
Pavel Shamis
e2d0e27111 Adding:
* openib_finalize flow for openib btl
* async event handler for openib btl

This commit was SVN r14623.
2007-05-08 21:47:21 +00:00
Terry Dontje
f864348f97 Put an ifdef to conditionalize the use of memcpy for sparcv9 platforms to
avoid alignmment issues.  This commit fixes trac:1009.

This commit was SVN r14608.

The following Trac tickets were found above:
  Ticket 1009 --> https://svn.open-mpi.org/trac/ompi/ticket/1009
2007-05-08 17:17:34 +00:00
Jeff Squyres
ecf5a3b8dd Fix compiler warning
This commit was SVN r14604.
2007-05-08 13:12:50 +00:00
Adrian Knoth
d63d125a88 I guess we only need this when IPv6 is enabled.
This commit was SVN r14551.
2007-04-29 16:38:34 +00:00
Adrian Knoth
5765ecc22e This patch reverts r14549 while retaining IPv6 support.
Re #1008

This commit was SVN r14550.

The following SVN revision numbers were found above:
  r14549 --> open-mpi/ompi@386baed55b
2007-04-29 16:23:11 +00:00
Adrian Knoth
386baed55b Hotfix for IPv6 support. Closes trac:1008
This commit was SVN r14549.

The following Trac tickets were found above:
  Ticket 1008 --> https://svn.open-mpi.org/trac/ompi/ticket/1008
2007-04-29 13:46:45 +00:00
George Bosilca
46265db0a9 Update the TCP BTL in order to bring back some of the functionalities lost
during the IPv6 patch. The most important is the multi BTL support. There
was a quite interesting bug. Instead of setting up the multiple connections
over different physical devices, based on the time when these connections
were created most of the time they were all using the same physical network.
Which, of course, was not the intended goal, as we top at the maximum
bandwidth available over one device instead of gathering all available
bandwidth from all devices.

Second, the IPv6 RFC suggest to use sockaddr_storage as a holder for the
IP information, but use a sockaddr* when we pass it to functions. This is
only partially corrected by this patch.

Some other minor cleanups.

This commit was SVN r14544.
2007-04-28 19:13:47 +00:00
Sven Stork
8d92773067 - export required symbol
This commit was SVN r14536.
2007-04-27 11:38:45 +00:00
Rainer Keller
1aceece03f - Add a few comments for elements for structs, a few spelling fixes.
No functional change.

This commit was SVN r14534.
2007-04-26 21:03:38 +00:00
Rainer Keller
ce32b918da - Fixes for for unlocking the mutex in case of error in functions
mca_btl_openib_post_srr and
     btl_openib_endpoint_post_rr

This commit was SVN r14530.
2007-04-26 13:33:02 +00:00
Adrian Knoth
e3d35258b4 Cosmetics. Brian fixes my crappy code and I fix the curly braces.
That's teamwork, right? ;)

This commit was SVN r14517.
2007-04-25 20:17:19 +00:00
Brian Barrett
4b8bb70afb A couple cleanups for the IPv6 support:
- make opal_sockaddr2str() take a sockaddr_storage instead of a sockaddr_in6
    so that it works for IPv4 and IPv6 addresses, and remove a whole bunch
    of #ifs in the OOOB code.
  - Fix a compiler warning in the TCP BTL due to run-time determined
    array size by making it a dynamicly allocated array.
  - Fix the unpacking code of IPv4 addresses when using IPv6 support, so
    that the address is in the correct location (instead of in an IPv6
    structure, use an IPv4 structure).  Refs trac:1005.

This commit was SVN r14514.

The following Trac tickets were found above:
  Ticket 1005 --> https://svn.open-mpi.org/trac/ompi/ticket/1005
2007-04-25 19:08:07 +00:00
Adrian Knoth
d1ce39de4f Move mca_btl_tcp_addr_isipv4public to opal_addr_isipv4public
This commit was SVN r14512.
2007-04-25 18:06:06 +00:00
Donald Kerr
80d984441f change so that we only check connection queue when expecting a connection; create a mca parameter that controls frequency at which the async queue is checked
This commit was SVN r14511.
2007-04-25 17:46:25 +00:00
Jeff Squyres
c4c68e666a Merge in the ipv6 work from /tmp/ipv6-merge.
This commit was SVN r14503.
2007-04-25 01:55:40 +00:00
Donald Kerr
cae24fcde1 move mca parameter registration into own .c and .h files
This commit was SVN r14493.
2007-04-24 18:34:16 +00:00
Donald Kerr
3f428af7b8 couple of minor changes to fix #973 and seperated eager rdma fragments into structure only and data only area
This commit was SVN r14470.
2007-04-23 17:41:34 +00:00
Sharon Melamed
cf3f41288b Add pkey value MCA parameter. if this param is used,
only ports with the actual pkey value will be initiate.

This commit was SVN r14463.
2007-04-22 10:22:12 +00:00
Jeff Squyres
0ba47105ed Merge the /tmp/jms-installdirs-trunk branch into the trunk. This
finally brings in functionality that is already on the 1.2 branch, and
was developed and tested in the v1.2ofed branch (and other places).

Short version of new features:

 * Support for ibv_fork_init() 
 * Automatically fill in the openib BTL bandwidth value by 
   querying the HCA port 
 * Installdirs functionality 
 * Fixes to always use -I in the Fortran wrapper compilers (#924) 
 * Gleb's mpool updates 
 * Remove some kruft in btl/openib/configure.m4, therefore 
   fixing the harmless warnings noted in #665 
 * Bunches of updates to the Linux RPM spec file 

I.e., effectively the same thing that r14411 brought to the v1.2
branch.

Also effectively brought in r14432 and r14433 (some fixes on top of
the original r14411 commit to v1.2).  Still need to bring in the moral
equivalent of r14445 after this commit (fixes to installdirs).

This commit was SVN r14449.

The following SVN revision numbers were found above:
  r14411 --> open-mpi/ompi@83b31314ae
  r14432 --> open-mpi/ompi@a48f160595
  r14433 --> open-mpi/ompi@68f346d2bc
  r14445 --> open-mpi/ompi@13d366b827
2007-04-21 00:15:05 +00:00
Josh Hursey
8f119d9063 Closes trac:977
Fix for memory corruption in the restarted process stack. This stemed from 
the brute force method we were previously using. This commit fixes this by
using a lighter weight solution focused in the r2 BML instead of above the PML.
This is a more efficient and flexible solution, and it solves the original
problem.

In the process I pulled out the ft_event function in the tcp BTL and r2 BML
into a set of *_ft.[c|h] files just to keep any updates to these code paths
as isolated as possible to make merging easier on everyone.

This commit was SVN r14371.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855

The following Trac tickets were found above:
  Ticket 977 --> https://svn.open-mpi.org/trac/ompi/ticket/977
2007-04-14 02:06:05 +00:00
Jeff Squyres
51f286d737 Just like r14289 on the ORTE trunk:
Per discussions with Brian and Ralph, make a slight correction in
where components are installed. Use $pkglibdir, not $libdir/openmpi,
so that when compiled in the orte trunk, components are installed to
the right directory (because the component search patch is checking
$pkglibdir).

This commit was SVN r14345.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r14289
2007-04-12 11:19:42 +00:00
Gleb Natapov
d41ca417e8 Delete declaration of non-existent functions and no longer relevant comment.
This commit was SVN r14341.
2007-04-12 08:12:31 +00:00
George Bosilca
20f0ec584a A tricky optimization. On my test machine it improve the bandwidth by about 3Mb/s out of 580Mb/s. But
the real interest is for small to middle size unexpected messages. The unexpected messages are copied
by the PML in it's own unexpected buffers. Therefore, there is no reason to make a first copy in the
TCP BTL. The BTL can handle to the PML it's own buffer, and can be sure that once the callback
completed it can reuse the buffer, no matter what happened with the fragment.

This commit was SVN r14320.
2007-04-12 04:52:29 +00:00
Li-Ta Lo
ec8a859a44 fixed typo
This commit was SVN r14207.
2007-04-03 17:21:54 +00:00
George Bosilca
667bda0fef Rework the code a little bit to make things simpler.
This commit was SVN r14203.
2007-04-03 16:05:51 +00:00
George Bosilca
120cf76ad8 Remove some warnings.
This commit was SVN r14196.
2007-04-02 19:11:06 +00:00
George Bosilca
8273c5eeba Correct an error introduced by commit r14180.
This commit was SVN r14191.

The following SVN revision numbers were found above:
  r14180 --> open-mpi/ompi@1cb26e3b9c
2007-04-02 02:59:23 +00:00
Tim Prins
80e047b843 make the mx btl compile again...
This commit was SVN r14183.
2007-04-01 02:49:23 +00:00
George Bosilca
1cb26e3b9c Finally the convertor export a convenience function to allow a consistent
computation of the current location on the pack/unpack process. This can
be used both for retrieving the pointer to the first byte (in the special
case of the cached RDMA protocol) and for getting the current
position (for the pipelined protocol).

I modified all BTLs, but most of them are still untested.

This commit was SVN r14180.
2007-03-30 22:02:45 +00:00
Gleb Natapov
e5450613b5 Add new SM BTL parameter btl_sm_cb_max_num. If set to value greater then zero
it limits the number of circular buffers allocated between each pair of peers.
This allows for more tight memory usage control.

This commit was SVN r14120.
2007-03-22 12:21:42 +00:00
Gleb Natapov
efe0323d35 Initialize fifos at SM BTL init time instead of waiting for first send. This
waist slightly more memory, but prevents problem when fifo cannot be allocated
later during a job run when memory resource is exhausted.

This commit was SVN r14119.
2007-03-22 12:18:44 +00:00
Gleb Natapov
c389c47d79 Fix SM connectivity calculations.
This commit was SVN r14109.
2007-03-21 13:29:19 +00:00
Gleb Natapov
a1a14aa4c3 Add memory barriers during SM btl initialization.
This commit was SVN r14099.
2007-03-21 10:25:10 +00:00
Gleb Natapov
435565590f Don't relay on opcode to decide how to progress pending message.
This commit was SVN r14098.
2007-03-21 07:59:59 +00:00
Brian Barrett
464d536928 remove debugging printf
This commit was SVN r14088.
2007-03-20 21:28:28 +00:00
George Bosilca
8c9e4baa47 Add multi-link capabilities to the TCP BTL. This is useful for systems where the
latency is high and the network relatively fast. This will allow for more kernel
level buffering, which allow overlap between system calls and communications.
Somehow, even on fast clusters there is an improvement (non significant).

This patch create multiple modules for the same device, which in turn will
create multiple sockets between the peers. By default the number of BTL by
device is set to 1, so there is no fundamental difference with the current
version. Change the value of btl_tcp_links to enable multiple links between
peers.

This commit was SVN r14076.
2007-03-20 11:50:17 +00:00
George Bosilca
4332295b32 Typos.
This commit was SVN r14074.
2007-03-20 11:18:05 +00:00
Gleb Natapov
e551c5f1a3 Get rid of separate sm BTL for different shared memory base addresses. Now,
when we precalculate most of the addresses there is no point to have separate
BTL for this. The sm_progress() code become much more simple as a result.

This commit was SVN r14071.
2007-03-20 08:15:58 +00:00
Josh Hursey
e1a18fa149 Patch from Gleb
Always set opcode appropriately before calling ibv_post_send.

This commit was SVN r14056.
2007-03-18 13:33:15 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Gleb Natapov
1dc1ee3998 Send control credit message over "eager rdma" channel if possible.
This commit was SVN r14032.
2007-03-14 14:38:56 +00:00
Gleb Natapov
1f3ac2d7ae Hold pointers to free_max/free_eager lists in array indexed by priority.
This eliminates couple of ifs from fast path.

This commit was SVN r14031.
2007-03-14 14:36:03 +00:00
Gleb Natapov
8607957df9 Get rid of remaining _hp/_lp stuff. Consolidate HP/LP QP creation code.
This commit was SVN r14030.
2007-03-14 14:33:24 +00:00
Galen Shipman
8253d83410 make btl template compile again
This commit was SVN r13990.
2007-03-08 21:58:26 +00:00
Galen Shipman
67ba5264f6 ORTE_NAME_ARGS casts to long, not unsigned long.
This commit was SVN r13988.
2007-03-08 21:42:29 +00:00
Galen Shipman
8072dd344c use %ld instead of %d as ORTE_NAME_ARGS does casting to long not unsigned long
This commit was SVN r13987.
2007-03-08 21:41:39 +00:00
Bill D'Amico
53d434d6ab Fix warnings when building with UDAPL - minor formatting errors.
This commit was SVN r13971.
2007-03-08 18:39:40 +00:00
Gleb Natapov
40501f8274 Amend IB parameter checking.
This commit was SVN r13936.
2007-03-06 13:05:12 +00:00
Gleb Natapov
be018944d2 Clean up circular buffer implementation. Get rid of _same_base_address()
functions by pre-calculating everything in advance.

This commit was SVN r13923.
2007-03-05 14:27:26 +00:00
Gleb Natapov
8078ae5977 Optimize sm communication. Pass message type (MCA_BTL_SM_FRAG_ACK/
MCA_BTL_SM_FRAG_SEND) and status success/fail in low bits of pointers we
are passing through circular buffer. The rank that receives ACK doesn't need
to look into data it received and this is a big win since this data is not in
the cache of the rank's CPU. (Note that we can use low bits of pointers because
free_list always return pointers aligned at least to cache line size).

This commit was SVN r13922.
2007-03-05 14:24:09 +00:00
Gleb Natapov
90fb58de4f When frags are allocated from mpool by free_list the frag structure is also
allocated from mpool memory (which is registered memory for RDMA transports)
This is not a problem for a small jobs, but for a big number of ranks an
amount of waisted memory is big.

This commit was SVN r13921.
2007-03-05 14:17:50 +00:00
Rich Graham
e932d9a695 macro variable has same name as one of the parameters passed to the
macro.
Typo - most likely cut and paste error.

This commit was SVN r13918.
2007-03-04 23:31:07 +00:00
Josh Hursey
0404444dbe * Added 2 new MCA parameters
- mca_base_param_file_prefix
     (Default: NULL)
     This is the fullname of the "-am" mpirun option. Used to specify a ':'
     separated list of AMCA parameter set files.
  - mca_base_param_file_path
     (Default: $SYSCONFDIR/amca-param-sets/:$CWD)
     The path to search for AMCA files with relative paths. A warning will be
     printed if the AMCA file cannot be found.

* Added a new function "mca_base_param_recache_files" the re-reads the file
configurations. This is used internally to help bootstrap the MCA system.

* Added a new orterun/mpirun command line option '-am' that aliases for the
mca_base_param_file_prefix MCA parameter

* Exposed the opal_path_access function as it is generally useful in other
places in the code.

* New function "opal_cmd_line_make_opt_mca" which will allow you to append a
new command line option with MCA parameter identifiers to set at the same
time. Previously this could only be done at command line declaration time.

* Added a new directory under the $pkgdatadir named "amca-param-sets" where all
the 'shipped with' Open MPI AMCA parameter sets are placed. This is the first
place to search for AMCA sets with relative paths.

* An example.conf AMCA parameter set file is located in
contrib/amca-param-sets/.

* Jeff Squyres contributed an OpenIB AMCA set for benchmarking.

Note: You will need to autogen with this commit as it adds a configure param.
  Sorry :(

This commit was SVN r13867.
2007-03-01 13:39:20 +00:00
Gleb Natapov
2b6cbd6299 Separate frag lists for RDMA descriptors to two, one for src descriptors
and another for dst descriptors. This provide partial solution to OB1 protocol
deadlock problem. We can limit number of RDMA descriptors (by setting
btl_openib_free_list_max to something different from -1) and if we will be
lucky to hit this limit before we fail to register more memory the protocol
will not deadlock. When we had only one list for src/dst descriptors we
deadlocked when we reached max limit for the list.

This commit was SVN r13844.
2007-02-28 13:43:38 +00:00
Sven Stork
d8a369936e - Fix more symbols that should be exported.
This commit was SVN r13824.
2007-02-27 15:17:17 +00:00
Josh Hursey
c573171b7d Mostly a cleanup commit.
- Implement the BML/r2 finialize funciton
- Cleanup the btl close routine
- Wire up a pml_base_verbose MCA parameter so you can actually watch the PML selection logic if you really want to.
- Fix a potental segfault in the selection logic.
  ompi_pointer_array_get_item() may return NULL, so we have to check for it

This commit was SVN r13734.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2007-02-21 16:18:43 +00:00
Jeff Squyres
f820e44112 Remove a gcc-ism from the code (defining an anonymous union in the
middle of a struct).  Now we properly define and name the union
outside the struct and simply create an instance of it inside the
struct. 

This commit was SVN r13709.
2007-02-19 18:21:57 +00:00
George Bosilca
020b8ade70 A slightly better fix for the data mismatch compiler complaints.
This commit was SVN r13695.
2007-02-17 05:23:57 +00:00
George Bosilca
04138c23af No more warnings.
This commit was SVN r13683.
2007-02-16 16:25:58 +00:00
Pavel Shamis
edeab0e912 Adding Mellanox Technologies copyright to files touched by Mellanox.
This commit was SVN r13669.
2007-02-15 18:03:20 +00:00