openmpi

Автор	SHA1	Сообщение	Дата
Gleb Natapov	32a61c3bf2	Credit fragment is not protected properly from concurrent access. There is a race that can prevent further explicit credits update from been sent. Fix the race. This commit was SVN r15965.	2007-08-27 11:34:59 +00:00
Gleb Natapov	becf4aa9c9	ompi_pointer_array_get_size doesn't return how much elements are actually in an array, so count them by ourselves. This commit was SVN r15943.	2007-08-22 09:31:12 +00:00
Gleb Natapov	d8f3063895	Create only one CQ for all BTLs on the same HCA. Many BTLs can be created for one HCA. Multiple ports, LMC, multiple BTLs per one LID. Having only one CQ for all of them substantially reduce polling time. This commit was SVN r15933.	2007-08-20 12:28:25 +00:00
Jeff Squyres	50bae9c603	Bring in the modular-wireup stuff for the openib BTL (from /tmp/jms-modular-wireup branch): * This commit moves all the openib BTL connection code out of btl_openib_endpoint.c and into a connect "pseudo-component" area, meaning that different schemes for doing OFA connection schemes can be chosen via function pointer (i.e., MCA parameter) at run-time. * The connect/connect.h file includes comments describing the specific interface for the connect pseudo-component. * Two pseudo-components are in this commit (more can certainly be added). * oob: use the same old oob/rml scheme for creating OFA connections that we've had forever; this now just puts the logic into this self-contained pseudo-component. * rdma_cm: a currently-empty set of functions (that currently return NOT_IMPLEMENTED) that will someday use the RDMA connection manager to make OFA connections. This commit was SVN r15786.	2007-08-06 23:40:35 +00:00
Jeff Squyres	0fb8cf65a8	If you have an HCA with no active ports, we still create an mpool. This mpool will have no btl module owner there was no btl created for the HCA with no ports, but it will still be tracked in the mpool framework (i.e., it's available). If MPI_ALLOC_MEM is called by the app, one of two things will happen: 1. if there's an HCA on the host with some active ports, the openib btl component will still be in the process space, and therefore the "mpool with no btl" (MWNB) module will still be able to call the reg/dereg functions, and all will be fine. However, if MPI_FREE_MEM is never invoked to free the memory, bad things will happen during MPI_FINALIZE. The pml is finalized, which finalizes all the btls. The btls finalize all their mpools and all is fine. But later we close down the mpool framework which then finalizes any left over mpool modules, such as MWNB. However, the openib BTL module functions that the MWNB was registered with are no longer in the process space, and it segv's while trying deregister the memory. 2. if there are no HCA's on the host with active ports, then the openib btl will have been unloaded, and when the MWNM tries to register the memory, the functions it tries to call (in the openib btl) are no longer there, and we segv. This commit was SVN r15735.	2007-08-01 20:53:34 +00:00
Gleb Natapov	cce6bb478c	Process message before reposting buffers. This way rd_posted should be calculated properly. This commit was SVN r15635.	2007-07-26 13:56:07 +00:00
Pavel Shamis	bda6f1a5cf	Fixing compilation problem in openib btl progress thread. This commit was SVN r15631.	2007-07-26 11:35:15 +00:00
Jeff Squyres	e36038bb17	We know that --enable-progress-threads doesn't work. But this allows it to at least compile. If you actually get to the point of invoking the openib btl progress thread, you'll get a big opal_output warning that it is pretty much guaranteed not to work. This commit was SVN r15628.	2007-07-26 00:58:56 +00:00
Galen Shipman	438a56e0d7	update copyrights for ib_multifrag commit This commit was SVN r15612.	2007-07-25 15:03:34 +00:00
Galen Shipman	325c184fb4	remove debugging "abort()" fix a debugging assert This commit was SVN r15611.	2007-07-25 14:51:19 +00:00
Gleb Natapov	5b7d3faedc	Implement "credit management for credit messages" protocol. On each message a sender piggybacks a number of credit messages it received from a peer. A number of outstanding credit messages is limited. This is needed to never ever fall back to HW flow control. This commit was SVN r15580.	2007-07-24 15:19:51 +00:00
Gleb Natapov	45a7a0650b	btl_openib_handle_incoming() is called from regular receive path and from eager RDMA receive path and checks internally from where it was called from to perform different tasks. Leave only common code in there and move other code to appropriate places. This commit was SVN r15579.	2007-07-24 13:23:08 +00:00
Gleb Natapov	30b2183314	Remove debug output from a hot path. This commit was SVN r15478.	2007-07-18 12:48:34 +00:00
Jeff Squyres	8ace07efed	This commit brings in two major things: 1. Galen's fine-grain control of queue pair resources in the openib BTL. 1. Pasha's new implementation of asychronous HCA event handling. Pasha's new implementation doesn't take much explanation, but the new "multifrag" stuff does. Note that "svn merge" was not used to bring this new code from the /tmp/ib_multifrag branch -- something Bad happened in the periodic trunk pulls on that branch making an actual merge back to the trunk effectively impossible (i.e., lots and lots of arbitrary conflicts and artifical changes). :-( == Fine-grain control of queue pair resources == Galen's fine-grain control of queue pair resources to the OpenIB BTL (thanks to Gleb for fixing broken code and providing additional functionality, Pasha for finding broken code, and Jeff for doing all the svn work and regression testing). Prior to this commit, the OpenIB BTL created two queue pairs: one for eager size fragments and one for max send size fragments. When the use of the shared receive queue (SRQ) was specified (via "-mca btl_openib_use_srq 1"), these QPs would use a shared receive queue for receive buffers instead of the default per-peer (PP) receive queues and buffers. One consequence of this design is that receive buffer utilization (the size of the data received as a percentage of the receive buffer used for the data) was quite poor for a number of applications. The new design allows multiple QPs to be specified at runtime. Each QP can be setup to use PP or SRQ receive buffers as well as giving fine-grained control over receive buffer size, number of receive buffers to post, when to replenish the receive queue (low water mark) and for SRQ QPs, the number of outstanding sends can also be specified. The following is an example of the syntax to describe QPs to the OpenIB BTL using the new MCA parameter btl_openib_receive_queues: {{{ -mca btl_openib_receive_queues \ "P,128,16,4;S,1024,256,128,32;S,4096,256,128,32;S,65536,256,128,32" }}} Each QP description is delimited by ";" (semicolon) with individual fields of the QP description delimited by "," (comma). The above example therefore describes 4 QPs. The first QP is: P,128,16,4 Meaning: per-peer receive buffer QPs are indicated by a starting field of "P"; the first QP (shown above) is therefore a per-peer based QP. The second field indicates the size of the receive buffer in bytes (128 bytes). The third field indicates the number of receive buffers to allocate to the QP (16). The fourth field indicates the low watermark for receive buffers at which time the BTL will repost receive buffers to the QP (4). The second QP is: S,1024,256,128,32 Shared receive queue based QPs are indicated by a starting field of "S"; the second QP (shown above) is therefore a shared receive queue based QP. The second, third and fourth fields are the same as in the per-peer based QP. The fifth field is the number of outstanding sends that are allowed at a given time on the QP (32). This provides a "good enough" mechanism of flow control for some regular communication patterns. QPs MUST be specified in ascending receive buffer size order. This requirement may be removed prior to 1.3 release. This commit was SVN r15474.	2007-07-18 01:15:59 +00:00
Brian Barrett	8b9e8054fd	Move modex from pml base to general ompi runtime, sicne it's used by more than just the PML/BTLs these days. Also clean up the code so that it handles the situation where not all nodes register information for a given node (rather than just spinning until that node sends information, like we do today). Includes r15234 and r15265 from the /tmp/bwb-modex branch. This commit was SVN r15310. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r15234 r15265	2007-07-09 17:16:34 +00:00
Jeff Squyres	2399b9a535	Ensure to initialize the variable so that we don't segv. This commit was SVN r15078.	2007-06-14 13:59:28 +00:00
Jeff Squyres	1e18265c16	Bring over the functionality from the /tmp/jnysal-openib-wireup branch: * Support btl_openib_if_include and btl_openib_if_exclude MCA parameters, similar to those supported by other BTLs. Each take a comma-delimited lists of identifiers. Identifiers can be HCA interface names (e.g., ipath0, mthca1, etc.) or an HCA interface name and port numbers (e.g., ipath0:1, mthca1:2, etc.). It is an error to specify both _include and _exclude. If you specify a non-existant (or non-ACTIVE) HCA and/or port, you'll get a warning unless you disable the warning by setting the MCA parameter btl_openib_warn_nonexistent_if to 0. * Start updating to use BEGIN_C_DECLS and END_C_DECLS * A few other minor fixes that were picked up along the way. This commit was SVN r15063.	2007-06-14 01:59:25 +00:00
Gleb Natapov	8164723014	Allow to configure bandwidth and latency with finer granularity. Set bandwidth for all ports of mthca0: --mca btl_openib_bandwidth_mthca0 1000 Set bandwidth for port 1 of mthca1: --mca btl_openib_bandwidth_mthca1:1 1000 Set latency for port 2 lid 123 on mthca0: --mca btl_openib_latency_mthca0:2:123 20 This commit was SVN r15041.	2007-06-13 12:47:38 +00:00
Galen Shipman	5340f5e320	Try to cleanup the flow control logic a bit Renamed a few variables Inialize the reserve receive buffers to 1, prior to this they were initialized to zero. This commit was SVN r14919.	2007-06-06 18:51:09 +00:00
Brian Barrett	a446af5b6b	* Remove unneeded SRQ test -- we no longer support OFED builds that don't have the SRQ interface. * Instead of setting AC_DEFINEs per MCA component, set per test. THe answers can never be difference, and this will speed sed just a teeny bit This commit was SVN r14856.	2007-06-05 01:49:26 +00:00
Pavel Shamis	cd87b05711	Added check for IBV_EVENT_CLIENT_REREGISTER async event that was not exists in old openib gen2 versions (Ticket #1025) This commit was SVN r14658.	2007-05-15 13:53:49 +00:00
Pavel Shamis	e2d0e27111	Adding: * openib_finalize flow for openib btl * async event handler for openib btl This commit was SVN r14623.	2007-05-08 21:47:21 +00:00
Rainer Keller	1aceece03f	- Add a few comments for elements for structs, a few spelling fixes. No functional change. This commit was SVN r14534.	2007-04-26 21:03:38 +00:00
Sharon Melamed	cf3f41288b	Add pkey value MCA parameter. if this param is used, only ports with the actual pkey value will be initiate. This commit was SVN r14463.	2007-04-22 10:22:12 +00:00
Jeff Squyres	0ba47105ed	Merge the /tmp/jms-installdirs-trunk branch into the trunk. This finally brings in functionality that is already on the 1.2 branch, and was developed and tested in the v1.2ofed branch (and other places). Short version of new features: * Support for ibv_fork_init() * Automatically fill in the openib BTL bandwidth value by querying the HCA port * Installdirs functionality * Fixes to always use -I in the Fortran wrapper compilers (#924) * Gleb's mpool updates * Remove some kruft in btl/openib/configure.m4, therefore fixing the harmless warnings noted in #665 * Bunches of updates to the Linux RPM spec file I.e., effectively the same thing that r14411 brought to the v1.2 branch. Also effectively brought in r14432 and r14433 (some fixes on top of the original r14411 commit to v1.2). Still need to bring in the moral equivalent of r14445 after this commit (fixes to installdirs). This commit was SVN r14449. The following SVN revision numbers were found above: r14411 --> open-mpi/ompi@83b31314ae r14432 --> open-mpi/ompi@a48f160595 r14433 --> open-mpi/ompi@68f346d2bc r14445 --> open-mpi/ompi@13d366b827	2007-04-21 00:15:05 +00:00
Gleb Natapov	435565590f	Don't relay on opcode to decide how to progress pending message. This commit was SVN r14098.	2007-03-21 07:59:59 +00:00
Josh Hursey	dadca7da88	Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD). This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158	2007-03-16 23:11:45 +00:00
Gleb Natapov	1dc1ee3998	Send control credit message over "eager rdma" channel if possible. This commit was SVN r14032.	2007-03-14 14:38:56 +00:00
Gleb Natapov	1f3ac2d7ae	Hold pointers to free_max/free_eager lists in array indexed by priority. This eliminates couple of ifs from fast path. This commit was SVN r14031.	2007-03-14 14:36:03 +00:00
Gleb Natapov	90fb58de4f	When frags are allocated from mpool by free_list the frag structure is also allocated from mpool memory (which is registered memory for RDMA transports) This is not a problem for a small jobs, but for a big number of ranks an amount of waisted memory is big. This commit was SVN r13921.	2007-03-05 14:17:50 +00:00
Gleb Natapov	2b6cbd6299	Separate frag lists for RDMA descriptors to two, one for src descriptors and another for dst descriptors. This provide partial solution to OB1 protocol deadlock problem. We can limit number of RDMA descriptors (by setting btl_openib_free_list_max to something different from -1) and if we will be lucky to hit this limit before we fail to register more memory the protocol will not deadlock. When we had only one list for src/dst descriptors we deadlocked when we reached max limit for the list. This commit was SVN r13844.	2007-02-28 13:43:38 +00:00
Pavel Shamis	edeab0e912	Adding Mellanox Technologies copyright to files touched by Mellanox. This commit was SVN r13669.	2007-02-15 18:03:20 +00:00
Jeff Squyres	c9fe68c406	Better patch from Gleb to do the per-port (endpoint) specification of whether to use eager RDMA or not This commit was SVN r13262.	2007-01-23 22:40:59 +00:00
Jeff Squyres	3389a523e9	Arrgh. That printf should not have been in there! This commit was SVN r13243.	2007-01-22 18:52:49 +00:00
Jeff Squyres	a24f3c0886	Move the "use eager RDMA" flag to the individual openib BTL modules, not the component. This potentially allows for a mix of HCAs that support eager RDMA and those who do not on a port-by-port basis. This commit was SVN r13242.	2007-01-22 18:49:32 +00:00
Galen Shipman	2097d174f6	heterogeneous fixes to the OpenIB BTL. This includes work by nysal, brian and I. This commit was SVN r13106.	2007-01-12 23:14:45 +00:00
Galen Shipman	df099a4731	call it what it is... we are looking at subnet_id's and we are counting active ports per subnet. move subnet count out of procs loop,, no need to do it there... This commit was SVN r13105.	2007-01-12 22:42:20 +00:00
Gleb Natapov	d3ac56272a	Prevent access to openib_btl after free(). This commit was SVN r13052.	2007-01-09 09:07:32 +00:00
Brian Barrett	48ec0b2071	Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix for now... This commit was SVN r12997. The following SVN revision numbers were found above: r12974 --> open-mpi/ompi@27cea44a9c	2007-01-04 22:07:37 +00:00
Galen Shipman	d207a6c988	endpoint should use a uint64_t for subnet, as everyone else does.. makes bad things happen when packing into a 64 bit buffer... Misc cleanup.. This commit was SVN r12993.	2007-01-04 20:25:28 +00:00
Galen Shipman	f12bbe0591	Handle different subnets correctly and multiple nic endpoint negotiation This is somewhat limited currently for expample, if you have 3 ports on Node A and 5 ports on Node B then the peers will use 3 ports to communicate with each other. This is on a subnet basis, so for any pair of nodes we take the intersection of the available ports within a subnet. We use subnets to determine reachability for lazy connection establishment. So if Node A and Node B each have two HCA's (on seperate networks) then the subnet's must be distinct, otherwise we will try to wire up HCA's on seperate networks. This commit was SVN r12978.	2007-01-03 22:35:41 +00:00
Brian Barrett	27cea44a9c	Fix a number of issues with the ompi_ptr_t: * Make sure that the pval always writes to the correct portion of the lval. This only matters on 32 bit big endian machines. * On 32 bit machines when assigning to pval, the other 4 bytes of lval weren't being written, which could lead to bogus data We use macros so that there aren't casts all over the code and the pval assignment can occur to the correct 4 bytes. Refs trac:587 This commit was SVN r12974. The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-03 19:47:48 +00:00
Gleb Natapov	484c6a2c1a	Use OPAL_ALIGN() macro to align length. Return address from mpool_alloc is now properly aligned so no need to align it once more. This commit was SVN r12899.	2006-12-19 08:34:48 +00:00
Gleb Natapov	190e7a27cd	Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma). udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited). This commit was SVN r12878.	2006-12-17 12:26:41 +00:00
Jeff Squyres	0ca8cb35b7	Fixes trac:366 Add ability for ini files to recognize "use_eager_rdma" flag. Set the default to "no" (because we should assume that HCAs cannot support the property necessary for using RDMA for eager messages -- that the last byte of the message is guaranteed to be written to memory last -- unless proven otherwise. For example, iWARP cards apparently do not provide this guarantee), and then set all Mellanox and IBM HCAs to override the default to enable this behavior on these cards. This commit was SVN r12851. The following Trac tickets were found above: Ticket 366 --> https://svn.open-mpi.org/trac/ompi/ticket/366	2006-12-14 15:52:13 +00:00
Pavel Shamis	f08bc818c4	Cleaning mca_btl_openib_progress_thread from unused variables. This commit was SVN r12709.	2006-11-30 18:28:45 +00:00
Gleb Natapov	b4fd2d7d50	Fix warnings from progress thread patch. This commit was SVN r12434.	2006-11-06 12:34:56 +00:00
Pavel Shamis	566667ac61	Adding progress thread support to OpenIB BTL. Reviewed by Gleb. This commit was SVN r12411.	2006-11-02 16:15:21 +00:00
Gleb Natapov	4c784b6403	As Andrew Friedley pointed, my previous patch may cause deadlock if mca_btl_openib_endpoint_connect_eager_rdma() is called recursively. He also noticed that orte_pointer_array_add() can't fail because we allocate max number of elements at init time. So just remove error handling and locking. No locking - no deadlocks. This commit was SVN r12388.	2006-11-01 15:53:33 +00:00
Gleb Natapov	aac695a51f	eager_rdma_buffers update is not atomic. A buffer is added to the array and if something is going wrong down in the code it is removed from the array. So add mutex to prevent concurrent access to the array from different threads. This commit was SVN r12385.	2006-11-01 07:27:32 +00:00

1 2 3

143 Коммитов