openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	e19777e910	A more consistent version. As we now share the send and receive queue, we have to construct/destruct only once. Therefore, the construction will happens before digging for a PML, while the destruction just before finalizing the component. Add some OPAL_LIKELY/OPAL_UNLIKELY. This commit was SVN r15347.	2007-07-10 23:45:23 +00:00
George Bosilca	433f8a7694	This patch bring full support for message queues in Open MPI. Now the send and receive queues are shared among all PMLs, they are declared in the base PML, and the selected PML is in charge of initializing and releasing them. The CM PML is slightly different compared with OB1 or DR. Internally it use 2 different types of requests: light and heavy. However, now with this patch both types of requests are stored in the same queue, and cast appropriately on the allocation macro. This means we might use less memory than we allocate, but in exchange we got full support for most of the parallel debuggers. Another thing with this patch, is that now for all PML (CM included) the basic PML requests start with the same fields, and they are declared in the same order in the request structure. Moreover, the fields have been moved in such a way that only one volatile/atomic will exist per line of cache (hopefully). This commit was SVN r15346.	2007-07-10 22:16:38 +00:00
Tim Prins	f3ac4ac20e	Fix order of function arguments This commit was SVN r15304.	2007-07-08 16:37:51 +00:00
Rainer Keller	cff1b6a71b	- PERUSE_COMM_REQ_XFER_BEGIN should be emited for first fragment of larger message as well. This commit was SVN r15299.	2007-07-06 15:02:36 +00:00
George Bosilca	c435094639	Only trigger the PERUSE_COMM_REQ_XFER_BEGIN event on the initial fragment. This commit was SVN r15252.	2007-07-01 16:19:13 +00:00
Gleb Natapov	54b40aef91	Schedule SEND traffic of pipeline protocol between BTLs in accordance with relative bandwidths of each BTL. Precalculate what part of a message should be send via each BTL in advance instead of doing it during scheduling. This commit was SVN r15248.	2007-07-01 11:34:23 +00:00
Rainer Keller	ca09aae2cc	- Get PERUSE compile again with latest RDMA changes in r14768/r14842. This commit was SVN r15042. The following SVN revision numbers were found above: r14768 --> open-mpi/ompi@3401bd2b07 r14842 --> open-mpi/ompi@10266fb467	2007-06-13 12:47:47 +00:00
Gleb Natapov	423f404c34	Shut up compiler warning. Ugly, but I can see better way except changing converter to use uint64_t(ssize_t?) for offset. This commit was SVN r14950.	2007-06-07 11:33:28 +00:00
Gleb Natapov	10266fb467	Fix deadlock in OB1 protocol by by sending memory by copying if registration fails. This commit was SVN r14842.	2007-06-03 08:31:58 +00:00
Gleb Natapov	a25e1e7b15	Implement new function mca_pml_ob1_send_requst_copy_in_out(req, offset, len) that allows to send any range of a request by send/recv instaed of RDMA and use it to send data from the end of a request in pipeline protocol. This commit was SVN r14841.	2007-06-03 08:30:07 +00:00
Galen Shipman	3401bd2b07	Add optional ordering to the BTL interface. This is required to tighten up the BTL semantics. Ordering is not guaranteed, but, if the BTL returns a order tag in a descriptor (other than MCA_BTL_NO_ORDER) then we may request another descriptor that will obey ordering w.r.t. to the other descriptor. This will allow sane behavior for RDMA networks, where local completion of an RDMA operation on the active side does not imply remote completion on the passive side. If we send a FIN message after local completion and the FIN is not ordered w.r.t. the RDMA operation then badness may occur as the passive side may now try to deregister the memory and the RDMA operation may still be pending on the passive side. Note that this has no impact on networks that don't suffer from this limitation as the ORDER tag can simply always be specified as MCA_BTL_NO_ORDER. This commit was SVN r14768.	2007-05-24 19:51:26 +00:00
Gleb Natapov	2562253678	Do more work at RDMA frag preparation time and less work at RDMA frag sending time. This commit was SVN r14627.	2007-05-09 12:11:51 +00:00
Gleb Natapov	78fda79630	Use size_t instead of uint64_t in call to convertor cloning. This commit was SVN r14626.	2007-05-09 10:02:06 +00:00
Gleb Natapov	8029893489	In multithreaded application sending of initial portion of a request may overlap with RDMAing the rest of it. Also more than one RDMA writes can be performed simultaneously by different threads. To make this code thread safe this patch clones original request convertor for each RDMA fragment. This commit was SVN r14574.	2007-05-03 09:13:17 +00:00
George Bosilca	bb481273a6	Typos. This commit was SVN r14546.	2007-04-28 19:15:53 +00:00
Galen Shipman	d7e428909e	two fixes, one mine, the other gleb's, I'm committing for gleb due to time difference... 1) The PML makes an assumption on local/remote completion semantics of the BTL which Self BTL does not obey, nor should it, so we fix the PML 2) The Get protocol must handle the case when sender and reciever do not agree on wheter the data is contiguous This commit was SVN r14313.	2007-04-11 22:03:06 +00:00
Gleb Natapov	1033002595	Fix memory leak. Free allocated descriptor if operation cannot proceed. This commit was SVN r13610.	2007-02-12 09:47:51 +00:00
Gleb Natapov	4c7dbd36c7	Balance RDMA operation in round robin fashion between all available RDMA BTLs. OB1 always use first element from array of BTLs available for RDMA. The patch change the array creation algorithm, it puts different BTL in the first element in round robin fashion. This commit was SVN r13174.	2007-01-18 09:15:18 +00:00
Brian Barrett	8900d3ae43	Second take at fixing the issues with using ompi_ptr_t. Add helper functions for converting from .pval to .lval and vice-versa. Users of ompi_ptr_t types should only use one of the fields in the union unless using the helper conversion functions. For the BTLs, local pointers will always be stored in the .pval field and remote pointers always stored in the .lval field. George wrote the initial patch, I extended it slightly and am responsible for all bugs found. Refs trac:587 This commit was SVN r13023. The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-07 01:48:57 +00:00
Brian Barrett	48ec0b2071	Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix for now... This commit was SVN r12997. The following SVN revision numbers were found above: r12974 --> open-mpi/ompi@27cea44a9c	2007-01-04 22:07:37 +00:00
Galen Shipman	931a389c4f	fix deadlock on rendezvous protocol.. This commit was SVN r12982.	2007-01-04 03:46:11 +00:00
Brian Barrett	27cea44a9c	Fix a number of issues with the ompi_ptr_t: * Make sure that the pval always writes to the correct portion of the lval. This only matters on 32 bit big endian machines. * On 32 bit machines when assigning to pval, the other 4 bytes of lval weren't being written, which could lead to bogus data We use macros so that there aren't casts all over the code and the pval assignment can occur to the correct 4 bytes. Refs trac:587 This commit was SVN r12974. The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-03 19:47:48 +00:00
Gleb Natapov	a6127fd8ce	Increase req_bytes_delivered atomically. This commit was SVN r12971.	2007-01-03 15:19:34 +00:00
Gleb Natapov	79202561f6	Don't check req_pipeline_depth on frag completion. Checking of req_bytes_delivered should be enough. This commit was SVN r12967.	2007-01-03 14:44:20 +00:00
Gleb Natapov	1ad6c41735	Sender can start scheduling send fragments immediately after receiving ACK. No need to wait for RNDV completion. This commit was SVN r12965.	2007-01-03 12:37:11 +00:00
George Bosilca	0b5d879a63	ompi_convertor_pack do not return errors (all checkings are done when the convertor is created). This commit was SVN r12940.	2006-12-29 07:40:02 +00:00
Gleb Natapov	190e7a27cd	Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma). udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited). This commit was SVN r12878.	2006-12-17 12:26:41 +00:00
Gleb Natapov	30ca7457b4	Some BTLs (e.g TCP) can report put/get completion before data actually hits the buffer on the other side. For this kind of BTLs we need to send FIN through the same BTL, PUT was performed with so network will handle ordering for us. If we will use another BTL, receiver can get FIN before data will hit the buffer and complete request prematurely. We mark such problematic BTLs with MCA_BTL_FLAGS_FAKE_RDMA flag (this kind of RDMA is really fake, because the real one guaranties that sender will see the completion only after receiver's NIC confirmed that all the data was received). This commit was SVN r12732.	2006-12-03 10:12:09 +00:00
Gleb Natapov	39c930b160	The bug fixing part of r12720 introduce much more serious bug that it fixes. It calls mca_pml_ob1_send_fin_btl() which may fail and doesn't check return code. This breaks all RDMA transports event when only one BTL is used. Revert it for now, I am working on a real fix for the problem (I hope). This commit was SVN r12731. The following SVN revision numbers were found above: r12720 --> open-mpi/ompi@3e3689320b	2006-12-03 08:55:59 +00:00
George Bosilca	3e3689320b	Some indentations and one BIG fix. Avoid race conditions on the PUT RDMA protocol when multiple NICS are available between 2 peers. The fix force the FIN message to take exactly the same path as the fragment it describe (i.e. same path means same BTL). Otherwise, the FIN can be received by the peer before the RDMA complete and the request will get freed too early. This commit was SVN r12720.	2006-12-01 21:52:07 +00:00
Gleb Natapov	8ef5b6a589	Change tabs to spaces to be consistent with the rest of the file. This commit was SVN r12345.	2006-10-29 08:12:44 +00:00
George Bosilca	a9c6ae8f15	Minimize the number of branches, and orce the correct prediction for the most usual one. Most of the time we expect the functions which allocate requests to succeed. This commit was SVN r12344.	2006-10-27 23:16:13 +00:00
George Bosilca	126a68dc9a	Big datatype commit. Remove all unused features of the datatype engine. As the memory allocation logic is completely done outside the data-type engine (in the PML) there is no need for any special case inside the data-type engine. There is less arguments for the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is not required anymore as there is no memory allocated in the engine itself). This change affect all components using datatypes. I test most of them, but it might happens that I miss some ... If it's the case please let me know (don't shoot the pianist!!). This commit was SVN r12331.	2006-10-26 23:11:26 +00:00
Gleb Natapov	90be664b9f	Some process_pending() functions get bml_btl on which resource was freed as a parameter. For optimisation purpose only this BTL is used to send packet through instead of trying to send packets through all BTLs. But actually the code was wrong. It simply used provided bml_btl and it may represent different endpoint from packet's destination. The fixed code checks if packet's destination is reachable through the BTL, finds appropriate bml_btl and only then tries to send it through correct bml_btl. This commit was SVN r12319.	2006-10-26 13:21:47 +00:00
Sven Stork	f3f39e003e	- Increment the pipeline depth before we trigger the send function. As mentioned in the comment the completion/callback of the triggered send operation can happen before the call returns. If this happens and if the pipeline depth is 0 before we triggered the send operation and this is the last send operation of the request then the completion detection code will decrement the pipeline depth and check it for equality to 0. Because (0-1) != 0 the pml completion function for this request will not be called. This part 2 of the fix for ticket #246. This commit was SVN r12292.	2006-10-25 08:52:39 +00:00
George Bosilca	8852c00c36	Look like a big commit but in fact it address only one issue. The way we're working with size and diplacement of data-type. After this patch all data can contain size_t bytes and the displacements are defined as ptrdiff_t. All of the files I was able to compile have been modified to match this requirement. This commit was SVN r12146.	2006-10-17 20:20:58 +00:00
George Bosilca	8d2a8229bb	We don't use the send and receive request destructor. This commit was SVN r11880.	2006-09-28 23:57:49 +00:00
George Bosilca	7f2fd41ace	Make sure we trigger the PERUSE event before releasing the request. This commit was SVN r11879.	2006-09-28 23:54:38 +00:00
George Bosilca	688a16ea78	A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was long ago) supposed to be used as a cache for accessing the PML procs. But in all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc. This pointer can be accessed using the c_remote_group easily. Therefore, there is no meaning of keeping the PML procs around. Slim fast commit ... This commit was SVN r11730.	2006-09-20 22:14:46 +00:00
Rainer Keller	40cb5d3e30	- Fix peruse compilation This commit was SVN r11685.	2006-09-18 07:41:09 +00:00
Gleb Natapov	fa17445384	fix compilation warning. This commit was SVN r11601.	2006-09-10 06:17:33 +00:00
Gleb Natapov	e7650ff48a	Bad things happen if min_rdma_size is smaller then data delivered in the RNDV packet. Fix this. This commit was SVN r11548.	2006-09-07 10:42:35 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
Gleb Natapov	91f48f9a79	Merge with gleb-pml branch. Add out of resource handling support to PML layer. If resource is not available request is added to one of the pending list and retried later. This commit was SVN r10900.	2006-07-20 14:44:35 +00:00
George Bosilca	9f927dc7c1	Minor cleanups. On the OB1 PML the endpoint is not used => remove it from the build. There was some old code regarding the convertor which does not have to be there (the problem was corrected a while ago). In the PML we already know how the progress function is defined, so call the BML progress instead, which will save one function call. The macro MCA_PML_OB1_COMPUTE_SEGMENT_LENGTH is already defined in the pml_ob1.h so it should not be in the endpoint.h. Remove a double definition of the mca_pml_ob1_progress function in the pml_ob1.h. This commit was SVN r10775.	2006-07-13 00:07:13 +00:00
George Bosilca	476c9e64df	Don't keep multiples copies of the datatype and count. The only one we really need is the one provided by the user. For the buffered send the real datatype used for the communication is always MPI_BYTE and the count can be retrieved from the req_bytes_packed field. This will decrease the size of the request by one pointer and one size_t (8 bytes or 16 bytes depending on the architecture). This commit was SVN r10680.	2006-07-06 17:58:25 +00:00
George Bosilca	01a59d68da	Do not generate the XFER_BEGIN and XFER_END events if the length of the data is zero, for both the receives and the sends. This commit was SVN r10670.	2006-07-05 23:39:13 +00:00
George Bosilca	940dbff0fa	Add a new PERUSE macro. This is for the CONTINUE event (the one we added to the standard). This macro allow us to specify the length of the fragment. Now we are able to know how the message is fragmented between the network devices or inside the communication protocol. This commit was SVN r10508.	2006-06-26 20:08:33 +00:00
George Bosilca	c43b9821e7	Generate the PERUSE XFER_CONTINUE event. This commit was SVN r10501.	2006-06-26 19:01:22 +00:00
George Bosilca	dee2a7a08d	On this branch the rdma_offset should be set. The send_offset is anyway already set in the _START macro. This commit was SVN r10429.	2006-06-20 14:12:32 +00:00

1 2 3

117 Коммитов