1
1
Граф коммитов

123 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
690fb95bda Cleanup send scheduling code.
This commit was SVN r16014.
2007-08-30 12:10:04 +00:00
Gleb Natapov
0b0f9d14aa Mark send request complete on PML level only when absolutely sure there is
no more work associated with this request. No more outstanding completions or
packets and send scheduling isn't running in another thread.

This commit was SVN r16013.
2007-08-30 12:08:33 +00:00
Gleb Natapov
eac2674f66 The inner voice tells me this is a typo.
This commit was SVN r16004.
2007-08-29 13:28:47 +00:00
Brian Barrett
59b22533f2 Enable RDMA for heterogeneous situations. Currently done by overloading
the ompi_convertor_need_buffers function to only return 0 if the convertor
is homogeneous (which it never does on the trunk, but does to on v1.2, but
that's a different issue).  Only enable the heterogeneous rdma code for
a btl if it supports it (via a flag), as some btls need some work for this
to work properly.  Currently only TCP and OpenIB extensively tested

This commit was SVN r15990.
2007-08-28 21:23:44 +00:00
Gleb Natapov
627d9bc8ed Delay freeing of a send request if scheduling function is running by other
thread.

This commit was SVN r15722.
2007-08-01 12:19:16 +00:00
Gleb Natapov
21dd061696 Init req_send_range_lock. Found by Terry Dontje.
This commit was SVN r15677.
2007-07-30 08:21:52 +00:00
George Bosilca
e19777e910 A more consistent version. As we now share the send and receive queue, we
have to construct/destruct only once. Therefore, the construction will
happens before digging for a PML, while the destruction just before
finalizing the component.

Add some OPAL_LIKELY/OPAL_UNLIKELY.

This commit was SVN r15347.
2007-07-10 23:45:23 +00:00
George Bosilca
433f8a7694 This patch bring full support for message queues in Open MPI. Now the send and
receive queues are shared among all PMLs, they are declared in the base PML,
and the selected PML is in charge of initializing and releasing them. 

The CM PML is slightly different compared with OB1 or DR. Internally it use
2 different types of requests: light and heavy. However, now with this patch
both types of requests are stored in the same queue, and cast appropriately
on the allocation macro. This means we might use less memory than we allocate,
but in exchange we got full support for most of the parallel debuggers.

Another thing with this patch, is that now for all PML (CM included) the basic
PML requests start with the same fields, and they are declared in the same order
in the request structure. Moreover, the fields have been moved in such a way
that only one volatile/atomic will exist per line of cache (hopefully).

This commit was SVN r15346.
2007-07-10 22:16:38 +00:00
Tim Prins
f3ac4ac20e Fix order of function arguments
This commit was SVN r15304.
2007-07-08 16:37:51 +00:00
Rainer Keller
cff1b6a71b - PERUSE_COMM_REQ_XFER_BEGIN should be emited for first fragment
of larger message as well.

This commit was SVN r15299.
2007-07-06 15:02:36 +00:00
George Bosilca
c435094639 Only trigger the PERUSE_COMM_REQ_XFER_BEGIN event on the initial fragment.
This commit was SVN r15252.
2007-07-01 16:19:13 +00:00
Gleb Natapov
54b40aef91 Schedule SEND traffic of pipeline protocol between BTLs in accordance with
relative bandwidths of each BTL. Precalculate what part of a message should
be send via each BTL in advance instead of doing it during scheduling.

This commit was SVN r15248.
2007-07-01 11:34:23 +00:00
Rainer Keller
ca09aae2cc - Get PERUSE compile again with latest RDMA changes in r14768/r14842.
This commit was SVN r15042.

The following SVN revision numbers were found above:
  r14768 --> open-mpi/ompi@3401bd2b07
  r14842 --> open-mpi/ompi@10266fb467
2007-06-13 12:47:47 +00:00
Gleb Natapov
423f404c34 Shut up compiler warning. Ugly, but I can see better way except changing
converter to use uint64_t(ssize_t?) for offset.

This commit was SVN r14950.
2007-06-07 11:33:28 +00:00
Gleb Natapov
10266fb467 Fix deadlock in OB1 protocol by by sending memory by copying if registration
fails.

This commit was SVN r14842.
2007-06-03 08:31:58 +00:00
Gleb Natapov
a25e1e7b15 Implement new function mca_pml_ob1_send_requst_copy_in_out(req, offset, len)
that allows to send any range of a request by send/recv instaed of RDMA
and use it to send data from the end of a request in pipeline protocol. 

This commit was SVN r14841.
2007-06-03 08:30:07 +00:00
Galen Shipman
3401bd2b07 Add optional ordering to the BTL interface.
This is required to tighten up the BTL semantics. Ordering is not guaranteed,
but, if the BTL returns a order tag in a descriptor (other than
MCA_BTL_NO_ORDER) then we may request another descriptor that will obey
ordering w.r.t. to the other descriptor.


This will allow sane behavior for RDMA networks, where local completion of an
RDMA operation on the active side does not imply remote completion on the
passive side. If we send a FIN message after local completion and the FIN is
not ordered w.r.t. the RDMA operation then badness may occur as the passive
side may now try to deregister the memory and the RDMA operation may still be
pending on the passive side. 

Note that this has no impact on networks that don't suffer from this
limitation as the ORDER tag can simply always be specified as
MCA_BTL_NO_ORDER.

This commit was SVN r14768.
2007-05-24 19:51:26 +00:00
Gleb Natapov
2562253678 Do more work at RDMA frag preparation time and less work at RDMA frag sending
time.

This commit was SVN r14627.
2007-05-09 12:11:51 +00:00
Gleb Natapov
78fda79630 Use size_t instead of uint64_t in call to convertor cloning.
This commit was SVN r14626.
2007-05-09 10:02:06 +00:00
Gleb Natapov
8029893489 In multithreaded application sending of initial portion of a request may overlap
with RDMAing the rest of it. Also more than one RDMA writes can be performed
simultaneously by different threads. To make this code thread safe this patch
clones original request convertor for each RDMA fragment.

This commit was SVN r14574.
2007-05-03 09:13:17 +00:00
George Bosilca
bb481273a6 Typos.
This commit was SVN r14546.
2007-04-28 19:15:53 +00:00
Galen Shipman
d7e428909e two fixes, one mine, the other gleb's, I'm committing for gleb due to
time difference...  

1) The PML makes an assumption on local/remote completion semantics of the BTL
which Self BTL does not obey, nor should it, so we fix the PML
2) The Get protocol must handle the case when sender and reciever do not agree
on wheter the data is contiguous 

This commit was SVN r14313.
2007-04-11 22:03:06 +00:00
Gleb Natapov
1033002595 Fix memory leak. Free allocated descriptor if operation cannot proceed.
This commit was SVN r13610.
2007-02-12 09:47:51 +00:00
Gleb Natapov
4c7dbd36c7 Balance RDMA operation in round robin fashion between all available RDMA BTLs.
OB1 always use first element from array of BTLs available for RDMA. The patch
change the array creation algorithm, it puts different BTL in the first element
in round robin fashion.

This commit was SVN r13174.
2007-01-18 09:15:18 +00:00
Brian Barrett
8900d3ae43 Second take at fixing the issues with using ompi_ptr_t. Add helper functions for converting from .pval to .lval and vice-versa. Users of ompi_ptr_t types should only use one of the fields in the union unless using the helper conversion functions. For the BTLs, local pointers will always be stored in the .pval field and remote pointers always stored in the .lval field.
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.

Refs trac:587

This commit was SVN r13023.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-07 01:48:57 +00:00
Brian Barrett
48ec0b2071 Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix
for now...

This commit was SVN r12997.

The following SVN revision numbers were found above:
  r12974 --> open-mpi/ompi@27cea44a9c
2007-01-04 22:07:37 +00:00
Galen Shipman
931a389c4f fix deadlock on rendezvous protocol..
This commit was SVN r12982.
2007-01-04 03:46:11 +00:00
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
Gleb Natapov
a6127fd8ce Increase req_bytes_delivered atomically.
This commit was SVN r12971.
2007-01-03 15:19:34 +00:00
Gleb Natapov
79202561f6 Don't check req_pipeline_depth on frag completion. Checking of
req_bytes_delivered should be enough.

This commit was SVN r12967.
2007-01-03 14:44:20 +00:00
Gleb Natapov
1ad6c41735 Sender can start scheduling send fragments immediately after receiving ACK. No
need to wait for RNDV completion.

This commit was SVN r12965.
2007-01-03 12:37:11 +00:00
George Bosilca
0b5d879a63 ompi_convertor_pack do not return errors (all checkings are done when the
convertor is created).

This commit was SVN r12940.
2006-12-29 07:40:02 +00:00
Gleb Natapov
190e7a27cd Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma).
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).

This commit was SVN r12878.
2006-12-17 12:26:41 +00:00
Gleb Natapov
30ca7457b4 Some BTLs (e.g TCP) can report put/get completion before data actually
hits the buffer on the other side. For this kind of BTLs we need to send
FIN through the same BTL, PUT was performed with so network will handle
ordering for us. If we will use another BTL, receiver can get FIN before
data will hit the buffer and complete request prematurely. We mark such
problematic BTLs with MCA_BTL_FLAGS_FAKE_RDMA flag (this kind of RDMA
is really fake, because the real one guaranties that sender will see the
completion only after receiver's NIC confirmed that all the data was
received).

This commit was SVN r12732.
2006-12-03 10:12:09 +00:00
Gleb Natapov
39c930b160 The bug fixing part of r12720 introduce much more serious bug that it fixes.
It calls mca_pml_ob1_send_fin_btl() which may fail and doesn't check return
code. This breaks all RDMA transports event when only one BTL is used. Revert
it for now, I am working on a real fix for the problem (I hope).

This commit was SVN r12731.

The following SVN revision numbers were found above:
  r12720 --> open-mpi/ompi@3e3689320b
2006-12-03 08:55:59 +00:00
George Bosilca
3e3689320b Some indentations and one BIG fix. Avoid race conditions on the PUT RDMA
protocol when multiple NICS are available between 2 peers. The fix force
the FIN message to take exactly the same path as the fragment it describe
(i.e. same path means same BTL). Otherwise, the FIN can be received by
the peer before the RDMA complete and the request will get freed
too early.

This commit was SVN r12720.
2006-12-01 21:52:07 +00:00
Gleb Natapov
8ef5b6a589 Change tabs to spaces to be consistent with the rest of the file.
This commit was SVN r12345.
2006-10-29 08:12:44 +00:00
George Bosilca
a9c6ae8f15 Minimize the number of branches, and orce the correct prediction for the
most usual one. Most of the time we expect the functions which allocate
requests to succeed.

This commit was SVN r12344.
2006-10-27 23:16:13 +00:00
George Bosilca
126a68dc9a Big datatype commit. Remove all unused features of the datatype engine. As the memory
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).

This commit was SVN r12331.
2006-10-26 23:11:26 +00:00
Gleb Natapov
90be664b9f Some process_pending() functions get bml_btl on which resource was freed as a
parameter. For optimisation purpose only this BTL is used to send packet
through instead of trying to send packets through all BTLs. But actually the 
code was wrong. It simply used provided bml_btl and it may represent different
endpoint from packet's destination. The fixed code checks if packet's
destination is reachable through the BTL, finds appropriate bml_btl and only
then tries to send it through correct bml_btl.

This commit was SVN r12319.
2006-10-26 13:21:47 +00:00
Sven Stork
f3f39e003e - Increment the pipeline depth before we trigger the send function. As
mentioned in the comment the completion/callback of the triggered 
  send operation can happen before the call returns. If this happens and
  if the pipeline depth is 0 before we triggered the send operation and 
  this is the last send operation of the request then the completion detection
  code will decrement the pipeline depth and check it for equality to 0.
  Because (0-1) != 0 the pml completion function for this request will 
  *not* be called.
  This part 2 of the fix for ticket #246.

This commit was SVN r12292.
2006-10-25 08:52:39 +00:00
George Bosilca
8852c00c36 Look like a big commit but in fact it address only one issue. The way we're working with
size and diplacement of data-type. After this patch all data can contain size_t bytes
and the displacements are defined as ptrdiff_t. All of the files I was able to compile
have been modified to match this requirement.

This commit was SVN r12146.
2006-10-17 20:20:58 +00:00
George Bosilca
8d2a8229bb We don't use the send and receive request destructor.
This commit was SVN r11880.
2006-09-28 23:57:49 +00:00
George Bosilca
7f2fd41ace Make sure we trigger the PERUSE event before releasing the request.
This commit was SVN r11879.
2006-09-28 23:54:38 +00:00
George Bosilca
688a16ea78 A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was
long ago) supposed to be used as a cache for accessing the PML procs. But in
all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc.
This pointer can be accessed using the c_remote_group easily. Therefore, there is no
meaning of keeping the PML procs around. Slim fast commit ...

This commit was SVN r11730.
2006-09-20 22:14:46 +00:00
Rainer Keller
40cb5d3e30 - Fix peruse compilation
This commit was SVN r11685.
2006-09-18 07:41:09 +00:00
Gleb Natapov
fa17445384 fix compilation warning.
This commit was SVN r11601.
2006-09-10 06:17:33 +00:00
Gleb Natapov
e7650ff48a Bad things happen if min_rdma_size is smaller then data delivered in the RNDV
packet. Fix this.

This commit was SVN r11548.
2006-09-07 10:42:35 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00
Gleb Natapov
91f48f9a79 Merge with gleb-pml branch. Add out of resource handling support to PML layer.
If resource is not available request is added to one of the pending list and retried later.

This commit was SVN r10900.
2006-07-20 14:44:35 +00:00