1
1
Граф коммитов

104 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
097b17d30e Prevent a receive request from been freed while other thread holds a reference
to it or there is an outstanding completion for the request.

This commit was SVN r16153.
2007-09-18 16:18:47 +00:00
Brian Barrett
59b22533f2 Enable RDMA for heterogeneous situations. Currently done by overloading
the ompi_convertor_need_buffers function to only return 0 if the convertor
is homogeneous (which it never does on the trunk, but does to on v1.2, but
that's a different issue).  Only enable the heterogeneous rdma code for
a btl if it supports it (via a flag), as some btls need some work for this
to work properly.  Currently only TCP and OpenIB extensively tested

This commit was SVN r15990.
2007-08-28 21:23:44 +00:00
Gleb Natapov
fa69c5cc10 If a memory on a sender's size is not registered don't register it on a receive
side too. Otherwise a content of the recvreq->req_rdma array is replaced later
without freeing previous content and refcount on registration in mpool become
wrong.

This commit was SVN r15978.
2007-08-28 07:43:06 +00:00
Gleb Natapov
e1a1d9d90e Receive request converter can be accessed in parallel by a thread that receives
data and a thread that run RDMA schedule function. Protect access to the
converter by a lock.

This commit was SVN r15967.
2007-08-27 11:41:42 +00:00
Gleb Natapov
065d04dfde Do not free recvreq while schedule function is running in another thread.
This commit was SVN r15964.
2007-08-27 11:31:40 +00:00
Rainer Keller
1b5fa48a29 - Add missing PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q when matching
from the posted generic_recv-queue.
 - Move the PERUSE_COMM_MSG_MATCH_POSTED_REQ from
   MCA_PML_OB1_RECV_REQUEST_MATCHED to
   mca_pml_ob1_recv_frag_match() as suggested by Terry Dontje
   Only post, if this is not a probe/iprobe request.
 - Do not post PERUSE_COMM_REQ_MATCH_UNEX for probes / iprobes and
   do in correct order before PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q

This commit was SVN r15947.
2007-08-23 07:09:43 +00:00
Gleb Natapov
afac5eb93f Guard recv request with lock against simultaneous access from different
threads.

This commit was SVN r15681.
2007-07-30 12:50:38 +00:00
George Bosilca
e19777e910 A more consistent version. As we now share the send and receive queue, we
have to construct/destruct only once. Therefore, the construction will
happens before digging for a PML, while the destruction just before
finalizing the component.

Add some OPAL_LIKELY/OPAL_UNLIKELY.

This commit was SVN r15347.
2007-07-10 23:45:23 +00:00
George Bosilca
433f8a7694 This patch bring full support for message queues in Open MPI. Now the send and
receive queues are shared among all PMLs, they are declared in the base PML,
and the selected PML is in charge of initializing and releasing them. 

The CM PML is slightly different compared with OB1 or DR. Internally it use
2 different types of requests: light and heavy. However, now with this patch
both types of requests are stored in the same queue, and cast appropriately
on the allocation macro. This means we might use less memory than we allocate,
but in exchange we got full support for most of the parallel debuggers.

Another thing with this patch, is that now for all PML (CM included) the basic
PML requests start with the same fields, and they are declared in the same order
in the request structure. Moreover, the fields have been moved in such a way
that only one volatile/atomic will exist per line of cache (hopefully).

This commit was SVN r15346.
2007-07-10 22:16:38 +00:00
George Bosilca
11656e20aa Remove few warnings.
This commit was SVN r15250.
2007-07-01 16:16:05 +00:00
Gleb Natapov
77e54ebc7e Schedule RDMA op on the last BTL that got completion.
This commit was SVN r15249.
2007-07-01 11:35:55 +00:00
Gleb Natapov
e74aa6b295 Schedule RDMA traffic between BTLs in accordance with relative bandwidths of
each BTL. Precalculate what part of a message should be send via each BTL in
advance instead of doing it during scheduling.

This commit was SVN r15247.
2007-07-01 11:31:26 +00:00
Gleb Natapov
b88b7dedfe Rename btl_rdma_offset to btl_pipeline_send_length.
This commit was SVN r15153.
2007-06-21 07:12:40 +00:00
Gleb Natapov
10266fb467 Fix deadlock in OB1 protocol by by sending memory by copying if registration
fails.

This commit was SVN r14842.
2007-06-03 08:31:58 +00:00
Gleb Natapov
a25e1e7b15 Implement new function mca_pml_ob1_send_requst_copy_in_out(req, offset, len)
that allows to send any range of a request by send/recv instaed of RDMA
and use it to send data from the end of a request in pipeline protocol. 

This commit was SVN r14841.
2007-06-03 08:30:07 +00:00
Galen Shipman
3401bd2b07 Add optional ordering to the BTL interface.
This is required to tighten up the BTL semantics. Ordering is not guaranteed,
but, if the BTL returns a order tag in a descriptor (other than
MCA_BTL_NO_ORDER) then we may request another descriptor that will obey
ordering w.r.t. to the other descriptor.


This will allow sane behavior for RDMA networks, where local completion of an
RDMA operation on the active side does not imply remote completion on the
passive side. If we send a FIN message after local completion and the FIN is
not ordered w.r.t. the RDMA operation then badness may occur as the passive
side may now try to deregister the memory and the RDMA operation may still be
pending on the passive side. 

Note that this has no impact on networks that don't suffer from this
limitation as the ORDER tag can simply always be specified as
MCA_BTL_NO_ORDER.

This commit was SVN r14768.
2007-05-24 19:51:26 +00:00
Gleb Natapov
3ebaff8dfe Implement new BTL parameters:
We eagerly send data up to btl_*_eager_limit with the match
Upon ACK of the MATCH we start using send/receives of size
btl_*_max_send_size up to the btl_*_rdma_pipeline_offset
After the btl_*_rdma_pipeline_offset we begin using RDMA writes of
size btl_*_rdma_pipeline_frag_size.

Now, on a per message basis we only use the above protocol if the
message is larger than btl_*_min_rdma_pipeline_size

btl_*_eager_limit - > same
btl_*_max_send_size -> same
btl_*_rdma_pipeline_offset -> btl_*_min_rdma_size
btl_*_rdma_pipeline_frag_size -> btl_*_max_rdma_size


btl_*_min_rdma_pipeline_size is new..

This patch also moves all BTL common parameters initialisation into
btl_base_mca.c file.

This commit was SVN r14681.
2007-05-17 07:54:27 +00:00
Gleb Natapov
2562253678 Do more work at RDMA frag preparation time and less work at RDMA frag sending
time.

This commit was SVN r14627.
2007-05-09 12:11:51 +00:00
Galen Shipman
d7e428909e two fixes, one mine, the other gleb's, I'm committing for gleb due to
time difference...  

1) The PML makes an assumption on local/remote completion semantics of the BTL
which Self BTL does not obey, nor should it, so we fix the PML
2) The Get protocol must handle the case when sender and reciever do not agree
on wheter the data is contiguous 

This commit was SVN r14313.
2007-04-11 22:03:06 +00:00
George Bosilca
1cb26e3b9c Finally the convertor export a convenience function to allow a consistent
computation of the current location on the pack/unpack process. This can
be used both for retrieving the pointer to the first byte (in the special
case of the cached RDMA protocol) and for getting the current
position (for the pipelined protocol).

I modified all BTLs, but most of them are still untested.

This commit was SVN r14180.
2007-03-30 22:02:45 +00:00
Galen Shipman
a78672be2b fix mpi_leave_pinned case for arbitrary datatypes
George will be streamlining this with a new convertor function soon... 

This commit was SVN r14174.
2007-03-30 02:06:08 +00:00
Brian Barrett
8900d3ae43 Second take at fixing the issues with using ompi_ptr_t. Add helper functions for converting from .pval to .lval and vice-versa. Users of ompi_ptr_t types should only use one of the fields in the union unless using the helper conversion functions. For the BTLs, local pointers will always be stored in the .pval field and remote pointers always stored in the .lval field.
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.

Refs trac:587

This commit was SVN r13023.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-07 01:48:57 +00:00
Brian Barrett
48ec0b2071 Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix
for now...

This commit was SVN r12997.

The following SVN revision numbers were found above:
  r12974 --> open-mpi/ompi@27cea44a9c
2007-01-04 22:07:37 +00:00
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
Gleb Natapov
1ad6c41735 Sender can start scheduling send fragments immediately after receiving ACK. No
need to wait for RNDV completion.

This commit was SVN r12965.
2007-01-03 12:37:11 +00:00
Gleb Natapov
190e7a27cd Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma).
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).

This commit was SVN r12878.
2006-12-17 12:26:41 +00:00
Brian Barrett
441432950f Merge in changes from the bwb-heterogeneous temp branch (r12491 -
r12714) for supporting compilers / architectures with different
padding rules.

This commit was SVN r12749.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r12491
  r12714
2006-12-04 20:11:42 +00:00
Gleb Natapov
30ca7457b4 Some BTLs (e.g TCP) can report put/get completion before data actually
hits the buffer on the other side. For this kind of BTLs we need to send
FIN through the same BTL, PUT was performed with so network will handle
ordering for us. If we will use another BTL, receiver can get FIN before
data will hit the buffer and complete request prematurely. We mark such
problematic BTLs with MCA_BTL_FLAGS_FAKE_RDMA flag (this kind of RDMA
is really fake, because the real one guaranties that sender will see the
completion only after receiver's NIC confirmed that all the data was
received).

This commit was SVN r12732.
2006-12-03 10:12:09 +00:00
Gleb Natapov
65d7ad4581 The "bug fix" from the r12721 reverts part of the r12433 that fixed
regresion from v1.1 was reviewed and put to v1.2 branch. So revert this part
of r12721 back.

This commit was SVN r12730.

The following SVN revision numbers were found above:
  r12433 --> open-mpi/ompi@82f7c0dd69
  r12721 --> open-mpi/ompi@3edd850d2e
2006-12-03 08:29:55 +00:00
George Bosilca
3edd850d2e Some indentation and code arrangement. However, there is a bug fix. Force the PUT
protocol to always obey to the btl_max_rdma_size.

This commit was SVN r12721.
2006-12-01 22:26:14 +00:00
George Bosilca
eab1776e9a Explicit casts for our friendly Windows environment...
This commit was SVN r12496.
2006-11-08 17:02:46 +00:00
Gleb Natapov
82f7c0dd69 Fix regression from v1.1.
1) make the code do what comment says
2) if memory is prepinned don't send multiple PUT messages.

This commit was SVN r12433.
2006-11-06 12:00:17 +00:00
George Bosilca
8852c00c36 Look like a big commit but in fact it address only one issue. The way we're working with
size and diplacement of data-type. After this patch all data can contain size_t bytes
and the displacements are defined as ptrdiff_t. All of the files I was able to compile
have been modified to match this requirement.

This commit was SVN r12146.
2006-10-17 20:20:58 +00:00
George Bosilca
ef66afe45c Another inner loop optimization. Only check for num_fails when prev_bytes is
equal to num_bytes.

This commit was SVN r12133.
2006-10-17 04:38:38 +00:00
George Bosilca
8d2a8229bb We don't use the send and receive request destructor.
This commit was SVN r11880.
2006-09-28 23:57:49 +00:00
George Bosilca
7f2fd41ace Make sure we trigger the PERUSE event before releasing the request.
This commit was SVN r11879.
2006-09-28 23:54:38 +00:00
George Bosilca
688a16ea78 A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was
long ago) supposed to be used as a cache for accessing the PML procs. But in
all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc.
This pointer can be accessed using the c_remote_group easily. Therefore, there is no
meaning of keeping the PML procs around. Slim fast commit ...

This commit was SVN r11730.
2006-09-20 22:14:46 +00:00
George Bosilca
6f3782bbd7 When we succesfully cancel a request we have to set it's pml_complete flag to true
if we want to be able to reuse the request. If not, the request will never be freed
even if the user call MPI_Request_free.

This commit was SVN r11717.
2006-09-19 18:04:09 +00:00
Gleb Natapov
03cda61302 Fix hang in receiving into MPI_alloced area.
This code hangs with openib BTL:

int size = 4000000;
sbuf = malloc(size);  
MPI_Alloc_mem(size, MPI_INFO_NULL, &rbuf);

if (rank == 0)
{
    MPI_Recv(rbuf, size, MPI_CHAR, 1, 1, MPI_COMM_WORLD, &stat);
}else{
    MPI_Send(sbuf, size, MPI_CHAR, 0, 1, MPI_COMM_WORLD);
}

This commit was SVN r11613.
2006-09-11 12:18:59 +00:00
Gleb Natapov
e7650ff48a Bad things happen if min_rdma_size is smaller then data delivered in the RNDV
packet. Fix this.

This commit was SVN r11548.
2006-09-07 10:42:35 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00
George Bosilca
6afa4c6c64 Windows friendly version. We have to split the OMPI_DECLSPEC in at least 3
different macros, one for each project. Therefore, now we have OPAL_DECLSPEC,
ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project.

This commit was SVN r11270.
2006-08-20 15:54:04 +00:00
Gleb Natapov
91f48f9a79 Merge with gleb-pml branch. Add out of resource handling support to PML layer.
If resource is not available request is added to one of the pending list and retried later.

This commit was SVN r10900.
2006-07-20 14:44:35 +00:00
George Bosilca
01a59d68da Do not generate the XFER_BEGIN and XFER_END events if the length of
the data is zero, for both the receives and the sends.

This commit was SVN r10670.
2006-07-05 23:39:13 +00:00
Brian Barrett
47725c9b02 * Add new PML (CM) and network drivers (MTL) for high speed
interconnects that provide matching logic in the library.
  Currently includes support for MX and some support for
  Portals
* Fix overuse of proc_pml pointer on the ompi_proc structuer, 
  splitting into proc_pml for pml data and proc_bml for
  the BML endpoint data
* bug fixes in bsend init code, which wasn't being used by
  the OB1 or DR PMLs...

This commit was SVN r10642.
2006-07-04 01:20:20 +00:00
George Bosilca
940dbff0fa Add a new PERUSE macro. This is for the CONTINUE event (the one we added to the
standard). This macro allow us to specify the length of the fragment. Now we are
able to know how the message is fragmented between the network devices or inside
the communication protocol.

This commit was SVN r10508.
2006-06-26 20:08:33 +00:00
George Bosilca
a514cdc068 Always limit the size of the RDMA transfer to the maximum amount supported
by the BTL (btl_max_rdma_size). Now the PUT protocol is pipelined even if there
is just one network between the 2 peers. Unfortunately, this problem is present
the 1.1 (no pipeline for the PUT protocol).

This commit was SVN r10499.
2006-06-26 19:00:07 +00:00
George Bosilca
31365fa799 Use the RDMA limit not the eager one when we schedule a receive (for the
PUT protocol).

This commit was SVN r10456.
2006-06-21 15:51:56 +00:00
George Bosilca
ec28040c58 Remove all useless assignment (now they are done inside the macro). Protect one
call to the _UNPACK macro, in the case where the length of the received data
is zero. This might happens on the PUT protocol.

This commit was SVN r10431.
2006-06-20 14:16:52 +00:00
Brian Barrett
c9e8dbc10e * fix for multi-nic case with put protocol -- index will be 1 for the first
put request if we have more than one nic

This commit was SVN r10397.
2006-06-16 22:25:04 +00:00