1
1

254 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
2e042c91cf Once we compute the local offset use it (instead of the global one).
This commit was SVN r13634.
2007-02-13 09:34:04 +00:00
Gleb Natapov
1033002595 Fix memory leak. Free allocated descriptor if operation cannot proceed.
This commit was SVN r13610.
2007-02-12 09:47:51 +00:00
Brian Barrett
041beeb1b6 Share currently selected PML in the modex information, then check whenever
adding new procs that the remote proc's pml is the same as our local pml.
Turns the hangs from mismatched PMLs into an abort, which is better,
I think.

This commit was SVN r13582.
2007-02-09 16:38:16 +00:00
Galen Shipman
ec610a9e65 spread priorities out a bit..
This commit was SVN r13487.
2007-02-04 00:55:25 +00:00
Galen Shipman
a94101fa62 mostly another hack around for PML selection, allows CM be select itself if an
MTL is available, if not OB1 is used. Still prevents DR and OB1 from stomping
on each other though. 

This commit was SVN r13481.
2007-02-03 02:01:18 +00:00
George Bosilca
0ff2115964 Other warnings are now silenced.
This commit was SVN r13462.
2007-02-02 06:47:35 +00:00
Rainer Keller
ca35881cd0 - Minor bugfixes and removed compiler warnings
This commit was SVN r13343.
2007-01-28 19:52:09 +00:00
Jeff Squyres
52ca6cf86c The mpi_leave_pinned and mpi_leave_pinned_pipeline MCA parameters were
needlessly registered in multiple different places, and none of them
had a good help string.  There was also an inconsistent check for
setting both mpi_leave_pinned and mpi_leave_pinned_pipeline (i.e., it
was only in ob1).  This commit moves the registration of these params
to one central place (ompi/runtime/ompi_mpi_params.c, with all other
mpi_* MCA params) and uses globals to propagate the values as
relevant.  The error check was also moved to the central location to
ensure that we can consistency everywhere.

This commit was SVN r13226.
2007-01-21 14:02:06 +00:00
Gleb Natapov
4c7dbd36c7 Balance RDMA operation in round robin fashion between all available RDMA BTLs.
OB1 always use first element from array of BTLs available for RDMA. The patch
change the array creation algorithm, it puts different BTL in the first element
in round robin fashion.

This commit was SVN r13174.
2007-01-18 09:15:18 +00:00
Brian Barrett
a34e67d743 Remove unneeded PARAM_INIT_FILE variable in configure.params files used by
components that use configure.m4 for configuration or are always built. 
The macro has not been needed since moving to configure types other than
configure.stub

Fixes trac:590

This commit was SVN r13031.

The following Trac tickets were found above:
  Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
2007-01-08 03:44:22 +00:00
Brian Barrett
8900d3ae43 Second take at fixing the issues with using ompi_ptr_t. Add helper functions for converting from .pval to .lval and vice-versa. Users of ompi_ptr_t types should only use one of the fields in the union unless using the helper conversion functions. For the BTLs, local pointers will always be stored in the .pval field and remote pointers always stored in the .lval field.
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.

Refs trac:587

This commit was SVN r13023.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-07 01:48:57 +00:00
Brian Barrett
48ec0b2071 Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix
for now...

This commit was SVN r12997.

The following SVN revision numbers were found above:
  r12974 --> open-mpi/ompi@27cea44a9c
2007-01-04 22:07:37 +00:00
Galen Shipman
931a389c4f fix deadlock on rendezvous protocol..
This commit was SVN r12982.
2007-01-04 03:46:11 +00:00
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
Gleb Natapov
a6127fd8ce Increase req_bytes_delivered atomically.
This commit was SVN r12971.
2007-01-03 15:19:34 +00:00
Gleb Natapov
79202561f6 Don't check req_pipeline_depth on frag completion. Checking of
req_bytes_delivered should be enough.

This commit was SVN r12967.
2007-01-03 14:44:20 +00:00
Gleb Natapov
1ad6c41735 Sender can start scheduling send fragments immediately after receiving ACK. No
need to wait for RNDV completion.

This commit was SVN r12965.
2007-01-03 12:37:11 +00:00
George Bosilca
0b5d879a63 ompi_convertor_pack do not return errors (all checkings are done when the
convertor is created).

This commit was SVN r12940.
2006-12-29 07:40:02 +00:00
George Bosilca
d8db9e49f3 Set the bml_btl to NULL or segfault !!!
This commit was SVN r12939.
2006-12-29 07:38:24 +00:00
Brian Barrett
2ab65eb521 Remove some debugging output that was #if 0'ed out but shouldn't have been
committed into the trunk anyway

This commit was SVN r12897.
2006-12-19 02:34:41 +00:00
Gleb Natapov
190e7a27cd Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma).
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).

This commit was SVN r12878.
2006-12-17 12:26:41 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
Brian Barrett
441432950f Merge in changes from the bwb-heterogeneous temp branch (r12491 -
r12714) for supporting compilers / architectures with different
padding rules.

This commit was SVN r12749.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r12491
  r12714
2006-12-04 20:11:42 +00:00
Gleb Natapov
30ca7457b4 Some BTLs (e.g TCP) can report put/get completion before data actually
hits the buffer on the other side. For this kind of BTLs we need to send
FIN through the same BTL, PUT was performed with so network will handle
ordering for us. If we will use another BTL, receiver can get FIN before
data will hit the buffer and complete request prematurely. We mark such
problematic BTLs with MCA_BTL_FLAGS_FAKE_RDMA flag (this kind of RDMA
is really fake, because the real one guaranties that sender will see the
completion only after receiver's NIC confirmed that all the data was
received).

This commit was SVN r12732.
2006-12-03 10:12:09 +00:00
Gleb Natapov
39c930b160 The bug fixing part of r12720 introduce much more serious bug that it fixes.
It calls mca_pml_ob1_send_fin_btl() which may fail and doesn't check return
code. This breaks all RDMA transports event when only one BTL is used. Revert
it for now, I am working on a real fix for the problem (I hope).

This commit was SVN r12731.

The following SVN revision numbers were found above:
  r12720 --> open-mpi/ompi@3e3689320b
2006-12-03 08:55:59 +00:00
Gleb Natapov
65d7ad4581 The "bug fix" from the r12721 reverts part of the r12433 that fixed
regresion from v1.1 was reviewed and put to v1.2 branch. So revert this part
of r12721 back.

This commit was SVN r12730.

The following SVN revision numbers were found above:
  r12433 --> open-mpi/ompi@82f7c0dd69
  r12721 --> open-mpi/ompi@3edd850d2e
2006-12-03 08:29:55 +00:00
George Bosilca
3edd850d2e Some indentation and code arrangement. However, there is a bug fix. Force the PUT
protocol to always obey to the btl_max_rdma_size.

This commit was SVN r12721.
2006-12-01 22:26:14 +00:00
George Bosilca
3e3689320b Some indentations and one BIG fix. Avoid race conditions on the PUT RDMA
protocol when multiple NICS are available between 2 peers. The fix force
the FIN message to take exactly the same path as the fragment it describe
(i.e. same path means same BTL). Otherwise, the FIN can be received by
the peer before the RDMA complete and the request will get freed
too early.

This commit was SVN r12720.
2006-12-01 21:52:07 +00:00
Andrew Friedley
a4bdcb4faa Fix a segfault that turned up in more MPI_THREAD_MULTIPLE testing.
Same sort of problem and fix as described in r12323 - mca_pml_ob1_recv_frag_progress() was segfaulting due to a NULL req_proc pointer.  The path leading to this was through the mca_pml_ob1_check_cantmatch_for_match() function, where we can match a frag using the same macros as mca_pml_ob1_frag_match() and never initialize the req_proc pointer.

This commit was SVN r12582.

The following SVN revision numbers were found above:
  r12323 --> open-mpi/ompi@c752502dee
2006-11-13 20:12:51 +00:00
George Bosilca
eab1776e9a Explicit casts for our friendly Windows environment...
This commit was SVN r12496.
2006-11-08 17:02:46 +00:00
George Bosilca
915d748d72 Initialize the convertor on _START not on _INIT. This allow us to
set it up before the match when we know the peer, saving some
time on the critical path. If the receive is ANY_SOURCE then
we initialize the convertor on _MATCHED. Anyway, we will set it
up only once per receive.

This commit was SVN r12484.
2006-11-08 05:42:29 +00:00
Gleb Natapov
82f7c0dd69 Fix regression from v1.1.
1) make the code do what comment says
2) if memory is prepinned don't send multiple PUT messages.

This commit was SVN r12433.
2006-11-06 12:00:17 +00:00
Gleb Natapov
7b39039cd6 Add comments to process_pending functions.
This commit was SVN r12346.
2006-10-29 09:12:24 +00:00
Gleb Natapov
8ef5b6a589 Change tabs to spaces to be consistent with the rest of the file.
This commit was SVN r12345.
2006-10-29 08:12:44 +00:00
George Bosilca
a9c6ae8f15 Minimize the number of branches, and orce the correct prediction for the
most usual one. Most of the time we expect the functions which allocate
requests to succeed.

This commit was SVN r12344.
2006-10-27 23:16:13 +00:00
George Bosilca
44f3dd81b4 Update the comment to reflect what's inside the code.
This commit was SVN r12343.
2006-10-27 23:09:37 +00:00
George Bosilca
3472d19d4d Do not modify the convertor if there is no data to be send across the network. The
req_bytes_packed field is initialized in the BASE_INIT macro, so it is set for
all requests at this stage.

This commit was SVN r12342.
2006-10-27 23:03:15 +00:00
Jeff Squyres
020efdf1f9 Refs trac:250
This commit essentially caches the invoking comm/win/file on the
ompi_request_t. This, paired with the req_type field, allows us to
retrieve the invoking MPI object and invoke the proper errhandler.

The patch is missing most updates for the MPI-2 one-sided stuff (i.e.,
the patch mainly fixes comms and files); I didn't really understand
that code and didn't want to hazard trying to figure it out when Brian
can probably do it much more quickly.

So #250 will still stay open, pending MPI-2 one-sided updates for this
stuff.

This commit was SVN r12339.

The following Trac tickets were found above:
  Ticket 250 --> https://svn.open-mpi.org/trac/ompi/ticket/250
2006-10-27 12:35:27 +00:00
George Bosilca
126a68dc9a Big datatype commit. Remove all unused features of the datatype engine. As the memory
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).

This commit was SVN r12331.
2006-10-26 23:11:26 +00:00
Andrew Friedley
c752502dee Fix for a common race condition when running the Sandia mt_send_recv.cc test.
A segfault would occur in mca_pml_ob1_recv_request_progress() when trying to prepare the convertor for unpacking, because the request's req_proc field was NULL.

Turns out that we weren't setting the req_proc field in the MCA_PML_OB1_CHECK_SPECIFIC_AND_WILD_RECEIVES_FOR_MATCH macro.  Instead of just setting it there I removed the other place req_proc was being set correctly, and instead took care of all the cases at once in mca_pml_ob1_recv_frag_match().

This commit was SVN r12323.
2006-10-26 19:09:39 +00:00
Gleb Natapov
90be664b9f Some process_pending() functions get bml_btl on which resource was freed as a
parameter. For optimisation purpose only this BTL is used to send packet
through instead of trying to send packets through all BTLs. But actually the 
code was wrong. It simply used provided bml_btl and it may represent different
endpoint from packet's destination. The fixed code checks if packet's
destination is reachable through the BTL, finds appropriate bml_btl and only
then tries to send it through correct bml_btl.

This commit was SVN r12319.
2006-10-26 13:21:47 +00:00
Sven Stork
f3f39e003e - Increment the pipeline depth before we trigger the send function. As
mentioned in the comment the completion/callback of the triggered 
  send operation can happen before the call returns. If this happens and
  if the pipeline depth is 0 before we triggered the send operation and 
  this is the last send operation of the request then the completion detection
  code will decrement the pipeline depth and check it for equality to 0.
  Because (0-1) != 0 the pml completion function for this request will 
  *not* be called.
  This part 2 of the fix for ticket #246.

This commit was SVN r12292.
2006-10-25 08:52:39 +00:00
George Bosilca
06563b5dec Last set of explicit conversions. We are now close to the zero warnings on
all platforms. The only exceptions (and I will not deal with them
anytime soon) are on Windows:
- the write functions which require the length to be an int when it's
  a size_t on all UNIX variants.
- all iovec manipulation functions where the iov_len is again an int
  when it's a size_t on most of the UNIXes.
As these only happens on Windows, so I think we're set for now :)

This commit was SVN r12215.
2006-10-20 03:57:44 +00:00
Rainer Keller
47b24a0603 - Now the branch is done, linearize access regarding
request handling.  Buys a little bit on IMB, no
   functional change, otherwise.

This commit was SVN r12165.
2006-10-18 16:11:50 +00:00
George Bosilca
8852c00c36 Look like a big commit but in fact it address only one issue. The way we're working with
size and diplacement of data-type. After this patch all data can contain size_t bytes
and the displacements are defined as ptrdiff_t. All of the files I was able to compile
have been modified to match this requirement.

This commit was SVN r12146.
2006-10-17 20:20:58 +00:00
George Bosilca
ed83927025 Don't reset the convertor when a persistent request complete. Instead reset it
next time then request is used. This will keep the execution path on the default
case (not persistent) shorter.

This commit was SVN r12134.
2006-10-17 05:01:47 +00:00
George Bosilca
ef66afe45c Another inner loop optimization. Only check for num_fails when prev_bytes is
equal to num_bytes.

This commit was SVN r12133.
2006-10-17 04:38:38 +00:00
George Bosilca
7c8c8d6a46 Keep the critical path as short as possible.
This commit was SVN r11881.
2006-09-28 23:59:24 +00:00
George Bosilca
8d2a8229bb We don't use the send and receive request destructor.
This commit was SVN r11880.
2006-09-28 23:57:49 +00:00
George Bosilca
7f2fd41ace Make sure we trigger the PERUSE event before releasing the request.
This commit was SVN r11879.
2006-09-28 23:54:38 +00:00