1
1
Граф коммитов

140 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
688a16ea78 A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was
long ago) supposed to be used as a cache for accessing the PML procs. But in
all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc.
This pointer can be accessed using the c_remote_group easily. Therefore, there is no
meaning of keeping the PML procs around. Slim fast commit ...

This commit was SVN r11730.
2006-09-20 22:14:46 +00:00
Andrew Friedley
e776b01811 This assert fails if -mca pml_dr_enable_csum 0 is set, which isn't what we want..
This commit was SVN r11719.
2006-09-19 19:57:33 +00:00
George Bosilca
e33c35112b Correct the conversion between int and bool. Apply it on all files except
the one that will be modified by Ralph for the ORTE 2.0. The missing ones
are in the rsh PLS.

This commit was SVN r11476.
2006-08-28 18:59:16 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00
George Bosilca
6afa4c6c64 Windows friendly version. We have to split the OMPI_DECLSPEC in at least 3
different macros, one for each project. Therefore, now we have OPAL_DECLSPEC,
ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project.

This commit was SVN r11270.
2006-08-20 15:54:04 +00:00
Galen Shipman
7473d04a9a Simple failover is working.. ;-)
This commit was SVN r11237.
2006-08-16 22:32:18 +00:00
Galen Shipman
e809a442e7 add the error handler registration to OB1..
This commit was SVN r11234.
2006-08-16 20:56:22 +00:00
Galen Shipman
3b49953ce2 Add error callback to the btl interface, this allows error to be delivered to
the upperlayer assynchronously although there are some issues with this.. such
as there are multiple consumers of the btl's.. who get's the

This commit was SVN r11232.
2006-08-16 20:21:38 +00:00
Galen Shipman
84e7b90a19 Fix DR PML after the great MTL crusade.. Added a bit of debugging while I was
in there trying to track things down.. 

This commit was SVN r11208.
2006-08-15 21:44:55 +00:00
Brian Barrett
f6e7e11ee6 Fixes truncate error (ticket #172) for the DR PML and therefore closes trac:172.
We now set truncation error if we received more than we delivered for both
the OB1 and DR PMLs (the CM PML doesn't need such a fix, as the condition
is set at the MTL level)

This commit was SVN r10812.

The following Trac tickets were found above:
  Ticket 172 --> https://svn.open-mpi.org/trac/ompi/ticket/172
2006-07-14 19:45:51 +00:00
George Bosilca
476c9e64df Don't keep multiples copies of the datatype and count. The only one we really need
is the one provided by the user. For the buffered send the real datatype used
for the communication is always MPI_BYTE and the count can be retrieved from
the req_bytes_packed field. This will decrease the size of the request by
one pointer and one size_t (8 bytes or 16 bytes depending on the architecture).

This commit was SVN r10680.
2006-07-06 17:58:25 +00:00
Brian Barrett
47725c9b02 * Add new PML (CM) and network drivers (MTL) for high speed
interconnects that provide matching logic in the library.
  Currently includes support for MX and some support for
  Portals
* Fix overuse of proc_pml pointer on the ompi_proc structuer, 
  splitting into proc_pml for pml data and proc_bml for
  the BML endpoint data
* bug fixes in bsend init code, which wasn't being used by
  the OB1 or DR PMLs...

This commit was SVN r10642.
2006-07-04 01:20:20 +00:00
Galen Shipman
e6cd8db0e5 DR will now checksum on a per btl basis (see MCA_BTL_FLAGS_NEED_CSUM). We
still always send ACK's, teasing apart completion for ACK/no ACK looks like a
pain in the .. 

This commit was SVN r10530.
2006-06-27 20:23:47 +00:00
Galen Shipman
8855e5b73a Fixes for DR as well as better diagnostic..
Successfully passing the intel test suite with/without induced errors/drops. 

This commit was SVN r10518.
2006-06-26 22:29:29 +00:00
George Bosilca
27000ef7d6 More compact and readable code. Otherwise, no big difference with the
previous version.

This commit was SVN r10389.
2006-06-16 03:07:42 +00:00
George Bosilca
3f96f39e46 If the goal of this code was to copy the iovec and skip the first offset
bytes then it was not correct.

This commit was SVN r10388.
2006-06-16 03:06:30 +00:00
George Bosilca
93afe59226 It is not required to initialize the csum.
This commit was SVN r10387.
2006-06-16 03:05:20 +00:00
George Bosilca
1f96768b76 For zero length persistent request do not reposition the convertor as
it is not initialized.

This commit was SVN r10386.
2006-06-16 03:04:41 +00:00
George Bosilca
3727fa2ae6 Nothing relevant. I add some more output in the case we have a checksum error.
Just to be able to know more information about the failure.

This commit was SVN r10337.
2006-06-13 19:36:38 +00:00
Galen Shipman
218a438509 finished the ompi_free_list_t class nightmare..
This commit was SVN r10314.
2006-06-12 22:09:03 +00:00
Galen Shipman
18dda70fd0 make ompi_free_list_item_t a class..
This will go to the 1.1 branch but will probably require a few changes as
ompi_free_list_t is different in the branch.. 

This commit was SVN r10306.
2006-06-12 16:44:00 +00:00
Galen Shipman
84479d0b5a potential fix for iprobe test,, tested with openib.. will have andy try ud..
This commit was SVN r10232.
2006-06-06 22:10:41 +00:00
Brian Barrett
c70fff6ed0 * Fix for bug #44 for the trunk -- remove a bunch of warnings from the DR
PML when compiling on Solaris.  Patch won't apply cleanly to the v1.1
  branch, so a diff for that is coming up soon.

This commit was SVN r10173.
2006-06-01 18:58:38 +00:00
George Bosilca
b8ef0cc749 Minor cleanups.
This commit was SVN r10001.
2006-05-21 05:55:21 +00:00
Galen Shipman
9165882c07 fixes for failover...
This commit was SVN r9998.
2006-05-20 02:39:05 +00:00
Tim Woodall
d8ff8010f3 track wether the vfrag is being retransmitted
This commit was SVN r9817.
2006-05-04 17:30:58 +00:00
Tim Woodall
1b26caa95b first cut at btl failover - seems to be working for simple test case
This commit was SVN r9816.
2006-05-04 16:16:26 +00:00
Galen Shipman
ba0aa46220 make csum's optional in pml dr, on by default, see mca param
pml_dr_enable_csum

This commit was SVN r9608.
2006-04-10 21:54:46 +00:00
Galen Shipman
c29db49198 return out if we ack a duplicate matched rendezvous from mathed receives
sequence tracker and the communicator is null.. 

This commit was SVN r9521.
2006-04-03 21:04:51 +00:00
Galen Shipman
1d67917b69 must handle header validation correctly for each case, not enough in common
for the MACRO 

This commit was SVN r9486.
2006-03-30 21:27:21 +00:00
Tim Woodall
9a73fe8beb check for valid sequence number before attempting to use communicator
This commit was SVN r9482.
2006-03-30 19:36:15 +00:00
Galen Shipman
641fa6c0d2 more fixes, reset state on completion..
This commit was SVN r9469.
2006-03-29 22:21:35 +00:00
Galen Shipman
5271948ec0 --- opal object changes
add object size to opal class
no longer need the size when allocating a new object as this is stored in
the class structure

--- dr changes 
Previous rev. maintained state on the communicator used for acking duplicate
fragments, but the communicator may be destroyed prior to successfull
delivery of an ack to the peer. We must therefore maintain this state
globally on a per peer, not a per peer, per communicator basis. 
This requires that we use a global rank on the wire and translate this as
appropriate to a local rank within the communicator. 

This commit was SVN r9454.
2006-03-29 16:19:17 +00:00
George Bosilca
5d465cf118 Call the constructor on the DR lock.
This commit was SVN r9438.
2006-03-28 07:34:02 +00:00
Graham Fagg
19906e66dc missing lock?
This commit was SVN r9436.
2006-03-28 06:15:48 +00:00
Tim Woodall
c724e4c804 - removed unused flags
- updated copyrights

This commit was SVN r9430.
2006-03-27 22:44:26 +00:00
Galen Shipman
1677ca1cd4 continue to debug retransmission of incorrect offset,
only occurs on vfrag timeout.. 

This commit was SVN r9421.
2006-03-24 22:28:43 +00:00
Tim Woodall
2e376e0ee8 misc cleanup
This commit was SVN r9410.
2006-03-24 06:49:45 +00:00
Tim Woodall
1aaad721e8 clear state on rndv ack
This commit was SVN r9404.
2006-03-23 23:36:07 +00:00
Galen Shipman
19732d4c7c add length to frag_ack
This commit was SVN r9403.
2006-03-23 23:06:19 +00:00
Tim Woodall
0fa49f1297 set requests vfrag id when matched
This commit was SVN r9402.
2006-03-23 23:04:20 +00:00
Galen Shipman
3595cd8956 use hdr_match..
This commit was SVN r9401.
2006-03-23 22:21:15 +00:00
Galen Shipman
bec2ee346c use correct ack for rendezvous from seq tracker
This commit was SVN r9400.
2006-03-23 22:18:09 +00:00
Tim Woodall
996a1b56df more tweaking
This commit was SVN r9399.
2006-03-23 22:08:59 +00:00
Galen Shipman
c38fd90e63 need state to ack sync send retransmits, even after the recvreq is gone..
This commit was SVN r9397.
2006-03-23 22:02:59 +00:00
Galen Shipman
754b424266 set vf_mask_pending when retransmitting so completion will occur before
the request is completed.. 

This commit was SVN r9394.
2006-03-23 20:28:52 +00:00
Galen Shipman
e01cf0a166 Seperate out sequence tracking list as stand alone class.
This commit was SVN r9391.
2006-03-23 17:02:17 +00:00
Tim Woodall
d9dc534c08 fix bogus comment
This commit was SVN r9388.
2006-03-23 16:41:37 +00:00
Tim Woodall
28fa260404 for frag case don't use retrans flag, simply
retransmit all segments of vfrag that have not been acked

This commit was SVN r9387.
2006-03-23 16:36:13 +00:00
Tim Woodall
dc125cf7d5 misc corrections
This commit was SVN r9380.
2006-03-23 15:11:06 +00:00
Galen Shipman
70cf1ce562 more work in progress..
This commit was SVN r9369.
2006-03-22 23:06:18 +00:00
Tim Woodall
0f6161c6da reorg
This commit was SVN r9366.
2006-03-22 15:02:36 +00:00
Galen Shipman
bcb23dc762 rework rndv and eager data timeout/retrans
This commit was SVN r9358.
2006-03-21 21:23:33 +00:00
Tim Woodall
7a1ad5b6fb corrections to scheduling logic
This commit was SVN r9354.
2006-03-21 14:30:54 +00:00
Tim Woodall
797a6b2887 dont compute checksum over header - data only
This commit was SVN r9343.
2006-03-20 23:08:14 +00:00
Galen Shipman
fc42320ea6 check retry counts on NAK retrans as well as timeouts
This commit was SVN r9342.
2006-03-20 22:11:23 +00:00
Galen Shipman
ca13833e95 more dr work
This commit was SVN r9340.
2006-03-20 21:57:30 +00:00
Galen Shipman
5600932c2f fix misc warnings
This commit was SVN r9339.
2006-03-20 15:41:45 +00:00
Tim Woodall
bd870519fd - modified convertor copy_and_prepare routines to accept an addition
flag, new flags to be included when convertor is initialized
- modified pml/btl module defs and added stub functions for diagnostic
  output routines to dump state of queues / endpoints
- updates to data reliability pml

This commit was SVN r9329.
2006-03-17 18:46:48 +00:00
Galen Shipman
a465047e97 enable timeouts and retransmissions
This commit was SVN r9322.
2006-03-16 22:33:08 +00:00
George Bosilca
229f26dc55 First split of the datatype. More files and a cleaner distribution of functions
in the corresponding files. There are few others changes to come ...

This commit was SVN r9319.
2006-03-16 21:04:34 +00:00
Galen Shipman
3c9ce06f59 Use new csum routines
This commit was SVN r9318.
2006-03-16 20:26:33 +00:00
Galen Shipman
ff75de8c52 more dr work, add destination check on all receives, misc
This commit was SVN r9317.
2006-03-16 19:38:21 +00:00
Tim Woodall
178d8ea905 use consistent macros for csum
This commit was SVN r9294.
2006-03-16 00:20:43 +00:00
George Bosilca
612570134f The request management framework has been redesigned. The main idea is
to let the PML (or io, more generally the low level request manager)
to have it's own release function (what was before the req_fini). This
function will only be called from the low level while the req_free will
be called from the upper level (MPI layer) in order to mark the request
as not used by the user anymore.

From the request point of view the requests will be marked as inactive
everytime we read their status (true for persistent as well). As 
MPI_REQUEST_NULL is already marked as inactive, the test and wait functions
are simpler. The drawback is that now we have to change in the
ompi_request_{test|wait} the req_status of the request once we get it's
status.

This commit was SVN r9290.
2006-03-15 22:53:41 +00:00
Tim Woodall
92c5e26758 correct scheduling
This commit was SVN r9277.
2006-03-14 18:25:25 +00:00
Galen Shipman
5531baaec6 fix warnings, generalize acked datastructure, allows for easier external
testing. 

This commit was SVN r9212.
2006-03-06 23:18:26 +00:00
George Bosilca
1d0e378df3 icc complain about a missing return.
This commit was SVN r9211.
2006-03-06 21:42:07 +00:00
Tim Woodall
d350232c04 work in progress
This commit was SVN r9209.
2006-03-06 19:30:37 +00:00
Tim Woodall
0ef924769a minor edits
This commit was SVN r9205.
2006-03-06 16:32:36 +00:00
Tim Woodall
274ee03df6 work in progress
This commit was SVN r9192.
2006-03-04 00:36:16 +00:00
Galen Shipman
4e430b0428 fix warnings, other misc
This commit was SVN r9190.
2006-03-03 04:01:10 +00:00
Galen Shipman
84d3055db5 Make sure everything is imediatly acked, even if not matched
Buffer first descriptor on the sendreq until postive ACK 
Set bytes delivered only after postive ACK, removed num_acks, etc, in general
trying to remove as much state as possible so that rolling things back isn't
such a nightmare 

This commit was SVN r9187.
2006-03-01 22:37:10 +00:00
Galen Shipman
d9fd35d399 add acked items to datastructure,
fix compile issue. 

This commit was SVN r9178.
2006-02-28 01:07:35 +00:00
Galen Shipman
c6b4cc4417 Add data structure to track ACK's
This commit was SVN r9177.
2006-02-27 22:56:43 +00:00
Galen Shipman
db6b1db548 use pml level datatype, someone else already cleaned this up in ob1.
This commit was SVN r9174.
2006-02-27 18:20:49 +00:00
Galen Shipman
2aa7b129a6 don't use ptl datatypes!
This commit was SVN r9173.
2006-02-27 18:07:38 +00:00
Brian Barrett
285581dff2 More endian-related cleanups:
- moved hton64 and ntoh64 from the bunch of places it had been copied
    into one header file
  - properly set and use the btl_tcp's nbo option to put things in
    network byte order on the wire if both sides don't have the same
    endianness
  - Put the OB1 PML's headers (with a couple exceptions I need to discuss
    with Tim) in network byte order on the wire if both sides don't have
    the same endianness
  - since it was needed for the TCP BTL, move the orte_process_name_t
    HTON and NTOH macros from the TCP OOB to ns_types.h

This commit was SVN r9145.
2006-02-26 00:45:54 +00:00
Galen Shipman
05140c5f8f Rework the data reliability PML, still needs quite a bit of work,
working on creating a uniform retransmission mechanism otherwise each type of
send ends up needing a special case for retransmission. 
Removed NACK for individual transmissions, we just aggregate these and send
them at the end of a vfrag 

This commit was SVN r9141.
2006-02-24 17:08:14 +00:00
Brian Barrett
2eb76ff0cd * finish the TEG/UNIQ/PTL removal
This commit was SVN r9118.
2006-02-23 00:39:01 +00:00
Galen Shipman
0bc3cbf0db Corrections to pml_dr, now passes intel test suite (p2p_c).
Note, the checksums are not enabled currently, setting to zero as the
convertor is not ready for checksums yet. 

Also, we can't call unpack/pack on convertor with 0 bytes, otherwise it
crashes. 

This commit was SVN r9062.
2006-02-16 16:15:16 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Rainer Keller
60c2ae768b - Change the spacing preventing finding the struct from script.
This commit was SVN r8819.
2006-01-26 11:55:00 +00:00
George Bosilca
d247436bea Make the opal_atomic happy by using a signed int instead of an unsigned one.
This commit was SVN r8759.
2006-01-19 19:54:51 +00:00
Tim Woodall
0c57e2d091 correct typo
This commit was SVN r8593.
2005-12-22 14:28:13 +00:00
Jeff Squyres
efe84971ce Correct copyrights and some typos
This commit was SVN r8588.
2005-12-22 05:37:28 +00:00
Tim Woodall
1f9a559245 mising include
This commit was SVN r8579.
2005-12-21 14:26:56 +00:00
Tim Woodall
8c1027d974 first cut at ack/retrans protocol
This commit was SVN r8570.
2005-12-20 21:42:58 +00:00
George Bosilca
c6eb429a9a Wondows work:
- remove windows socket initialization (it's already in the TCP component)
 - protect all used header files
 - remove the unused ones.

This commit was SVN r8434.
2005-12-10 21:38:48 +00:00
Tim Woodall
b06335abe2 start of a pml for data reliability
This commit was SVN r8236.
2005-11-22 17:24:47 +00:00