Aurelien Bouteiller
e11237aadb
Introduction of the "progress" sender_based method to replace the slow isend-self method.
...
This commit was SVN r17998.
2008-03-27 21:19:45 +00:00
Aurelien Bouteiller
93db01871e
This is part of the previous patch.
...
This commit was SVN r17997.
2008-03-27 21:06:14 +00:00
Aurelien Bouteiller
f8bf6f2c6a
Code cleanup.
...
sender_based.h is now split in two files, to solve cyclic .h files inclusion.
Most macros are now inline functions.
Variable names have been changed from places to places.
Various other small things...
This commit was SVN r17996.
2008-03-27 21:05:44 +00:00
George Bosilca
be4b153f0d
Another patch for thread safety in the TCP BTL (thanks to Pierre).
...
This commit was SVN r17993.
2008-03-27 18:36:08 +00:00
Gleb Natapov
cf40674369
Decide if sends should be throttled at the receiver and pass this to the sender
...
in an ACK message. The decision can't be done reliably at the sender.
This commit was SVN r17987.
2008-03-27 08:56:43 +00:00
Rich Graham
e2ad9c4be2
adjust to change in orte_process_info.
...
This commit was SVN r17986.
2008-03-27 01:25:28 +00:00
Rich Graham
441fb9fb9e
checkpoint.
...
This commit was SVN r17985.
2008-03-27 01:16:32 +00:00
Ralph Castain
90107f3c14
Fix an issue with comm_spawn over who sent/recv first in the modex. The modex assumes that the first name on the list is the "root" that will serve as the allgather collector/distributor. The dpm was putting that entity last, which forced us to pre-inform the parent procs of the child proc's contact info since the parent was trying to send to the child.
...
Clarify the setting of send_first in the mpi bindings (trivial, i know, but helpful)
Remove the extra xcast of child contact info to the parent job.
This commit was SVN r17952.
2008-03-25 14:57:34 +00:00
Ralph Castain
cca449e379
Move an OMPI RML tag to the OMPI layer
...
This commit was SVN r17950.
2008-03-25 13:30:48 +00:00
Jeff Squyres
5320c91ab3
Oops -- fix the constructor to also use opal_object_t instead of
...
opal_list_item_t.
This commit was SVN r17945.
2008-03-25 11:59:50 +00:00
Galen Shipman
0116041133
BTL shouldn't own the passive side's descriptor in the PML get protocol. The BTL
...
doesn't know when to free it on the passive side.
This commit was SVN r17943.
2008-03-25 01:43:41 +00:00
Jeff Squyres
ebfdd133f5
AFACT, we never put endpoints on a list.
...
This commit was SVN r17940.
2008-03-24 18:32:55 +00:00
Ralph Castain
dc7f45dafd
Remove the obsolete and largely unused orte_system_info structure. The only fields that were used in that struct were nodeid and nodename - these have been transferred to the orte_process_info structure.
...
Only one place used the user name field - session_dir, when formulating the name of the top-level directory. Accordingly, the code for getting the user's id has been moved to the session_dir code.
This commit was SVN r17926.
2008-03-23 23:10:15 +00:00
Rich Graham
a7c836a2b0
fix location of the restrict key word.
...
Make the tag in the fan-in/fan-out algorithm be fragment based.
This commit was SVN r17903.
2008-03-21 01:40:36 +00:00
Rich Graham
2c66d396b7
take care of some bit-rot with the fanin-fanout method.
...
This commit was SVN r17902.
2008-03-21 01:08:49 +00:00
Rich Graham
b9520e61dc
get the sm optimized allreduce working for all but user defined
...
operations. Added to the reduction operations a set of reduction
functions that take 2 input buffers and one output buffer to avoid
some extra memory copies. These can't be used with user defined
operations. The intel c collective suite passes both original, and
new (new, not the user defined operations).
This commit was SVN r17901.
2008-03-20 23:51:16 +00:00
Galen Shipman
dcac824f59
Fix problem in releasing fragments during GET_END event (didn't check that
...
portals btl has ownership and therefor didn't free the frag as it should) this
causes leakage and hangs in MPI_Finalize.
Also added a bit more debugging.
This commit was SVN r17900.
2008-03-20 22:46:32 +00:00
George Bosilca
efa89bfa3f
Revert r17857. The context should be set in one case ... when we call prepare_{src|dst}
...
without calling a get or put. So, just keep it here until a better solution is
found.
This commit was SVN r17872.
The following SVN revision numbers were found above:
r17857 --> open-mpi/ompi@d460ccfbf9
2008-03-18 19:01:27 +00:00
George Bosilca
8943ae0b4e
Cleanup plus some typos.
...
This commit was SVN r17858.
2008-03-18 03:03:33 +00:00
George Bosilca
d460ccfbf9
No need to check for NULL there. The bml_btl is set correctly
...
on the upper level.
This commit was SVN r17857.
2008-03-18 03:02:31 +00:00
George Bosilca
39353ebb44
Cleanup.
...
This commit was SVN r17855.
2008-03-18 02:56:50 +00:00
George Bosilca
76deec135e
The .h file is not used anymore (it contain the descriptor cache). Update the
...
Makefile.am file as well.
This commit was SVN r17854.
2008-03-18 02:50:24 +00:00
George Bosilca
1d04ec4ded
Correct the connection logic for TCP. Now we have not only a cleaner
...
connection, but a more thread safe one. Thanks to Pierre for his
help on this.
This commit was SVN r17853.
2008-03-18 02:42:16 +00:00
Jeff Squyres
61290c0e51
Remove a useless file.
...
This commit was SVN r17852.
2008-03-18 01:50:47 +00:00
Ralph Castain
be7d0a8a4d
Fix a problem introduced by the conversion of orte_pointer_array to opal_pointer_array. We used to derive the app context's index from the returned index of the orte_pointer_array_add function - this parameter was lost in the transition to opal_pointer_array_add. As a result, we no longer knew the index of the app_context, so everything is launched with app0.
...
This commit was SVN r17851.
2008-03-17 23:48:10 +00:00
Edgar Gabriel
570bbea5e0
fixing the allgather problem reported on the mailing list. The problem was
...
that at one locatin we had the local-size instead of the remote size as a
receive argument.
This commit was SVN r17849.
2008-03-17 19:42:18 +00:00
Gleb Natapov
9b6db25182
Fix compilation warning.
...
This commit was SVN r17839.
2008-03-17 13:37:57 +00:00
Pavel Shamis
54ad8d7446
The issue was reported/fixed by Jon Mason one month ago but the fix was not committed. So I'm commiting it now.
...
This commit was SVN r17835.
2008-03-17 11:13:06 +00:00
Brad Penoff
be13b86fc5
Clarifying and fixing SCTP btl_sctp_if_11 parameter
...
This commit was SVN r17834.
2008-03-17 09:18:31 +00:00
Gleb Natapov
f488b94899
More SM BTL initialization cleanups.
...
This commit was SVN r17833.
2008-03-16 10:01:56 +00:00
Rich Graham
27182afb67
get the timers in correctly.
...
This commit was SVN r17832.
2008-03-16 03:25:16 +00:00
Rich Graham
afcd1016fd
move temp buffer allocation out of the iteration loop - i.e. always use the
...
same temp loop. The algorithm is rather synchronous already...
This commit was SVN r17831.
2008-03-16 03:20:46 +00:00
Rich Graham
a1766b29f6
fix some barrier addressing errors.
...
This commit was SVN r17830.
2008-03-15 22:46:19 +00:00
Rich Graham
0453e7d2f4
bug in management memory allocation - too much memory allocated.
...
This commit was SVN r17829.
2008-03-15 18:12:20 +00:00
Rich Graham
3c2f1eb8bf
reduce the number of temp buffers used.
...
This commit was SVN r17828.
2008-03-15 17:23:04 +00:00
Rich Graham
0f9d642d51
temp buffer pointers are computed when they are set up. A bit more
...
efficient, but more important, it is much easier to play around with
memory layout now.
This commit was SVN r17827.
2008-03-15 16:36:35 +00:00
Rich Graham
e3e336b5ab
check point
...
This commit was SVN r17826.
2008-03-15 13:31:21 +00:00
Jeff Squyres
6c77c995c2
Add missing dependencies in the static build case.
...
This commit was SVN r17825.
2008-03-15 12:11:36 +00:00
George Bosilca
5e229fe688
Thanks Ma for the patch. Correct the multi-rail support and
...
rename some fields to something more clear.
This commit was SVN r17824.
2008-03-14 19:17:28 +00:00
George Bosilca
ecebd5ae77
Update the Elan BTL to take in account multiple networks, and correctly deal
...
with the node position in the network.
This commit was SVN r17822.
2008-03-14 17:32:35 +00:00
Gleb Natapov
772772b944
Remove unneeded include.
...
This commit was SVN r17813.
2008-03-12 10:01:20 +00:00
Gleb Natapov
90c70e37b9
Clean up SM btl startup code. Remove no longer needed code leftovers from two
...
BTL times. Remove old and no longer correct comment.
This commit was SVN r17805.
2008-03-11 14:39:10 +00:00
Gleb Natapov
3a9652ffc4
Endpoint array may not exist if in add_proc() we failed to find suitable
...
btl for communication with a proc. Don't segfault in this case.
This commit was SVN r17804.
2008-03-11 08:13:37 +00:00
Gleb Natapov
ffa09c44fd
Pass correct pointer to mpool_base function.
...
This commit was SVN r17795.
2008-03-09 13:22:12 +00:00
Gleb Natapov
b0b21c68b4
Remove trailing spaces from SM BTL.
...
This commit was SVN r17794.
2008-03-09 13:17:13 +00:00
Rich Graham
ebcf928c24
add some diagnostics.
...
This commit was SVN r17789.
2008-03-07 22:27:41 +00:00
Rich Graham
9131461511
move some test code to another machine.
...
This commit was SVN r17785.
2008-03-07 19:18:02 +00:00
Rich Graham
c230b65543
fix a couple of bugs. Recursive doubling seems to be working.
...
This commit was SVN r17777.
2008-03-07 02:51:38 +00:00
Rich Graham
70157166f9
checkpoint - compiles, now neeed to debug.
...
This commit was SVN r17775.
2008-03-07 00:39:59 +00:00
Ralph Castain
b110a247be
Fix comm_spawn (maybe).
...
Comm_spawn was sticking during spawn_multiple because of a problem in the dpm - the modex there is asking processes to talk to each other in an allgather_list operation, but the procs don't have the required contact info to do so. The solution here was to ensure that all parent procs have full contact info for procs in the child job.
Admittedly, this isn't the long-term answer. We would like to have the contact info given to only the parent procs that were involved in the comm_spawn. There is a way to do that, but this will suffice to keep things working until that can be implemented and tested.
This commit was SVN r17772.
2008-03-06 21:56:00 +00:00