1
1

34 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
cc65814969 And set the message size before the first use too.
This commit was SVN r14159.
2007-03-28 18:01:13 +00:00
George Bosilca
b540545fa7 Set the communicator size before using it.
This commit was SVN r14158.
2007-03-28 17:59:21 +00:00
Jelena Pjesivac-Grbovic
d6402b6898 Adding in-order binary tree algorithm for non-commutative reduce operations.
I tested algorithm with intel and ibm tests and it passed again - so it should work.

This commit was SVN r14068.
2007-03-19 21:03:57 +00:00
Jelena Pjesivac-Grbovic
0c07654c30 Updating reduce_scatter decision function based on MX results up to 64 nodes and both 1ppn and 2ppn
configurations.

This commit was SVN r13945.
2007-03-07 00:38:33 +00:00
Jelena Pjesivac-Grbovic
e5ed167a6e Adding tuned version of reduce_scatter implementation.
Currently 3 algorithms are available:
- non-overlapping, reduce + scatterv, (works for non-commutative operations)
- recursive halving algorithm (copied from basic module)
- ring algorithm  (similar to allreduce ring, for large messages)

This commit was SVN r13929.
2007-03-05 20:40:39 +00:00
Li-Ta Lo
196e2a86bb addes binomial tree based scatter, passed IBM and intel tests
This commit was SVN r13906.
2007-03-02 23:19:02 +00:00
Li-Ta Lo
c5d8c221b0 added binomial tree based Gather alogrithm, passed IBM and Intel tests
This commit was SVN r13835.
2007-02-28 01:11:01 +00:00
Jelena Pjesivac-Grbovic
627533fe4a Adding segmented ring algorithm for Allreduce for commutative operations.
Algorithm allows user to specify the segment size to be used for computation/communication overlap.
The additional memory requirement for the algorithm is 2 x segment size.
It performed well for (really) large message sizes over MX and it passed intel Allreduce_c and Allreduce_loc_c tests.

This commit was SVN r13832.
2007-02-27 20:32:30 +00:00
George Bosilca
bec20422ee Remove the warnings about printf data-type mismatch.
This commit was SVN r13804.
2007-02-26 22:20:35 +00:00
Jelena Pjesivac-Grbovic
d2d02642ca Removing compilation warnings about the output format.
This commit was SVN r13693.
2007-02-16 23:32:47 +00:00
Jelena Pjesivac-Grbovic
b52dc9e427 Modifying fixed decision function for reduce to utilize linear algorithm only for really small communicator sizes.
This commit was SVN r13597.
2007-02-10 00:31:10 +00:00
Jelena Pjesivac-Grbovic
afbd032ff9 Removing compiler warnings about comparison of unsigned values to signed ones, and
unused variables.

This commit was SVN r13258.
2007-01-23 21:10:07 +00:00
Jelena Pjesivac-Grbovic
568477ade8 Adding new Allreduce algorithms, updating allreduce decision function, and cleaning up util.
- Allreduce algorithms:
  - Recursive doubling is used for small messages (up to 10KB) and can be used for 
    both commutative and non-commutative operations.  
	 Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and
	 Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with 
	 MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes.
  - Ring algorithms performs well for larger messages but cannot be used for 
    non-commutative operations.  It passed the same tests as recursive doubling, except
	 some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c
	 (which was expected).
- MPI_Allreduce with new decision function passed all of the tests mentioned above.
- Cleaning up coll_tuned_util.  Moving isendrecv to static inline just like sendrecv. 

This commit was SVN r13252.
2007-01-23 01:19:11 +00:00
Jelena Pjesivac-Grbovic
ccc3ee0b6b Minor changes to allgather implementation with some clean-up of util code.
- in allgather algorithms I replaces irecv-isend-waitall sequence with 
  call to ompi_coll_tuned_sendrecv
- most of the functions in util code and allgather decision function conform to 80 character line width.
- 

This commit was SVN r13069.
2007-01-10 21:56:59 +00:00
Jelena Pjesivac-Grbovic
eae3df4904 Updated broadcast decision function based on MX results up to 64 nodes.
(The previous decision function did not consider binomial algorithm (since we did not have it at the time)).

This commit was SVN r13007.
2007-01-06 00:37:40 +00:00
Jelena Pjesivac-Grbovic
3494e1bb05 - Updated decision function for Alltoall collective.
Fixes "jump" for intermediate sizes message on 24+ number of nodes
    (at least on Grig cluster).

This commit was SVN r12920.
2006-12-22 19:59:17 +00:00
George Bosilca
b1725e02d4 No more warnings plus some code reordering.
This commit was SVN r12919.
2006-12-21 22:42:15 +00:00
Jelena Pjesivac-Grbovic
f1aec23507 Adding tuned allgather implementation.
It contains four algorithms: 
Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps),
and Neighbor Exchange (P/2 steps for even number of processes).

All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes.
The fixed decision function is based on results collected over MX on the Grig cluster at 
the University of Tennessee at Knoxville.  
I have also added (and commented out) copy of MPICH2 decision function for allgather
(from their IJHPCA 2005 paper).

This commit was SVN r12910.
2006-12-21 18:40:02 +00:00
George Bosilca
476b922074 Updates & upgrades:
- consistent arguments checking (not allowing to select an algorithm which
     is not available)
 - consistent way of computing the segcount (number of datatypes by segment).
 - small cleanups.
 - more informative debugging messages.

This commit was SVN r12545.
2006-11-10 19:54:09 +00:00
George Bosilca
ba3c247f2a Big collective commit. I lightly test it, but I think it should be quite stable. Anyway,
the default decision functions (for broadcast, reduce and barrier) are based on a
high performance network (not TCP). It should give good performance (really good) for
any network having the following caracteristics: small latency (5 microseconds) and good
bandwidth (more than 1Gb/s).
+ Cleanup of the reduce algorithms, plus 2 new algorithms (binary and binomial). Now most
  of the reduce algorithms use a generic tree based function for completing the reduce.
+ Added macros for computing the trees (they are used for bcast and reduce right now).
+ Allow the usage of all 5 topologies.
+ Jelena's implementation of a binary tree that can be used for non commutative operations.
  Right now only the tree building function is there, it will get activated soon.
+ Some others minor cleanups.

This commit was SVN r12326.
2006-10-26 22:53:05 +00:00
George Bosilca
d7d3f9e486 Tuned collectives works only for at least 2 processes. We have the self module
for the other cases.

This commit was SVN r12271.
2006-10-23 22:28:56 +00:00
George Bosilca
6b697ad3dd If the operation is not commutative then force the basic reducve algorithm. The others
cannot be used for non commutative operations ... yet ...

This commit was SVN r12241.
2006-10-20 22:11:44 +00:00
George Bosilca
26b33ec2d7 If there is just one node, we don't need a decision function, just do the copy
and return.

This commit was SVN r12199.
2006-10-19 22:19:36 +00:00
George Bosilca
041fcb8d18 Update the barrier decision function.
This commit was SVN r12190.
2006-10-19 17:14:01 +00:00
George Bosilca
c9da782804 Keep only one function to get the size of a datatype.
This commit was SVN r12170.
2006-10-18 17:33:01 +00:00
George Bosilca
be27ee6fa0 Correct the bcast problem where we always did a bcast with segzise of 0.
Activate the reduce decision function.
Others small updates (mostly TAB to spaces).

This commit was SVN r12161.
2006-10-18 02:00:46 +00:00
Graham Fagg
f64cbbe8f2 ops. some decisions used extent rather than size for decision making
yes this means it WAS possible for two nodes to choice two different algorithms
(discovered by Doug Gregor and figured out by George)
Also changed some names like size to comsize so we know which sizes we are using where
This should be updated in al versions

This commit was SVN r10601.
2006-06-30 21:49:04 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Graham Fagg
232bb9534a Start moving stuff out of modules that should be in the component.
This commit was SVN r8874.
2006-02-01 20:50:14 +00:00
Graham Fagg
5f2d82347f a couple of changes to make barrier synchronous.. means last communication to any possible peer must
be locally completing. for now using synchronous calls until the new functionality is available. then will change
the code to use the new PML send flags.

This commit was SVN r8867.
2006-01-31 23:21:46 +00:00
Graham Fagg
25375759c3 arrgh. reduce could for very small message sizes and proc counts call a linear function
this was implemented using a chain (tree followed with pipeline) by setting the chain fanout to a factor of size etc but the chain datastructure was fixed in length and if exceeded the topo create returned a null which isn't helpfull in cid next function of comdup...
Anyway two fixes, first we do have a real linear function so changed the decision function and second altered the
topo chain create to force chain fanouts of less than 1 to 1 and fanouts bigger than max to max.
next check in will change chain to dynamically allocd array (reallocable) but we shouldn't ever use a chain fanout for a linear tree anyway. 
(lession must rerun all tests for all data sizes when changing decision functions)

This commit was SVN r8662.
2006-01-08 02:41:09 +00:00
Jeff Squyres
54c4bd3ce2 Update to have public symbols be consistent; use new prefix rule
(apparently we've been doing this in opal and orte, but not in ompi
yet).  All public symbols begin with "ompi_coll_tuned_" (not
mca_coll_tuned_) except the component struct.  Now this component
passes the illegal symbol report with no hits.

This commit was SVN r8589.
2005-12-22 13:49:33 +00:00
Graham Fagg
877f7bbe6a File based dynamic up and tested...
Lots of misc fixes: printfs->opal_output, handles fanin/out correctly for forced ops
unused vars, correct calculations on meaning of 'msgsize' for decision functions
(varies depending on algorithm), etc

This commit was SVN r8113.
2005-11-11 04:49:29 +00:00
Graham Fagg
dcd3450e06 simplified the building of different rule sets
(also corrected some prototypes missing 'struct')

This commit was SVN r8003.
2005-11-06 22:05:50 +00:00