1
1
Граф коммитов

127 Коммитов

Автор SHA1 Сообщение Дата
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Rolf vandeVaart
42168575fd Fix for the special case where np=2 and the sendbuf is set to MPI_IN_PLACE.
In that case, sendcount and sendtype are not valid and we need to use
recvcount and recvtype.

This commit fixes trac:943.  Reviewed by Jelena Pjesivac-Grbovic.

This commit was SVN r14022.

The following Trac tickets were found above:
  Ticket 943 --> https://svn.open-mpi.org/trac/ompi/ticket/943
2007-03-13 19:01:20 +00:00
Jelena Pjesivac-Grbovic
9780a000ba Cleanup of generic reduce function and possible (low probability) bug fix.
- fixing line lengths and some of the comments
- possible bug fix (but I do not think we exposed it in any tests so far)
  temporary buffers were allocated as multiples of extent instead of 
  true_extent + (count -1) * extent.
Everything is still passing Intel tests over tcp and btl mx up to 64 nodes.

This commit was SVN r13956.
2007-03-08 00:54:52 +00:00
Jelena Pjesivac-Grbovic
57cbafafd5 Clean up of generic broadcast function: removing unecessary statements and improving comments.
This commit was SVN r13955.
2007-03-07 21:59:53 +00:00
Jelena Pjesivac-Grbovic
0c07654c30 Updating reduce_scatter decision function based on MX results up to 64 nodes and both 1ppn and 2ppn
configurations.

This commit was SVN r13945.
2007-03-07 00:38:33 +00:00
Jelena Pjesivac-Grbovic
e5ed167a6e Adding tuned version of reduce_scatter implementation.
Currently 3 algorithms are available:
- non-overlapping, reduce + scatterv, (works for non-commutative operations)
- recursive halving algorithm (copied from basic module)
- ring algorithm  (similar to allreduce ring, for large messages)

This commit was SVN r13929.
2007-03-05 20:40:39 +00:00
Li-Ta Lo
196e2a86bb addes binomial tree based scatter, passed IBM and intel tests
This commit was SVN r13906.
2007-03-02 23:19:02 +00:00
Li-Ta Lo
11c94cbe76 eliminated the use of MPI_Get_count
This commit was SVN r13904.
2007-03-02 22:57:50 +00:00
Li-Ta Lo
3765e19d15 added ASCII graph for the topologies
This commit was SVN r13892.
2007-03-02 17:17:14 +00:00
Li-Ta Lo
bd75f2f162 change ALLGATHER to GATHER
This commit was SVN r13891.
2007-03-02 17:02:29 +00:00
Li-Ta Lo
c5d8c221b0 added binomial tree based Gather alogrithm, passed IBM and Intel tests
This commit was SVN r13835.
2007-02-28 01:11:01 +00:00
Jelena Pjesivac-Grbovic
627533fe4a Adding segmented ring algorithm for Allreduce for commutative operations.
Algorithm allows user to specify the segment size to be used for computation/communication overlap.
The additional memory requirement for the algorithm is 2 x segment size.
It performed well for (really) large message sizes over MX and it passed intel Allreduce_c and Allreduce_loc_c tests.

This commit was SVN r13832.
2007-02-27 20:32:30 +00:00
George Bosilca
bec20422ee Remove the warnings about printf data-type mismatch.
This commit was SVN r13804.
2007-02-26 22:20:35 +00:00
Bill D'Amico
db1c2a58c4 Removed cruft - unused variables causing warnings during OMPI build.
This commit was SVN r13772.
2007-02-23 18:55:41 +00:00
Li-Ta Lo
049921a5ec the temporary buffer is not needed for the MPI_IN_PLACE cases if the underlying Gather is implemented correctly
This commit was SVN r13740.
2007-02-21 20:39:56 +00:00
Jelena Pjesivac-Grbovic
36156f39c2 Modification to allreduce ring algorithm:
- the block sizes are computed in more uniformn way.
  The first k blocks may be 1 element larger than the remaining blocks.
The algorithm passed Intel Allreduce_c and Allreduce_loc_c tests, and 
IMB-3.2 Allreduce, over TCP and both btl and mtl MX (up to 128 processes).
The algorithm still only supports commutative operations.

This commit was SVN r13738.
2007-02-21 19:30:08 +00:00
Jelena Pjesivac-Grbovic
b608887466 Adding variant of linear alltoall algorithm where the number of
outstanding requests can be limited using mca parameters.
The implementation passed Intel, IMB-3.2, and mpi_test_suite tests over
TCP and MX up to 128 processes (64 nodes), on both 32-bit and 64-bit machines.
It is not activated by default, but it should be useful for really large
communicator sizes.

This commit was SVN r13720.
2007-02-20 04:25:00 +00:00
Jelena Pjesivac-Grbovic
d2d02642ca Removing compilation warnings about the output format.
This commit was SVN r13693.
2007-02-16 23:32:47 +00:00
Jelena Pjesivac-Grbovic
e532b928af Adding segmented binary reduce algorithm which works with non-commutative operations.
Implementation passed intel: MPI_Reduce_c , MPI_Reduce_loc_c, and MPI_Reduce_user_c tests
over TCP, BTL MX, and MTL MX, as well as, mpi_test_suite Reduce tests (up to 64 nodes).

The algorithm is still not activated by decision function (will be in the near future).

This commit was SVN r13657.
2007-02-14 22:38:38 +00:00
Jelena Pjesivac-Grbovic
b52dc9e427 Modifying fixed decision function for reduce to utilize linear algorithm only for really small communicator sizes.
This commit was SVN r13597.
2007-02-10 00:31:10 +00:00
Jelena Pjesivac-Grbovic
6efca498ec Fixes trac:692 in trunk: receive buffer in MPI_Reduce operation is no longer overwritten on non-root nodes.
This commit was SVN r13538.

The following Trac tickets were found above:
  Ticket 692 --> https://svn.open-mpi.org/trac/ompi/ticket/692
2007-02-07 18:57:03 +00:00
Jeff Squyres
c91fcd7fbd Fix a bunch of minor typos submitted by Bernhard Fischer.
This commit was SVN r13505.
2007-02-06 12:00:30 +00:00
Jelena Pjesivac-Grbovic
e193d625bc Bugfix for ring allreduce algorithm.
The step used to iterate through buffer was function of true_extent instead of extent.

This may or may not solve ticket #689 because I am still getting failures over btl mx, 
but I cannot reproduce failures over mtl mx nor tcp.

This commit was SVN r13459.
2007-02-02 02:44:16 +00:00
Jelena Pjesivac-Grbovic
33dcb4f810 Minor change to linear alltoall algorithm:
- post isends in reverse order of posting irecvs.
if the messages arrive approximately in order, this should 
minimize the time spent in matching the requests.

I did not see any performance difference over MX up to 64 nodes, but 
the change makes sense and may have some impact when we have (many) 
more nodes.

This commit was SVN r13337.
2007-01-26 21:59:31 +00:00
George Bosilca
6f720f0d26 Add all required explicit conversions in order to be able
to build on Windows.

This commit was SVN r13264.
2007-01-24 00:48:16 +00:00
Jelena Pjesivac-Grbovic
5cbcf42dc3 Removing yet another unsed variable (missed it in previous submit).
This commit was SVN r13259.
2007-01-23 21:30:57 +00:00
Jelena Pjesivac-Grbovic
afbd032ff9 Removing compiler warnings about comparison of unsigned values to signed ones, and
unused variables.

This commit was SVN r13258.
2007-01-23 21:10:07 +00:00
Jelena Pjesivac-Grbovic
568477ade8 Adding new Allreduce algorithms, updating allreduce decision function, and cleaning up util.
- Allreduce algorithms:
  - Recursive doubling is used for small messages (up to 10KB) and can be used for 
    both commutative and non-commutative operations.  
	 Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and
	 Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with 
	 MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes.
  - Ring algorithms performs well for larger messages but cannot be used for 
    non-commutative operations.  It passed the same tests as recursive doubling, except
	 some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c
	 (which was expected).
- MPI_Allreduce with new decision function passed all of the tests mentioned above.
- Cleaning up coll_tuned_util.  Moving isendrecv to static inline just like sendrecv. 

This commit was SVN r13252.
2007-01-23 01:19:11 +00:00
George Bosilca
242292673a sendrecv is a static inline.
This commit was SVN r13237.
2007-01-22 05:50:23 +00:00
Sven Stork
862dcb1a34 - fix compiler warning in ia64
This commit was SVN r13212.
2007-01-19 14:48:47 +00:00
Jelena Pjesivac-Grbovic
85192c01b0 Modifying util functionality:
- removing static qualification on ompi_coll_tuned_sendrecv 
- adding ompi_coll_tuned_isendrecv function which posts isend and irecv requests
These changes are separate from but necessary for new algorithms I am working on.

This commit was SVN r13161.
2007-01-17 21:29:13 +00:00
Jelena Pjesivac-Grbovic
d2921a9d42 Cleanup of Barrier implementation:
- utilizing coll_tuned_util functions
- setting line length to 80.

This implementation uses standard send messages (instead of synchronous ones).
The change improved our performance over MX multiple number of times, however,
there exists a small potential that last message to be sent can be delayed 
(until next mpi call, which means potentially infinitely).

If this shows to be a problem, I will modify the algorithms to use synchronous
send as last operation (which will incur performance penalty again).

This commit was SVN r13071.
2007-01-10 22:49:43 +00:00
Jelena Pjesivac-Grbovic
ccc3ee0b6b Minor changes to allgather implementation with some clean-up of util code.
- in allgather algorithms I replaces irecv-isend-waitall sequence with 
  call to ompi_coll_tuned_sendrecv
- most of the functions in util code and allgather decision function conform to 80 character line width.
- 

This commit was SVN r13069.
2007-01-10 21:56:59 +00:00
Brian Barrett
a34e67d743 Remove unneeded PARAM_INIT_FILE variable in configure.params files used by
components that use configure.m4 for configuration or are always built. 
The macro has not been needed since moving to configure types other than
configure.stub

Fixes trac:590

This commit was SVN r13031.

The following Trac tickets were found above:
  Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
2007-01-08 03:44:22 +00:00
Jelena Pjesivac-Grbovic
eae3df4904 Updated broadcast decision function based on MX results up to 64 nodes.
(The previous decision function did not consider binomial algorithm (since we did not have it at the time)).

This commit was SVN r13007.
2007-01-06 00:37:40 +00:00
Jelena Pjesivac-Grbovic
3494e1bb05 - Updated decision function for Alltoall collective.
Fixes "jump" for intermediate sizes message on 24+ number of nodes
    (at least on Grig cluster).

This commit was SVN r12920.
2006-12-22 19:59:17 +00:00
George Bosilca
b1725e02d4 No more warnings plus some code reordering.
This commit was SVN r12919.
2006-12-21 22:42:15 +00:00
Jelena Pjesivac-Grbovic
f1aec23507 Adding tuned allgather implementation.
It contains four algorithms: 
Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps),
and Neighbor Exchange (P/2 steps for even number of processes).

All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes.
The fixed decision function is based on results collected over MX on the Grig cluster at 
the University of Tennessee at Knoxville.  
I have also added (and commented out) copy of MPICH2 decision function for allgather
(from their IJHPCA 2005 paper).

This commit was SVN r12910.
2006-12-21 18:40:02 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
George Bosilca
ec410644ce Implement the send receive as 2 non blocking operations. That will help us
avoiding too many calls to opal_progress.

This commit was SVN r12553.
2006-11-10 23:06:19 +00:00
George Bosilca
c2c6a1b37e Correctly compute the number of elements in a segment.
For broadcast send the correct size for all intermediary nodes.

This commit was SVN r12552.
2006-11-10 23:04:50 +00:00
George Bosilca
7102147b9f Correctly detect when the specified algorithm is out of range. In
this case we reset it to zero.

This commit was SVN r12551.
2006-11-10 21:47:07 +00:00
George Bosilca
af68171253 Use the macro to compute the number of elements in a segment in both
bcast and reduce and update the default values for the variables
as required by the comment in the coll_tuned.h file.

This commit was SVN r12546.
2006-11-10 20:04:08 +00:00
George Bosilca
476b922074 Updates & upgrades:
- consistent arguments checking (not allowing to select an algorithm which
     is not available)
 - consistent way of computing the segcount (number of datatypes by segment).
 - small cleanups.
 - more informative debugging messages.

This commit was SVN r12545.
2006-11-10 19:54:09 +00:00
George Bosilca
77ef979457 New architecture for broadcast. A generic broadcast working on a tree
description. Most of the bcast algorithms can be completed using this
generic function once we create the tree structure. Add all kind of
trees.

There are 2 versions of the generic bcast function. One using overlapping
between receives (for intermediary nodes) and then blocking sends to all
childs and another where all sends are non blocking. I still have to
figure out which one give the smallest overhead.

This commit was SVN r12530.
2006-11-10 05:53:50 +00:00
George Bosilca
1d80f685b5 Remove one compiler warning.
This commit was SVN r12520.
2006-11-09 20:08:43 +00:00
George Bosilca
73eec4bfef Show the MCA parameter coll_base_verbose only if Open MPI is compiled in
debug mode. Otherwise there is no debug anyway ...

This commit was SVN r12516.
2006-11-09 19:02:32 +00:00
George Bosilca
a82ce427e4 Update the number of reduce algorithms available.
This commit was SVN r12503.
2006-11-08 22:20:34 +00:00
George Bosilca
0914892044 Small cleanups, some explicit casts.
This commit was SVN r12494.
2006-11-08 16:54:03 +00:00
George Bosilca
8529238d93 Add 2 more algorithms to the dynamic list.
This commit was SVN r12415.
2006-11-02 19:19:08 +00:00