1
1

413 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
a34e67d743 Remove unneeded PARAM_INIT_FILE variable in configure.params files used by
components that use configure.m4 for configuration or are always built. 
The macro has not been needed since moving to configure types other than
configure.stub

Fixes trac:590

This commit was SVN r13031.

The following Trac tickets were found above:
  Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
2007-01-08 03:44:22 +00:00
Jelena Pjesivac-Grbovic
eae3df4904 Updated broadcast decision function based on MX results up to 64 nodes.
(The previous decision function did not consider binomial algorithm (since we did not have it at the time)).

This commit was SVN r13007.
2007-01-06 00:37:40 +00:00
Brian Barrett
936fdd2ae1 remove some code that accidently came in with r12974. Refs trac:587
This commit was SVN r12991.

The following SVN revision numbers were found above:
  r12974 --> open-mpi/ompi@27cea44a9c

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-04 20:17:07 +00:00
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
Jelena Pjesivac-Grbovic
3494e1bb05 - Updated decision function for Alltoall collective.
Fixes "jump" for intermediate sizes message on 24+ number of nodes
    (at least on Grig cluster).

This commit was SVN r12920.
2006-12-22 19:59:17 +00:00
George Bosilca
b1725e02d4 No more warnings plus some code reordering.
This commit was SVN r12919.
2006-12-21 22:42:15 +00:00
Jelena Pjesivac-Grbovic
f1aec23507 Adding tuned allgather implementation.
It contains four algorithms: 
Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps),
and Neighbor Exchange (P/2 steps for even number of processes).

All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes.
The fixed decision function is based on results collected over MX on the Grig cluster at 
the University of Tennessee at Knoxville.  
I have also added (and commented out) copy of MPICH2 decision function for allgather
(from their IJHPCA 2005 paper).

This commit was SVN r12910.
2006-12-21 18:40:02 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
Ralph Castain
6d6cebb4a7 Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things).
Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it.

I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn).

This commit was SVN r12597.
2006-11-14 19:34:59 +00:00
George Bosilca
ec410644ce Implement the send receive as 2 non blocking operations. That will help us
avoiding too many calls to opal_progress.

This commit was SVN r12553.
2006-11-10 23:06:19 +00:00
George Bosilca
c2c6a1b37e Correctly compute the number of elements in a segment.
For broadcast send the correct size for all intermediary nodes.

This commit was SVN r12552.
2006-11-10 23:04:50 +00:00
George Bosilca
7102147b9f Correctly detect when the specified algorithm is out of range. In
this case we reset it to zero.

This commit was SVN r12551.
2006-11-10 21:47:07 +00:00
George Bosilca
af68171253 Use the macro to compute the number of elements in a segment in both
bcast and reduce and update the default values for the variables
as required by the comment in the coll_tuned.h file.

This commit was SVN r12546.
2006-11-10 20:04:08 +00:00
George Bosilca
476b922074 Updates & upgrades:
- consistent arguments checking (not allowing to select an algorithm which
     is not available)
 - consistent way of computing the segcount (number of datatypes by segment).
 - small cleanups.
 - more informative debugging messages.

This commit was SVN r12545.
2006-11-10 19:54:09 +00:00
George Bosilca
77ef979457 New architecture for broadcast. A generic broadcast working on a tree
description. Most of the bcast algorithms can be completed using this
generic function once we create the tree structure. Add all kind of
trees.

There are 2 versions of the generic bcast function. One using overlapping
between receives (for intermediary nodes) and then blocking sends to all
childs and another where all sends are non blocking. I still have to
figure out which one give the smallest overhead.

This commit was SVN r12530.
2006-11-10 05:53:50 +00:00
George Bosilca
1d80f685b5 Remove one compiler warning.
This commit was SVN r12520.
2006-11-09 20:08:43 +00:00
George Bosilca
73eec4bfef Show the MCA parameter coll_base_verbose only if Open MPI is compiled in
debug mode. Otherwise there is no debug anyway ...

This commit was SVN r12516.
2006-11-09 19:02:32 +00:00
George Bosilca
a82ce427e4 Update the number of reduce algorithms available.
This commit was SVN r12503.
2006-11-08 22:20:34 +00:00
George Bosilca
0914892044 Small cleanups, some explicit casts.
This commit was SVN r12494.
2006-11-08 16:54:03 +00:00
George Bosilca
74d3946342 Remove the call to set_args. This is only required for the MPI level,
because there we have to be able to return to the user the description
of the data.

This commit was SVN r12493.
2006-11-08 16:52:48 +00:00
Jeff Squyres
427c20af0d Use a new algorithm for allgatherv. The old algorithm essentially did
N gatherv's:

  for (i = 0 ... size)
    MPI_Gatherv(..., root = i, ...)

The new algorithm simply does (effectively):

  MPI_Gatherv(..., root = 0, ...)
  MPI_Bcast(..., root = 0, ...)

This commit was SVN r12469.
2006-11-07 18:07:55 +00:00
George Bosilca
8529238d93 Add 2 more algorithms to the dynamic list.
This commit was SVN r12415.
2006-11-02 19:19:08 +00:00
George Bosilca
393657ee26 Initialize the sndbuf in all cases. Do not forget to initialize the
tree used in each of the broadcast functions.

This commit was SVN r12332.
2006-10-27 00:13:33 +00:00
George Bosilca
126a68dc9a Big datatype commit. Remove all unused features of the datatype engine. As the memory
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).

This commit was SVN r12331.
2006-10-26 23:11:26 +00:00
George Bosilca
ba3c247f2a Big collective commit. I lightly test it, but I think it should be quite stable. Anyway,
the default decision functions (for broadcast, reduce and barrier) are based on a
high performance network (not TCP). It should give good performance (really good) for
any network having the following caracteristics: small latency (5 microseconds) and good
bandwidth (more than 1Gb/s).
+ Cleanup of the reduce algorithms, plus 2 new algorithms (binary and binomial). Now most
  of the reduce algorithms use a generic tree based function for completing the reduce.
+ Added macros for computing the trees (they are used for bcast and reduce right now).
+ Allow the usage of all 5 topologies.
+ Jelena's implementation of a binary tree that can be used for non commutative operations.
  Right now only the tree building function is there, it will get activated soon.
+ Some others minor cleanups.

This commit was SVN r12326.
2006-10-26 22:53:05 +00:00
George Bosilca
99631ccf66 Cleanups.
This commit was SVN r12272.
2006-10-23 22:29:17 +00:00
George Bosilca
d7d3f9e486 Tuned collectives works only for at least 2 processes. We have the self module
for the other cases.

This commit was SVN r12271.
2006-10-23 22:28:56 +00:00
George Bosilca
b848a5ad06 Remove all ompi_coll_chain_t references.
This commit was SVN r12269.
2006-10-23 21:47:50 +00:00
George Bosilca
39cd8d3d17 One to rule them all. We only need one topology information: a tree. How we
build it it's hat make the difference.

This commit was SVN r12268.
2006-10-23 21:46:30 +00:00
George Bosilca
9cf3040e5f Allocate enough memory for the reduce operation when MPI_IN_PLACE is specified.
This commit was SVN r12260.
2006-10-23 17:51:36 +00:00
George Bosilca
6b697ad3dd If the operation is not commutative then force the basic reducve algorithm. The others
cannot be used for non commutative operations ... yet ...

This commit was SVN r12241.
2006-10-20 22:11:44 +00:00
George Bosilca
a7b6078b73 No more segfault. Still some wrong data around ...
This commit was SVN r12238.
2006-10-20 20:17:34 +00:00
George Bosilca
02759cf515 Update the reduce chain collective.
This commit was SVN r12237.
2006-10-20 19:47:52 +00:00
George Bosilca
06563b5dec Last set of explicit conversions. We are now close to the zero warnings on
all platforms. The only exceptions (and I will not deal with them
anytime soon) are on Windows:
- the write functions which require the length to be an int when it's
  a size_t on all UNIX variants.
- all iovec manipulation functions where the iov_len is again an int
  when it's a size_t on most of the UNIXes.
As these only happens on Windows, so I think we're set for now :)

This commit was SVN r12215.
2006-10-20 03:57:44 +00:00
George Bosilca
527bb7a197 Remove a double ;
This commit was SVN r12213.
2006-10-20 03:28:51 +00:00
George Bosilca
caefd6d0ee Do not leak memory. Allocate the intermediary buffer only when we really need it
(not leafs) and release on the same way.

This commit was SVN r12200.
2006-10-19 22:20:33 +00:00
George Bosilca
26b33ec2d7 If there is just one node, we don't need a decision function, just do the copy
and return.

This commit was SVN r12199.
2006-10-19 22:19:36 +00:00
George Bosilca
3eb2f90ceb For the recurvise doubling correctly compute the closest power of 2 number of
nodes.

This commit was SVN r12191.
2006-10-19 17:14:57 +00:00
George Bosilca
041fcb8d18 Update the barrier decision function.
This commit was SVN r12190.
2006-10-19 17:14:01 +00:00
George Bosilca
c9da782804 Keep only one function to get the size of a datatype.
This commit was SVN r12170.
2006-10-18 17:33:01 +00:00
George Bosilca
21ade43b96 Remove a non reacheable statement.
This commit was SVN r12166.
2006-10-18 16:43:55 +00:00
George Bosilca
be27ee6fa0 Correct the bcast problem where we always did a bcast with segzise of 0.
Activate the reduce decision function.
Others small updates (mostly TAB to spaces).

This commit was SVN r12161.
2006-10-18 02:00:46 +00:00
George Bosilca
8852c00c36 Look like a big commit but in fact it address only one issue. The way we're working with
size and diplacement of data-type. After this patch all data can contain size_t bytes
and the displacements are defined as ptrdiff_t. All of the files I was able to compile
have been modified to match this requirement.

This commit was SVN r12146.
2006-10-17 20:20:58 +00:00
Jeff Squyres
a8e9fa09da Fix some compiler warnings introduced in r11619. I checked with
George: ompi_ddt_type_size() returns a signed int only because of the
MPI spec; it will never return a negative value.  So casting the
return value out of it to a (uint32_t) is safe, and makes the
comparisons be between two unsigned values. 

This commit was SVN r11639.

The following SVN revision numbers were found above:
  r11619 --> open-mpi/ompi@8667648a1b
2006-09-13 16:42:31 +00:00
Graham Fagg
8667648a1b Simple fix (for ticket 363). We push segment size to type size. In other algorithms we switch of segementing altogether. But really the DDT can probably handle partial types so we could really keep the segsize constant (for all but reduce ops) and treat it just as byte arrays..
todos: macroize it as we do it 10 different ways, add mca params to control handling (push up size, no change, switch off segmenting)

This commit was SVN r11619.
2006-09-12 00:01:27 +00:00
Jeff Squyres
fb4d7ab268 * Fix svn:ignore
* Remove files that should not be in SVN

This commit was SVN r11565.
2006-09-08 10:35:45 +00:00
George Bosilca
3b39df8ae1 More protection around what we really want to get exported.
This commit was SVN r11437.
2006-08-27 04:49:02 +00:00
Sami Ayyorgun
aa8cd63418 changed some barrier variables for shared-memory to volatile
This commit was SVN r11403.
2006-08-24 16:53:10 +00:00
Torsten Hoefler
6b22641669 added LibNBC (http://www.unixer.de/NBC) as collv1 (blocking) component.
I know it does not make much sense but one can play around with the
performance. Numbers are available at http://www.unixer.de/research/nbcoll/perf/.
This is the first step towards collv2. Next step includes the addition
of non-blocking functions to the MPI-Layer and the collv1 interface.

It implements all MPI-1 collective algorithms in a non-blocking manner.
However, the collv1 interface does not allow non-blocking collectives so
that all collectives are used blocking by the ompi-glue layer.

I wanted to add LibNBC as a separate subdirectory, but I could not
convince the buildsystem (and had not the time). So the component looks
pretty messy. It would be great if somebody could explain me how to move
all nbc*{c,h}, and {hb,dict}*{c,h} to a seperate subdirectory.

It's .ompi_ignored because I did not test it exhaustively yet.

This commit was SVN r11401.
2006-08-24 16:47:18 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00