1
1
Граф коммитов

24 Коммитов

Автор SHA1 Сообщение Дата
Edgar Gabriel
149ecb8d7d 1. debug the four new algorithms
2. fix a bug in the initial communicator creation of llcomm
3. fix a bug which showed up as the result of fixing issue number 2: we have
   to check now whether llcomm has really be created before freeing the 
   according llcomm in hierarch_destruct.

This commit was SVN r19361.
2008-08-18 21:54:35 +00:00
Edgar Gabriel
7cbc4a4077 adding four different algorithms for a hierarchical bcast which try to
generate an overlap between the different layers. Why four versions? Because
there is right now always the trade-off between using non-blocking operations
on a layer with a trivial, linear algorithm and using the more sophisticaed
algorithms in a blocking manner. 

- bcast_intra_seg used the bcast of lcomm and llcomm, similarly
  to original algorithm in hierarch. However, it can segment
  the message, such that we might get an overlap between the two
  layers. This overlap is based on the assumption, that a process
  might be done early with a bcast and can start the next one.
- bcast_intra_seg1: replaces the llcomm->bcast by isend/irecvs
  to increase the overlap, keeps the lcomm->bcast however
- bcast_intra_seg2: replaced lcomm->bcast by isend/irecvs
  to increase the overlap, keeps however llcomm->bcast
- bcast_intra_seg3: replaced both lcomm->bcast and llcomm->bcast
  by isend/irecvs

The code is lightly tested, more testing to follow right now.

This commit was SVN r19358.
2008-08-18 16:05:44 +00:00
Jeff Squyres
0af7ac53f2 Fixes trac:1392, #1400
* add "register" function to mca_base_component_t
   * converted coll:basic and paffinity:linux and paffinity:solaris to
     use this function
   * we'll convert the rest over time (I'll file a ticket once all
     this is committed)
 * add 32 bytes of "reserved" space to the end of mca_base_component_t
   and mca_base_component_data_2_0_0_t to make future upgrades
   [slightly] easier
   * new mca_base_component_t size: 196 bytes
   * new mca_base_component_data_2_0_0_t size: 36 bytes
 * MCA base version bumped to v2.0
   * '''We now refuse to load components that are not MCA v2.0.x'''
 * all MCA frameworks versions bumped to v2.0
 * be a little more explicit about version numbers in the MCA base
   * add big comment in mca.h about versioning philosophy

This commit was SVN r19073.

The following Trac tickets were found above:
  Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392
2008-07-28 22:40:57 +00:00
Edgar Gabriel
77057a50a3 - adding the two-level hierarchy detection algorithm
- minor fix in the temporary collectives 
- removing the symmetric parameter, since it didn't really make sense.

This commit was SVN r17359.
2008-02-01 17:11:36 +00:00
George Bosilca
906e8bf1d1 Replace the ompi_pointer_array with opal_pointer_array. The next step
(sometimes after the merge with the ORTE branch), the opal_pointer_array
will became the only pointer_array implementation (the orte_pointer_array
will be removed).

This commit was SVN r17007.
2007-12-21 06:02:00 +00:00
Edgar Gabriel
a2f5cada1a convert the hiearch component to the new structure. More testing required before we remove the .ompi_ignore flag again.
This commit was SVN r15954.
2007-08-23 20:41:29 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
George Bosilca
3b39df8ae1 More protection around what we really want to get exported.
This commit was SVN r11437.
2006-08-27 04:49:02 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Edgar Gabriel
2ec5fa5d24 - The component will remove itself from the list of potential collective
modules, if its priority is zero (the default value). Reason for that is
  + if there is no other module with a priority > 0, the hierarchical
    collective module has a problem anyway, since it has to rely on the coll
    modules of the subcommunicators. On the other hand, if its priority is
    zero, it won't be chosen anyway, and we can simply save the
    allreduce/allgather and comm_split operations which might occur during
    hierarchy detection.
  + to improve the startup times until we have the modex thing which we
    discussed with Jeff and Tim in Knoxville in place

- adding an mca parameter indicating a symmetric configuration. This can 
  speed up startup times, since each process can conclude from its data onto
  the data of the other processes -> no need for the allreduce operations. Per
  default  this parameter is set to "no".

This commit was SVN r7932.
2005-10-30 16:01:13 +00:00
Edgar Gabriel
00c04ab56a moving the hierarch collective component to the new parameter
registration interface.

This commit was SVN r7867.
2005-10-25 18:34:47 +00:00
Edgar Gabriel
818b4af554 - reverting the logic in the hierarchy detection stuff. This can reduce the
number of collective operations and simplifies the logic significantly.
- introducing a special case if size of comm == 1, avoiding thus collective
 operations as well ( i.e. no need for hierarchies)
- fix for an unsymmetric case. Still to be tested.

This commit was SVN r7799.
2005-10-18 18:17:50 +00:00
Edgar Gabriel
7e45f64065 reduce has now been tested quite extensively for all (predefined) operations
and for all root nodes and passed all tests.
First cut on barrier (which from my perspective does not make sense from the
performance point of view) and on allreduce (which might make sense),

This commit was SVN r7774.
2005-10-15 22:24:44 +00:00
Edgar Gabriel
3fab9c628c switching the root and creating (if necessary) the new local leader sub-communicators seems to work as well. Thoroughly tested with bcast, not yet that exhaustivly tested for the reduction.
This commit was SVN r7773.
2005-10-15 21:13:44 +00:00
Edgar Gabriel
ba163c611c checkpoint before moving to a real cluster. Most of the recoding should be
done. This version also doesn't break ompi (at least if its not chosen :-) ).
New features compared to the version from last Thursday (where bcast and
reduce seemed to work in most scenarios):
- clearer internal infrastructure
- ability to handle all root processes with a (hopefully) minimal number of
local leader communicators. 

This commit was SVN r7769.
2005-10-15 17:04:01 +00:00
Edgar Gabriel
84c070fc0f get rid of the different modes how to store the colorarray for now. Might be
reintroduced later as an optimization.

This commit was SVN r7762.
2005-10-14 18:11:21 +00:00
Edgar Gabriel
6d14440972 checkpoint for moving again to another machine. major rewrite to clean
up internal interfaces in progress.

This commit was SVN r7761.
2005-10-14 17:41:44 +00:00
Edgar Gabriel
770aeaf97b modifications towards adding new local-leader communicators.
This commit was SVN r7760.
2005-10-14 12:18:29 +00:00
Edgar Gabriel
48f2563b4c checkpoint. Moving to another machine.
This commit was SVN r7757.
2005-10-13 20:04:26 +00:00
Edgar Gabriel
25518b63c5 first version of coll_hierarch which does not crash the rest of the
library as long as its not selected :-)

This commit was SVN r7707.
2005-10-11 22:05:24 +00:00
Edgar Gabriel
083d0b9630 Checkpoint: most of the coding should be done for the basic
infrastructure.

This commit was SVN r7696.
2005-10-11 19:45:21 +00:00
Edgar Gabriel
b42d4ac780 Checkpoint:
- update the hierarch stuff to use btl's instead of ptl's
- start the new logic regarding how to handle local leader communicators

This commit was SVN r7691.
2005-10-11 17:29:59 +00:00
Jeff Squyres
4ab17f019b Rename src -> ompi
This commit was SVN r6269.
2005-07-02 13:43:57 +00:00