1
1
Граф коммитов

44 Коммитов

Автор SHA1 Сообщение Дата
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Brian Barrett
a34e67d743 Remove unneeded PARAM_INIT_FILE variable in configure.params files used by
components that use configure.m4 for configuration or are always built. 
The macro has not been needed since moving to configure types other than
configure.stub

Fixes trac:590

This commit was SVN r13031.

The following Trac tickets were found above:
  Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
2007-01-08 03:44:22 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
George Bosilca
527bb7a197 Remove a double ;
This commit was SVN r12213.
2006-10-20 03:28:51 +00:00
George Bosilca
8852c00c36 Look like a big commit but in fact it address only one issue. The way we're working with
size and diplacement of data-type. After this patch all data can contain size_t bytes
and the displacements are defined as ptrdiff_t. All of the files I was able to compile
have been modified to match this requirement.

This commit was SVN r12146.
2006-10-17 20:20:58 +00:00
George Bosilca
3b39df8ae1 More protection around what we really want to get exported.
This commit was SVN r11437.
2006-08-27 04:49:02 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Edgar Gabriel
2ec5fa5d24 - The component will remove itself from the list of potential collective
modules, if its priority is zero (the default value). Reason for that is
  + if there is no other module with a priority > 0, the hierarchical
    collective module has a problem anyway, since it has to rely on the coll
    modules of the subcommunicators. On the other hand, if its priority is
    zero, it won't be chosen anyway, and we can simply save the
    allreduce/allgather and comm_split operations which might occur during
    hierarchy detection.
  + to improve the startup times until we have the modex thing which we
    discussed with Jeff and Tim in Knoxville in place

- adding an mca parameter indicating a symmetric configuration. This can 
  speed up startup times, since each process can conclude from its data onto
  the data of the other processes -> no need for the allreduce operations. Per
  default  this parameter is set to "no".

This commit was SVN r7932.
2005-10-30 16:01:13 +00:00
Jeff Squyres
d47ce065e9 Minor Makefile.am fix for static builds.
This commit was SVN r7882.
2005-10-26 15:57:58 +00:00
Edgar Gabriel
ba3bf6592f fixing some warnings. No idea yet why the static builds fail...
This commit was SVN r7879.
2005-10-26 12:56:56 +00:00
Edgar Gabriel
d009d8de57 opening the hierarchical collective component to the public. I am at this
stage fairly confident that 
- it works in most scenarious (with symmetric hierarchies, with asymmetric
  hierarchies, wihout hierarchies - it just removes itself)
- it does not create too many problems (I am not aware of any at least)
- it does not slow down startup anymore dramatically (thanks to the fixes of
  Brian, Jeff, Tim and a significant reduction in the number of collective
  operations in the comm_query)
Any feedback is highly welcome.

This commit was SVN r7868.
2005-10-25 18:38:43 +00:00
Edgar Gabriel
00c04ab56a moving the hierarch collective component to the new parameter
registration interface.

This commit was SVN r7867.
2005-10-25 18:34:47 +00:00
Edgar Gabriel
3a7efaf4d9 fix for reduce and allreduce for an unsymmetric case
This commit was SVN r7802.
2005-10-18 19:20:48 +00:00
Edgar Gabriel
818b4af554 - reverting the logic in the hierarchy detection stuff. This can reduce the
number of collective operations and simplifies the logic significantly.
- introducing a special case if size of comm == 1, avoiding thus collective
 operations as well ( i.e. no need for hierarchies)
- fix for an unsymmetric case. Still to be tested.

This commit was SVN r7799.
2005-10-18 18:17:50 +00:00
Edgar Gabriel
7e45f64065 reduce has now been tested quite extensively for all (predefined) operations
and for all root nodes and passed all tests.
First cut on barrier (which from my perspective does not make sense from the
performance point of view) and on allreduce (which might make sense),

This commit was SVN r7774.
2005-10-15 22:24:44 +00:00
Edgar Gabriel
3fab9c628c switching the root and creating (if necessary) the new local leader sub-communicators seems to work as well. Thoroughly tested with bcast, not yet that exhaustivly tested for the reduction.
This commit was SVN r7773.
2005-10-15 21:13:44 +00:00
Edgar Gabriel
7d34770456 further bugfixes. The hierarchy detection works now as far as I can see (even in unsymmetric sitations). Bcast and reduce work as well. Still to test: the code which generates new local leader communicators, in case the root of the operation is not yet part of the lleader comm.
This commit was SVN r7772.
2005-10-15 19:36:54 +00:00
Edgar Gabriel
63554d245f further bugfixes
This commit was SVN r7771.
2005-10-15 18:44:57 +00:00
Edgar Gabriel
92c7b77cbc minor bug fixes
This commit was SVN r7770.
2005-10-15 18:32:40 +00:00
Edgar Gabriel
ba163c611c checkpoint before moving to a real cluster. Most of the recoding should be
done. This version also doesn't break ompi (at least if its not chosen :-) ).
New features compared to the version from last Thursday (where bcast and
reduce seemed to work in most scenarios):
- clearer internal infrastructure
- ability to handle all root processes with a (hopefully) minimal number of
local leader communicators. 

This commit was SVN r7769.
2005-10-15 17:04:01 +00:00
Edgar Gabriel
84c070fc0f get rid of the different modes how to store the colorarray for now. Might be
reintroduced later as an optimization.

This commit was SVN r7762.
2005-10-14 18:11:21 +00:00
Edgar Gabriel
6d14440972 checkpoint for moving again to another machine. major rewrite to clean
up internal interfaces in progress.

This commit was SVN r7761.
2005-10-14 17:41:44 +00:00
Edgar Gabriel
770aeaf97b modifications towards adding new local-leader communicators.
This commit was SVN r7760.
2005-10-14 12:18:29 +00:00
Edgar Gabriel
48f2563b4c checkpoint. Moving to another machine.
This commit was SVN r7757.
2005-10-13 20:04:26 +00:00
Edgar Gabriel
4b05359b16 minor fixes when freeing the component
This commit was SVN r7756.
2005-10-13 18:22:16 +00:00
Edgar Gabriel
0a5a346bbb first cut on the reduce operation.
This commit was SVN r7755.
2005-10-13 17:58:13 +00:00
Edgar Gabriel
30af775d40 further fixes. The first hierarchical MPI_Bcast works! Its just ~ 100 times slower then basic at the moment :-)
This commit was SVN r7754.
2005-10-13 17:34:42 +00:00
Edgar Gabriel
460b5cb840 further corrections to the hierarchy detection algorithms. It seems to work now as far as my tests show...
This commit was SVN r7753.
2005-10-13 16:21:13 +00:00
Edgar Gabriel
f5d16419b2 fix in the logic regarding protocol detection.
This commit was SVN r7749.
2005-10-13 15:07:35 +00:00
Edgar Gabriel
3e5ad3e681 Updates
This commit was SVN r7738.
2005-10-12 20:56:29 +00:00
Edgar Gabriel
25518b63c5 first version of coll_hierarch which does not crash the rest of the
library as long as its not selected :-)

This commit was SVN r7707.
2005-10-11 22:05:24 +00:00
Edgar Gabriel
0675c22dab updating with Jeff's help to the recent autogen/configure system
This commit was SVN r7705.
2005-10-11 21:50:16 +00:00
Edgar Gabriel
7b07dbc163 another round of fixes. Unfortunatly, I also have to provide a trivial
version of reduce and gather to make all this work....

This commit was SVN r7702.
2005-10-11 21:26:07 +00:00
Edgar Gabriel
c8adc2e65e coding around the collective operations
This commit was SVN r7698.
2005-10-11 20:34:17 +00:00
Edgar Gabriel
083d0b9630 Checkpoint: most of the coding should be done for the basic
infrastructure.

This commit was SVN r7696.
2005-10-11 19:45:21 +00:00
Edgar Gabriel
b42d4ac780 Checkpoint:
- update the hierarch stuff to use btl's instead of ptl's
- start the new logic regarding how to handle local leader communicators

This commit was SVN r7691.
2005-10-11 17:29:59 +00:00
Jeff Squyres
15d0a95202 - Remove extra whitespace from Makefile.am's from when we removed
Makefile.options
- Sample in each of the three projects of how to link againt the
  relevant libraries so that when components are loaded into a parent
  process' space, we don't rely on the libopal/liborte/libmpi symbols
  being in the parent's public symbol namespace -- instead,
  dynamically link to the relevant libraries, allowing the dynamic
  linker to pull those libraries in at run-time, if needed

This commit was SVN r7397.
2005-09-15 20:56:18 +00:00
Jeff Squyres
cf16a521c8 Ensure to get ompi/include/constants.h
This commit was SVN r6845.
2005-08-12 21:42:07 +00:00
Jeff Squyres
888f0c5afd Remove the EXTRA_DIST=VERSION stuff from all the Makefile.am's so that
"make dist" can succeed.  Duh.  :-\

This commit was SVN r6351.
2005-07-05 19:01:47 +00:00
Jeff Squyres
ba99409628 Major simplifications to component versioning:
- After long discussions and ruminations on how we run components in
  LAM/MPI, made the decision that, by default, all components included
  in Open MPI will use the version number of their parent project
  (i.e., OMPI or ORTE).  They are certaint free to use a different
  number, but this simplification makes the common cases easy:
  - components are only released when the parent project is released
  - it is easy (trivial?) to distinguish which version component goes
    with with version of the parent project
- removed all autogen/configure code for templating the version .h
  file in components
- made all ORTE components use ORTE_*_VERSION for version numbers
- made all OMPI components use OMPI_*_VERSION for version numbers
- removed all VERSION files from components
- configure now displays OPAL, ORTE, and OMPI version numbers
- ditto for ompi_info
- right now, faking it -- OPAL and ORTE and OMPI will always have the
  same version number (i.e., they all come from the same top-level
  VERSION file).  But this paves the way for the Great Configure
  Reorganization, where, among other things, each project will have
  its own version number.

So all in all, we went from a boatload of version numbers to
[effectively] three.  That's pretty good.  :-)

This commit was SVN r6344.
2005-07-04 20:12:36 +00:00
Brian Barrett
a13166b500 * rename ompi_output to opal_output
This commit was SVN r6329.
2005-07-03 23:31:27 +00:00
Jeff Squyres
4ab17f019b Rename src -> ompi
This commit was SVN r6269.
2005-07-02 13:43:57 +00:00