1
1
Граф коммитов

96 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
40a2bfa238 WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.

Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.

This commit was SVN r23764.
2010-09-17 23:04:06 +00:00
Shiqing Fan
7a301bc417 Add support for ompi_ext on Windows.
This commit was SVN r23632.
2010-08-23 13:16:30 +00:00
Josh Hursey
e12ca48cd9 A number of C/R enhancements per RFC below:
http://www.open-mpi.org/community/lists/devel/2010/07/8240.php

Documentation:
  http://osl.iu.edu/research/ft/

Major Changes: 
-------------- 
 * Added C/R-enabled Debugging support. 
   Enabled with the --enable-crdebug flag. See the following website for more information: 
   http://osl.iu.edu/research/ft/crdebug/ 
 * Added Stable Storage (SStore) framework for checkpoint storage 
   * 'central' component does a direct to central storage save 
   * 'stage' component stages checkpoints to central storage while the application continues execution. 
     * 'stage' supports offline compression of checkpoints before moving (sstore_stage_compress) 
     * 'stage' supports local caching of checkpoints to improve automatic recovery (sstore_stage_caching) 
 * Added Compression (compress) framework to support 
 * Add two new ErrMgr recovery policies 
   * {{{crmig}}} C/R Process Migration 
   * {{{autor}}} C/R Automatic Recovery 
 * Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} ErrMgr component 
 * Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} configure option) 
   * {{{OMPI_CR_Checkpoint}}} (Fixes trac:2342) 
   * {{{OMPI_CR_Restart}}} 
   * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) 
   * {{{OMPI_CR_INC_register_callback}}} (Fixes trac:2192) 
   * {{{OMPI_CR_Quiesce_start}}} 
   * {{{OMPI_CR_Quiesce_checkpoint}}} 
   * {{{OMPI_CR_Quiesce_end}}} 
   * {{{OMPI_CR_self_register_checkpoint_callback}}} 
   * {{{OMPI_CR_self_register_restart_callback}}} 
   * {{{OMPI_CR_self_register_continue_callback}}} 
 * The ErrMgr predicted_fault() interface has been changed to take an opal_list_t of ErrMgr defined types. This will allow us to better support a wider range of fault prediction services in the future. 
 * Add a progress meter to: 
   * FileM rsh (filem_rsh_process_meter) 
   * SnapC full (snapc_full_progress_meter) 
   * SStore stage (sstore_stage_progress_meter) 
 * Added 2 new command line options to ompi-restart 
   * --showme : Display the full command line that would have been exec'ed. 
   * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes trac:2413) 
 * Deprecated some MCA params: 
   * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir 
   * snapc_base_global_snapshot_dir deprecated, use sstore_base_global_snapshot_dir 
   * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared 
   * snapc_base_store_in_place deprecated, replaced with different components of SStore 
   * snapc_base_global_snapshot_ref deprecated, use sstore_base_global_snapshot_ref 
   * snapc_base_establish_global_snapshot_dir deprecated, never well supported 
   * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem 

Minor Changes: 
-------------- 
 * Fixes trac:1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint handles and does the right thing. 
 * Fixes trac:2097 : {{{ompi-info}}} should now report all available CRS components 
 * Fixes trac:2161 : Manual checkpoint movement. A user can 'mv' a checkpoint directory from the original location to another and still restart from it. 
 * Fixes trac:2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} 
 * Move {{{ompi_cr_continue_like_restart}}} to {{{orte_cr_continue_like_restart}}} to be more flexible in where this should be set. 
 * opal_crs_base_metadata_write* functions have been moved to SStore to support a wider range of metadata handling functionality. 
 * Cleanup the CRS framework and components to work with the SStore framework. 
 * Cleanup the SnapC framework and components to work with the SStore framework (cleans up these code paths considerably). 
 * Add 'quiesce' hook to CRCP for a future enhancement. 
 * We now require a BLCR version that supports {{{cr_request_file()}}} or {{{cr_request_checkpoint()}}} in order to make the code more maintainable. Note that {{{cr_request_file}}} has been deprecated since 0.7.0, so we prefer to use {{{cr_request_checkpoint()}}}. 
 * Add optional application level INC callbacks (registered through the CR MPI Ext interface). 
 * Increase the {{{opal_cr_thread_sleep_wait}}} parameter to 1000 microseconds to make the C/R thread less aggressive. 
 * {{{opal-restart}}} now looks for cache directories before falling back on stable storage when asked. 
 * {{{opal-restart}}} also support local decompression before restarting 
 * {{{orte-checkpoint}}} now uses the SStore framework to work with the metadata 
 * {{{orte-restart}}} now uses the SStore framework to work with the metadata 
 * Remove the {{{orte-restart}}} preload option. This was removed since the user only needs to select the 'stage' component in order to support this functionality. 
 * Since the '-am' parameter is saved in the metadata, {{{ompi-restart}}} no longer hard codes {{{-am ft-enable-cr}}}. 
 * Fix {{{hnp}}} ErrMgr so that if a previous component in the stack has 'fixed' the problem, then it should be skipped. 
 * Make sure to decrement the number of 'num_local_procs' in the orted when one goes away. 
 * odls now checks the SStore framework to see if it needs to load any checkpoint files before launching (to support 'stage'). This separates the SStore logic from the --preload-[binary|files] options. 
 * Add unique IDs to the named pipes established between the orted and the app in SnapC. This is to better support migration and automatic recovery activities. 
 * Improve the checks for 'already checkpointing' error path. 
 * A a recovery output timer, to show how long it takes to restart a job 
 * Do a better job of cleaning up the old session directory on restart. 
 * Add a local module to the autor and crmig ErrMgr components. These small modules prevent the 'orted' component from attempting a local recovery (Which does not work for MPI apps at the moment) 
 * Add a fix for bounding the checkpointable region between MPI_Init and MPI_Finalize. 

This commit was SVN r23587.

The following Trac tickets were found above:
  Ticket 1924 --> https://svn.open-mpi.org/trac/ompi/ticket/1924
  Ticket 2097 --> https://svn.open-mpi.org/trac/ompi/ticket/2097
  Ticket 2161 --> https://svn.open-mpi.org/trac/ompi/ticket/2161
  Ticket 2192 --> https://svn.open-mpi.org/trac/ompi/ticket/2192
  Ticket 2208 --> https://svn.open-mpi.org/trac/ompi/ticket/2208
  Ticket 2342 --> https://svn.open-mpi.org/trac/ompi/ticket/2342
  Ticket 2413 --> https://svn.open-mpi.org/trac/ompi/ticket/2413
2010-08-10 20:51:11 +00:00
Shiqing Fan
8de5654bf9 Add new files into the tarball.
This commit was SVN r23377.
2010-07-12 16:21:46 +00:00
Ralph Castain
1102f0c171 Replace old platform file with newer ones
This commit was SVN r23322.
2010-06-29 15:00:10 +00:00
Shiqing Fan
681df0089b Add a few new files into the tarball.
This commit was SVN r23297.
2010-06-22 16:45:56 +00:00
Ralph Castain
a1bc589f23 Include new cisco platform files in tarball
This commit was SVN r23209.
2010-05-25 22:39:10 +00:00
Ralph Castain
12fae43969 Correct the makefile
This commit was SVN r23103.
2010-05-05 01:46:11 +00:00
Brad Benton
7fe33ec90b add ibm platform files.
This commit was SVN r22933.
2010-04-06 21:47:12 +00:00
Ralph Castain
1a100812a9 Add some new cisco platform files
This commit was SVN r22898.
2010-03-28 15:40:51 +00:00
Shiqing Fan
c29a668e37 Remove flex.exe and its license file from the tarball.
cmr:v1.4
cmr:v1.5

This commit was SVN r22469.
2010-01-22 16:40:13 +00:00
Shiqing Fan
51025c10c6 Update the Makefile.am, so that the tarball won't get wrong.
This commit was SVN r22456.
2010-01-19 18:10:24 +00:00
Ralph Castain
3905d38bc5 No need for a separate tarball script - add -no-ompi support to the main tarball script and get rid of the orte_tarball ones
This commit was SVN r22385.
2010-01-09 01:06:52 +00:00
Ralph Castain
ea8b2dc752 Update the Cisco platform files. Create a make_tarball variant for creating orte-level tarballs
This commit was SVN r22338.
2009-12-23 16:44:57 +00:00
Ralph Castain
170da86ae5 Add a missing file to the tarball, and clean it up a little
This commit was SVN r22337.
2009-12-22 20:38:39 +00:00
Rainer Keller
507e22a7d3 - As promised in
http://www.open-mpi.org/faq/?category=debugging#valgrind_clean
   provide openmpi-valgrind.supp suppression file

This commit was SVN r22164.
2009-10-28 23:33:16 +00:00
Shiqing Fan
48dd7ff7d0 Get rid of the shadow file for mpi.h.in on Windows.
This commit was SVN r22154.
2009-10-28 15:49:01 +00:00
Shiqing Fan
454c5c2e12 Update the Makefile.am for the previous commit.
This commit was SVN r22146.
2009-10-27 18:25:27 +00:00
Rainer Keller
efb67579a0 - Add the ornl_configure_self_contained to the Makefile.am; this should
show up in the nightly tar-ball.
   Generalize the self_contained file from compiler, so that it can be
   called regardless (with intel and cray compiler coming into play...) 

This commit was SVN r22120.
2009-10-22 01:58:13 +00:00
Ralph Castain
214e26b539 Per Jeff (this work was done on a branch of mine, so I will do the commit):
Re-enable "./autogen.sh -no-ompi" again. If you -no-ompi, the entire OMPI
configury is skipped and the entire ompi/ subtree is not built. There's
some simple m4-isms that prune out the relevant parts.

I added ompi/config/, orte/config/, and opal/config/ directories. I moved a
bunch of m4 files from the top-level config/ dir into ompi/config/, and a few
into orte/config/.

Note that all 3 <project>/config directories have a config_files.m4 file. This
file contains the AC_CONFIG_FILES list for that project. The AC_CONFIG_FILES
call cannot be in an AC_DEFUN macro and conditionally called -- if it is
included at all, Autoconf will process it. Hence, these config_files.m4 files
don't AC_DEFUN -- they just have AC_CONFIG_FILES. m4_ifdef() is used to
conditionally include the files or not.

I moved a bunch of obvious OMPI-only m4 files from config/ to ompi/config/,
but I'm sure that there's more that could go. A ticket will be filed with
thoughts on future work in this area.

This commit was SVN r22113.
2009-10-20 23:44:20 +00:00
Ralph Castain
60c4ebab45 Update makefile so it finds new platform file directory
This commit was SVN r22097.
2009-10-14 01:48:30 +00:00
Shiqing Fan
1b6db85988 Complete the support for building on UNC path.
This commit was SVN r21897.
2009-08-27 07:57:26 +00:00
Shiqing Fan
e7cbfda432 Add the new CMake module into the tarball.
This commit was SVN r21807.
2009-08-12 09:57:49 +00:00
Shiqing Fan
03e6a117c5 Forgot to put the new CMake file into the tarball, it caused a MTT failure on Windows last night.
This commit was SVN r21678.
2009-07-15 09:40:19 +00:00
Ralph Castain
e5496fcc8a Add some more platform files and update some others
This commit was SVN r21544.
2009-06-26 20:49:41 +00:00
Shiqing Fan
a006c9ae97 This should also go into the tarball.
This commit was SVN r21236.
2009-05-14 13:34:17 +00:00
Jeff Squyres
5cdd3bb8ee Remove no-longer-present files.
This commit was SVN r21152.
2009-05-05 11:37:18 +00:00
Shiqing Fan
d5f887c79b Add the copyright file for flex into the tarball too.
This commit was SVN r21084.
2009-04-28 10:40:34 +00:00
Ralph Castain
1663e51d9d Adjust LANL platform files. Remove non-cross compile versions for cells now that cross-compiling works
This commit was SVN r21081.
2009-04-27 19:07:25 +00:00
Shiqing Fan
3d4e0472d6 Add windows support files into the tarball, including .windows, CMakeLists.txt files, and CMake modules. Thanks to Jeff for testing it on Linux.
This commit was SVN r21069.
2009-04-24 16:39:33 +00:00
Ralph Castain
d7a8b3038f Ensure new files are included in tarball
This commit was SVN r21052.
2009-04-21 20:08:54 +00:00
Ralph Castain
4f90b678d1 Crud - this didn't get committed due to too many interruptions this afternoon. My apologies for crashing the nightly tarball.
This commit was SVN r21026.
2009-04-16 02:23:33 +00:00
Ralph Castain
4f4af6e8cb Update LANL files again in the endless search through the weeds of the OMPI configure system.
I no longer wonder why people say OMPI is so hard to use. :-/

This commit was SVN r21004.
2009-04-14 20:11:17 +00:00
Ralph Castain
aef296bee8 Cleanup makefile to remove no longer existing entries
This commit was SVN r20850.
2009-03-24 01:38:19 +00:00
Ralph Castain
22bb995439 Add platform files for embedded operations.
Update the slave platform file to specify script wrapper compilers.

This commit was SVN r20761.
2009-03-11 16:05:17 +00:00
Ralph Castain
4588c677e2 Ensure slave platform files are in tarballs
This commit was SVN r20494.
2009-02-09 21:15:43 +00:00
Ralph Castain
56c7bb9484 Argh...add one more layer of redirection
This commit was SVN r19915.
2008-11-04 17:52:50 +00:00
Ralph Castain
49852bdc19 Add LANL platform files to tarballs
This commit was SVN r19914.
2008-11-04 17:02:32 +00:00
Galen Shipman
19c986bb57 include ornl specifics in the dist
This commit was SVN r18301.
2008-04-25 15:30:03 +00:00
Josh Hursey
5f4ca3b0c4 small make fix for new platform file
This commit was SVN r15118.
2007-06-17 13:20:57 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Josh Hursey
0404444dbe * Added 2 new MCA parameters
- mca_base_param_file_prefix
     (Default: NULL)
     This is the fullname of the "-am" mpirun option. Used to specify a ':'
     separated list of AMCA parameter set files.
  - mca_base_param_file_path
     (Default: $SYSCONFDIR/amca-param-sets/:$CWD)
     The path to search for AMCA files with relative paths. A warning will be
     printed if the AMCA file cannot be found.

* Added a new function "mca_base_param_recache_files" the re-reads the file
configurations. This is used internally to help bootstrap the MCA system.

* Added a new orterun/mpirun command line option '-am' that aliases for the
mca_base_param_file_prefix MCA parameter

* Exposed the opal_path_access function as it is generally useful in other
places in the code.

* New function "opal_cmd_line_make_opt_mca" which will allow you to append a
new command line option with MCA parameter identifiers to set at the same
time. Previously this could only be done at command line declaration time.

* Added a new directory under the $pkgdatadir named "amca-param-sets" where all
the 'shipped with' Open MPI AMCA parameter sets are placed. This is the first
place to search for AMCA sets with relative paths.

* An example.conf AMCA parameter set file is located in
contrib/amca-param-sets/.

* Jeff Squyres contributed an OpenIB AMCA set for benchmarking.

Note: You will need to autogen with this commit as it adds a configure param.
  Sorry :(

This commit was SVN r13867.
2007-03-01 13:39:20 +00:00
Brian Barrett
fd8fe94e6f * add symlink for Cray XT3 to Red Storm since they're the same platform and
all that

This commit was SVN r9899.
2006-05-11 15:23:43 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Brian Barrett
ed56e743b7 * update configure.ac to use the modern version of AC_INIT and
AM_INIT_AUTOMAKE, instead of the deprecated version.
* Work around dumbness in modern AC_INIT that requires the version
  number to be set at autoconf time (instead of at configure time, as
  it was before).  Set the version number, minus the subversion r number,
  at autoconf time.  Override the internal variables to include the r
  number (if needed) at configure time.  Basically, the right thing
  should always happen.  The only place it might not is the version
  reported as part of configure --help will not have an r number.
* Since AM_INIT_AUTOMAKE taks a list of options, no need to specify
  them in all the Makefile.am files.
* Addes support for subdir-objects, meaning that object files are put
  in the directory containing source files, even if the Makefile.am is
  in another directory.  This should start making it feasible to
  reduce the number of Makefile.am files we have in the tree, which
  will greatly reduce the time to run autogen and configure.

This commit was SVN r7211.
2005-09-07 05:54:53 +00:00
Brian Barrett
ba1f742f34 * fix bug in ompi_mca.m4 that would prevent components that were forced
not to build to not be added to the ALL_COMPONENTS list and therefore
  not distributed in a tarball
* add some of the contrib/ stuff to the dist tarball (the stuff to
  make binary packages and the "--with-platform" files)

This commit was SVN r6955.
2005-08-21 21:04:52 +00:00