1
1
Граф коммитов

1538 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
40a2bfa238 WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.

Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.

This commit was SVN r23764.
2010-09-17 23:04:06 +00:00
Rolf vandeVaart
09750d0310 Need output.h header file for opal_output() definition.
Otherwise, build will fail when configuring with --enable-picky.

This commit was SVN r23763.
2010-09-17 12:22:17 +00:00
Shiqing Fan
9a47ca1995 Correct the place of including the if.h, and change retain_loopback to opal_if_retain_loopback for windows module too.
This commit was SVN r23756.
2010-09-14 14:03:48 +00:00
Ralph Castain
c74ce1632a Catch a couple of places (one hidden inside an #if 0, other in solaris module) where retain_loopback needs to be opal_if_retain_loopback
This commit was SVN r23755.
2010-09-14 11:37:10 +00:00
Shiqing Fan
95b17c1e82 Add a missing header for if windows.
This commit was SVN r23754.
2010-09-14 07:51:38 +00:00
Ralph Castain
e96b5f486f Reorganize the opal interface code in opal/util/if.c per prior emails and telecon discussions. Move the interface discovery code into a framework so that configuration logic can separate it out (instead of the prior #if-#else confusion).
All interface APIs for accessing the info remain unchanged in opal/util/if.c.

This has been tested on Mac, Linux, and NetBSD. Nobody else seemed interested in testing it, so there may be some future problems revealed as people try it on other OSs.

This commit was SVN r23743.
2010-09-13 01:58:51 +00:00
Jeff Squyres
3b14366c85 Fix a copyright statement
This commit was SVN r23741.
2010-09-12 09:55:01 +00:00
Rolf vandeVaart
ef8090ec71 Fix the ia32 atomic add and subtract functions so they
do the right thing.  They now properly return
the value after the update.  This also fixes all warnings
reported by the Sun Studio compiler.  George provided the
new assembly routines.  I added some configure code to make
sure the compilers could handle it.

This fixes trac:2560.

This commit was SVN r23721.

The following Trac tickets were found above:
  Ticket 2560 --> https://svn.open-mpi.org/trac/ompi/ticket/2560
2010-09-08 10:47:15 +00:00
Rolf vandeVaart
14e7bcc383 Create new entries in the wrapper data files so the
administrator can specify compiler flags that get
inserted into the command before the user's flags.
These flags can be specified at configure time.
Reviewed by Jeff Squyres.

This fixes ticket #2474.

This commit was SVN r23709.
2010-09-02 10:47:55 +00:00
Rainer Keller
97511912ec - Fixup several functions, that cannot return
- Add one instance where we do not use a parameter in a function
 - Fix a buglet in commit r23689, where the attribute-for-function ptrs
   was applied.

This commit was SVN r23690.

The following SVN revision numbers were found above:
  r23689 --> open-mpi/ompi@5eb571c458
2010-08-31 12:21:13 +00:00
Rainer Keller
5eb571c458 - As suggested in CMR #2558, attribute-macros should be
be tested on function pointers and assigned accordingly,
   instead of using the pre-processor in the header files.

   A functional change is (re-) specifying __opal_attribute_noreturn__
   on orte_errmgr_base_abort(): All modules in the errmgr framework
   either use this function, or define their own abort function,
   which sets __opal_attribute_noreturn__.
   This attributes was taken out with the errmgr overhaul in r22872.

This commit was SVN r23689.

The following SVN revision numbers were found above:
  r22872 --> open-mpi/ompi@e4f2d03d28
2010-08-31 10:28:51 +00:00
Brad Benton
09c4f4d95c Added copyright notices for the files modified in r23669.
This commit was SVN r23687.

The following SVN revision numbers were found above:
  r23669 --> open-mpi/ompi@271cfa8c9a
2010-08-30 17:46:47 +00:00
Jeff Squyres
3eedbee7a4 Fixes trac:2541. Ensure that we keep CPPFLAGS if a non-standard valgrind location was specified. CMR:v1.4.3 CMR:v1.5
This commit was SVN r23680.

The following Trac tickets were found above:
  Ticket 2541 --> https://svn.open-mpi.org/trac/ompi/ticket/2541
2010-08-27 22:45:02 +00:00
Rainer Keller
4abcf5a0d7 - The Sun-compiler 12 update 1 complains about noreturn-attributes
assigned to function-declarations.
   Check this case and mark the currently only case existing in trunk.

   Thanks to Paul Hargrove for bringing this up.

   Let's test the svn commit msg CMR:v1.5

This commit was SVN r23676.
2010-08-27 09:18:30 +00:00
Rainer Keller
044b387d3c - If we don't compile with PGI, then mark the parameter as unused,
otherwise we get swamped with warnings by gcc, everywhere header is
   included.
 - Remove redundant declaration of opal_datatype_safeguard_pointer_debug_breakpoint

   Check whether  CMR:v1.5 works

This commit was SVN r23674.
2010-08-26 15:07:18 +00:00
Nysal Jan
271cfa8c9a Fix the the opal_path_nfs test for GPFS. Reported by Paul H. Hargrove
This commit was SVN r23669.
2010-08-26 10:10:16 +00:00
Jeff Squyres
97fb426325 Per long-ago RFC, now that the odsl default module reports errors nicely, remove all paffinity components except for hwloc and test.
This commit was SVN r23666.
2010-08-25 22:34:30 +00:00
Jeff Squyres
a5ce58f098 Define that we return OPAL_ERR_TIMEOUT if the other end of the socket
closes in an opal_fd_read().

This commit was SVN r23650.
2010-08-24 19:07:04 +00:00
Ethan Mallove
f42c2a737f Fixes trac:2532 - "MPI_Put can result in SIGBUS on SPARC"
Reviewed by Rolf V and Brian B

This commit was SVN r23649.

The following Trac tickets were found above:
  Ticket 2532 --> https://svn.open-mpi.org/trac/ompi/ticket/2532
2010-08-24 18:10:43 +00:00
Ralph Castain
51833bfe6c Not -everyone- wants to ignore loopback devices. Give us a choice.
This commit was SVN r23637.
2010-08-24 02:37:05 +00:00
Shiqing Fan
c110edbf44 Use exclude lists for non-ordinary sub directories check.
This commit was SVN r23631.
2010-08-23 09:43:05 +00:00
Rolf vandeVaart
e71827b8ff Undo 4 of the 5 changes introduced by r22638. Leave
one of them in as it may still be needed on Solaris.

This fixes trac:2530.

This commit was SVN r23626.

The following SVN revision numbers were found above:
  r22638 --> open-mpi/ompi@2a4b1227d9

The following Trac tickets were found above:
  Ticket 2530 --> https://svn.open-mpi.org/trac/ompi/ticket/2530
2010-08-18 20:06:50 +00:00
Rainer Keller
33f2b9398e - This warning now is not supported anymore. Using it generates
a warning itselve (when another warning is generated within the file),
   which can be rather anying.
   Therefore check for output regarding this unrecognized warning.

This commit was SVN r23624.
2010-08-18 06:01:23 +00:00
Ralph Castain
23904c2f3e Correct the extra_dist path to the .windows file
This commit was SVN r23613.
2010-08-14 01:21:58 +00:00
Jeff Squyres
a2f349167e Update hwloc to 1.0.3a1r2398. This fixes a problem with Solaris
linking against libibverbs on Solaris.

Sorry for the mid-day configure change folks; I meant to commit this
last night and forgot.  :-(

This commit was SVN r23606.
2010-08-13 13:18:09 +00:00
Shiqing Fan
550f180014 Add a windows support file into the tarball.
This commit was SVN r23605.
2010-08-13 11:54:13 +00:00
Rainer Keller
14aad075eb - On Jaguar, we don't have pretty printed stackframe, aka no opal_stackframe_output*
This commit was SVN r23602.
2010-08-12 14:44:56 +00:00
Shiqing Fan
330999e36c Some fixes for C/R enhancement on Windows. Add the option and fix some type casts, just let it compile.
This commit was SVN r23599.
2010-08-12 13:31:37 +00:00
Josh Hursey
e12ca48cd9 A number of C/R enhancements per RFC below:
http://www.open-mpi.org/community/lists/devel/2010/07/8240.php

Documentation:
  http://osl.iu.edu/research/ft/

Major Changes: 
-------------- 
 * Added C/R-enabled Debugging support. 
   Enabled with the --enable-crdebug flag. See the following website for more information: 
   http://osl.iu.edu/research/ft/crdebug/ 
 * Added Stable Storage (SStore) framework for checkpoint storage 
   * 'central' component does a direct to central storage save 
   * 'stage' component stages checkpoints to central storage while the application continues execution. 
     * 'stage' supports offline compression of checkpoints before moving (sstore_stage_compress) 
     * 'stage' supports local caching of checkpoints to improve automatic recovery (sstore_stage_caching) 
 * Added Compression (compress) framework to support 
 * Add two new ErrMgr recovery policies 
   * {{{crmig}}} C/R Process Migration 
   * {{{autor}}} C/R Automatic Recovery 
 * Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} ErrMgr component 
 * Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} configure option) 
   * {{{OMPI_CR_Checkpoint}}} (Fixes trac:2342) 
   * {{{OMPI_CR_Restart}}} 
   * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) 
   * {{{OMPI_CR_INC_register_callback}}} (Fixes trac:2192) 
   * {{{OMPI_CR_Quiesce_start}}} 
   * {{{OMPI_CR_Quiesce_checkpoint}}} 
   * {{{OMPI_CR_Quiesce_end}}} 
   * {{{OMPI_CR_self_register_checkpoint_callback}}} 
   * {{{OMPI_CR_self_register_restart_callback}}} 
   * {{{OMPI_CR_self_register_continue_callback}}} 
 * The ErrMgr predicted_fault() interface has been changed to take an opal_list_t of ErrMgr defined types. This will allow us to better support a wider range of fault prediction services in the future. 
 * Add a progress meter to: 
   * FileM rsh (filem_rsh_process_meter) 
   * SnapC full (snapc_full_progress_meter) 
   * SStore stage (sstore_stage_progress_meter) 
 * Added 2 new command line options to ompi-restart 
   * --showme : Display the full command line that would have been exec'ed. 
   * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes trac:2413) 
 * Deprecated some MCA params: 
   * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir 
   * snapc_base_global_snapshot_dir deprecated, use sstore_base_global_snapshot_dir 
   * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared 
   * snapc_base_store_in_place deprecated, replaced with different components of SStore 
   * snapc_base_global_snapshot_ref deprecated, use sstore_base_global_snapshot_ref 
   * snapc_base_establish_global_snapshot_dir deprecated, never well supported 
   * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem 

Minor Changes: 
-------------- 
 * Fixes trac:1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint handles and does the right thing. 
 * Fixes trac:2097 : {{{ompi-info}}} should now report all available CRS components 
 * Fixes trac:2161 : Manual checkpoint movement. A user can 'mv' a checkpoint directory from the original location to another and still restart from it. 
 * Fixes trac:2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} 
 * Move {{{ompi_cr_continue_like_restart}}} to {{{orte_cr_continue_like_restart}}} to be more flexible in where this should be set. 
 * opal_crs_base_metadata_write* functions have been moved to SStore to support a wider range of metadata handling functionality. 
 * Cleanup the CRS framework and components to work with the SStore framework. 
 * Cleanup the SnapC framework and components to work with the SStore framework (cleans up these code paths considerably). 
 * Add 'quiesce' hook to CRCP for a future enhancement. 
 * We now require a BLCR version that supports {{{cr_request_file()}}} or {{{cr_request_checkpoint()}}} in order to make the code more maintainable. Note that {{{cr_request_file}}} has been deprecated since 0.7.0, so we prefer to use {{{cr_request_checkpoint()}}}. 
 * Add optional application level INC callbacks (registered through the CR MPI Ext interface). 
 * Increase the {{{opal_cr_thread_sleep_wait}}} parameter to 1000 microseconds to make the C/R thread less aggressive. 
 * {{{opal-restart}}} now looks for cache directories before falling back on stable storage when asked. 
 * {{{opal-restart}}} also support local decompression before restarting 
 * {{{orte-checkpoint}}} now uses the SStore framework to work with the metadata 
 * {{{orte-restart}}} now uses the SStore framework to work with the metadata 
 * Remove the {{{orte-restart}}} preload option. This was removed since the user only needs to select the 'stage' component in order to support this functionality. 
 * Since the '-am' parameter is saved in the metadata, {{{ompi-restart}}} no longer hard codes {{{-am ft-enable-cr}}}. 
 * Fix {{{hnp}}} ErrMgr so that if a previous component in the stack has 'fixed' the problem, then it should be skipped. 
 * Make sure to decrement the number of 'num_local_procs' in the orted when one goes away. 
 * odls now checks the SStore framework to see if it needs to load any checkpoint files before launching (to support 'stage'). This separates the SStore logic from the --preload-[binary|files] options. 
 * Add unique IDs to the named pipes established between the orted and the app in SnapC. This is to better support migration and automatic recovery activities. 
 * Improve the checks for 'already checkpointing' error path. 
 * A a recovery output timer, to show how long it takes to restart a job 
 * Do a better job of cleaning up the old session directory on restart. 
 * Add a local module to the autor and crmig ErrMgr components. These small modules prevent the 'orted' component from attempting a local recovery (Which does not work for MPI apps at the moment) 
 * Add a fix for bounding the checkpointable region between MPI_Init and MPI_Finalize. 

This commit was SVN r23587.

The following Trac tickets were found above:
  Ticket 1924 --> https://svn.open-mpi.org/trac/ompi/ticket/1924
  Ticket 2097 --> https://svn.open-mpi.org/trac/ompi/ticket/2097
  Ticket 2161 --> https://svn.open-mpi.org/trac/ompi/ticket/2161
  Ticket 2192 --> https://svn.open-mpi.org/trac/ompi/ticket/2192
  Ticket 2208 --> https://svn.open-mpi.org/trac/ompi/ticket/2208
  Ticket 2342 --> https://svn.open-mpi.org/trac/ompi/ticket/2342
  Ticket 2413 --> https://svn.open-mpi.org/trac/ompi/ticket/2413
2010-08-10 20:51:11 +00:00
Terry Dontje
b74ef351b7 Added new solaris sysinfo module. Also added code to assign
orte_local_chip_type and orte_local_chip_model in MPI processes it the
appropriate sysinfo module found the values on the machine.

This commit was SVN r23581.
2010-08-09 19:28:56 +00:00
Nysal Jan
b6524f6a92 Fix the conditional branch, jump to the correct location. Reported by Matthew Clark
This commit was SVN r23576.
2010-08-09 10:07:58 +00:00
Ralph Castain
9c69175117 If debug is enabled, provide an mca param and supporting logic to output when OPAL_ACQUIRE_THREAD is waiting and has obtained the thread, and when OPAL_RELEASE_THREAD releases it.
This commit was SVN r23557.
2010-08-05 16:25:32 +00:00
Shiqing Fan
b8db8d0ef8 Need to change another variable name.
This commit was SVN r23556.
2010-08-05 12:38:28 +00:00
Shiqing Fan
714883d472 A better way to make this work with VS 2010.
This commit was SVN r23544.
2010-08-03 09:06:50 +00:00
Shiqing Fan
e822f465b5 Remove a bunch of warnings due to the new POSIX supplement in VS 2010.
This commit was SVN r23540.
2010-08-02 12:16:29 +00:00
Josh Hursey
ba7e94dd89 Some relatively minor C/R related cleanup
* Fix a configure warning for checking --enable-ft-thread
 * In hnp and orted ErrMgr components check to see if other components have already recovered this process before trying to recover it again.
 * Fix 'npernode' for restarting using the resilient rmaps component
 * export ompi_info_set, so that internal functionality can use it.

This commit was SVN r23535.
2010-07-30 18:59:34 +00:00
Shiqing Fan
ea7bf2bd9e Correctly check the data type alignment for VS 2010 environment, and set the event include paths to global level, in order to make the clever VS load them.
This commit was SVN r23534.
2010-07-30 14:25:15 +00:00
Ralph Castain
0ed98967ed Update the thread protection in the ring_buffer class
This commit was SVN r23532.
2010-07-29 02:12:44 +00:00
Rolf vandeVaart
3d9b05ba2b Fix bug introduced by r23463. We now handle positive
error codes correctly again.  Also fix a typo.
Reviewed by Jeff Squyres. 

This commit was SVN r23531.

The following SVN revision numbers were found above:
  r23463 --> open-mpi/ompi@2af3e6e5ae
2010-07-28 19:19:27 +00:00
Jeff Squyres
f313257022 This file should really be distclean, not maintainer clean (it's not
shipped in the tarball).

This commit was SVN r23525.
2010-07-28 14:24:51 +00:00
Jeff Squyres
dca1ee8822 Revert r23495. Per on-list discussion, it doesn't do what it was
supposed to do, and there's disagreement about whether the concept
that it was supposed to do was the Right Thing anyway.

http://www.open-mpi.org/community/lists/devel/2010/07/8223.php

This commit was SVN r23517.

The following SVN revision numbers were found above:
  r23495 --> open-mpi/ompi@32e6dae8b0
2010-07-27 22:38:07 +00:00
Jeff Squyres
88b7923fc5 At least on NetBSD 5.0_STABLE with Libtool 2.2.6b, lt_dlerror() can
sometimes return NULL, so be sure to handle that case properly.

This commit was SVN r23503.
2010-07-27 14:15:53 +00:00
Jeff Squyres
245dc1a86d Add a cast to avoid a compiler warnings on BSD.
This commit was SVN r23502.
2010-07-27 14:14:37 +00:00
Jeff Squyres
0ce1a82cde This commit looks much bigger than it is. There are only 2
substantive changes in this commit; the rest are minor style changes:

 1. Change an OBJ_NEW(opal_list_item_t) to OBJ_NEW(opal_if_t).  This
    was causing memory corruption in the BSD code paths.
 1. Move some local variables from the top of opal_if_init() to inside
    the non-BSD code paths so that we avoid bunches of warnings about
    unused variables when compiling on BSD.  In doing so, I indented
    the whole non-BSD section one level deeper, making the commit look
    huge. 

I also added a few {} around 1-line blocks, added some spaces, broke a
few lines, re-formatted a few comments, ...etc.  Trivial stuff.

This commit was SVN r23501.
2010-07-27 13:46:55 +00:00
Ralph Castain
b3a8a394f0 Cleanup some lingering references to OMPI_SETUP_C and OMPI_SETUP_CXX that generated warnings. Follow the new naming convention by chaniging OMPI_SETUP_ASM to OPAL_SETUP_ASM
This commit was SVN r23500.
2010-07-27 04:51:50 +00:00
Jeff Squyres
41edaa1fe5 While we're here, also rename this macro: it really should be
OPAL_SETUP_CC. 

This commit was SVN r23496.
2010-07-26 22:09:24 +00:00
Jeff Squyres
32e6dae8b0 Add -gstabs+ compiler switch if we're on OSX and -g is in CFLAGS and that flag works with a test compile
This commit was SVN r23495.
2010-07-26 22:05:41 +00:00
Shiqing Fan
71d2749b6b Fix a header problem on Windows.
This commit was SVN r23483.
2010-07-23 07:52:34 +00:00
Jeff Squyres
7d7c0aa48f Somehow the check for the specific value "external" got dropped in the
logic (even though the "else" clause for handling it was there).  This
commit puts back the specific check for the word "external".

Thanks to Jed Brown for noticing the issue.  Fixes trac:2503.

This commit was SVN r23475.

The following Trac tickets were found above:
  Ticket 2503 --> https://svn.open-mpi.org/trac/ompi/ticket/2503
2010-07-22 11:42:15 +00:00
Jeff Squyres
29c1ad4196 Forgot BEGIN/END C_DECLS.
This commit was SVN r23453.
2010-07-21 11:05:08 +00:00