1
1
Граф коммитов

66 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
7fe860c8a2 Per http://www.open-mpi.org/community/lists/devel/2012/08/11362.php,
there actually are some cases where we don't want to uniq-ize all
flags.  Thanks to P. Martin for raising the issue.

This commit was SVN r26955.
2012-08-06 21:05:11 +00:00
Jeff Squyres
92adbc5d72 Fix some problems with the Fortran wrapper compiler flags,
particularly with respect to threading flags.  

Before this change, the following scenario would fail (e.g., on Linux
with pthreads):

{{{
$ ./configure --disable-shared --enable-static ...
$ make clean install
$ cd examples
$ make clean all
}}}

Linking the Fortran examples would fail with missing pthread symbols.

This commit was SVN r26927.
2012-07-31 18:12:24 +00:00
Jeff Squyres
11808852be Fixes trac:2913. Ensure that OMPI_VERSION is tolerant of having spaces in
it (i.e., so that it doesn't screw up the ident version string).

This commit was SVN r26651.

The following Trac tickets were found above:
  Ticket 2913 --> https://svn.open-mpi.org/trac/ompi/ticket/2913
2012-06-25 17:04:25 +00:00
Jeff Squyres
c30d1ef0df Patch from Evan Clinton, reviewed by Leif Lindholm, for supporting
ARM5 and ARM6.

This commit was SVN r26361.
2012-04-30 20:49:55 +00:00
Jeff Squyres
d4a05ec26e Disable inline assembly for the PGI C++ compiler per patch from Larry
Baker.  We already had it disabled for the PGI C compiler; I'm not
sure why it was left enabled for the C++ compiler.

This commit was SVN r26083.
2012-03-02 20:45:51 +00:00
Jeff Squyres
9c8b2a3489 Gah! Fix confusion about what files are in the source tree and what
files are in the build tree.

This commit was SVN r26023.
2012-02-23 16:13:09 +00:00
Jeff Squyres
94549d024b Fix the search for ltdl_advise support in VPATH builds.
This commit was SVN r26006.
2012-02-22 22:03:45 +00:00
Jeff Squyres
be16137072 Fix copyright.
This commit was SVN r25948.
2012-02-17 11:40:44 +00:00
Jeff Squyres
eb47a97025 After a bunch more back-n-forth with Paul Hargrove, hopefully this
visibility stuff will now be fixed!

This commit was SVN r25944.
2012-02-17 00:09:32 +00:00
Jeff Squyres
717b04f198 Strengthen and simplify the visibility test. Adding an "fprintf"
makes the test a bit more difficult, per discussions with Paul
Hargrove when going through the release process for hwloc 1.3.2.

This "improved" test is already on OMPI v1.4 -- not sure why it's not
on the trunk or OMPI v1.5 branch.  It looks like the extra fprintf got
lost in the translation from AC_TRY_LINK to AC_LINK_IFELSE.

This commit was SVN r25909.
2012-02-13 22:07:25 +00:00
Rolf vandeVaart
bdc7f7a4ef Add check for version of CUDA. Not used yet, but will be in future.
This commit was SVN r25561.
2011-12-02 13:35:20 +00:00
Brian Barrett
86f555121c Add (optional/last ditch effort) support for GCC/Intel __sync_ builtin atomic
operations.  Much easier than adding support for a new architecture.

This commit was SVN r25498.
2011-11-23 04:25:41 +00:00
Terry Dontje
1f53b32216 This commit fixes trac:2917. By using the cleaned up version of check_visibility that is in the hwloc trunk repo.
This commit was SVN r25495.

The following Trac tickets were found above:
  Ticket 2917 --> https://svn.open-mpi.org/trac/ompi/ticket/2917
2011-11-22 00:01:09 +00:00
Rainer Keller
4e6a6fc146 - Check, whether the compiler supports __builtin_clz (count leading
zeroes);
   if so, use it for bit-operations like opal_cube_dim and opal_hibit.
   Implement two versions of power-of-two.
   In case of opal_next_poweroftwo, this reduces the average execution
   time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
   measured rdtsc, with loop over 2^27 values).
   Numbers for other functions are similar (but of course heavily depend
   on the usage, e.g. opal_hibit() with a start of 4 does not save
   much).  The bsr instruction on AMD Opteron is also not as fast.

 - Replace various places where the next power-of-two is computed.
   
   Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
   Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.

This commit was SVN r25270.
2011-10-11 22:49:01 +00:00
Ralph Castain
56ebfa23cc ORTE configure options belong in orte/config, not opal.
This commit was SVN r25100.
2011-08-27 14:23:49 +00:00
George Bosilca
531b241304 Define ORTE_ENABLE_EPOCH and ORTE_RESIL_ORTE.
This commit was SVN r25098.
2011-08-27 00:16:21 +00:00
Wesley Bland
4e7ff0bd5e By popular demand the epoch code is now disabled by default.
To enable the epochs and the resilient orte code, use the configure flag:

--enable-resilient-orte

This will define both:

ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE

This commit was SVN r25093.
2011-08-26 22:16:14 +00:00
Josh Hursey
6539a31b23 Cleanup configure checks for C/R functionality.
Add a WANT_FT_CR flag different from WANT_FT so tools like *-checkpoint are not built when a different FT technique is requested.

Also fix the C/R thread check so that it is only enabled if C/R is enabled, not generally when threads are enabled.

This commit was SVN r24769.
2011-06-09 19:45:29 +00:00
Rolf vandeVaart
2634f6401a Add some basic support for sending and receiving CUDA device memory. Feature is disabled by default and has no effect on default code paths.
This commit was SVN r24659.
2011-04-28 23:05:55 +00:00
Jeff Squyres
bffa5c8f7e * Rename OMPI_CHECK_PTHREAD_PIDS to OPAL_CHECK_PTHREAD_PIDS.
* Convert from AC_TRY_RUN to AC_RUN_IFELSE.
 * Excellent suggestion from Paul Hargrove: use AC_CHECK_FUNC to look
   for a Linuxthreads-specific symbol when we're cross compiling to
   see if threads will have different PIDs (because AC_CHECK_FUNC
   works properly even when in cross-compiling environments).

Background: the old/Linuxthreads-based pthreads implementation used
the Linux clone() call to make threads, which effectively meant that
each thread had a different PID.  The new NPTL pthreads implementation
does things better, meaning that threads have the same PID.  

Open MPI no longer supports threads with different PIDs -- we ripped
out the supporting code for threads with different PIDs because we
don't have systems available to test this on anymore (anyone who still
has such a system can still use older versions of Open MPI).  Hence,
configure needs to determine whether the target system will have the
same PID for threads or not -- even if we're cross-compiling.  The
current test compiles and runs a multi-threaded app that checks PIDs
of different threads, but we clearly can't do that in a
cross-compiling environment.  So use AC_CHECK_FUNC in cross-compiling
environments.

Simple, no?

This commit was SVN r24537.
2011-03-17 11:59:54 +00:00
Ralph Castain
d5dfe05521 Remove stale code associated with OPAL_THREADS_HAVE_DIFFERENT_PIDS. In the past, we have supported the case of really, really old Linux kernels where threads have different pids. However, when we updated the event library, we didn't also update that support code. In addition, when we dropped progress thread support, we didn't remove areas of the code that could no longer be compiled (i.e., were protected by "if progress thread && if have different pids).
There was no compelling reason to support such old kernels. Accordingly, convert the test to print a nice error message indicating we no longer support old kernels (but indicate that earlier OMPI versions do) and error out. Remove all code that was protected by "if have different pids" since it can no longer be compiled.

This commit was SVN r24531.
2011-03-15 21:05:03 +00:00
Ralph Castain
45aacd30ab Add prefix for PPC hosts
This commit was SVN r24515.
2011-03-11 22:58:51 +00:00
Brian Barrett
07996af388 Fix register clobber list for x86 assembly. Thanks to Jay Fenlason for the
patch.

This commit was SVN r24449.
2011-02-23 21:54:07 +00:00
Nysal Jan
f0f1d4e311 Older versions of config.guess detect the canonical system name of an AIX 7.1 system to be rs6000-ibm-aix. Add this workaround until AIX 7.1 support is available in the autotools releases
This commit was SVN r24362.
2011-02-08 02:52:10 +00:00
Nysal Jan
ab2f738b0b Recent versions of IBM XL compilers on AIX support GCC inline assembly format
This commit was SVN r24340.
2011-02-02 11:31:30 +00:00
Jeff Squyres
511f87665b Fixes trac:2680: Add ARM support.
This commit was SVN r24308.

The following Trac tickets were found above:
  Ticket 2680 --> https://svn.open-mpi.org/trac/ompi/ticket/2680
2011-01-26 17:22:44 +00:00
Rolf vandeVaart
37d5267895 The fix for ticket #2560 was somehow removed in the
great autogen update.  Therefore, put them back.

This commit was SVN r24053.
2010-11-15 21:41:56 +00:00
Jeff Squyres
e4744b4ed5 Per http://www.open-mpi.org/community/lists/devel/2010/11/8671.php,
change a bunch of OMPI_<foo> names to OPAL_<foo>.

This commit was SVN r24046.
2010-11-12 23:22:11 +00:00
Jeff Squyres
52f5dd906c Fixes trac:2611: get updated code from GASNet (thanks Paul Hargrove!) that
handles more modern versions of Autoconf's --program-transform
arguments.  Also make it clear that the message is coming from Open
MPI logic, so that we don't blame Autoconf, Red Hat, or anyone else
next time!

This commit was SVN r24024.

The following Trac tickets were found above:
  Ticket 2611 --> https://svn.open-mpi.org/trac/ompi/ticket/2611
2010-11-09 23:36:30 +00:00
Ralph Castain
fceabb2498 Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac.
This is a fairly intrusive change, but outside of the moving of opal/event to opal/mca/event, the only changes involved (a) changing all calls to opal_event functions to reflect the new framework instead, and (b) ensuring that all opal_event_t objects are properly constructed since they are now true opal_objects.

Note: Shiqing has just returned from vacation and has not yet had a chance to complete the Windows integration. Thus, this commit almost certainly breaks Windows support on the trunk. However, I want this to have a chance to soak for as long as possible before I become less available a week from today (going to be at a class for 5 days, and thus will only be sparingly available) so we can find and fix any problems.

Biggest change is moving the libevent code from opal/event to a new opal/mca/event framework. This was done to make it much easier to update libevent in the future. New versions can be inserted as a new component and tested in parallel with the current version until validated, then we can remove the earlier version if we so choose. This is a statically built framework ala installdirs, so only one component will build at a time. There is no selection logic - the sole compiled component simply loads its function pointers into the opal_event struct.

I have gone thru the code base and converted all the libevent calls I could find. However, I cannot compile nor test every environment. It is therefore quite likely that errors remain in the system. Please keep an eye open for two things:

1. compile-time errors: these will be obvious as calls to the old functions (e.g., opal_evtimer_new) must be replaced by the new framework APIs (e.g., opal_event.evtimer_new)

2. run-time errors: these will likely show up as segfaults due to missing constructors on opal_event_t objects. It appears that it became a typical practice for people to "init" an opal_event_t by simply using memset to zero it out. This will no longer work - you must either OBJ_NEW or OBJ_CONSTRUCT an opal_event_t. I tried to catch these cases, but may have missed some. Believe me, you'll know when you hit it.

There is also the issue of the new libevent "no recursion" behavior. As I described on a recent email, we will have to discuss this and figure out what, if anything, we need to do.

This commit was SVN r23925.
2010-10-24 18:35:54 +00:00
Jeff Squyres
082086085b Fix some wordings in the test messages
This commit was SVN r23924.
2010-10-23 14:35:25 +00:00
Jeff Squyres
8ffb046649 Fix a few problems with the compiler visibility test:
* Update to be safe for AC 2.68 by using AC_LINK_IFELSE instead of
   AC_TRY_LINK
 * If enable visibility was used, ensure we fail if the compiler
   doesn't support it
 * Rename OMPI_CHECK_VISIBILITY -> OPAL_CHECK_VISIBILITY (and all
   internal variables)

This commit was SVN r23923.
2010-10-23 14:32:44 +00:00
Ralph Castain
d9389689b1 Fix yet another mangling
This commit was SVN r23818.
2010-09-30 17:53:52 +00:00
Jeff Squyres
7ef20f60f3 Autoconf updates to make us compatible with AC 2.68. Thanks to Ralf W. for the patch!
This commit was SVN r23797.
2010-09-23 22:37:52 +00:00
Ralph Castain
40a2bfa238 WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.

Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.

This commit was SVN r23764.
2010-09-17 23:04:06 +00:00
Ralph Castain
e96b5f486f Reorganize the opal interface code in opal/util/if.c per prior emails and telecon discussions. Move the interface discovery code into a framework so that configuration logic can separate it out (instead of the prior #if-#else confusion).
All interface APIs for accessing the info remain unchanged in opal/util/if.c.

This has been tested on Mac, Linux, and NetBSD. Nobody else seemed interested in testing it, so there may be some future problems revealed as people try it on other OSs.

This commit was SVN r23743.
2010-09-13 01:58:51 +00:00
Rolf vandeVaart
ef8090ec71 Fix the ia32 atomic add and subtract functions so they
do the right thing.  They now properly return
the value after the update.  This also fixes all warnings
reported by the Sun Studio compiler.  George provided the
new assembly routines.  I added some configure code to make
sure the compilers could handle it.

This fixes trac:2560.

This commit was SVN r23721.

The following Trac tickets were found above:
  Ticket 2560 --> https://svn.open-mpi.org/trac/ompi/ticket/2560
2010-09-08 10:47:15 +00:00
Rolf vandeVaart
14e7bcc383 Create new entries in the wrapper data files so the
administrator can specify compiler flags that get
inserted into the command before the user's flags.
These flags can be specified at configure time.
Reviewed by Jeff Squyres.

This fixes ticket #2474.

This commit was SVN r23709.
2010-09-02 10:47:55 +00:00
Rainer Keller
4abcf5a0d7 - The Sun-compiler 12 update 1 complains about noreturn-attributes
assigned to function-declarations.
   Check this case and mark the currently only case existing in trunk.

   Thanks to Paul Hargrove for bringing this up.

   Let's test the svn commit msg CMR:v1.5

This commit was SVN r23676.
2010-08-27 09:18:30 +00:00
Rainer Keller
33f2b9398e - This warning now is not supported anymore. Using it generates
a warning itselve (when another warning is generated within the file),
   which can be rather anying.
   Therefore check for output regarding this unrecognized warning.

This commit was SVN r23624.
2010-08-18 06:01:23 +00:00
Josh Hursey
e12ca48cd9 A number of C/R enhancements per RFC below:
http://www.open-mpi.org/community/lists/devel/2010/07/8240.php

Documentation:
  http://osl.iu.edu/research/ft/

Major Changes: 
-------------- 
 * Added C/R-enabled Debugging support. 
   Enabled with the --enable-crdebug flag. See the following website for more information: 
   http://osl.iu.edu/research/ft/crdebug/ 
 * Added Stable Storage (SStore) framework for checkpoint storage 
   * 'central' component does a direct to central storage save 
   * 'stage' component stages checkpoints to central storage while the application continues execution. 
     * 'stage' supports offline compression of checkpoints before moving (sstore_stage_compress) 
     * 'stage' supports local caching of checkpoints to improve automatic recovery (sstore_stage_caching) 
 * Added Compression (compress) framework to support 
 * Add two new ErrMgr recovery policies 
   * {{{crmig}}} C/R Process Migration 
   * {{{autor}}} C/R Automatic Recovery 
 * Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} ErrMgr component 
 * Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} configure option) 
   * {{{OMPI_CR_Checkpoint}}} (Fixes trac:2342) 
   * {{{OMPI_CR_Restart}}} 
   * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) 
   * {{{OMPI_CR_INC_register_callback}}} (Fixes trac:2192) 
   * {{{OMPI_CR_Quiesce_start}}} 
   * {{{OMPI_CR_Quiesce_checkpoint}}} 
   * {{{OMPI_CR_Quiesce_end}}} 
   * {{{OMPI_CR_self_register_checkpoint_callback}}} 
   * {{{OMPI_CR_self_register_restart_callback}}} 
   * {{{OMPI_CR_self_register_continue_callback}}} 
 * The ErrMgr predicted_fault() interface has been changed to take an opal_list_t of ErrMgr defined types. This will allow us to better support a wider range of fault prediction services in the future. 
 * Add a progress meter to: 
   * FileM rsh (filem_rsh_process_meter) 
   * SnapC full (snapc_full_progress_meter) 
   * SStore stage (sstore_stage_progress_meter) 
 * Added 2 new command line options to ompi-restart 
   * --showme : Display the full command line that would have been exec'ed. 
   * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes trac:2413) 
 * Deprecated some MCA params: 
   * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir 
   * snapc_base_global_snapshot_dir deprecated, use sstore_base_global_snapshot_dir 
   * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared 
   * snapc_base_store_in_place deprecated, replaced with different components of SStore 
   * snapc_base_global_snapshot_ref deprecated, use sstore_base_global_snapshot_ref 
   * snapc_base_establish_global_snapshot_dir deprecated, never well supported 
   * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem 

Minor Changes: 
-------------- 
 * Fixes trac:1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint handles and does the right thing. 
 * Fixes trac:2097 : {{{ompi-info}}} should now report all available CRS components 
 * Fixes trac:2161 : Manual checkpoint movement. A user can 'mv' a checkpoint directory from the original location to another and still restart from it. 
 * Fixes trac:2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} 
 * Move {{{ompi_cr_continue_like_restart}}} to {{{orte_cr_continue_like_restart}}} to be more flexible in where this should be set. 
 * opal_crs_base_metadata_write* functions have been moved to SStore to support a wider range of metadata handling functionality. 
 * Cleanup the CRS framework and components to work with the SStore framework. 
 * Cleanup the SnapC framework and components to work with the SStore framework (cleans up these code paths considerably). 
 * Add 'quiesce' hook to CRCP for a future enhancement. 
 * We now require a BLCR version that supports {{{cr_request_file()}}} or {{{cr_request_checkpoint()}}} in order to make the code more maintainable. Note that {{{cr_request_file}}} has been deprecated since 0.7.0, so we prefer to use {{{cr_request_checkpoint()}}}. 
 * Add optional application level INC callbacks (registered through the CR MPI Ext interface). 
 * Increase the {{{opal_cr_thread_sleep_wait}}} parameter to 1000 microseconds to make the C/R thread less aggressive. 
 * {{{opal-restart}}} now looks for cache directories before falling back on stable storage when asked. 
 * {{{opal-restart}}} also support local decompression before restarting 
 * {{{orte-checkpoint}}} now uses the SStore framework to work with the metadata 
 * {{{orte-restart}}} now uses the SStore framework to work with the metadata 
 * Remove the {{{orte-restart}}} preload option. This was removed since the user only needs to select the 'stage' component in order to support this functionality. 
 * Since the '-am' parameter is saved in the metadata, {{{ompi-restart}}} no longer hard codes {{{-am ft-enable-cr}}}. 
 * Fix {{{hnp}}} ErrMgr so that if a previous component in the stack has 'fixed' the problem, then it should be skipped. 
 * Make sure to decrement the number of 'num_local_procs' in the orted when one goes away. 
 * odls now checks the SStore framework to see if it needs to load any checkpoint files before launching (to support 'stage'). This separates the SStore logic from the --preload-[binary|files] options. 
 * Add unique IDs to the named pipes established between the orted and the app in SnapC. This is to better support migration and automatic recovery activities. 
 * Improve the checks for 'already checkpointing' error path. 
 * A a recovery output timer, to show how long it takes to restart a job 
 * Do a better job of cleaning up the old session directory on restart. 
 * Add a local module to the autor and crmig ErrMgr components. These small modules prevent the 'orted' component from attempting a local recovery (Which does not work for MPI apps at the moment) 
 * Add a fix for bounding the checkpointable region between MPI_Init and MPI_Finalize. 

This commit was SVN r23587.

The following Trac tickets were found above:
  Ticket 1924 --> https://svn.open-mpi.org/trac/ompi/ticket/1924
  Ticket 2097 --> https://svn.open-mpi.org/trac/ompi/ticket/2097
  Ticket 2161 --> https://svn.open-mpi.org/trac/ompi/ticket/2161
  Ticket 2192 --> https://svn.open-mpi.org/trac/ompi/ticket/2192
  Ticket 2208 --> https://svn.open-mpi.org/trac/ompi/ticket/2208
  Ticket 2342 --> https://svn.open-mpi.org/trac/ompi/ticket/2342
  Ticket 2413 --> https://svn.open-mpi.org/trac/ompi/ticket/2413
2010-08-10 20:51:11 +00:00
Josh Hursey
ba7e94dd89 Some relatively minor C/R related cleanup
* Fix a configure warning for checking --enable-ft-thread
 * In hnp and orted ErrMgr components check to see if other components have already recovered this process before trying to recover it again.
 * Fix 'npernode' for restarting using the resilient rmaps component
 * export ompi_info_set, so that internal functionality can use it.

This commit was SVN r23535.
2010-07-30 18:59:34 +00:00
Jeff Squyres
dca1ee8822 Revert r23495. Per on-list discussion, it doesn't do what it was
supposed to do, and there's disagreement about whether the concept
that it was supposed to do was the Right Thing anyway.

http://www.open-mpi.org/community/lists/devel/2010/07/8223.php

This commit was SVN r23517.

The following SVN revision numbers were found above:
  r23495 --> open-mpi/ompi@32e6dae8b0
2010-07-27 22:38:07 +00:00
Ralph Castain
b3a8a394f0 Cleanup some lingering references to OMPI_SETUP_C and OMPI_SETUP_CXX that generated warnings. Follow the new naming convention by chaniging OMPI_SETUP_ASM to OPAL_SETUP_ASM
This commit was SVN r23500.
2010-07-27 04:51:50 +00:00
Jeff Squyres
41edaa1fe5 While we're here, also rename this macro: it really should be
OPAL_SETUP_CC. 

This commit was SVN r23496.
2010-07-26 22:09:24 +00:00
Jeff Squyres
32e6dae8b0 Add -gstabs+ compiler switch if we're on OSX and -g is in CFLAGS and that flag works with a test compile
This commit was SVN r23495.
2010-07-26 22:05:41 +00:00
Jeff Squyres
6bcdadbf0e If we're not building project_ompi, don't do anything with C++. Also
rename OMPI_CHECK_ATTRIBUTES -> OPAL_CHECK_ATTRIBUTES, because it's in
OPAL (somehow that name must have gotten missed in the Great M4 split
of '10...?)

This commit was SVN r23267.
2010-06-12 03:15:47 +00:00
Jeff Squyres
befc0b590b Fix the --disable-dlopen case -- don't expect to build or link anything.
This commit was SVN r23198.
2010-05-21 17:46:46 +00:00
Jeff Squyres
208953f1bf Grr -- also don't reset LIBLTDL unless we're using an external libltdl
build. 

This commit was SVN r23194.
2010-05-21 15:00:03 +00:00
Jeff Squyres
473547481b Don't reset LTDLINCL unless we're using an external libltdl
installation. 

This commit was SVN r23192.
2010-05-21 13:58:53 +00:00