1
1
Граф коммитов

1717 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
847e43703f Remove cruft
This commit was SVN r23950.
2010-10-26 14:49:36 +00:00
Shiqing Fan
b2c3cb300c Correctly configure the new libevent mca for Windows.
This commit was SVN r23946.
2010-10-26 09:33:47 +00:00
Ralph Castain
86c7365e8e Clean up a few initialization issues - don't think these are impacting the shared memory situation as it didn't fix the problem.
Setup the event API to support multiple bases in preparation for splitting the OMPI and ORTE events. Holding here pending shared memory resolution.

This commit was SVN r23943.
2010-10-26 02:41:42 +00:00
Jeff Squyres
ed1e9a412a Need these files in all tarballs -- so don't conditionally add them to
EXTRA_DIST. 

This commit was SVN r23938.
2010-10-25 18:31:38 +00:00
Jeff Squyres
d14474969b Need this variable in optimized builds, too.
This commit was SVN r23937.
2010-10-25 18:31:01 +00:00
George Bosilca
bc3e1376ba event-config.h only exists in the builddir, so we need to explicitly
include it while building.

This commit was SVN r23936.
2010-10-25 18:29:52 +00:00
George Bosilca
c2e40f8616 Remove a warning about signed to unsigned comparaison.
This commit was SVN r23935.
2010-10-25 18:29:11 +00:00
George Bosilca
b9a06afd98 opal_event_libevent207 is prototyped as const, so it should be defined as const.
This commit was SVN r23934.
2010-10-25 18:28:42 +00:00
Jeff Squyres
1d1571a86c Fix vpath builds.
This commit was SVN r23932.
2010-10-25 17:48:02 +00:00
Ralph Castain
a04da165bc Remove the sample and test code from the libevent distro - don't need to include them in ompi
This commit was SVN r23931.
2010-10-25 14:53:33 +00:00
Ralph Castain
bab990d812 Revert r23928 as being the incorrect fix. The correct fix is not to include ipv6 interfaces when ipv6 support was not requested.
This commit was SVN r23930.

The following SVN revision numbers were found above:
  r23928 --> open-mpi/ompi@7394f6d167
2010-10-25 14:31:18 +00:00
Abhishek Kulkarni
c671ec52d1 Fix broken trunk compile after the libevent changes.
This commit was SVN r23929.
2010-10-25 14:11:48 +00:00
Ralph Castain
7394f6d167 Silence warnings about IPV6 sa_family not known when ipv6 support is not enabled in configure
This commit was SVN r23928.
2010-10-25 13:56:23 +00:00
Ralph Castain
fceabb2498 Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac.
This is a fairly intrusive change, but outside of the moving of opal/event to opal/mca/event, the only changes involved (a) changing all calls to opal_event functions to reflect the new framework instead, and (b) ensuring that all opal_event_t objects are properly constructed since they are now true opal_objects.

Note: Shiqing has just returned from vacation and has not yet had a chance to complete the Windows integration. Thus, this commit almost certainly breaks Windows support on the trunk. However, I want this to have a chance to soak for as long as possible before I become less available a week from today (going to be at a class for 5 days, and thus will only be sparingly available) so we can find and fix any problems.

Biggest change is moving the libevent code from opal/event to a new opal/mca/event framework. This was done to make it much easier to update libevent in the future. New versions can be inserted as a new component and tested in parallel with the current version until validated, then we can remove the earlier version if we so choose. This is a statically built framework ala installdirs, so only one component will build at a time. There is no selection logic - the sole compiled component simply loads its function pointers into the opal_event struct.

I have gone thru the code base and converted all the libevent calls I could find. However, I cannot compile nor test every environment. It is therefore quite likely that errors remain in the system. Please keep an eye open for two things:

1. compile-time errors: these will be obvious as calls to the old functions (e.g., opal_evtimer_new) must be replaced by the new framework APIs (e.g., opal_event.evtimer_new)

2. run-time errors: these will likely show up as segfaults due to missing constructors on opal_event_t objects. It appears that it became a typical practice for people to "init" an opal_event_t by simply using memset to zero it out. This will no longer work - you must either OBJ_NEW or OBJ_CONSTRUCT an opal_event_t. I tried to catch these cases, but may have missed some. Believe me, you'll know when you hit it.

There is also the issue of the new libevent "no recursion" behavior. As I described on a recent email, we will have to discuss this and figure out what, if anything, we need to do.

This commit was SVN r23925.
2010-10-24 18:35:54 +00:00
Jeff Squyres
082086085b Fix some wordings in the test messages
This commit was SVN r23924.
2010-10-23 14:35:25 +00:00
Jeff Squyres
8ffb046649 Fix a few problems with the compiler visibility test:
* Update to be safe for AC 2.68 by using AC_LINK_IFELSE instead of
   AC_TRY_LINK
 * If enable visibility was used, ensure we fail if the compiler
   doesn't support it
 * Rename OMPI_CHECK_VISIBILITY -> OPAL_CHECK_VISIBILITY (and all
   internal variables)

This commit was SVN r23923.
2010-10-23 14:32:44 +00:00
Ralph Castain
29b16cc800 Add missing include
This commit was SVN r23892.
2010-10-15 04:04:20 +00:00
Jeff Squyres
e09bbb49a9 No need to have this AC_ARG_WITH in every component configure.m4 -- just put it up in the framework-level configure.m4.
This commit was SVN r23890.
2010-10-14 22:39:48 +00:00
Jeff Squyres
66d15035ab Replace some intentional-segv's with abort(). Seems safer and doesn't
cause all kinds of compiler warnings.

This commit was SVN r23889.
2010-10-14 22:01:14 +00:00
Sylvain Jeaugey
5fb2a2f2c9 Add a check for the ummunotify device before setting up ptmalloc2 hooks.
This commit was SVN r23882.
2010-10-11 15:05:57 +00:00
Sylvain Jeaugey
78176d2aeb Fix missing include in ummunotify
This commit was SVN r23881.
2010-10-11 15:03:00 +00:00
Jeff Squyres
69a64e5905 Fix typo that prevented the valgrind component from configuring properly
This commit was SVN r23874.
2010-10-07 22:39:08 +00:00
Jeff Squyres
a95ca7444e Fix a meaningless compare of an unsigned against 0. Rework the logic
a bit so that the secondary loop isn't even necessary; makes the whole
thing much simpler, anyway.

This commit was SVN r23860.
2010-10-07 15:04:50 +00:00
Ralph Castain
d9389689b1 Fix yet another mangling
This commit was SVN r23818.
2010-09-30 17:53:52 +00:00
Jeff Squyres
73bcc4a36b Fix mistake that came in via the ompi-agen tree in r23764. The mistake wasn't part of the core autogen upgrade; it was an additional 'bonus' cleanup. Oops. The mistake will always create a set of directories under installdir, even if you do not --with-devel-headers. The set of directories will be empty, but still -- they should not be there at all. This commit fixes that -- the directories are not created at all if you do not --with-devel-headers
This commit was SVN r23801.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-24 22:53:28 +00:00
Jeff Squyres
7ef20f60f3 Autoconf updates to make us compatible with AC 2.68. Thanks to Ralf W. for the patch!
This commit was SVN r23797.
2010-09-23 22:37:52 +00:00
Ralph Castain
407eefc66d Update the if configure to include "opal" so they will build!
This commit was SVN r23787.
2010-09-22 03:19:15 +00:00
Ralph Castain
3631e4e936 Revert remaining svn kruft from r23764
This commit was SVN r23786.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-22 01:11:40 +00:00
Jeff Squyres
0ca617e570 Make this a warning, not an error.
This commit was SVN r23767.
2010-09-18 07:14:58 +00:00
Ralph Castain
40a2bfa238 WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.

Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.

This commit was SVN r23764.
2010-09-17 23:04:06 +00:00
Rolf vandeVaart
09750d0310 Need output.h header file for opal_output() definition.
Otherwise, build will fail when configuring with --enable-picky.

This commit was SVN r23763.
2010-09-17 12:22:17 +00:00
Shiqing Fan
9a47ca1995 Correct the place of including the if.h, and change retain_loopback to opal_if_retain_loopback for windows module too.
This commit was SVN r23756.
2010-09-14 14:03:48 +00:00
Ralph Castain
c74ce1632a Catch a couple of places (one hidden inside an #if 0, other in solaris module) where retain_loopback needs to be opal_if_retain_loopback
This commit was SVN r23755.
2010-09-14 11:37:10 +00:00
Shiqing Fan
95b17c1e82 Add a missing header for if windows.
This commit was SVN r23754.
2010-09-14 07:51:38 +00:00
Ralph Castain
e96b5f486f Reorganize the opal interface code in opal/util/if.c per prior emails and telecon discussions. Move the interface discovery code into a framework so that configuration logic can separate it out (instead of the prior #if-#else confusion).
All interface APIs for accessing the info remain unchanged in opal/util/if.c.

This has been tested on Mac, Linux, and NetBSD. Nobody else seemed interested in testing it, so there may be some future problems revealed as people try it on other OSs.

This commit was SVN r23743.
2010-09-13 01:58:51 +00:00
Jeff Squyres
3b14366c85 Fix a copyright statement
This commit was SVN r23741.
2010-09-12 09:55:01 +00:00
Rolf vandeVaart
ef8090ec71 Fix the ia32 atomic add and subtract functions so they
do the right thing.  They now properly return
the value after the update.  This also fixes all warnings
reported by the Sun Studio compiler.  George provided the
new assembly routines.  I added some configure code to make
sure the compilers could handle it.

This fixes trac:2560.

This commit was SVN r23721.

The following Trac tickets were found above:
  Ticket 2560 --> https://svn.open-mpi.org/trac/ompi/ticket/2560
2010-09-08 10:47:15 +00:00
Rolf vandeVaart
14e7bcc383 Create new entries in the wrapper data files so the
administrator can specify compiler flags that get
inserted into the command before the user's flags.
These flags can be specified at configure time.
Reviewed by Jeff Squyres.

This fixes ticket #2474.

This commit was SVN r23709.
2010-09-02 10:47:55 +00:00
Rainer Keller
97511912ec - Fixup several functions, that cannot return
- Add one instance where we do not use a parameter in a function
 - Fix a buglet in commit r23689, where the attribute-for-function ptrs
   was applied.

This commit was SVN r23690.

The following SVN revision numbers were found above:
  r23689 --> open-mpi/ompi@5eb571c458
2010-08-31 12:21:13 +00:00
Rainer Keller
5eb571c458 - As suggested in CMR #2558, attribute-macros should be
be tested on function pointers and assigned accordingly,
   instead of using the pre-processor in the header files.

   A functional change is (re-) specifying __opal_attribute_noreturn__
   on orte_errmgr_base_abort(): All modules in the errmgr framework
   either use this function, or define their own abort function,
   which sets __opal_attribute_noreturn__.
   This attributes was taken out with the errmgr overhaul in r22872.

This commit was SVN r23689.

The following SVN revision numbers were found above:
  r22872 --> open-mpi/ompi@e4f2d03d28
2010-08-31 10:28:51 +00:00
Brad Benton
09c4f4d95c Added copyright notices for the files modified in r23669.
This commit was SVN r23687.

The following SVN revision numbers were found above:
  r23669 --> open-mpi/ompi@271cfa8c9a
2010-08-30 17:46:47 +00:00
Jeff Squyres
3eedbee7a4 Fixes trac:2541. Ensure that we keep CPPFLAGS if a non-standard valgrind location was specified. CMR:v1.4.3 CMR:v1.5
This commit was SVN r23680.

The following Trac tickets were found above:
  Ticket 2541 --> https://svn.open-mpi.org/trac/ompi/ticket/2541
2010-08-27 22:45:02 +00:00
Rainer Keller
4abcf5a0d7 - The Sun-compiler 12 update 1 complains about noreturn-attributes
assigned to function-declarations.
   Check this case and mark the currently only case existing in trunk.

   Thanks to Paul Hargrove for bringing this up.

   Let's test the svn commit msg CMR:v1.5

This commit was SVN r23676.
2010-08-27 09:18:30 +00:00
Rainer Keller
044b387d3c - If we don't compile with PGI, then mark the parameter as unused,
otherwise we get swamped with warnings by gcc, everywhere header is
   included.
 - Remove redundant declaration of opal_datatype_safeguard_pointer_debug_breakpoint

   Check whether  CMR:v1.5 works

This commit was SVN r23674.
2010-08-26 15:07:18 +00:00
Nysal Jan
271cfa8c9a Fix the the opal_path_nfs test for GPFS. Reported by Paul H. Hargrove
This commit was SVN r23669.
2010-08-26 10:10:16 +00:00
Jeff Squyres
97fb426325 Per long-ago RFC, now that the odsl default module reports errors nicely, remove all paffinity components except for hwloc and test.
This commit was SVN r23666.
2010-08-25 22:34:30 +00:00
Jeff Squyres
a5ce58f098 Define that we return OPAL_ERR_TIMEOUT if the other end of the socket
closes in an opal_fd_read().

This commit was SVN r23650.
2010-08-24 19:07:04 +00:00
Ethan Mallove
f42c2a737f Fixes trac:2532 - "MPI_Put can result in SIGBUS on SPARC"
Reviewed by Rolf V and Brian B

This commit was SVN r23649.

The following Trac tickets were found above:
  Ticket 2532 --> https://svn.open-mpi.org/trac/ompi/ticket/2532
2010-08-24 18:10:43 +00:00
Ralph Castain
51833bfe6c Not -everyone- wants to ignore loopback devices. Give us a choice.
This commit was SVN r23637.
2010-08-24 02:37:05 +00:00
Shiqing Fan
c110edbf44 Use exclude lists for non-ordinary sub directories check.
This commit was SVN r23631.
2010-08-23 09:43:05 +00:00
Rolf vandeVaart
e71827b8ff Undo 4 of the 5 changes introduced by r22638. Leave
one of them in as it may still be needed on Solaris.

This fixes trac:2530.

This commit was SVN r23626.

The following SVN revision numbers were found above:
  r22638 --> open-mpi/ompi@2a4b1227d9

The following Trac tickets were found above:
  Ticket 2530 --> https://svn.open-mpi.org/trac/ompi/ticket/2530
2010-08-18 20:06:50 +00:00
Rainer Keller
33f2b9398e - This warning now is not supported anymore. Using it generates
a warning itselve (when another warning is generated within the file),
   which can be rather anying.
   Therefore check for output regarding this unrecognized warning.

This commit was SVN r23624.
2010-08-18 06:01:23 +00:00
Ralph Castain
23904c2f3e Correct the extra_dist path to the .windows file
This commit was SVN r23613.
2010-08-14 01:21:58 +00:00
Jeff Squyres
a2f349167e Update hwloc to 1.0.3a1r2398. This fixes a problem with Solaris
linking against libibverbs on Solaris.

Sorry for the mid-day configure change folks; I meant to commit this
last night and forgot.  :-(

This commit was SVN r23606.
2010-08-13 13:18:09 +00:00
Shiqing Fan
550f180014 Add a windows support file into the tarball.
This commit was SVN r23605.
2010-08-13 11:54:13 +00:00
Rainer Keller
14aad075eb - On Jaguar, we don't have pretty printed stackframe, aka no opal_stackframe_output*
This commit was SVN r23602.
2010-08-12 14:44:56 +00:00
Shiqing Fan
330999e36c Some fixes for C/R enhancement on Windows. Add the option and fix some type casts, just let it compile.
This commit was SVN r23599.
2010-08-12 13:31:37 +00:00
Josh Hursey
e12ca48cd9 A number of C/R enhancements per RFC below:
http://www.open-mpi.org/community/lists/devel/2010/07/8240.php

Documentation:
  http://osl.iu.edu/research/ft/

Major Changes: 
-------------- 
 * Added C/R-enabled Debugging support. 
   Enabled with the --enable-crdebug flag. See the following website for more information: 
   http://osl.iu.edu/research/ft/crdebug/ 
 * Added Stable Storage (SStore) framework for checkpoint storage 
   * 'central' component does a direct to central storage save 
   * 'stage' component stages checkpoints to central storage while the application continues execution. 
     * 'stage' supports offline compression of checkpoints before moving (sstore_stage_compress) 
     * 'stage' supports local caching of checkpoints to improve automatic recovery (sstore_stage_caching) 
 * Added Compression (compress) framework to support 
 * Add two new ErrMgr recovery policies 
   * {{{crmig}}} C/R Process Migration 
   * {{{autor}}} C/R Automatic Recovery 
 * Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} ErrMgr component 
 * Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} configure option) 
   * {{{OMPI_CR_Checkpoint}}} (Fixes trac:2342) 
   * {{{OMPI_CR_Restart}}} 
   * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) 
   * {{{OMPI_CR_INC_register_callback}}} (Fixes trac:2192) 
   * {{{OMPI_CR_Quiesce_start}}} 
   * {{{OMPI_CR_Quiesce_checkpoint}}} 
   * {{{OMPI_CR_Quiesce_end}}} 
   * {{{OMPI_CR_self_register_checkpoint_callback}}} 
   * {{{OMPI_CR_self_register_restart_callback}}} 
   * {{{OMPI_CR_self_register_continue_callback}}} 
 * The ErrMgr predicted_fault() interface has been changed to take an opal_list_t of ErrMgr defined types. This will allow us to better support a wider range of fault prediction services in the future. 
 * Add a progress meter to: 
   * FileM rsh (filem_rsh_process_meter) 
   * SnapC full (snapc_full_progress_meter) 
   * SStore stage (sstore_stage_progress_meter) 
 * Added 2 new command line options to ompi-restart 
   * --showme : Display the full command line that would have been exec'ed. 
   * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes trac:2413) 
 * Deprecated some MCA params: 
   * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir 
   * snapc_base_global_snapshot_dir deprecated, use sstore_base_global_snapshot_dir 
   * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared 
   * snapc_base_store_in_place deprecated, replaced with different components of SStore 
   * snapc_base_global_snapshot_ref deprecated, use sstore_base_global_snapshot_ref 
   * snapc_base_establish_global_snapshot_dir deprecated, never well supported 
   * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem 

Minor Changes: 
-------------- 
 * Fixes trac:1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint handles and does the right thing. 
 * Fixes trac:2097 : {{{ompi-info}}} should now report all available CRS components 
 * Fixes trac:2161 : Manual checkpoint movement. A user can 'mv' a checkpoint directory from the original location to another and still restart from it. 
 * Fixes trac:2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} 
 * Move {{{ompi_cr_continue_like_restart}}} to {{{orte_cr_continue_like_restart}}} to be more flexible in where this should be set. 
 * opal_crs_base_metadata_write* functions have been moved to SStore to support a wider range of metadata handling functionality. 
 * Cleanup the CRS framework and components to work with the SStore framework. 
 * Cleanup the SnapC framework and components to work with the SStore framework (cleans up these code paths considerably). 
 * Add 'quiesce' hook to CRCP for a future enhancement. 
 * We now require a BLCR version that supports {{{cr_request_file()}}} or {{{cr_request_checkpoint()}}} in order to make the code more maintainable. Note that {{{cr_request_file}}} has been deprecated since 0.7.0, so we prefer to use {{{cr_request_checkpoint()}}}. 
 * Add optional application level INC callbacks (registered through the CR MPI Ext interface). 
 * Increase the {{{opal_cr_thread_sleep_wait}}} parameter to 1000 microseconds to make the C/R thread less aggressive. 
 * {{{opal-restart}}} now looks for cache directories before falling back on stable storage when asked. 
 * {{{opal-restart}}} also support local decompression before restarting 
 * {{{orte-checkpoint}}} now uses the SStore framework to work with the metadata 
 * {{{orte-restart}}} now uses the SStore framework to work with the metadata 
 * Remove the {{{orte-restart}}} preload option. This was removed since the user only needs to select the 'stage' component in order to support this functionality. 
 * Since the '-am' parameter is saved in the metadata, {{{ompi-restart}}} no longer hard codes {{{-am ft-enable-cr}}}. 
 * Fix {{{hnp}}} ErrMgr so that if a previous component in the stack has 'fixed' the problem, then it should be skipped. 
 * Make sure to decrement the number of 'num_local_procs' in the orted when one goes away. 
 * odls now checks the SStore framework to see if it needs to load any checkpoint files before launching (to support 'stage'). This separates the SStore logic from the --preload-[binary|files] options. 
 * Add unique IDs to the named pipes established between the orted and the app in SnapC. This is to better support migration and automatic recovery activities. 
 * Improve the checks for 'already checkpointing' error path. 
 * A a recovery output timer, to show how long it takes to restart a job 
 * Do a better job of cleaning up the old session directory on restart. 
 * Add a local module to the autor and crmig ErrMgr components. These small modules prevent the 'orted' component from attempting a local recovery (Which does not work for MPI apps at the moment) 
 * Add a fix for bounding the checkpointable region between MPI_Init and MPI_Finalize. 

This commit was SVN r23587.

The following Trac tickets were found above:
  Ticket 1924 --> https://svn.open-mpi.org/trac/ompi/ticket/1924
  Ticket 2097 --> https://svn.open-mpi.org/trac/ompi/ticket/2097
  Ticket 2161 --> https://svn.open-mpi.org/trac/ompi/ticket/2161
  Ticket 2192 --> https://svn.open-mpi.org/trac/ompi/ticket/2192
  Ticket 2208 --> https://svn.open-mpi.org/trac/ompi/ticket/2208
  Ticket 2342 --> https://svn.open-mpi.org/trac/ompi/ticket/2342
  Ticket 2413 --> https://svn.open-mpi.org/trac/ompi/ticket/2413
2010-08-10 20:51:11 +00:00
Terry Dontje
b74ef351b7 Added new solaris sysinfo module. Also added code to assign
orte_local_chip_type and orte_local_chip_model in MPI processes it the
appropriate sysinfo module found the values on the machine.

This commit was SVN r23581.
2010-08-09 19:28:56 +00:00
Nysal Jan
b6524f6a92 Fix the conditional branch, jump to the correct location. Reported by Matthew Clark
This commit was SVN r23576.
2010-08-09 10:07:58 +00:00
Ralph Castain
9c69175117 If debug is enabled, provide an mca param and supporting logic to output when OPAL_ACQUIRE_THREAD is waiting and has obtained the thread, and when OPAL_RELEASE_THREAD releases it.
This commit was SVN r23557.
2010-08-05 16:25:32 +00:00
Shiqing Fan
b8db8d0ef8 Need to change another variable name.
This commit was SVN r23556.
2010-08-05 12:38:28 +00:00
Shiqing Fan
714883d472 A better way to make this work with VS 2010.
This commit was SVN r23544.
2010-08-03 09:06:50 +00:00
Shiqing Fan
e822f465b5 Remove a bunch of warnings due to the new POSIX supplement in VS 2010.
This commit was SVN r23540.
2010-08-02 12:16:29 +00:00
Josh Hursey
ba7e94dd89 Some relatively minor C/R related cleanup
* Fix a configure warning for checking --enable-ft-thread
 * In hnp and orted ErrMgr components check to see if other components have already recovered this process before trying to recover it again.
 * Fix 'npernode' for restarting using the resilient rmaps component
 * export ompi_info_set, so that internal functionality can use it.

This commit was SVN r23535.
2010-07-30 18:59:34 +00:00
Shiqing Fan
ea7bf2bd9e Correctly check the data type alignment for VS 2010 environment, and set the event include paths to global level, in order to make the clever VS load them.
This commit was SVN r23534.
2010-07-30 14:25:15 +00:00
Ralph Castain
0ed98967ed Update the thread protection in the ring_buffer class
This commit was SVN r23532.
2010-07-29 02:12:44 +00:00
Rolf vandeVaart
3d9b05ba2b Fix bug introduced by r23463. We now handle positive
error codes correctly again.  Also fix a typo.
Reviewed by Jeff Squyres. 

This commit was SVN r23531.

The following SVN revision numbers were found above:
  r23463 --> open-mpi/ompi@2af3e6e5ae
2010-07-28 19:19:27 +00:00
Jeff Squyres
f313257022 This file should really be distclean, not maintainer clean (it's not
shipped in the tarball).

This commit was SVN r23525.
2010-07-28 14:24:51 +00:00
Jeff Squyres
dca1ee8822 Revert r23495. Per on-list discussion, it doesn't do what it was
supposed to do, and there's disagreement about whether the concept
that it was supposed to do was the Right Thing anyway.

http://www.open-mpi.org/community/lists/devel/2010/07/8223.php

This commit was SVN r23517.

The following SVN revision numbers were found above:
  r23495 --> open-mpi/ompi@32e6dae8b0
2010-07-27 22:38:07 +00:00
Jeff Squyres
88b7923fc5 At least on NetBSD 5.0_STABLE with Libtool 2.2.6b, lt_dlerror() can
sometimes return NULL, so be sure to handle that case properly.

This commit was SVN r23503.
2010-07-27 14:15:53 +00:00
Jeff Squyres
245dc1a86d Add a cast to avoid a compiler warnings on BSD.
This commit was SVN r23502.
2010-07-27 14:14:37 +00:00
Jeff Squyres
0ce1a82cde This commit looks much bigger than it is. There are only 2
substantive changes in this commit; the rest are minor style changes:

 1. Change an OBJ_NEW(opal_list_item_t) to OBJ_NEW(opal_if_t).  This
    was causing memory corruption in the BSD code paths.
 1. Move some local variables from the top of opal_if_init() to inside
    the non-BSD code paths so that we avoid bunches of warnings about
    unused variables when compiling on BSD.  In doing so, I indented
    the whole non-BSD section one level deeper, making the commit look
    huge. 

I also added a few {} around 1-line blocks, added some spaces, broke a
few lines, re-formatted a few comments, ...etc.  Trivial stuff.

This commit was SVN r23501.
2010-07-27 13:46:55 +00:00
Ralph Castain
b3a8a394f0 Cleanup some lingering references to OMPI_SETUP_C and OMPI_SETUP_CXX that generated warnings. Follow the new naming convention by chaniging OMPI_SETUP_ASM to OPAL_SETUP_ASM
This commit was SVN r23500.
2010-07-27 04:51:50 +00:00
Jeff Squyres
41edaa1fe5 While we're here, also rename this macro: it really should be
OPAL_SETUP_CC. 

This commit was SVN r23496.
2010-07-26 22:09:24 +00:00
Jeff Squyres
32e6dae8b0 Add -gstabs+ compiler switch if we're on OSX and -g is in CFLAGS and that flag works with a test compile
This commit was SVN r23495.
2010-07-26 22:05:41 +00:00
Shiqing Fan
71d2749b6b Fix a header problem on Windows.
This commit was SVN r23483.
2010-07-23 07:52:34 +00:00
Jeff Squyres
7d7c0aa48f Somehow the check for the specific value "external" got dropped in the
logic (even though the "else" clause for handling it was there).  This
commit puts back the specific check for the word "external".

Thanks to Jed Brown for noticing the issue.  Fixes trac:2503.

This commit was SVN r23475.

The following Trac tickets were found above:
  Ticket 2503 --> https://svn.open-mpi.org/trac/ompi/ticket/2503
2010-07-22 11:42:15 +00:00
Jeff Squyres
29c1ad4196 Forgot BEGIN/END C_DECLS.
This commit was SVN r23453.
2010-07-21 11:05:08 +00:00
Jeff Squyres
b3952e4f07 Use const for the opal_fd_write() function, just to be nice.
This commit was SVN r23452.
2010-07-21 11:01:16 +00:00
Jeff Squyres
ab5fc1b570 Add trivial functions to loop over read()'ing and write()'ing with a
file descriptor (i.e., read and write complete messages, transparently
handling partial reads/writes, EAGAIN, and EINTR).

This code effectively already exists in a few places in the code base;
this is mainly a consolidation.

This commit was SVN r23450.
2010-07-20 19:53:49 +00:00
Jeff Squyres
64cb8f5d7f Another round of man page cleanups from Debian mantainer Manuel
Prinz.  Many thanks!

This commit was SVN r23445.
2010-07-20 14:07:18 +00:00
Christopher Yeoh
8a3d5d4e1c Adds missing sys/stat.h include needed for more recent versions of glibc
This commit was SVN r23440.
2010-07-20 06:31:16 +00:00
Jeff Squyres
5ab634555a Apparently, Cisco plans to be working on Open MPI for a veeeeery long time!
This commit was SVN r23433.
2010-07-19 19:31:59 +00:00
Jeff Squyres
57d89d1c0c Remove a lot of kruft from the hwloc paffinity directory that we're
not using in Open MPI (i.e., that stuff is only used in the standalone
builds of hwloc -- it's not compiled/installed/used by Open MPI).

This commit was SVN r23416.
2010-07-14 20:46:47 +00:00
Jeff Squyres
dc7d30b0ed We (Ralph and Jeff) discovered that if the OPAL_DESTDIR environment
variable was set, it was prefixed to ''all'' values in the wrapper
compiler data text files.  For example, if OPAL_DESTDIR was set to
/tmp/bogus and a wrapper compiler data file contained the line:

  preprocessor_flags=-pthread

The value would be exanded to:

  /tmp/bogus/-pthread

Which is clearly wrong.  After some back-and-forth with Ralph and
Brian, Brian submitted this patch that fixes the problem.  Now we
handle three cases properly (assume that configure was invoked with
--prefix=/opt/openmpi and no other directory specifications, and
$OPAL_DESTDIR is set to /tmp/buildroot):

1. Individual directories, such as libdir.  These need to be prepended
with DESTDIR.  I.e., return /tmp/buildroot/opt/openmpi/lib.

2. Compiler flags that have ${FIELD} values embedded in them.  For
example, consider if a wrapper compiler data file contains the
line:

  preprocessor_flags=-DMYFLAG="${prefix}/share/randomthingy/"

The value we should return is:

  -DMYFLAG="/tmp/buildroot/opt/openmpi/share/randomthingy/"

3. Compiler flags that do not have any ${FIELD} values.  For example,
consider if a wrapper compiler data file contains the line:

  preprocessor_flags=-pthread

The value we should return is:

  -pthread

Note, too, that this OPAL_DESTDIR futzing only needs to occur during
opal_init().  By the time opal_init() has completed, all values should
be substituted in that need substituting.  Hence, we take an extra
parameter (is_setup) to know whether we should do this futzing or
not.

This commit was SVN r23402.
2010-07-14 00:53:08 +00:00
Shiqing Fan
cdc7e0bec9 Mainly type casts.
Get rid of pthread and other unnecessary stuffs for Windows.

This commit was SVN r23376.
2010-07-12 16:17:56 +00:00
Jeff Squyres
c8bb7537e7 Remove include/opal/sys/cache.h -- its only purpose in life was to
#define CACHE_LINE_SIZE to 128.  This name has a conflict on NetBSD,
and it seems kinda odd to have a header file that ''only'' defines a
single value.  Also, we'll soon be raising hwloc to be a first-class
item, so having this file around seemed kinda weird.

Therefore, I replaced CACHE_LINE_SIZE with opal_cache_line_size, an
int (in opal/runtime/opal_init.c and opal/runtime/opal.h) on the
rationale that we can fill this in at runtime with hwloc info (trunk
and v1.5/beyond, only).  The only place we ''needed'' a compile-time
CACHE_LINE_SIZE was in the BTL SM (for struct padding), so I made a
new BTL_SM_ preprocessor macro with the old CACHE_LINE_SIZE value
(128).  That use isn't suitable for run-time hwloc information,
anyway.

This commit was SVN r23349.
2010-07-06 14:33:36 +00:00
Jeff Squyres
10185343a7 Ensure that we're actually checking for *linux*. Thanks to Aleksej
Saushev for the patch.

This commit was SVN r23336.
2010-07-01 23:26:49 +00:00
Jeff Squyres
6d07a1cc0b Per comments in this commit, hwloc isn't able to find cores on all
platforms (e.g., PPC64 running RHEL 5.4) -- sometimes it only finds
PUs.  So in that case, just run the same calculation, but with PUs
instead of cores.

This commit was SVN r23305.
2010-06-25 21:36:53 +00:00
Ralph Castain
f325ac030a Add a function to prepend a string to the beginning of an argv array - useful when building app_contexts from user input
This commit was SVN r23303.
2010-06-24 15:52:36 +00:00
Jeff Squyres
5cdd79ef13 Oops -- set the bits one at a time via _set. Using _cpu effectively
zeroed out the cpuset before setting the bit (i.e., we always had a
cpuset of 1).

This commit was SVN r23298.
2010-06-23 20:56:59 +00:00
Jeff Squyres
6bcdadbf0e If we're not building project_ompi, don't do anything with C++. Also
rename OMPI_CHECK_ATTRIBUTES -> OPAL_CHECK_ATTRIBUTES, because it's in
OPAL (somehow that name must have gotten missed in the Great M4 split
of '10...?)

This commit was SVN r23267.
2010-06-12 03:15:47 +00:00
Jeff Squyres
8ce59bb3e3 Use HWLOC_EMBEDDED_LIBS properly (new variable as of 1.0.2a12214).
Should fix some Solaris build issues.

This commit was SVN r23266.
2010-06-09 19:58:42 +00:00
Jeff Squyres
2887fe77c5 Refresh hwloc to an as-yet unreleased tarball from the hwloc 1.0
release branch in order to fix some Solaris bugs.

This commit was SVN r23265.
2010-06-09 19:56:18 +00:00
Jeff Squyres
f1a7b5cc33 Make "processor affinity not supported" error message a little better:
* Remove OPAL_ERR_PAFFINITY_NOT_SUPPORTED; fit it into the generic
   OPAL_ERR_NOT_SUPPORTED case.
 * When odls_default detects that processor affinity is not supported,
   it prints a specific message about it, and then it suppressed a
   generic HNP help message that would normally follow it (i.e., it's
   easier to have the "processor affinity is not supported" show_help
   message last).
 * Use some symbolic names in odls_default instead of fixed int's,
   just for slight readability improvements in the code.
 * Introduce orte_show_help_suppress(), which gives the ability to
   suppress any future showings of any arbitrary show_help() message.
   This is useful if you display message X and want to suppress
   message Y.  This suppression *only* works in environments where
   orte_show_help() does coalescing.

This commit was SVN r23249.
2010-06-08 20:16:07 +00:00
Shiqing Fan
43bd92272a Remove an unnecessary inline definition, in order to solve the conflict of function exporting on Windows.
This commit was SVN r23230.
2010-06-01 15:44:46 +00:00
Jeff Squyres
61f5528ec4 Update to hwloc 1.0.1rc1:
* Should fix the issues with 32 bit builds on 64 bit platforms
 * A few windows fixes
 * A few other minor / misc fixes

This commit was SVN r23226.
2010-06-01 14:51:25 +00:00
Jeff Squyres
e41603fb64 Add files into 3 directories that would not otherwise exist in a
distribution tarball, and would therefore cause automake to fail (in
case someone invokes autogen.sh on a distribution tarball).

This commit was SVN r23218.
2010-05-28 19:33:22 +00:00
Jeff Squyres
befc0b590b Fix the --disable-dlopen case -- don't expect to build or link anything.
This commit was SVN r23198.
2010-05-21 17:46:46 +00:00
Jeff Squyres
fec7918eea Some paffinity functions had their return status overloaded:
* If < 0, it's an OPAL_ERR_* value
 * If >= 0, it's the actual output value of the function

This is problematic for the OPAL_SOS stuff.  This commit changes those
functions to always return OPAL_* statuses and send the output value
back through output parameters (like 95% of the rest of the code
base).  This avoids the confusion with OPAL_SOS stuff and makes
paffinity work again (e.g., mpirun --bind-to-core ...).

I updated all paffinitiy modules for the new function signatures, and
bumped the paffinity API version up to 2.0.1.  I don't think the
version change will matter, though, because we'll be introducing
support for hardware threads soon, which will either bump the
paffinity version again or we'll replace paffinity with 
a new framework.

This commit was SVN r23197.
2010-05-21 16:55:28 +00:00
Jeff Squyres
208953f1bf Grr -- also don't reset LIBLTDL unless we're using an external libltdl
build. 

This commit was SVN r23194.
2010-05-21 15:00:03 +00:00
Shiqing Fan
857f1669e2 Solve a few compilation problems on Windows.
This commit was SVN r23193.
2010-05-21 14:30:15 +00:00
Jeff Squyres
473547481b Don't reset LTDLINCL unless we're using an external libltdl
installation. 

This commit was SVN r23192.
2010-05-21 13:58:53 +00:00
Jeff Squyres
e597c4f9cd Add --with-libltdl option to allow building Open MPI with an external installation of libltdl. Fixes trac:2407
This commit was SVN r23189.

The following Trac tickets were found above:
  Ticket 2407 --> https://svn.open-mpi.org/trac/ompi/ticket/2407
2010-05-20 22:42:02 +00:00
Josh Hursey
71fa89aca5 Move the sos_init() after the initialization of opal_show_help.
I was getting a funny segv if the param_register failed, and show_help was not initialized yet.

This commit was SVN r23177.
2010-05-19 20:47:05 +00:00
Abhishek Kulkarni
c63c4d6892 Fix bugs where (OMPI_ERROR == *) checks cannot be converted to (OMPI_SUCCESS != *) since the return codes are overloaded to return an "index" on success.
The fix is to just check if the return value is positive or not, since all the SOS encoded errors are *always* negative.

The real fix (as Ralph points out) is to change these functions (opal_pointer_array_add and mca_base_param*) to return the index as a pointer.

This commit was SVN r23173.
2010-05-18 20:54:11 +00:00
Jeff Squyres
32417b9802 Bump up to hwloc v1.0.
This commit was SVN r23171.
2010-05-18 17:11:45 +00:00
Abhishek Kulkarni
0b3e5f5d79 Silence a opal_sos compiler warning.
This commit was SVN r23163.
2010-05-17 23:14:44 +00:00
Abhishek Kulkarni
afbe3e99c6 * Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
 SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
 back the native error code.

* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
  (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
  decode 'ret' to get the native error code.

This commit was SVN r23162.
2010-05-17 23:08:56 +00:00
Abhishek Kulkarni
b0e963299a Adding a new function to return the stack trace (not including the call to the function itself)
as a string (which must be freed by the caller).

This commit was SVN r23160.
2010-05-17 22:57:42 +00:00
Abhishek Kulkarni
5e05546194 Adding SOS headers and package data to the Makefile.
This commit was SVN r23159.
2010-05-17 22:53:33 +00:00
Abhishek Kulkarni
4e33e6aeaa Merge OPAL SOS into the trunk.
The OPAL SOS framework tries to meet the following objectives:

 * reduce the cascading error messages and the amount of code needed to print an error message.
 * build and aggregate stacks of encountered errors and associate related individual errors with each other.
 * allow registration of custom callbacks to intercept error events.

For more information, refer to
https://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages

This commit was SVN r23158.
2010-05-17 22:51:52 +00:00
Jeff Squyres
b43288f01e Add missing header file
This commit was SVN r23154.
2010-05-17 21:31:24 +00:00
Jeff Squyres
b0cfe91eca Re-enable hwloc component; it should be working now.
I forgot to mention one more thing in the r23152 commit message:

 * Copy the fix for hwloc's m4 to disable the configure flag
   --enable-debug when building in embedding mode, because it can be
   hijacked by the outter-level application.  In this case, if you
   configured OMPI with --enable-debug (or have --enable-debug in a
   platform file), you'd see all of hwloc's debug output.  Ick.  hwloc
   1.0 will include this fix.

This commit was SVN r23153.

The following SVN revision numbers were found above:
  r23152 --> open-mpi/ompi@ca3362021e
2010-05-17 21:07:57 +00:00
Jeff Squyres
ca3362021e Fix some problems noted by Ralph:
* Fix disabling hwloc build (i.e., put the AM_CONDITIONALs where they
   belong in the configure.m4 file)
 * Update some svn:ignores
 * r23142 removed some extraneous code, but forgot to remove the
   variables used only by that code

This commit was SVN r23152.

The following SVN revision numbers were found above:
  r23142 --> open-mpi/ompi@610fc67d12
2010-05-17 21:05:27 +00:00
Ralph Castain
cc8ebe7dd5 Protect against NULL when looking for an MCA param in an environment
This commit was SVN r23151.
2010-05-17 02:50:39 +00:00
Ralph Castain
12590202d8 Cleanup warnings
This commit was SVN r23148.
2010-05-16 20:22:00 +00:00
Ralph Castain
da170a7ab9 Turn off the blasted hwloc component as it generates a ton of garbage. Note that this means linux-based systems will -not- have paffinity for now since the good old plpa module was removed.
Clean up some missing ignores

This commit was SVN r23147.
2010-05-16 20:06:14 +00:00
Jeff Squyres
e2ab4f2baf Should be working now...
This commit was SVN r23143.
2010-05-14 15:20:47 +00:00
Jeff Squyres
610fc67d12 Oops -- don't convert to a processor ID here; just return the OS index
of the core.

This commit was SVN r23142.
2010-05-14 15:14:28 +00:00
Jeff Squyres
a27da2473a Ensure the whole directory is built.
This commit was SVN r23140.
2010-05-14 13:21:09 +00:00
Jeff Squyres
3ba4086b0f Remove another debugging message.
This commit was SVN r23139.
2010-05-14 13:20:46 +00:00
Jeff Squyres
a1848ef8d5 Arf. Ignore this component while I fix vpath builds...
This commit was SVN r23138.
2010-05-14 13:03:02 +00:00
Jeff Squyres
2d01a67516 Remove these generates files from SVN.
This commit was SVN r23137.
2010-05-14 11:58:17 +00:00
Jeff Squyres
8c8efa9bf3 Remove debugging message.
This commit was SVN r23136.
2010-05-14 11:57:43 +00:00
Jeff Squyres
21178f9379 Remove the "linux" paffinity component (i.e., the one that was based
on the now-defunct PLPA) -- the new hwloc component supersedes it.  

So long, PLPA -- we loved ya!

This commit was SVN r23126.
2010-05-13 23:59:21 +00:00
Jeff Squyres
3129ccd9ec Make the hwloc paffinity component available for everyone. hwloc
supports a wide variety of operating systems and platforms; see the
opal/mca/paffinity/hwloc/hwloc/README file for details.

This component includes an embedded copy of hwloc, currently based on
hwloc-1.0rc6.  But note that hwloc is properly SVN imported into the
/vendor branch, so it will be easy to update when 1.0 GA is released.
Note that the hwloc tree embedded in opal/mca/paffinity/hwloc/hwloc is
identical to a hwloc distribution tarball, except that much of the
documentation was rm -rf'ed (because we don't need it for the embedded
case).

Since the paffinity framework currently does not understand hardware
threads, the hwloc component compensates for this by identifying cores
by the "first" hardware thread on that core.  Hopefully we'll update
paffinity someday to understand hardware threads.  :-)

configure grew a --with-hwloc option, analogous to what we do for many
other external libraries that OMPI supports.  However, there's a new
feature: due to the request of several distros, OMPI can be configured
to build with its internal copy of hwloc or with an external copy of
hwloc (e.g., a system-installed hwloc).

 1. If --with-hwloc is not specified, Open MPI will try to use its
    internal copy (but silently fail/ignore hwloc if that fails).
 1. If --with-hwloc=<dir> is supplied, Open MPI looks for hwloc
    support in <dir> (and --with-hwloc-libdir=<dir>, if specified).
 1. If --with-hwloc=external is supplied, Open MPI will look for hwloc
    in a compiler/linker default external location.
 1. If --with-hwloc=internal is supplied, Open MPI will use its
    internal copy of hwloc.

Some of OMPI's main configury had to be slightly re-arranged in the
bootstrapping phase to accomodate hwloc's configry needs.

This commit was SVN r23125.
2010-05-13 23:56:05 +00:00
Jeff Squyres
ca6d95a9c8 Clean up some comments; make paffinity/base/base.h comments agree with
paffinity/paffinity.h. 

This commit was SVN r23124.
2010-05-13 23:43:28 +00:00
Jeff Squyres
bf7954c1de Bump up to 1.0rc6 from the vendor branch.
This commit was SVN r23117.
2010-05-12 17:04:48 +00:00
Jeff Squyres
c7c3de87f5 Add ummunotify support to Open MPI. See
http://marc.info/?l=linux-mm-commits&m=127352503417787&w=2 for more
details.

 * Remove the ptmalloc memory component; replace it with a new "linux"
   memory component.
 * The linux memory component will conditionally compile in support
   for ummunotify.  At run-time, if it has ummunotify support and
   finds run-time support for ummunotify (i.e., /dev/ummunotify), it
   uses it.  If not, it tries to use ptmalloc via the glibc memory
   hooks. 
 * Add some more API functions to the memory framework to accomodate
   the ummunotify model (i.e., poll to see if memory has "changed").
 * Add appropriate calls in the rcache to the new memory APIs to see
   if memory has changed, and to react accordingly.
 * Add a few comments in the openib BTL to indicate why we don't need
   to notify the OPAL memory framework about specific instances of
   registered memory.
 * Add dummy API calls in the solaris malloc component (since it
   doesn't have polling/"did memory change" support).

This commit was SVN r23113.
2010-05-11 21:43:19 +00:00
Ralph Castain
5965d3e620 Include the new error code in the error strings
This commit was SVN r23111.
2010-05-07 18:09:08 +00:00
Ralph Castain
d6a1d7a082 Little more cleanup on paffinity. Provide a specific error code for affinity not supported so we can better report the problem. Move the error reporting to orterun so we only get one error message. Update the darwin paffinity module to return the correct new error codes.
This commit was SVN r23107.
2010-05-07 14:04:55 +00:00
Ralph Castain
d4f56cff61 More cleanup on paffinity....groan
It is okay to not have a paffinity module IF you aren't using paffinity anyway. So don't error out of MPI_Init because a paffinity module wasn't selected.

Cleanup error reporting in the odls default module to (once and for all!) eliminate messages originating in the fork'd process. Create some new error codes to allow us to pass enough info back to the parent process to provide useful error messages.

This commit was SVN r23106.
2010-05-06 20:57:17 +00:00
Jeff Squyres
71cbe1a69f Bump up to hwloc v1.0rc3
This commit was SVN r23070.
2010-04-29 15:59:01 +00:00
Jeff Squyres
f064056a07 We don't need all this stuff in OMPI.
This commit was SVN r23056.
2010-04-28 00:31:15 +00:00
Jeff Squyres
2fe1bc043d Bump up to hwloc 1.0rc2
This commit was SVN r23042.
2010-04-26 21:57:51 +00:00
Ralph Castain
13a7338289 Ensure we get past the '=' in the parameter
This commit was SVN r23039.
2010-04-26 20:46:50 +00:00
Ralph Castain
e1b9f400ba Add some new utilities that support searching an environ string list (not just our own environ) for specific MCA params and returning their value. Helpful when a daemon needs to check an app_context's environ for params that can impact how the daemon launches and/or interacts with it, but don't pertain to the daemon's own environ.
This commit was SVN r23034.
2010-04-26 03:35:09 +00:00
Jeff Squyres
ea8b0ea569 Add a new function in the paffinity base:
opal_paffinity_base_cset2str().  This function basically makes a
prettyprint string out of an opal_paffinity_base_cset_t.

This commit was SVN r23017.
2010-04-21 17:26:36 +00:00
Shiqing Fan
d1e66bdd01 Use variables instead of hard-coded compiler flags, in order to support various C/C++ compilers on Windows.
This commit was SVN r23016.
2010-04-21 12:45:00 +00:00
Christopher Yeoh
cab7982c7e fixes trac:2355 - race in interaction between opal_atomic_lifo_push
and opal_atomic_lifo_pop. Adds memory barriers to remove the race
condition

This commit was SVN r23014.

The following Trac tickets were found above:
  Ticket 2355 --> https://svn.open-mpi.org/trac/ompi/ticket/2355
2010-04-21 00:00:14 +00:00
Jeff Squyres
53ab6600e6 Minor update to comments.
This commit was SVN r23013.
2010-04-20 20:59:42 +00:00
Jeff Squyres
f1d4a748eb Minor fix: pass by pointer to the new function so that the caller
can see the results.

This commit was SVN r23012.
2010-04-20 19:52:47 +00:00
Ralph Castain
7717c970a3 Ahem...it requires 2 hex chars to describe each byte of a bitmask...
This commit was SVN r23001.
2010-04-20 05:11:16 +00:00
Ralph Castain
86228aee38 Provide two new opal paffinity utilities for printing a hex representation of the cpu set and parsing that string back into a cpu set on the other end. Also add a new MCA param for passing the cpu set applied to a process during launch down to that process so it can know what we attempted to do.
All to be used in some new MPI extensions provided by Jeff so that users can easily query their binding situation.

This commit was SVN r22998.
2010-04-19 22:16:35 +00:00
Jeff Squyres
338920656f Remove the compile-time proiorities for paffinity modules (they were
done this way a long time ago for the "gee whiz!" factor -- when in
reality, they really only need one-of-many-run-time priority
selection).

Changed run-time priorities to be as follows:

 * darwin: 20
 * linux: 20
 * posix: 10
 * solaris: 30
 * test: 5
 * windows: 20

I have a very dim (possibly untrue) recollection that Solaris needs to
have a higher priority than others just to ensure that no other is
chosen under Solaris.  Make all other "native" components have a
priority of 20 (they shouldn't conflict with each other).  Make the
posix fallback component have a priority of 10.  Make the test
component priority 5, meaning someone can always select it, but you
can also make a "never select me" component that prioritizes itself
under test.

This commit was SVN r22997.
2010-04-19 22:14:06 +00:00
Jeff Squyres
9f5ddbcc6e 3rd party import hwloc 1.0rc1 into the SVN trunk
This commit was SVN r22996.
2010-04-19 19:48:58 +00:00
Jeff Squyres
8b163ccd70 Add dummy hwloc directory for staged import into svn
This commit was SVN r22994.
2010-04-19 19:43:43 +00:00
Ralph Castain
4d06125a33 Establish a method by which a process knows if it has been bound by mpirun. This helps resolve a problem where a process gets "bound" to all available resources, which looks to the opal paffinity system as "not bound". This can cause mpi_init to attempt to "bind" the process itself, causing unintended behavior.
This commit was SVN r22985.
2010-04-17 01:58:26 +00:00
Ralph Castain
41428e6b61 Issue a warning if a requested binding operation results in processes being bound to all available processes, which is the equivalent of not being bound at all.
See the following email thread for further details:

http://www.open-mpi.org/community/lists/devel/2010/04/7745.php

This commit was SVN r22984.
2010-04-17 01:02:41 +00:00
Jeff Squyres
798202c424 Allow the mca_component_path to change over time.
This commit was SVN r22957.
2010-04-12 22:02:34 +00:00
Jeff Squyres
f77257d931 These don't belong in this file.
This commit was SVN r22956.
2010-04-12 20:50:23 +00:00
Jeff Squyres
1919ba225d Allow static_components to be NULL for cases where we ''know'' there
will be no static components to be searched.

This commit was SVN r22954.
2010-04-12 14:51:47 +00:00
Shiqing Fan
96b20a29b5 An easy solution to make singleton work on Windows.
This commit was SVN r22952.
2010-04-10 16:30:59 +00:00
Terry Dontje
282a537cf7 This commit fixes 2370, by having the solaris paffinity module return error codes for get_physical_processor_id and having odls_default_fork_local_proc check get_physical_processor_id for OPAL_ERROR
This commit was SVN r22948.
2010-04-09 15:10:46 +00:00
Ralph Castain
2b8ab61328 Add another helpful macro
This commit was SVN r22934.
2010-04-06 22:40:45 +00:00
Brad Benton
58a9aeff5a ================================================================================
modify the OPAL_PAFFINITY_PROCESS_IS_BOUND macro to search the cpuset for
the maximum possible number of cpus rather than just the number of cpus
currently online.  This corrects a problem where mpi_paffinity_alone was
not working properly on systems in which there can be cpu namespaces with
holes, such as on ppc64 with smt off (as discussed in #2365).

This commit was SVN r22927.
2010-04-02 18:24:12 +00:00
Jeff Squyres
8a85c4617f Fixes trac:2366: dragonboy noticed that the PGI compiler is picky about
#if directives -- had to change a pair of #if conditionals in
opal/util/stacktrace.c to make the PGI compiler accept it.

This commit was SVN r22923.

The following Trac tickets were found above:
  Ticket 2366 --> https://svn.open-mpi.org/trac/ompi/ticket/2366
2010-04-01 17:04:06 +00:00
Josh Hursey
62f8d3c471 r22885 missed a few symbol updates when it changed ompi_want_ft to opal_want_ft
This commit was SVN r22916.

The following SVN revision numbers were found above:
  r22885 --> open-mpi/ompi@522a23d6a3
2010-03-30 16:47:39 +00:00
Ralph Castain
522a23d6a3 A few changes to the FT-related configure options:
1. fix a bug that caused an infinite loop in configure when specifying want-ft but not want-ft-thread by removing a stale reference to the opal-progress-thread option

2. add want-ft=orcm so we can build the orcm errmgr component

3. cleanup the use of "ompi_want_ft_xxx" and replace it with "opal_want_ft_xxx" so that naming conventions are preserved

This commit was SVN r22885.
2010-03-25 22:53:48 +00:00
Christopher Yeoh
cd5294944b fixes trac:2355 - race in opal_atomic_lifo
Adds memory barriers to remove race condition which can
occur on PowerPC architectures (and probably others)

This commit was SVN r22880.

The following Trac tickets were found above:
  Ticket 2355 --> https://svn.open-mpi.org/trac/ompi/ticket/2355
2010-03-25 03:44:38 +00:00
Jeff Squyres
c26dae01ce Update the if.c code to properly use the OBJ_* system.
This commit was SVN r22869.
2010-03-23 20:37:06 +00:00
Jeff Squyres
59126b1e0b Update copyrights.
This commit was SVN r22867.
2010-03-23 12:03:20 +00:00
Shiqing Fan
9591680ec0 One of the binaries was generated from a wrong source.
This commit was SVN r22865.
2010-03-23 09:56:11 +00:00
Jeff Squyres
136f926fd1 Really fixes trac:2104. There is a lengthy discussion about this patch on
#2322.

The short version is that this patch consolidates two pieces of code
that call the back-end munmap and ensures that (if dlsym is used) the
corresponding dlsym is only invoked once and that the variable holding
the result is volatile.

This commit was SVN r22863.

The following Trac tickets were found above:
  Ticket 2104 --> https://svn.open-mpi.org/trac/ompi/ticket/2104
2010-03-23 01:04:25 +00:00
Ralph Castain
df2d361b2b Add a pair of convenience macros for handling threads to minimize code duplication
This commit was SVN r22861.
2010-03-22 15:45:03 +00:00
Ralph Castain
b400b84162 Merge in the modified thread configure option branch per today's telecon.
Remove the --enable-progress-threads option as this is no longer functional, and hardcode OPAL_ENABLE_PROGRESS_THREADS to 0.

Replace the --enable-mpi-threads option with --enable-mpi-thread-multiple as this is clearer as to meaning. This option automatically turns "on" opal thread support if it wasn't already so specified. If the user specifies --disable-opal-multi-threads --enable-mpi-thread-multiple, we will error out with a message

Add a new --enable-opal-multi-threads option that turns "on" opal thread support without doing anything wrt mpi-thread-multiple

This commit was SVN r22841.
2010-03-16 23:10:50 +00:00
Rainer Keller
814fb9399f - Further patches for support on NetBSD (and DragonFly) by
Aleksej Saushev.
   Dont use bash or bashism in shell scripts
   We should use Posix' setpgid(0,0), which is equivalent to setpgrp().

This commit was SVN r22829.
2010-03-15 05:33:42 +00:00
Josh Hursey
e9b5162d79 Fix the configure logic for --with-ft so that it properly takes a comma separated list.
Many of the OPAL_ENABLE_FT should be OPAL_ENABLE_FT_CR, so fix those.

The OPAL Layer INC should call opal_output on restart so that it can refresh the string it prints to reflect the current pid/hostname which may have changed.

This commit was SVN r22824.
2010-03-12 23:57:50 +00:00
Ralph Castain
ed1dbabc0c Remove the last vestiges of mpi_portable_platform.h.in
This commit was SVN r22789.
2010-03-05 21:21:03 +00:00
Nadia Derbey
3f56f9e688 Fix typo in evutil.h
This commit was SVN r22730.
2010-03-01 07:55:08 +00:00
Ralph Castain
8c7f3a0c44 Silence warnings by correctly identifying when we are on a Mac
This commit was SVN r22724.
2010-02-27 08:15:49 +00:00
Iain Bason
7445b23e0d Fixed a minor typo.
This commit was SVN r22706.
2010-02-24 19:05:19 +00:00
Jeff Squyres
af6f1f4b00 Add pkg-config(1) config files to Open MPI. Additionally, fix a minor
bug: libmpi_f90 had libmpi.la in its LIBADD instead of libmpi_f77.la.

Fixes trac:2244.

This commit was SVN r22704.

The following Trac tickets were found above:
  Ticket 2244 --> https://svn.open-mpi.org/trac/ompi/ticket/2244
2010-02-24 18:46:06 +00:00
Jeff Squyres
d9b6b5af0c This commit converts us to the "one big libmpi" scheme that has been
discussed extensively.  See
https://svn.open-mpi.org/trac/ompi/ticket/2092 and the RFC thread
http://www.open-mpi.org/community/lists/devel/2010/02/7447.php.

Specifically:

 * Create LT convenience libraries for OPAL and ORTE if the layer
   above them is being created (use the already-defined
   AM_CONDITIONALs to know if the project above us is being built).
 * ORTE slurps in the LT convenience library for OPAL; OMPI slurps in
   the LT convenience library for ORTE.
 * Wrapper compilers now only -l one library (e.g., ortecc only does
   -lopen-ret, and mpicc only does -lmpi).

This commit was SVN r22691.
2010-02-23 22:20:01 +00:00
Terry Dontje
cfe37fb5a1 Fixed issue with detecting root dir and used appropriate defines for solaris detection
This commit was SVN r22686.
2010-02-23 15:58:49 +00:00
Christopher Yeoh
bccafbb5df Fixes the problem where the rcache and core memory allocation can deadlock itself
This commit fixes trac:2104. Request a cmr:v1.4

This commit was SVN r22675.

The following Trac tickets were found above:
  Ticket 2104 --> https://svn.open-mpi.org/trac/ompi/ticket/2104
2010-02-22 05:12:10 +00:00
George Bosilca
3356c2e241 Don't forget to update the return value for PPC32 and PPC64.
This commit was SVN r22665.
2010-02-18 19:16:41 +00:00
George Bosilca
ab202d0f69 Add the memory and the cc to the clobber list for the cas atomics.
This commit was SVN r22664.
2010-02-18 19:15:50 +00:00
Rainer Keller
a46cecf4f2 - Use strrchr instead of loop for '/' as Nysal suggests.
This commit was SVN r22649.
2010-02-17 23:40:08 +00:00
Jeff Squyres
17f0885f12 Add proper BSD interface detection code. Fixes a long-standing
discussion on the users list (see
http://www.open-mpi.org/community/lists/users/2009/12/11526.php). 

Many thanks to Kevin Buckley who did most of the coding work, and to
Aleksej Saushev for his extreme patience in waiting for me to review
and commit this stuff.

This commit was SVN r22640.
2010-02-17 19:43:57 +00:00
Terry Dontje
2a4b1227d9 corrected an array access bug in the latest libevent merge (see #2234) that was causing Solaris binaries to loop infinitely.
This commit was SVN r22638.
2010-02-17 14:50:37 +00:00
Shiqing Fan
3a3018deef Convert the line endings for the added header files. They were changed automatically by Windows when adding new files.
This commit was SVN r22634.
2010-02-16 17:24:44 +00:00
Shiqing Fan
08ffdbe987 Changes for portable platform headers. Commit it on behalf of Ralph.
This commit was SVN r22619.
2010-02-15 22:14:59 +00:00
Shiqing Fan
0b765637d9 A type cast.
This commit was SVN r22618.
2010-02-15 10:26:02 +00:00
Ralph Castain
7a1b2a706e Add a new ring_buffer class
This commit was SVN r22615.
2010-02-14 19:20:19 +00:00
Nysal Jan
0538b1a948 Adding GPFS to the list of file systems checked
This commit was SVN r22612.
2010-02-12 14:15:39 +00:00
Rainer Keller
ea4de16561 - Check whether file is opened on network file-system.
If file does not exist, check the directory it lives in...
   Maybe used by caller, trying to open mmap() on NFS, Lustre or
   Panasas (thanks Sam).
   For now, this is used to warn about the usage of mmap on such FS.

   Please note, that Ralph mentioned the orte_no_session_dir parameter.
   The help message includes a reference to this.

   Tested on NFS and Lustre on Linux on
     smoky: mpirun --mca orte_tmpdir_base $HOME/tmp -np 2 ./mpi_stub
     jaguar: mpirun ... --mca orte_tmpdir_base /tmp/work/$USER ...

   Fixes trac:1354

   This should   cmr:v1.5   once it has soaked and is shown to work on
   Solaris

This commit was SVN r22604.

The following Trac tickets were found above:
  Ticket 1354 --> https://svn.open-mpi.org/trac/ompi/ticket/1354
2010-02-10 23:18:29 +00:00
Jeff Squyres
a89dc623b0 Brice Goglin noticed that mpi_paffinity_alone didn't seem to be doing
anything for non-MPI apps.  Oops!  (But before you freak out, gentle
reader, note that mpi_paffinity_alone for MPI apps still worked fine)
When we made the switchover somewhere in the 1.3 series to have the
orted's do processor binding, then stuff like:

  mpirun --mca mpi_paffinity_alone 1 hostname

should have bound hostname to processor 0.  But it didn't because of a
subtle startup ordering issue: the MCA param registration for
opal_paffinity_alone was in the paffinity base (vs. being in
opal/runtime/opal_params.c), but it didn't actually get registered
until after the global variable opal_paffinity_alone was checked to
see if we wanted old-style affinity bindings.  Oops.

However, for MPI apps, even though the orted didn't do the binding,
ompi_mpi_init() would notice that opal_paffinity_alone was set, yet
the process didn't seem to be bound.  So the MPI process would bind
itself (this was done to support the running-without-orteds
scenarios).  Hence, MPI apps still obeyed mpi_paffinity_alone
semantics.

But note that the error described above caused the new mpirun switch
--report-bindings to not work with mpi_paffinity_alone=1, meaning that
the orted would not report the bindings when mpi_paffinity_alone was
set to 1 (it ''did'' correctly report bindings if you used
--bind-to-core or one of the other binding options).

This commit separates out the paffinity base MCA param registration
into a small function that can be called at the Right place during the
startup sequence.

This commit was SVN r22602.
2010-02-10 22:32:00 +00:00
Rainer Keller
80136ac9e2 - We don't configure-check for errno.h and don't need errno.h here...
This commit was SVN r22587.
2010-02-09 01:12:52 +00:00
Jeff Squyres
3c8685ea8c Add a check for strtoll for libevent.
Not having this check was causing distcheck errors on the OMPI
tarball-build machine because it's still a 32-bit-default machine, so
the evutil.c code was failing some #if conditionals (since it didn't
think it had strtoll available).

This commit was SVN r22577.
2010-02-08 20:55:21 +00:00
Jeff Squyres
cd5012d481 Fix "make dist: opal/event/WIN32-Code/misc.* don't exist anymore.
This commit was SVN r22562.
2010-02-05 13:42:45 +00:00
Shiqing Fan
23bb52ad05 Remove a few files from the CMakeList, that no long exist in the new libevent.
Add #ifdef for including _libevent_time.h.
Use the Windows version of function random().

This commit was SVN r22556.
2010-02-04 15:18:54 +00:00
Brian Barrett
b5e391251f Update libevent to 1.4.13
This commit was SVN r22548.
2010-02-04 05:38:30 +00:00
Ralph Castain
db1b07c02c In multiple places in the code base, we expect opal_bitmap_is_set_bit to return either true or false...not a negative error code. If the index is out of range, this is effectively a "false" condition as the bit was clearly not set by the program.
This revision is consistent with how the function was called in the OMPI code base. Will fix the test to match.

This commit was SVN r22544.
2010-02-03 19:45:22 +00:00
Shiqing Fan
7ad3d310b8 Define SIGPIPE for Windows, just for fixing the v1.4 Windows build.
cmr:v1.4.2:reviewer=jsquyres

This commit was SVN r22543.
2010-02-03 18:49:22 +00:00
Rainer Keller
0009d10c4d - This fixes the failing mpic++/mpiCC MTT tests, bailing due to not
finding symbol pthread_atfork, e.g. cxx-test-suite.

   Fixes trac:2088

   cmr:v1.5:reviewer=jsquyres

This commit was SVN r22542.

The following Trac tickets were found above:
  Ticket 2088 --> https://svn.open-mpi.org/trac/ompi/ticket/2088
2010-02-03 18:47:13 +00:00
Brian Barrett
8b4825ff37 Updates to make trunk run on Catamount again:
* Don't build the pstat component if all defines needed aren't there.
 * Update platform file to work better
 * Work around two places that depended on modex being operational

This commit was SVN r22536.
2010-02-03 05:07:40 +00:00
Jeff Squyres
007a6c7b99 Per #2201, move the user arguments up to be the first set of argv
after the compiler argv tokens.  

Not closing #2201 yet; there's still discussion on that ticket about
whether we want to do more or not.

Refs trac:2201
cmr:v1.4.2 
cmr:v1.5

This commit was SVN r22513.

The following Trac tickets were found above:
  Ticket 2201 --> https://svn.open-mpi.org/trac/ompi/ticket/2201
2010-01-29 22:51:35 +00:00