1
1
Граф коммитов

1901 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
b5de068533 Clean up an error in r24371 - can't use a const parameter as target in asprintf as it changes the value of the address.
Add some new proc/job states

Rename a constant to reflect coming change - remove the arbitrary difference between restarting a proc locally and relocating it to another node in terms of the number of restarts allowed.

Add pretty-print of signals for "proc aborted due to signal" reports.

This commit was SVN r24378.

The following SVN revision numbers were found above:
  r24371 --> open-mpi/ompi@93d28a5792
2011-02-14 19:29:09 +00:00
Ralph Castain
e8c2519280 Restore thread-supported condition waits when thread support requested
This commit was SVN r24377.
2011-02-14 19:10:38 +00:00
Abhishek Kulkarni
93d28a5792 Change opal_err2str_fn_t to return the error string as an argument.
This means that the converters (opal_err2str, orte_err2str) can now
return NULL as a "silent error". The return value of opal_err2str_fn_t
is the status of the operation (OPAL_SUCCESS or OPAL_ERROR).

This fixes the "Unknown error" message issues on the trunk.

This commit was SVN r24371.
2011-02-13 16:09:17 +00:00
Nysal Jan
92e06b0a1f Missed this change suggested by Terry
This commit was SVN r24364.
2011-02-08 04:06:52 +00:00
Nysal Jan
a31025bb48 Fix pty setup code on AIX
This commit was SVN r24363.
2011-02-08 02:54:47 +00:00
Nysal Jan
f0f1d4e311 Older versions of config.guess detect the canonical system name of an AIX 7.1 system to be rs6000-ibm-aix. Add this workaround until AIX 7.1 support is available in the autotools releases
This commit was SVN r24362.
2011-02-08 02:52:10 +00:00
Jeff Squyres
b0ce9bae8e Oops. Also need to remove myriexpress.h from the Makefile.am.
This commit was SVN r24357.
2011-02-04 03:29:49 +00:00
Abhishek Kulkarni
d711c5a4b1 SOS fix for the Studio compilers (Thanks to Terry for spotting this).
This commit was SVN r24355.
2011-02-03 22:36:28 +00:00
Jeff Squyres
6421abecc7 Fixes trac:2690.
Temporarily remove hwloc's internal version of myriexpress.h.  It is
causing a problem when compiling Open MPI with MX support because
hwloc uses AC_CONFIG_HEADER in hwloc's hwloc.m4 to generate
opal/mca/paffinity/hwloc/hwloc/include/hwloc/config.h.
AC_CONFIG_HEADER apparently has the (undocumented) side effect of
adding -I$(top_builddir)/opal/mca/paffinity/hwloc/hwloc/include/hwloc
to OMPI's compilation flags.  Hence, when the OMPI MX components are
compiled and #include "myriexpress.h" (or <myriexpress.h>) they see
hwloc's myriexpress.h before the system one.  Badness ensures.

This removal is temporary because we need to figure out a better
solution.  But for now, OMPI is not using hwloc's myriexpress.h file --
so it's safe to remove.  I'll push this issue upstream to hwloc to
figure out a better solution...

This commit was SVN r24354.

The following Trac tickets were found above:
  Ticket 2690 --> https://svn.open-mpi.org/trac/ompi/ticket/2690
2011-02-03 14:24:32 +00:00
Nysal Jan
ab2f738b0b Recent versions of IBM XL compilers on AIX support GCC inline assembly format
This commit was SVN r24340.
2011-02-02 11:31:30 +00:00
Jeff Squyres
4674e62929 These files are superflouos.
This commit was SVN r24331.
2011-02-01 21:31:35 +00:00
Jeff Squyres
c8badb79df Don't instantiate variables in for loops; we don't assume C99
compilers. 

This commit was SVN r24330.
2011-02-01 19:23:14 +00:00
Nysal Jan
42015cf30a Fix build failure on AIX
This commit was SVN r24321.
2011-01-28 08:09:45 +00:00
Nysal Jan
857c32784e Fix detection of fd_mask
This commit was SVN r24320.
2011-01-28 06:20:32 +00:00
George Bosilca
d457338f66 Force mips2 asm acceptance before sc and ll.
This commit was SVN r24319.
2011-01-27 22:42:26 +00:00
Jeff Squyres
6c8de8fb76 Bump up to hwloc 1.1.1
This commit was SVN r24312.
2011-01-26 23:20:26 +00:00
Jeff Squyres
511f87665b Fixes trac:2680: Add ARM support.
This commit was SVN r24308.

The following Trac tickets were found above:
  Ticket 2680 --> https://svn.open-mpi.org/trac/ompi/ticket/2680
2011-01-26 17:22:44 +00:00
Josh Hursey
66af515061 Fix C/R functionality with the new libtool. This fixes the case where the restarted process cannot be checkpointed or finalized.
Short Version:
--------------
Event engine needs to be flushed so it does not use old/stale file descriptors.

Long Version:
-------------
The problem was that the restarted process was waiting for the socket to the local daemon to finish establishing during the 'sync' operation. The core problem was that the daemon was sending a header of 36 bytes, but the restarted process only received 35 bytes of the message. So the restarted process became stuck waiting for the last byte to arrive.

After many hours of digging, I figured out that the event engine was using the same file descriptor for its evsig_cb functionality (to signal itself when a signal arrives). So when the daemon wrote in to the new fd the event engine was stealing the first byte (*shakes fist at event engine*) before the recv() could be posted.

The solution is to use the event_reinit() function on restart to re-establish the now-stale file descriptors in the event engine. This seems to have fixed the problem.


A few other minor things:
-------------------------
 * Add a check to make sure the event engine is balanced in its init/finalize
 * Add the opal_event_base_close() to the BLCR restart exec function (still not 100% sure it is needed, but there it is).

This commit was SVN r24296.
2011-01-25 22:43:47 +00:00
Jeff Squyres
afa654746c Somehow this has been sitting, uncommitted, in a local checkout since
last December.  :-(

Add new MCA param: maffinity_libnuma_policy.  Thanks to David
Singleton for the suggestion.  Here's the help text about it:

{{{
   MCA maffinity: parameter "maffinity_libnuma_policy" (current value:
                  <loose>, data source: default value)
                  Binding policy that determines what happens if memory
                  is unavailable on the local NUMA node.  A value of
                  "strict" means that the memory allocation will fail;
                  a value of "loose" means that the memory allocation
                  will spill over to another NUMA node.
}}}

This commit was SVN r24290.
2011-01-24 14:39:16 +00:00
Abhishek Kulkarni
3243b16bb3 Decode SOS error code before checking it with the native error code.
This commit was SVN r24281.
2011-01-20 23:21:38 +00:00
Abhishek Kulkarni
45a53b4f7a Add a missing to opal_sos_finalize in opal_finalize_util.
This commit was SVN r24280.
2011-01-20 23:18:02 +00:00
Jeff Squyres
189b541dbd Add a proper help message for the mca_verbose MCA param (and shuffle
the code to be slightly more efficient).

This commit was SVN r24256.
2011-01-14 20:18:06 +00:00
George Bosilca
5390fd6f33 Reshape the datatype engine. The basic types are built down in OPAL. MPI types are
either direct link to these basic predefined types, or a combination of them.
Anyway, the first items in the datatype list belong to OPAL, the second round
are MPI datatypes created by composing basic OPAL datatypes, and the last
batch are mapped datatype (direct correspondance between an OMPI datatype and
an OPAL one such as int -> int32_t).

Modify the op to fit this new scheme.

This commit was SVN r24247.
2011-01-13 06:08:54 +00:00
Ralph Castain
b09f57b03d Update the multicast subsystem - ported from Cisco branch
This commit was SVN r24246.
2011-01-13 01:54:05 +00:00
Terry Dontje
56c03a3853 removing a file I should not have added
This commit was SVN r24220.
2011-01-11 19:02:08 +00:00
Terry Dontje
a374661ead add configure.params to solaris sysinfo module to allow it to be built
This commit was SVN r24219.
2011-01-11 18:31:55 +00:00
Jeff Squyres
cd8f12d8e5 Remove a few useless files that were missed last night.
This commit was SVN r24218.
2011-01-11 14:15:31 +00:00
Jeff Squyres
54cb4eb2b5 Merge over new version of hwloc 1.1 from the vendor branch. Update
the module to use the new hwloc bitmap API (the cpuset API is both
klunkier and deprecated), which simplified a few things.

This commit was SVN r24217.
2011-01-11 01:41:10 +00:00
Ralph Castain
ac1853b5d8 Took me a couple of days, but finally tracked this one down. Some compilers/glibc's don't like composite test statements in a return and just randomly pick one of the two options.
So....don't do that!!!

This commit was SVN r24212.
2011-01-10 16:29:42 +00:00
Josh Hursey
bbfdf04a81 Fix a couple of 'unused variable' warnings, and one return value warning.
{{{
base/paffinity_base_service.c: In function ‘opal_paffinity_base_cset2mapstr’:
base/paffinity_base_service.c:623: warning: unused variable ‘range_last’
base/paffinity_base_service.c:623: warning: unused variable ‘range_first’
base/paffinity_base_service.c:622: warning: unused variable ‘count’
base/paffinity_base_service.c:622: warning: unused variable ‘m’
}}}

{{{
connect/btl_openib_connect_oob.c: In function ‘init_ud_qp’:
connect/btl_openib_connect_oob.c:1111: warning: control reaches end of non-void function
connect/btl_openib_connect_oob.c: In function ‘init_device’:
connect/btl_openib_connect_oob.c:1235: warning: unused variable ‘i’
connect/btl_openib_connect_oob.c: In function ‘get_pathrecord_sl’:
connect/btl_openib_connect_oob.c:1323: warning: unused variable ‘i’
}}}

This commit was SVN r24196.
2010-12-30 15:37:50 +00:00
Ethan Mallove
9251785161 Emit an error (instead of a SEGV) if the "compiler" parameter is not set
in the wrapper data file.

This commit was SVN r24190.
2010-12-21 19:01:39 +00:00
Jeff Squyres
a525e70f46 Convert "opal_show_help" to be a global variable pointer.
It is statically initialized to the real back-end OPAL show_help
function.  During orte_show_help_init(), the variable is re-assigned
with the value of the back-end ORTE show_help function (the one that
does error message aggregation).  

Therefore, anything that calls opal_show_help() after a certain point
in orte_init() will have their show_help messages be aggregated.
w00t!  Even code down in OPAL -- that has no knowledge of ORTE -- will
have their messages aggregated.  '''Double w00t!'''

During orte_show_help_finalize(), we restore the original pointer
value so that it something calls opal_show_help() after
orte_finalize(), it'll still work properly (but it won't be
aggregated).  

This commit was SVN r24185.
2010-12-16 23:00:25 +00:00
Terry Dontje
6da16ab0d7 add format parameter and layout format to OMPI_Affinity_str
This commit was SVN r24182.
2010-12-16 15:11:17 +00:00
Shiqing Fan
883aeedd26 Add support for nanosleep function using Sleep on Windows. The accuracy of the sleep function on Windows is 1 millisecond mentioned in MSDN doc.
This commit was SVN r24175.
2010-12-15 15:43:25 +00:00
Shiqing Fan
ec82e73bce use sockets instead of pipes on Windows.
This commit was SVN r24174.
2010-12-15 14:34:25 +00:00
George Bosilca
b4355408f5 Fix the Sparc and Sparcv9 atomics based on Nicolai Stange
patch.

CMR:v1.5
CMR:v1.4

This commit was SVN r24150.
2010-12-03 19:16:53 +00:00
George Bosilca
bb412a5ff7 Indentation.
This commit was SVN r24148.
2010-12-03 19:13:57 +00:00
Rolf vandeVaart
3f7dd84278 Fix libevent so it can compile in the few cases where sys/queue.h does not exist.
1. Remove it from libevent207.h because it is not needed.
2. Add compat to the include list so it can use queue.h when needed.

This commit was SVN r24144.
2010-12-02 23:05:02 +00:00
Shiqing Fan
f43862420c Convert the bad dos line endings to unix style for all windows related files.
This commit was SVN r24137.
2010-12-02 12:08:08 +00:00
Ralph Castain
c56185887b Change the event base "wakeup" support to enable the passing of events to the central thread for add/del. Add a macro OPAL_UPDATE_EVBASE for this purpose as it will likely be widely used.
Update the ORTE thread support to utilize this capability. Update the rmcast framework to track the change.

This commit was SVN r24121.
2010-12-01 04:26:43 +00:00
Ralph Castain
2523c9b2e8 Overload the event_base_t struct to include a (hopefully) temporary change to deal with cross-event-base synchronization. This is done transparently so no code changes are required within the rest of the code base. Comments explain what was changed and why.
This commit was SVN r24105.
2010-11-30 16:14:19 +00:00
Ralph Castain
71f116d21f Expose the event_active API
This commit was SVN r24090.
2010-11-24 23:30:13 +00:00
Ralph Castain
380835602c Add support for internal libevent threading support. Add configure logic to define an appropriate flag, and then use that flag to expose the required functions.
This commit was SVN r24088.
2010-11-24 23:24:53 +00:00
Ralph Castain
aa467162da Add a "name" field to the condition wait object to help with debugging
This commit was SVN r24087.
2010-11-24 23:20:06 +00:00
Shiqing Fan
39c9f7468e Add support for managing priorities of windows mca components.
Correct the generated strings in mpi.h.

This commit was SVN r24082.
2010-11-23 19:09:06 +00:00
Rolf vandeVaart
e7ff9375d7 Use pid_t to avoid warnings on some platforms.
This commit was SVN r24072.
2010-11-19 17:14:33 +00:00
Rolf vandeVaart
1735f98c78 Avoid potential warnings by using pid_t in all places.
This commit was SVN r24071.
2010-11-19 16:29:45 +00:00
Shiqing Fan
358b4a5cba Add an option to enable the debug postfix for executables.
This commit was SVN r24070.
2010-11-19 15:54:13 +00:00
Rolf vandeVaart
4b14a6416f No need to conditionalize around this macro. It turns
out it is needed even in one case when we configure
--without-threads.

This commit was SVN r24069.
2010-11-19 15:47:48 +00:00
Shiqing Fan
4fea0f021e Per r24062, this should also be removed.
This commit was SVN r24064.

The following SVN revision numbers were found above:
  r24062 --> open-mpi/ompi@3b0caf7dea
2010-11-17 17:14:55 +00:00
Rolf vandeVaart
3b0caf7dea Remove inclusion of stdbool.h where not needed.
Change OMPI code in libevent to not use bool.
Add some comments to indicate OMPI specific code.
This should fix compiles on Sun Studio Solaris.

This commit was SVN r24062.
2010-11-17 15:14:00 +00:00
George Bosilca
96abaf2e17 Pushing the Debian patch (based on Manuel Prinz modifications).
This commit was SVN r24061.
2010-11-17 02:36:03 +00:00
George Bosilca
d997ef4f49 Update the copyright. Add the fix from Tim Mattox regarding the computation
of the upper bound.

This commit was SVN r24060.
2010-11-17 02:10:15 +00:00
Ralph Castain
32be69eaef Update the OMPI libevent interface module and the internal libevent event.c file to provide ability to disable specific event modes. Basically an issue between #define and checking to see if the value was defined to zero.
This commit was SVN r24056.
2010-11-16 16:06:32 +00:00
Ralph Castain
1b3421f16e Fix a bug spotted by Rolf - ensure that disable-event-xxx results in the corresponding have_event_xxx being undefined or defined to 0
This commit was SVN r24055.
2010-11-16 04:37:30 +00:00
Rolf vandeVaart
37d5267895 The fix for ticket #2560 was somehow removed in the
great autogen update.  Therefore, put them back.

This commit was SVN r24053.
2010-11-15 21:41:56 +00:00
Ralph Castain
b43a4509ac Remove stale mca param. Ensure that verbosity gets properly set for event framework debug
This commit was SVN r24050.
2010-11-13 15:37:17 +00:00
Ralph Castain
db014edb0b Initialize boolean
This commit was SVN r24048.
2010-11-13 15:31:55 +00:00
George Bosilca
36ce319869 Add a second version of the datatype copy function using memmove instead of memcpy.
As memmove is slower than memcpy, I added the required logic to only use it when
really necessary.

No modification from developers point of view, you should always call
opal_datatype_copy_content_same_ddt.

This commit was SVN r24047.
2010-11-12 23:22:35 +00:00
Jeff Squyres
e4744b4ed5 Per http://www.open-mpi.org/community/lists/devel/2010/11/8671.php,
change a bunch of OMPI_<foo> names to OPAL_<foo>.

This commit was SVN r24046.
2010-11-12 23:22:11 +00:00
Shiqing Fan
1f4eae2046 Type cast for compiling under VS 2010.
This commit was SVN r24044.
2010-11-12 08:31:23 +00:00
Shiqing Fan
c03ea1a5f3 A more clean way to build on Windows.
It's not possible to combine two shared libraries on Windows, so we have to do it a bit different. First generate a small event static library by just linking the object files, and link it into other libraries that needs the libevent API.

This commit was SVN r24039.
2010-11-11 12:02:54 +00:00
Jeff Squyres
52f5dd906c Fixes trac:2611: get updated code from GASNet (thanks Paul Hargrove!) that
handles more modern versions of Autoconf's --program-transform
arguments.  Also make it clear that the message is coming from Open
MPI logic, so that we don't blame Autoconf, Red Hat, or anyone else
next time!

This commit was SVN r24024.

The following Trac tickets were found above:
  Ticket 2611 --> https://svn.open-mpi.org/trac/ompi/ticket/2611
2010-11-09 23:36:30 +00:00
Jeff Squyres
dded8a9756 Ensure to always remove the .new file
This commit was SVN r24023.
2010-11-09 23:33:53 +00:00
Shiqing Fan
482a621e31 Change the behavior of exporting/importing symbols on Windows, so that to fit the new build procedure, i.e. import statically linked opal/orte libraries for other libraries/binaries. There are several use cases when creating dll libraries:
1. create DLL A, export symbols of A, import nothing  (A normally is OPAL)
   should define _USRDLL , A_EXPORT 

2. create DLL B, export symbols of B, import A.lib    (B could be ORTE, OMPI or other ompi tools)
   should define _USRDLL, B_EXPORT

3. create DLL C, import B.dll    (C could be external libs or apps)
   should define B_IMPORT

This commit was SVN r24016.
2010-11-09 16:13:30 +00:00
Shiqing Fan
7bac326920 Fix Windows build, add custom command to generate static libraries (opal and orte) for shared build.
This commit was SVN r24012.
2010-11-09 08:32:45 +00:00
Ralph Castain
fa919be622 Add support for thread_kill
This commit was SVN r24008.
2010-11-08 19:06:10 +00:00
Terry Dontje
8e0b24a45b add comment to r23998 code change to be able to track libevent code change better
This commit was SVN r24005.

The following SVN revision numbers were found above:
  r23998 --> open-mpi/ompi@e8aa8984a8
2010-11-08 14:36:28 +00:00
Terry Dontje
e8aa8984a8 corrected stdbool.h inclusion to allow Oracle C++ compilers to work with OMPI
This commit was SVN r23998.
2010-11-05 18:54:19 +00:00
Josh Hursey
676adfb7cc fix compile error, need to revisit this line later
This commit was SVN r23991.
2010-11-03 17:34:11 +00:00
Shiqing Fan
a7dc32afb0 Remove the OPAL_DECLSPEC for the event functions.
This commit was SVN r23987.
2010-11-03 09:10:12 +00:00
Shiqing Fan
505efbaa27 Update the CMake scripts, solve a few export symbols for Windows.
This commit was SVN r23976.
2010-11-02 16:39:27 +00:00
Jeff Squyres
9c15a30b75 Really fix the libevent make distcheck problem. The main issue is how
libevent creates its event-config.h during "make all" (vs. during
configure).  The prior method around this didn't work because it wrote
an event-config.h.in in the source tree -- a Bad Idea(tm).  The new
way uses AC_CONFIG_COMMAND to get stuff executed at the end of
config.status to create event-config.h.  This seems to work properly
during make distcheck.

This commit was SVN r23975.
2010-11-01 23:28:50 +00:00
Jeff Squyres
6bd41cf5d8 Fixes for vpath builds; this should enable 'make dist' again.
This commit was SVN r23973.
2010-10-29 22:07:52 +00:00
Ralph Castain
0171e05942 Only add include paths for event headers if --with-devel-headers was specified
This commit was SVN r23968.
2010-10-29 00:43:10 +00:00
Ralph Castain
838ed14401 Include the libevent headers when --with-devel-headers is specified. Ensure that the proper include paths are added to the wrapper compilers - thanks to Jeff for figuring out how to do it.
This commit was SVN r23967.
2010-10-28 21:26:07 +00:00
Ralph Castain
9ea2b196ce Convert the opal_event framework to use direct function calls instead of hiding functions behind function pointers. Eliminate the opal_object_t abstraction of libevent's event struct so it can be directly passed to the libevent functions.
Note: the ompi_check_libfca.m4 file had to be modified to avoid it stomping on global CPPFLAGS and the like. The file was also relocated to the ompi/config directory as it pertains solely to an ompi-layer component.

Forgive the mid-day configure change, but I know Shiqing is working the windows issues and don't want to cause him unnecessary redo work.

This commit was SVN r23966.
2010-10-28 15:22:46 +00:00
Brian Barrett
50394a05f2 Restore ordering to installdirs components
This commit was SVN r23964.
2010-10-28 01:03:16 +00:00
Terry Dontje
b3f2ac8d46 removed direct include of stdbool.h from event.h that was causing studio C++ issues. Also removed include of stdbool.h in a couple other places since it was already being pulled in via opal_config_bottom.h.
This commit was SVN r23963.
2010-10-27 20:47:42 +00:00
Shiqing Fan
199df1eadf Rename a few var names.
This commit was SVN r23959.
2010-10-27 11:52:57 +00:00
Jeff Squyres
33c3b71317 We had long-ago added a new loop type to libevent: EVLOOP_ONELOOP.
After talking with Brian, we're pretty sure that this is only because
really, really old libevent didn't allow bitwise or-ing of the other
loop types, because what we really need is (EVLOOP_ONCE |
EVLOOP_NONBLOCK).  And that's what EVLOOP_ONELOOP did (i.e., we
changed the logic of libevent's event.c to let ONELOOP do both ONCE
and NONBLOCK things).

In the new libevent version, we didn't implement EVLOOP_ONELOOP
properly.  As a result, and we got hangs in the SM BTL add_procs
function.  Note that the SM BTL wasn't to blame -- it was purely a
side-effect of bad ONELOOP integration (i.e., if you got past the SM
BTL add_procs, you may well have hung somewhere else).

This commit removes all ONELOOP customizations from event.c and
returns it to (almost) its original state from the libevent 2.0.7-rc
distribution.  Everwhere in the code base where we used ONELOOP, we
now use (ONCE | NONBLOCK).

This commit was SVN r23957.
2010-10-26 20:29:22 +00:00
Ralph Castain
a5c440c974 Turn off libevent's internal thread support to (hopefully) minimize performance hit
This commit was SVN r23956.
2010-10-26 20:10:44 +00:00
Shiqing Fan
fae7076d64 add new files into the tarball.
This commit was SVN r23952.
2010-10-26 14:55:37 +00:00
Shiqing Fan
a3d9c91ff7 Exclude stdbool.h for Windows, and use the definition in opal. Immigrate the socket pair support from libevent. Fix other minor things and make it compile.
This commit was SVN r23951.
2010-10-26 14:53:50 +00:00
Ralph Castain
847e43703f Remove cruft
This commit was SVN r23950.
2010-10-26 14:49:36 +00:00
Shiqing Fan
b2c3cb300c Correctly configure the new libevent mca for Windows.
This commit was SVN r23946.
2010-10-26 09:33:47 +00:00
Ralph Castain
86c7365e8e Clean up a few initialization issues - don't think these are impacting the shared memory situation as it didn't fix the problem.
Setup the event API to support multiple bases in preparation for splitting the OMPI and ORTE events. Holding here pending shared memory resolution.

This commit was SVN r23943.
2010-10-26 02:41:42 +00:00
Jeff Squyres
ed1e9a412a Need these files in all tarballs -- so don't conditionally add them to
EXTRA_DIST. 

This commit was SVN r23938.
2010-10-25 18:31:38 +00:00
Jeff Squyres
d14474969b Need this variable in optimized builds, too.
This commit was SVN r23937.
2010-10-25 18:31:01 +00:00
George Bosilca
bc3e1376ba event-config.h only exists in the builddir, so we need to explicitly
include it while building.

This commit was SVN r23936.
2010-10-25 18:29:52 +00:00
George Bosilca
c2e40f8616 Remove a warning about signed to unsigned comparaison.
This commit was SVN r23935.
2010-10-25 18:29:11 +00:00
George Bosilca
b9a06afd98 opal_event_libevent207 is prototyped as const, so it should be defined as const.
This commit was SVN r23934.
2010-10-25 18:28:42 +00:00
Jeff Squyres
1d1571a86c Fix vpath builds.
This commit was SVN r23932.
2010-10-25 17:48:02 +00:00
Ralph Castain
a04da165bc Remove the sample and test code from the libevent distro - don't need to include them in ompi
This commit was SVN r23931.
2010-10-25 14:53:33 +00:00
Ralph Castain
bab990d812 Revert r23928 as being the incorrect fix. The correct fix is not to include ipv6 interfaces when ipv6 support was not requested.
This commit was SVN r23930.

The following SVN revision numbers were found above:
  r23928 --> open-mpi/ompi@7394f6d167
2010-10-25 14:31:18 +00:00
Abhishek Kulkarni
c671ec52d1 Fix broken trunk compile after the libevent changes.
This commit was SVN r23929.
2010-10-25 14:11:48 +00:00
Ralph Castain
7394f6d167 Silence warnings about IPV6 sa_family not known when ipv6 support is not enabled in configure
This commit was SVN r23928.
2010-10-25 13:56:23 +00:00
Ralph Castain
fceabb2498 Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac.
This is a fairly intrusive change, but outside of the moving of opal/event to opal/mca/event, the only changes involved (a) changing all calls to opal_event functions to reflect the new framework instead, and (b) ensuring that all opal_event_t objects are properly constructed since they are now true opal_objects.

Note: Shiqing has just returned from vacation and has not yet had a chance to complete the Windows integration. Thus, this commit almost certainly breaks Windows support on the trunk. However, I want this to have a chance to soak for as long as possible before I become less available a week from today (going to be at a class for 5 days, and thus will only be sparingly available) so we can find and fix any problems.

Biggest change is moving the libevent code from opal/event to a new opal/mca/event framework. This was done to make it much easier to update libevent in the future. New versions can be inserted as a new component and tested in parallel with the current version until validated, then we can remove the earlier version if we so choose. This is a statically built framework ala installdirs, so only one component will build at a time. There is no selection logic - the sole compiled component simply loads its function pointers into the opal_event struct.

I have gone thru the code base and converted all the libevent calls I could find. However, I cannot compile nor test every environment. It is therefore quite likely that errors remain in the system. Please keep an eye open for two things:

1. compile-time errors: these will be obvious as calls to the old functions (e.g., opal_evtimer_new) must be replaced by the new framework APIs (e.g., opal_event.evtimer_new)

2. run-time errors: these will likely show up as segfaults due to missing constructors on opal_event_t objects. It appears that it became a typical practice for people to "init" an opal_event_t by simply using memset to zero it out. This will no longer work - you must either OBJ_NEW or OBJ_CONSTRUCT an opal_event_t. I tried to catch these cases, but may have missed some. Believe me, you'll know when you hit it.

There is also the issue of the new libevent "no recursion" behavior. As I described on a recent email, we will have to discuss this and figure out what, if anything, we need to do.

This commit was SVN r23925.
2010-10-24 18:35:54 +00:00
Jeff Squyres
082086085b Fix some wordings in the test messages
This commit was SVN r23924.
2010-10-23 14:35:25 +00:00
Jeff Squyres
8ffb046649 Fix a few problems with the compiler visibility test:
* Update to be safe for AC 2.68 by using AC_LINK_IFELSE instead of
   AC_TRY_LINK
 * If enable visibility was used, ensure we fail if the compiler
   doesn't support it
 * Rename OMPI_CHECK_VISIBILITY -> OPAL_CHECK_VISIBILITY (and all
   internal variables)

This commit was SVN r23923.
2010-10-23 14:32:44 +00:00
Ralph Castain
29b16cc800 Add missing include
This commit was SVN r23892.
2010-10-15 04:04:20 +00:00
Jeff Squyres
e09bbb49a9 No need to have this AC_ARG_WITH in every component configure.m4 -- just put it up in the framework-level configure.m4.
This commit was SVN r23890.
2010-10-14 22:39:48 +00:00
Jeff Squyres
66d15035ab Replace some intentional-segv's with abort(). Seems safer and doesn't
cause all kinds of compiler warnings.

This commit was SVN r23889.
2010-10-14 22:01:14 +00:00
Sylvain Jeaugey
5fb2a2f2c9 Add a check for the ummunotify device before setting up ptmalloc2 hooks.
This commit was SVN r23882.
2010-10-11 15:05:57 +00:00
Sylvain Jeaugey
78176d2aeb Fix missing include in ummunotify
This commit was SVN r23881.
2010-10-11 15:03:00 +00:00
Jeff Squyres
69a64e5905 Fix typo that prevented the valgrind component from configuring properly
This commit was SVN r23874.
2010-10-07 22:39:08 +00:00
Jeff Squyres
a95ca7444e Fix a meaningless compare of an unsigned against 0. Rework the logic
a bit so that the secondary loop isn't even necessary; makes the whole
thing much simpler, anyway.

This commit was SVN r23860.
2010-10-07 15:04:50 +00:00
Ralph Castain
d9389689b1 Fix yet another mangling
This commit was SVN r23818.
2010-09-30 17:53:52 +00:00
Jeff Squyres
73bcc4a36b Fix mistake that came in via the ompi-agen tree in r23764. The mistake wasn't part of the core autogen upgrade; it was an additional 'bonus' cleanup. Oops. The mistake will always create a set of directories under installdir, even if you do not --with-devel-headers. The set of directories will be empty, but still -- they should not be there at all. This commit fixes that -- the directories are not created at all if you do not --with-devel-headers
This commit was SVN r23801.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-24 22:53:28 +00:00
Jeff Squyres
7ef20f60f3 Autoconf updates to make us compatible with AC 2.68. Thanks to Ralf W. for the patch!
This commit was SVN r23797.
2010-09-23 22:37:52 +00:00
Ralph Castain
407eefc66d Update the if configure to include "opal" so they will build!
This commit was SVN r23787.
2010-09-22 03:19:15 +00:00
Ralph Castain
3631e4e936 Revert remaining svn kruft from r23764
This commit was SVN r23786.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-22 01:11:40 +00:00
Jeff Squyres
0ca617e570 Make this a warning, not an error.
This commit was SVN r23767.
2010-09-18 07:14:58 +00:00
Ralph Castain
40a2bfa238 WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.

Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.

This commit was SVN r23764.
2010-09-17 23:04:06 +00:00
Rolf vandeVaart
09750d0310 Need output.h header file for opal_output() definition.
Otherwise, build will fail when configuring with --enable-picky.

This commit was SVN r23763.
2010-09-17 12:22:17 +00:00
Shiqing Fan
9a47ca1995 Correct the place of including the if.h, and change retain_loopback to opal_if_retain_loopback for windows module too.
This commit was SVN r23756.
2010-09-14 14:03:48 +00:00
Ralph Castain
c74ce1632a Catch a couple of places (one hidden inside an #if 0, other in solaris module) where retain_loopback needs to be opal_if_retain_loopback
This commit was SVN r23755.
2010-09-14 11:37:10 +00:00
Shiqing Fan
95b17c1e82 Add a missing header for if windows.
This commit was SVN r23754.
2010-09-14 07:51:38 +00:00
Ralph Castain
e96b5f486f Reorganize the opal interface code in opal/util/if.c per prior emails and telecon discussions. Move the interface discovery code into a framework so that configuration logic can separate it out (instead of the prior #if-#else confusion).
All interface APIs for accessing the info remain unchanged in opal/util/if.c.

This has been tested on Mac, Linux, and NetBSD. Nobody else seemed interested in testing it, so there may be some future problems revealed as people try it on other OSs.

This commit was SVN r23743.
2010-09-13 01:58:51 +00:00
Jeff Squyres
3b14366c85 Fix a copyright statement
This commit was SVN r23741.
2010-09-12 09:55:01 +00:00
Rolf vandeVaart
ef8090ec71 Fix the ia32 atomic add and subtract functions so they
do the right thing.  They now properly return
the value after the update.  This also fixes all warnings
reported by the Sun Studio compiler.  George provided the
new assembly routines.  I added some configure code to make
sure the compilers could handle it.

This fixes trac:2560.

This commit was SVN r23721.

The following Trac tickets were found above:
  Ticket 2560 --> https://svn.open-mpi.org/trac/ompi/ticket/2560
2010-09-08 10:47:15 +00:00
Rolf vandeVaart
14e7bcc383 Create new entries in the wrapper data files so the
administrator can specify compiler flags that get
inserted into the command before the user's flags.
These flags can be specified at configure time.
Reviewed by Jeff Squyres.

This fixes ticket #2474.

This commit was SVN r23709.
2010-09-02 10:47:55 +00:00
Rainer Keller
97511912ec - Fixup several functions, that cannot return
- Add one instance where we do not use a parameter in a function
 - Fix a buglet in commit r23689, where the attribute-for-function ptrs
   was applied.

This commit was SVN r23690.

The following SVN revision numbers were found above:
  r23689 --> open-mpi/ompi@5eb571c458
2010-08-31 12:21:13 +00:00
Rainer Keller
5eb571c458 - As suggested in CMR #2558, attribute-macros should be
be tested on function pointers and assigned accordingly,
   instead of using the pre-processor in the header files.

   A functional change is (re-) specifying __opal_attribute_noreturn__
   on orte_errmgr_base_abort(): All modules in the errmgr framework
   either use this function, or define their own abort function,
   which sets __opal_attribute_noreturn__.
   This attributes was taken out with the errmgr overhaul in r22872.

This commit was SVN r23689.

The following SVN revision numbers were found above:
  r22872 --> open-mpi/ompi@e4f2d03d28
2010-08-31 10:28:51 +00:00
Brad Benton
09c4f4d95c Added copyright notices for the files modified in r23669.
This commit was SVN r23687.

The following SVN revision numbers were found above:
  r23669 --> open-mpi/ompi@271cfa8c9a
2010-08-30 17:46:47 +00:00
Jeff Squyres
3eedbee7a4 Fixes trac:2541. Ensure that we keep CPPFLAGS if a non-standard valgrind location was specified. CMR:v1.4.3 CMR:v1.5
This commit was SVN r23680.

The following Trac tickets were found above:
  Ticket 2541 --> https://svn.open-mpi.org/trac/ompi/ticket/2541
2010-08-27 22:45:02 +00:00
Rainer Keller
4abcf5a0d7 - The Sun-compiler 12 update 1 complains about noreturn-attributes
assigned to function-declarations.
   Check this case and mark the currently only case existing in trunk.

   Thanks to Paul Hargrove for bringing this up.

   Let's test the svn commit msg CMR:v1.5

This commit was SVN r23676.
2010-08-27 09:18:30 +00:00
Rainer Keller
044b387d3c - If we don't compile with PGI, then mark the parameter as unused,
otherwise we get swamped with warnings by gcc, everywhere header is
   included.
 - Remove redundant declaration of opal_datatype_safeguard_pointer_debug_breakpoint

   Check whether  CMR:v1.5 works

This commit was SVN r23674.
2010-08-26 15:07:18 +00:00
Nysal Jan
271cfa8c9a Fix the the opal_path_nfs test for GPFS. Reported by Paul H. Hargrove
This commit was SVN r23669.
2010-08-26 10:10:16 +00:00
Jeff Squyres
97fb426325 Per long-ago RFC, now that the odsl default module reports errors nicely, remove all paffinity components except for hwloc and test.
This commit was SVN r23666.
2010-08-25 22:34:30 +00:00
Jeff Squyres
a5ce58f098 Define that we return OPAL_ERR_TIMEOUT if the other end of the socket
closes in an opal_fd_read().

This commit was SVN r23650.
2010-08-24 19:07:04 +00:00
Ethan Mallove
f42c2a737f Fixes trac:2532 - "MPI_Put can result in SIGBUS on SPARC"
Reviewed by Rolf V and Brian B

This commit was SVN r23649.

The following Trac tickets were found above:
  Ticket 2532 --> https://svn.open-mpi.org/trac/ompi/ticket/2532
2010-08-24 18:10:43 +00:00
Ralph Castain
51833bfe6c Not -everyone- wants to ignore loopback devices. Give us a choice.
This commit was SVN r23637.
2010-08-24 02:37:05 +00:00
Shiqing Fan
c110edbf44 Use exclude lists for non-ordinary sub directories check.
This commit was SVN r23631.
2010-08-23 09:43:05 +00:00
Rolf vandeVaart
e71827b8ff Undo 4 of the 5 changes introduced by r22638. Leave
one of them in as it may still be needed on Solaris.

This fixes trac:2530.

This commit was SVN r23626.

The following SVN revision numbers were found above:
  r22638 --> open-mpi/ompi@2a4b1227d9

The following Trac tickets were found above:
  Ticket 2530 --> https://svn.open-mpi.org/trac/ompi/ticket/2530
2010-08-18 20:06:50 +00:00
Rainer Keller
33f2b9398e - This warning now is not supported anymore. Using it generates
a warning itselve (when another warning is generated within the file),
   which can be rather anying.
   Therefore check for output regarding this unrecognized warning.

This commit was SVN r23624.
2010-08-18 06:01:23 +00:00
Ralph Castain
23904c2f3e Correct the extra_dist path to the .windows file
This commit was SVN r23613.
2010-08-14 01:21:58 +00:00
Jeff Squyres
a2f349167e Update hwloc to 1.0.3a1r2398. This fixes a problem with Solaris
linking against libibverbs on Solaris.

Sorry for the mid-day configure change folks; I meant to commit this
last night and forgot.  :-(

This commit was SVN r23606.
2010-08-13 13:18:09 +00:00
Shiqing Fan
550f180014 Add a windows support file into the tarball.
This commit was SVN r23605.
2010-08-13 11:54:13 +00:00
Rainer Keller
14aad075eb - On Jaguar, we don't have pretty printed stackframe, aka no opal_stackframe_output*
This commit was SVN r23602.
2010-08-12 14:44:56 +00:00
Shiqing Fan
330999e36c Some fixes for C/R enhancement on Windows. Add the option and fix some type casts, just let it compile.
This commit was SVN r23599.
2010-08-12 13:31:37 +00:00
Josh Hursey
e12ca48cd9 A number of C/R enhancements per RFC below:
http://www.open-mpi.org/community/lists/devel/2010/07/8240.php

Documentation:
  http://osl.iu.edu/research/ft/

Major Changes: 
-------------- 
 * Added C/R-enabled Debugging support. 
   Enabled with the --enable-crdebug flag. See the following website for more information: 
   http://osl.iu.edu/research/ft/crdebug/ 
 * Added Stable Storage (SStore) framework for checkpoint storage 
   * 'central' component does a direct to central storage save 
   * 'stage' component stages checkpoints to central storage while the application continues execution. 
     * 'stage' supports offline compression of checkpoints before moving (sstore_stage_compress) 
     * 'stage' supports local caching of checkpoints to improve automatic recovery (sstore_stage_caching) 
 * Added Compression (compress) framework to support 
 * Add two new ErrMgr recovery policies 
   * {{{crmig}}} C/R Process Migration 
   * {{{autor}}} C/R Automatic Recovery 
 * Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} ErrMgr component 
 * Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} configure option) 
   * {{{OMPI_CR_Checkpoint}}} (Fixes trac:2342) 
   * {{{OMPI_CR_Restart}}} 
   * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) 
   * {{{OMPI_CR_INC_register_callback}}} (Fixes trac:2192) 
   * {{{OMPI_CR_Quiesce_start}}} 
   * {{{OMPI_CR_Quiesce_checkpoint}}} 
   * {{{OMPI_CR_Quiesce_end}}} 
   * {{{OMPI_CR_self_register_checkpoint_callback}}} 
   * {{{OMPI_CR_self_register_restart_callback}}} 
   * {{{OMPI_CR_self_register_continue_callback}}} 
 * The ErrMgr predicted_fault() interface has been changed to take an opal_list_t of ErrMgr defined types. This will allow us to better support a wider range of fault prediction services in the future. 
 * Add a progress meter to: 
   * FileM rsh (filem_rsh_process_meter) 
   * SnapC full (snapc_full_progress_meter) 
   * SStore stage (sstore_stage_progress_meter) 
 * Added 2 new command line options to ompi-restart 
   * --showme : Display the full command line that would have been exec'ed. 
   * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes trac:2413) 
 * Deprecated some MCA params: 
   * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir 
   * snapc_base_global_snapshot_dir deprecated, use sstore_base_global_snapshot_dir 
   * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared 
   * snapc_base_store_in_place deprecated, replaced with different components of SStore 
   * snapc_base_global_snapshot_ref deprecated, use sstore_base_global_snapshot_ref 
   * snapc_base_establish_global_snapshot_dir deprecated, never well supported 
   * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem 

Minor Changes: 
-------------- 
 * Fixes trac:1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint handles and does the right thing. 
 * Fixes trac:2097 : {{{ompi-info}}} should now report all available CRS components 
 * Fixes trac:2161 : Manual checkpoint movement. A user can 'mv' a checkpoint directory from the original location to another and still restart from it. 
 * Fixes trac:2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} 
 * Move {{{ompi_cr_continue_like_restart}}} to {{{orte_cr_continue_like_restart}}} to be more flexible in where this should be set. 
 * opal_crs_base_metadata_write* functions have been moved to SStore to support a wider range of metadata handling functionality. 
 * Cleanup the CRS framework and components to work with the SStore framework. 
 * Cleanup the SnapC framework and components to work with the SStore framework (cleans up these code paths considerably). 
 * Add 'quiesce' hook to CRCP for a future enhancement. 
 * We now require a BLCR version that supports {{{cr_request_file()}}} or {{{cr_request_checkpoint()}}} in order to make the code more maintainable. Note that {{{cr_request_file}}} has been deprecated since 0.7.0, so we prefer to use {{{cr_request_checkpoint()}}}. 
 * Add optional application level INC callbacks (registered through the CR MPI Ext interface). 
 * Increase the {{{opal_cr_thread_sleep_wait}}} parameter to 1000 microseconds to make the C/R thread less aggressive. 
 * {{{opal-restart}}} now looks for cache directories before falling back on stable storage when asked. 
 * {{{opal-restart}}} also support local decompression before restarting 
 * {{{orte-checkpoint}}} now uses the SStore framework to work with the metadata 
 * {{{orte-restart}}} now uses the SStore framework to work with the metadata 
 * Remove the {{{orte-restart}}} preload option. This was removed since the user only needs to select the 'stage' component in order to support this functionality. 
 * Since the '-am' parameter is saved in the metadata, {{{ompi-restart}}} no longer hard codes {{{-am ft-enable-cr}}}. 
 * Fix {{{hnp}}} ErrMgr so that if a previous component in the stack has 'fixed' the problem, then it should be skipped. 
 * Make sure to decrement the number of 'num_local_procs' in the orted when one goes away. 
 * odls now checks the SStore framework to see if it needs to load any checkpoint files before launching (to support 'stage'). This separates the SStore logic from the --preload-[binary|files] options. 
 * Add unique IDs to the named pipes established between the orted and the app in SnapC. This is to better support migration and automatic recovery activities. 
 * Improve the checks for 'already checkpointing' error path. 
 * A a recovery output timer, to show how long it takes to restart a job 
 * Do a better job of cleaning up the old session directory on restart. 
 * Add a local module to the autor and crmig ErrMgr components. These small modules prevent the 'orted' component from attempting a local recovery (Which does not work for MPI apps at the moment) 
 * Add a fix for bounding the checkpointable region between MPI_Init and MPI_Finalize. 

This commit was SVN r23587.

The following Trac tickets were found above:
  Ticket 1924 --> https://svn.open-mpi.org/trac/ompi/ticket/1924
  Ticket 2097 --> https://svn.open-mpi.org/trac/ompi/ticket/2097
  Ticket 2161 --> https://svn.open-mpi.org/trac/ompi/ticket/2161
  Ticket 2192 --> https://svn.open-mpi.org/trac/ompi/ticket/2192
  Ticket 2208 --> https://svn.open-mpi.org/trac/ompi/ticket/2208
  Ticket 2342 --> https://svn.open-mpi.org/trac/ompi/ticket/2342
  Ticket 2413 --> https://svn.open-mpi.org/trac/ompi/ticket/2413
2010-08-10 20:51:11 +00:00
Terry Dontje
b74ef351b7 Added new solaris sysinfo module. Also added code to assign
orte_local_chip_type and orte_local_chip_model in MPI processes it the
appropriate sysinfo module found the values on the machine.

This commit was SVN r23581.
2010-08-09 19:28:56 +00:00
Nysal Jan
b6524f6a92 Fix the conditional branch, jump to the correct location. Reported by Matthew Clark
This commit was SVN r23576.
2010-08-09 10:07:58 +00:00
Ralph Castain
9c69175117 If debug is enabled, provide an mca param and supporting logic to output when OPAL_ACQUIRE_THREAD is waiting and has obtained the thread, and when OPAL_RELEASE_THREAD releases it.
This commit was SVN r23557.
2010-08-05 16:25:32 +00:00
Shiqing Fan
b8db8d0ef8 Need to change another variable name.
This commit was SVN r23556.
2010-08-05 12:38:28 +00:00
Shiqing Fan
714883d472 A better way to make this work with VS 2010.
This commit was SVN r23544.
2010-08-03 09:06:50 +00:00
Shiqing Fan
e822f465b5 Remove a bunch of warnings due to the new POSIX supplement in VS 2010.
This commit was SVN r23540.
2010-08-02 12:16:29 +00:00
Josh Hursey
ba7e94dd89 Some relatively minor C/R related cleanup
* Fix a configure warning for checking --enable-ft-thread
 * In hnp and orted ErrMgr components check to see if other components have already recovered this process before trying to recover it again.
 * Fix 'npernode' for restarting using the resilient rmaps component
 * export ompi_info_set, so that internal functionality can use it.

This commit was SVN r23535.
2010-07-30 18:59:34 +00:00
Shiqing Fan
ea7bf2bd9e Correctly check the data type alignment for VS 2010 environment, and set the event include paths to global level, in order to make the clever VS load them.
This commit was SVN r23534.
2010-07-30 14:25:15 +00:00
Ralph Castain
0ed98967ed Update the thread protection in the ring_buffer class
This commit was SVN r23532.
2010-07-29 02:12:44 +00:00
Rolf vandeVaart
3d9b05ba2b Fix bug introduced by r23463. We now handle positive
error codes correctly again.  Also fix a typo.
Reviewed by Jeff Squyres. 

This commit was SVN r23531.

The following SVN revision numbers were found above:
  r23463 --> open-mpi/ompi@2af3e6e5ae
2010-07-28 19:19:27 +00:00
Jeff Squyres
f313257022 This file should really be distclean, not maintainer clean (it's not
shipped in the tarball).

This commit was SVN r23525.
2010-07-28 14:24:51 +00:00
Jeff Squyres
dca1ee8822 Revert r23495. Per on-list discussion, it doesn't do what it was
supposed to do, and there's disagreement about whether the concept
that it was supposed to do was the Right Thing anyway.

http://www.open-mpi.org/community/lists/devel/2010/07/8223.php

This commit was SVN r23517.

The following SVN revision numbers were found above:
  r23495 --> open-mpi/ompi@32e6dae8b0
2010-07-27 22:38:07 +00:00
Jeff Squyres
88b7923fc5 At least on NetBSD 5.0_STABLE with Libtool 2.2.6b, lt_dlerror() can
sometimes return NULL, so be sure to handle that case properly.

This commit was SVN r23503.
2010-07-27 14:15:53 +00:00
Jeff Squyres
245dc1a86d Add a cast to avoid a compiler warnings on BSD.
This commit was SVN r23502.
2010-07-27 14:14:37 +00:00
Jeff Squyres
0ce1a82cde This commit looks much bigger than it is. There are only 2
substantive changes in this commit; the rest are minor style changes:

 1. Change an OBJ_NEW(opal_list_item_t) to OBJ_NEW(opal_if_t).  This
    was causing memory corruption in the BSD code paths.
 1. Move some local variables from the top of opal_if_init() to inside
    the non-BSD code paths so that we avoid bunches of warnings about
    unused variables when compiling on BSD.  In doing so, I indented
    the whole non-BSD section one level deeper, making the commit look
    huge. 

I also added a few {} around 1-line blocks, added some spaces, broke a
few lines, re-formatted a few comments, ...etc.  Trivial stuff.

This commit was SVN r23501.
2010-07-27 13:46:55 +00:00
Ralph Castain
b3a8a394f0 Cleanup some lingering references to OMPI_SETUP_C and OMPI_SETUP_CXX that generated warnings. Follow the new naming convention by chaniging OMPI_SETUP_ASM to OPAL_SETUP_ASM
This commit was SVN r23500.
2010-07-27 04:51:50 +00:00
Jeff Squyres
41edaa1fe5 While we're here, also rename this macro: it really should be
OPAL_SETUP_CC. 

This commit was SVN r23496.
2010-07-26 22:09:24 +00:00
Jeff Squyres
32e6dae8b0 Add -gstabs+ compiler switch if we're on OSX and -g is in CFLAGS and that flag works with a test compile
This commit was SVN r23495.
2010-07-26 22:05:41 +00:00
Shiqing Fan
71d2749b6b Fix a header problem on Windows.
This commit was SVN r23483.
2010-07-23 07:52:34 +00:00
Jeff Squyres
7d7c0aa48f Somehow the check for the specific value "external" got dropped in the
logic (even though the "else" clause for handling it was there).  This
commit puts back the specific check for the word "external".

Thanks to Jed Brown for noticing the issue.  Fixes trac:2503.

This commit was SVN r23475.

The following Trac tickets were found above:
  Ticket 2503 --> https://svn.open-mpi.org/trac/ompi/ticket/2503
2010-07-22 11:42:15 +00:00
Jeff Squyres
29c1ad4196 Forgot BEGIN/END C_DECLS.
This commit was SVN r23453.
2010-07-21 11:05:08 +00:00
Jeff Squyres
b3952e4f07 Use const for the opal_fd_write() function, just to be nice.
This commit was SVN r23452.
2010-07-21 11:01:16 +00:00
Jeff Squyres
ab5fc1b570 Add trivial functions to loop over read()'ing and write()'ing with a
file descriptor (i.e., read and write complete messages, transparently
handling partial reads/writes, EAGAIN, and EINTR).

This code effectively already exists in a few places in the code base;
this is mainly a consolidation.

This commit was SVN r23450.
2010-07-20 19:53:49 +00:00
Jeff Squyres
64cb8f5d7f Another round of man page cleanups from Debian mantainer Manuel
Prinz.  Many thanks!

This commit was SVN r23445.
2010-07-20 14:07:18 +00:00
Christopher Yeoh
8a3d5d4e1c Adds missing sys/stat.h include needed for more recent versions of glibc
This commit was SVN r23440.
2010-07-20 06:31:16 +00:00
Jeff Squyres
5ab634555a Apparently, Cisco plans to be working on Open MPI for a veeeeery long time!
This commit was SVN r23433.
2010-07-19 19:31:59 +00:00
Jeff Squyres
57d89d1c0c Remove a lot of kruft from the hwloc paffinity directory that we're
not using in Open MPI (i.e., that stuff is only used in the standalone
builds of hwloc -- it's not compiled/installed/used by Open MPI).

This commit was SVN r23416.
2010-07-14 20:46:47 +00:00
Jeff Squyres
dc7d30b0ed We (Ralph and Jeff) discovered that if the OPAL_DESTDIR environment
variable was set, it was prefixed to ''all'' values in the wrapper
compiler data text files.  For example, if OPAL_DESTDIR was set to
/tmp/bogus and a wrapper compiler data file contained the line:

  preprocessor_flags=-pthread

The value would be exanded to:

  /tmp/bogus/-pthread

Which is clearly wrong.  After some back-and-forth with Ralph and
Brian, Brian submitted this patch that fixes the problem.  Now we
handle three cases properly (assume that configure was invoked with
--prefix=/opt/openmpi and no other directory specifications, and
$OPAL_DESTDIR is set to /tmp/buildroot):

1. Individual directories, such as libdir.  These need to be prepended
with DESTDIR.  I.e., return /tmp/buildroot/opt/openmpi/lib.

2. Compiler flags that have ${FIELD} values embedded in them.  For
example, consider if a wrapper compiler data file contains the
line:

  preprocessor_flags=-DMYFLAG="${prefix}/share/randomthingy/"

The value we should return is:

  -DMYFLAG="/tmp/buildroot/opt/openmpi/share/randomthingy/"

3. Compiler flags that do not have any ${FIELD} values.  For example,
consider if a wrapper compiler data file contains the line:

  preprocessor_flags=-pthread

The value we should return is:

  -pthread

Note, too, that this OPAL_DESTDIR futzing only needs to occur during
opal_init().  By the time opal_init() has completed, all values should
be substituted in that need substituting.  Hence, we take an extra
parameter (is_setup) to know whether we should do this futzing or
not.

This commit was SVN r23402.
2010-07-14 00:53:08 +00:00
Shiqing Fan
cdc7e0bec9 Mainly type casts.
Get rid of pthread and other unnecessary stuffs for Windows.

This commit was SVN r23376.
2010-07-12 16:17:56 +00:00
Jeff Squyres
c8bb7537e7 Remove include/opal/sys/cache.h -- its only purpose in life was to
#define CACHE_LINE_SIZE to 128.  This name has a conflict on NetBSD,
and it seems kinda odd to have a header file that ''only'' defines a
single value.  Also, we'll soon be raising hwloc to be a first-class
item, so having this file around seemed kinda weird.

Therefore, I replaced CACHE_LINE_SIZE with opal_cache_line_size, an
int (in opal/runtime/opal_init.c and opal/runtime/opal.h) on the
rationale that we can fill this in at runtime with hwloc info (trunk
and v1.5/beyond, only).  The only place we ''needed'' a compile-time
CACHE_LINE_SIZE was in the BTL SM (for struct padding), so I made a
new BTL_SM_ preprocessor macro with the old CACHE_LINE_SIZE value
(128).  That use isn't suitable for run-time hwloc information,
anyway.

This commit was SVN r23349.
2010-07-06 14:33:36 +00:00
Jeff Squyres
10185343a7 Ensure that we're actually checking for *linux*. Thanks to Aleksej
Saushev for the patch.

This commit was SVN r23336.
2010-07-01 23:26:49 +00:00
Jeff Squyres
6d07a1cc0b Per comments in this commit, hwloc isn't able to find cores on all
platforms (e.g., PPC64 running RHEL 5.4) -- sometimes it only finds
PUs.  So in that case, just run the same calculation, but with PUs
instead of cores.

This commit was SVN r23305.
2010-06-25 21:36:53 +00:00
Ralph Castain
f325ac030a Add a function to prepend a string to the beginning of an argv array - useful when building app_contexts from user input
This commit was SVN r23303.
2010-06-24 15:52:36 +00:00
Jeff Squyres
5cdd79ef13 Oops -- set the bits one at a time via _set. Using _cpu effectively
zeroed out the cpuset before setting the bit (i.e., we always had a
cpuset of 1).

This commit was SVN r23298.
2010-06-23 20:56:59 +00:00
Jeff Squyres
6bcdadbf0e If we're not building project_ompi, don't do anything with C++. Also
rename OMPI_CHECK_ATTRIBUTES -> OPAL_CHECK_ATTRIBUTES, because it's in
OPAL (somehow that name must have gotten missed in the Great M4 split
of '10...?)

This commit was SVN r23267.
2010-06-12 03:15:47 +00:00
Jeff Squyres
8ce59bb3e3 Use HWLOC_EMBEDDED_LIBS properly (new variable as of 1.0.2a12214).
Should fix some Solaris build issues.

This commit was SVN r23266.
2010-06-09 19:58:42 +00:00
Jeff Squyres
2887fe77c5 Refresh hwloc to an as-yet unreleased tarball from the hwloc 1.0
release branch in order to fix some Solaris bugs.

This commit was SVN r23265.
2010-06-09 19:56:18 +00:00
Jeff Squyres
f1a7b5cc33 Make "processor affinity not supported" error message a little better:
* Remove OPAL_ERR_PAFFINITY_NOT_SUPPORTED; fit it into the generic
   OPAL_ERR_NOT_SUPPORTED case.
 * When odls_default detects that processor affinity is not supported,
   it prints a specific message about it, and then it suppressed a
   generic HNP help message that would normally follow it (i.e., it's
   easier to have the "processor affinity is not supported" show_help
   message last).
 * Use some symbolic names in odls_default instead of fixed int's,
   just for slight readability improvements in the code.
 * Introduce orte_show_help_suppress(), which gives the ability to
   suppress any future showings of any arbitrary show_help() message.
   This is useful if you display message X and want to suppress
   message Y.  This suppression *only* works in environments where
   orte_show_help() does coalescing.

This commit was SVN r23249.
2010-06-08 20:16:07 +00:00
Shiqing Fan
43bd92272a Remove an unnecessary inline definition, in order to solve the conflict of function exporting on Windows.
This commit was SVN r23230.
2010-06-01 15:44:46 +00:00
Jeff Squyres
61f5528ec4 Update to hwloc 1.0.1rc1:
* Should fix the issues with 32 bit builds on 64 bit platforms
 * A few windows fixes
 * A few other minor / misc fixes

This commit was SVN r23226.
2010-06-01 14:51:25 +00:00
Jeff Squyres
e41603fb64 Add files into 3 directories that would not otherwise exist in a
distribution tarball, and would therefore cause automake to fail (in
case someone invokes autogen.sh on a distribution tarball).

This commit was SVN r23218.
2010-05-28 19:33:22 +00:00
Jeff Squyres
befc0b590b Fix the --disable-dlopen case -- don't expect to build or link anything.
This commit was SVN r23198.
2010-05-21 17:46:46 +00:00
Jeff Squyres
fec7918eea Some paffinity functions had their return status overloaded:
* If < 0, it's an OPAL_ERR_* value
 * If >= 0, it's the actual output value of the function

This is problematic for the OPAL_SOS stuff.  This commit changes those
functions to always return OPAL_* statuses and send the output value
back through output parameters (like 95% of the rest of the code
base).  This avoids the confusion with OPAL_SOS stuff and makes
paffinity work again (e.g., mpirun --bind-to-core ...).

I updated all paffinitiy modules for the new function signatures, and
bumped the paffinity API version up to 2.0.1.  I don't think the
version change will matter, though, because we'll be introducing
support for hardware threads soon, which will either bump the
paffinity version again or we'll replace paffinity with 
a new framework.

This commit was SVN r23197.
2010-05-21 16:55:28 +00:00
Jeff Squyres
208953f1bf Grr -- also don't reset LIBLTDL unless we're using an external libltdl
build. 

This commit was SVN r23194.
2010-05-21 15:00:03 +00:00
Shiqing Fan
857f1669e2 Solve a few compilation problems on Windows.
This commit was SVN r23193.
2010-05-21 14:30:15 +00:00
Jeff Squyres
473547481b Don't reset LTDLINCL unless we're using an external libltdl
installation. 

This commit was SVN r23192.
2010-05-21 13:58:53 +00:00
Jeff Squyres
e597c4f9cd Add --with-libltdl option to allow building Open MPI with an external installation of libltdl. Fixes trac:2407
This commit was SVN r23189.

The following Trac tickets were found above:
  Ticket 2407 --> https://svn.open-mpi.org/trac/ompi/ticket/2407
2010-05-20 22:42:02 +00:00
Josh Hursey
71fa89aca5 Move the sos_init() after the initialization of opal_show_help.
I was getting a funny segv if the param_register failed, and show_help was not initialized yet.

This commit was SVN r23177.
2010-05-19 20:47:05 +00:00
Abhishek Kulkarni
c63c4d6892 Fix bugs where (OMPI_ERROR == *) checks cannot be converted to (OMPI_SUCCESS != *) since the return codes are overloaded to return an "index" on success.
The fix is to just check if the return value is positive or not, since all the SOS encoded errors are *always* negative.

The real fix (as Ralph points out) is to change these functions (opal_pointer_array_add and mca_base_param*) to return the index as a pointer.

This commit was SVN r23173.
2010-05-18 20:54:11 +00:00
Jeff Squyres
32417b9802 Bump up to hwloc v1.0.
This commit was SVN r23171.
2010-05-18 17:11:45 +00:00
Abhishek Kulkarni
0b3e5f5d79 Silence a opal_sos compiler warning.
This commit was SVN r23163.
2010-05-17 23:14:44 +00:00
Abhishek Kulkarni
afbe3e99c6 * Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
 SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
 back the native error code.

* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
  (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
  decode 'ret' to get the native error code.

This commit was SVN r23162.
2010-05-17 23:08:56 +00:00
Abhishek Kulkarni
b0e963299a Adding a new function to return the stack trace (not including the call to the function itself)
as a string (which must be freed by the caller).

This commit was SVN r23160.
2010-05-17 22:57:42 +00:00
Abhishek Kulkarni
5e05546194 Adding SOS headers and package data to the Makefile.
This commit was SVN r23159.
2010-05-17 22:53:33 +00:00
Abhishek Kulkarni
4e33e6aeaa Merge OPAL SOS into the trunk.
The OPAL SOS framework tries to meet the following objectives:

 * reduce the cascading error messages and the amount of code needed to print an error message.
 * build and aggregate stacks of encountered errors and associate related individual errors with each other.
 * allow registration of custom callbacks to intercept error events.

For more information, refer to
https://svn.open-mpi.org/trac/ompi/wiki/ErrorMessages

This commit was SVN r23158.
2010-05-17 22:51:52 +00:00
Jeff Squyres
b43288f01e Add missing header file
This commit was SVN r23154.
2010-05-17 21:31:24 +00:00
Jeff Squyres
b0cfe91eca Re-enable hwloc component; it should be working now.
I forgot to mention one more thing in the r23152 commit message:

 * Copy the fix for hwloc's m4 to disable the configure flag
   --enable-debug when building in embedding mode, because it can be
   hijacked by the outter-level application.  In this case, if you
   configured OMPI with --enable-debug (or have --enable-debug in a
   platform file), you'd see all of hwloc's debug output.  Ick.  hwloc
   1.0 will include this fix.

This commit was SVN r23153.

The following SVN revision numbers were found above:
  r23152 --> open-mpi/ompi@ca3362021e
2010-05-17 21:07:57 +00:00
Jeff Squyres
ca3362021e Fix some problems noted by Ralph:
* Fix disabling hwloc build (i.e., put the AM_CONDITIONALs where they
   belong in the configure.m4 file)
 * Update some svn:ignores
 * r23142 removed some extraneous code, but forgot to remove the
   variables used only by that code

This commit was SVN r23152.

The following SVN revision numbers were found above:
  r23142 --> open-mpi/ompi@610fc67d12
2010-05-17 21:05:27 +00:00
Ralph Castain
cc8ebe7dd5 Protect against NULL when looking for an MCA param in an environment
This commit was SVN r23151.
2010-05-17 02:50:39 +00:00
Ralph Castain
12590202d8 Cleanup warnings
This commit was SVN r23148.
2010-05-16 20:22:00 +00:00
Ralph Castain
da170a7ab9 Turn off the blasted hwloc component as it generates a ton of garbage. Note that this means linux-based systems will -not- have paffinity for now since the good old plpa module was removed.
Clean up some missing ignores

This commit was SVN r23147.
2010-05-16 20:06:14 +00:00
Jeff Squyres
e2ab4f2baf Should be working now...
This commit was SVN r23143.
2010-05-14 15:20:47 +00:00
Jeff Squyres
610fc67d12 Oops -- don't convert to a processor ID here; just return the OS index
of the core.

This commit was SVN r23142.
2010-05-14 15:14:28 +00:00
Jeff Squyres
a27da2473a Ensure the whole directory is built.
This commit was SVN r23140.
2010-05-14 13:21:09 +00:00
Jeff Squyres
3ba4086b0f Remove another debugging message.
This commit was SVN r23139.
2010-05-14 13:20:46 +00:00
Jeff Squyres
a1848ef8d5 Arf. Ignore this component while I fix vpath builds...
This commit was SVN r23138.
2010-05-14 13:03:02 +00:00
Jeff Squyres
2d01a67516 Remove these generates files from SVN.
This commit was SVN r23137.
2010-05-14 11:58:17 +00:00
Jeff Squyres
8c8efa9bf3 Remove debugging message.
This commit was SVN r23136.
2010-05-14 11:57:43 +00:00
Jeff Squyres
21178f9379 Remove the "linux" paffinity component (i.e., the one that was based
on the now-defunct PLPA) -- the new hwloc component supersedes it.  

So long, PLPA -- we loved ya!

This commit was SVN r23126.
2010-05-13 23:59:21 +00:00
Jeff Squyres
3129ccd9ec Make the hwloc paffinity component available for everyone. hwloc
supports a wide variety of operating systems and platforms; see the
opal/mca/paffinity/hwloc/hwloc/README file for details.

This component includes an embedded copy of hwloc, currently based on
hwloc-1.0rc6.  But note that hwloc is properly SVN imported into the
/vendor branch, so it will be easy to update when 1.0 GA is released.
Note that the hwloc tree embedded in opal/mca/paffinity/hwloc/hwloc is
identical to a hwloc distribution tarball, except that much of the
documentation was rm -rf'ed (because we don't need it for the embedded
case).

Since the paffinity framework currently does not understand hardware
threads, the hwloc component compensates for this by identifying cores
by the "first" hardware thread on that core.  Hopefully we'll update
paffinity someday to understand hardware threads.  :-)

configure grew a --with-hwloc option, analogous to what we do for many
other external libraries that OMPI supports.  However, there's a new
feature: due to the request of several distros, OMPI can be configured
to build with its internal copy of hwloc or with an external copy of
hwloc (e.g., a system-installed hwloc).

 1. If --with-hwloc is not specified, Open MPI will try to use its
    internal copy (but silently fail/ignore hwloc if that fails).
 1. If --with-hwloc=<dir> is supplied, Open MPI looks for hwloc
    support in <dir> (and --with-hwloc-libdir=<dir>, if specified).
 1. If --with-hwloc=external is supplied, Open MPI will look for hwloc
    in a compiler/linker default external location.
 1. If --with-hwloc=internal is supplied, Open MPI will use its
    internal copy of hwloc.

Some of OMPI's main configury had to be slightly re-arranged in the
bootstrapping phase to accomodate hwloc's configry needs.

This commit was SVN r23125.
2010-05-13 23:56:05 +00:00
Jeff Squyres
ca6d95a9c8 Clean up some comments; make paffinity/base/base.h comments agree with
paffinity/paffinity.h. 

This commit was SVN r23124.
2010-05-13 23:43:28 +00:00
Jeff Squyres
bf7954c1de Bump up to 1.0rc6 from the vendor branch.
This commit was SVN r23117.
2010-05-12 17:04:48 +00:00
Jeff Squyres
c7c3de87f5 Add ummunotify support to Open MPI. See
http://marc.info/?l=linux-mm-commits&m=127352503417787&w=2 for more
details.

 * Remove the ptmalloc memory component; replace it with a new "linux"
   memory component.
 * The linux memory component will conditionally compile in support
   for ummunotify.  At run-time, if it has ummunotify support and
   finds run-time support for ummunotify (i.e., /dev/ummunotify), it
   uses it.  If not, it tries to use ptmalloc via the glibc memory
   hooks. 
 * Add some more API functions to the memory framework to accomodate
   the ummunotify model (i.e., poll to see if memory has "changed").
 * Add appropriate calls in the rcache to the new memory APIs to see
   if memory has changed, and to react accordingly.
 * Add a few comments in the openib BTL to indicate why we don't need
   to notify the OPAL memory framework about specific instances of
   registered memory.
 * Add dummy API calls in the solaris malloc component (since it
   doesn't have polling/"did memory change" support).

This commit was SVN r23113.
2010-05-11 21:43:19 +00:00
Ralph Castain
5965d3e620 Include the new error code in the error strings
This commit was SVN r23111.
2010-05-07 18:09:08 +00:00
Ralph Castain
d6a1d7a082 Little more cleanup on paffinity. Provide a specific error code for affinity not supported so we can better report the problem. Move the error reporting to orterun so we only get one error message. Update the darwin paffinity module to return the correct new error codes.
This commit was SVN r23107.
2010-05-07 14:04:55 +00:00
Ralph Castain
d4f56cff61 More cleanup on paffinity....groan
It is okay to not have a paffinity module IF you aren't using paffinity anyway. So don't error out of MPI_Init because a paffinity module wasn't selected.

Cleanup error reporting in the odls default module to (once and for all!) eliminate messages originating in the fork'd process. Create some new error codes to allow us to pass enough info back to the parent process to provide useful error messages.

This commit was SVN r23106.
2010-05-06 20:57:17 +00:00
Jeff Squyres
71cbe1a69f Bump up to hwloc v1.0rc3
This commit was SVN r23070.
2010-04-29 15:59:01 +00:00
Jeff Squyres
f064056a07 We don't need all this stuff in OMPI.
This commit was SVN r23056.
2010-04-28 00:31:15 +00:00
Jeff Squyres
2fe1bc043d Bump up to hwloc 1.0rc2
This commit was SVN r23042.
2010-04-26 21:57:51 +00:00
Ralph Castain
13a7338289 Ensure we get past the '=' in the parameter
This commit was SVN r23039.
2010-04-26 20:46:50 +00:00
Ralph Castain
e1b9f400ba Add some new utilities that support searching an environ string list (not just our own environ) for specific MCA params and returning their value. Helpful when a daemon needs to check an app_context's environ for params that can impact how the daemon launches and/or interacts with it, but don't pertain to the daemon's own environ.
This commit was SVN r23034.
2010-04-26 03:35:09 +00:00
Jeff Squyres
ea8b0ea569 Add a new function in the paffinity base:
opal_paffinity_base_cset2str().  This function basically makes a
prettyprint string out of an opal_paffinity_base_cset_t.

This commit was SVN r23017.
2010-04-21 17:26:36 +00:00
Shiqing Fan
d1e66bdd01 Use variables instead of hard-coded compiler flags, in order to support various C/C++ compilers on Windows.
This commit was SVN r23016.
2010-04-21 12:45:00 +00:00
Christopher Yeoh
cab7982c7e fixes trac:2355 - race in interaction between opal_atomic_lifo_push
and opal_atomic_lifo_pop. Adds memory barriers to remove the race
condition

This commit was SVN r23014.

The following Trac tickets were found above:
  Ticket 2355 --> https://svn.open-mpi.org/trac/ompi/ticket/2355
2010-04-21 00:00:14 +00:00
Jeff Squyres
53ab6600e6 Minor update to comments.
This commit was SVN r23013.
2010-04-20 20:59:42 +00:00
Jeff Squyres
f1d4a748eb Minor fix: pass by pointer to the new function so that the caller
can see the results.

This commit was SVN r23012.
2010-04-20 19:52:47 +00:00
Ralph Castain
7717c970a3 Ahem...it requires 2 hex chars to describe each byte of a bitmask...
This commit was SVN r23001.
2010-04-20 05:11:16 +00:00
Ralph Castain
86228aee38 Provide two new opal paffinity utilities for printing a hex representation of the cpu set and parsing that string back into a cpu set on the other end. Also add a new MCA param for passing the cpu set applied to a process during launch down to that process so it can know what we attempted to do.
All to be used in some new MPI extensions provided by Jeff so that users can easily query their binding situation.

This commit was SVN r22998.
2010-04-19 22:16:35 +00:00
Jeff Squyres
338920656f Remove the compile-time proiorities for paffinity modules (they were
done this way a long time ago for the "gee whiz!" factor -- when in
reality, they really only need one-of-many-run-time priority
selection).

Changed run-time priorities to be as follows:

 * darwin: 20
 * linux: 20
 * posix: 10
 * solaris: 30
 * test: 5
 * windows: 20

I have a very dim (possibly untrue) recollection that Solaris needs to
have a higher priority than others just to ensure that no other is
chosen under Solaris.  Make all other "native" components have a
priority of 20 (they shouldn't conflict with each other).  Make the
posix fallback component have a priority of 10.  Make the test
component priority 5, meaning someone can always select it, but you
can also make a "never select me" component that prioritizes itself
under test.

This commit was SVN r22997.
2010-04-19 22:14:06 +00:00
Jeff Squyres
9f5ddbcc6e 3rd party import hwloc 1.0rc1 into the SVN trunk
This commit was SVN r22996.
2010-04-19 19:48:58 +00:00
Jeff Squyres
8b163ccd70 Add dummy hwloc directory for staged import into svn
This commit was SVN r22994.
2010-04-19 19:43:43 +00:00
Ralph Castain
4d06125a33 Establish a method by which a process knows if it has been bound by mpirun. This helps resolve a problem where a process gets "bound" to all available resources, which looks to the opal paffinity system as "not bound". This can cause mpi_init to attempt to "bind" the process itself, causing unintended behavior.
This commit was SVN r22985.
2010-04-17 01:58:26 +00:00
Ralph Castain
41428e6b61 Issue a warning if a requested binding operation results in processes being bound to all available processes, which is the equivalent of not being bound at all.
See the following email thread for further details:

http://www.open-mpi.org/community/lists/devel/2010/04/7745.php

This commit was SVN r22984.
2010-04-17 01:02:41 +00:00
Jeff Squyres
798202c424 Allow the mca_component_path to change over time.
This commit was SVN r22957.
2010-04-12 22:02:34 +00:00
Jeff Squyres
f77257d931 These don't belong in this file.
This commit was SVN r22956.
2010-04-12 20:50:23 +00:00
Jeff Squyres
1919ba225d Allow static_components to be NULL for cases where we ''know'' there
will be no static components to be searched.

This commit was SVN r22954.
2010-04-12 14:51:47 +00:00
Shiqing Fan
96b20a29b5 An easy solution to make singleton work on Windows.
This commit was SVN r22952.
2010-04-10 16:30:59 +00:00
Terry Dontje
282a537cf7 This commit fixes 2370, by having the solaris paffinity module return error codes for get_physical_processor_id and having odls_default_fork_local_proc check get_physical_processor_id for OPAL_ERROR
This commit was SVN r22948.
2010-04-09 15:10:46 +00:00
Ralph Castain
2b8ab61328 Add another helpful macro
This commit was SVN r22934.
2010-04-06 22:40:45 +00:00
Brad Benton
58a9aeff5a ================================================================================
modify the OPAL_PAFFINITY_PROCESS_IS_BOUND macro to search the cpuset for
the maximum possible number of cpus rather than just the number of cpus
currently online.  This corrects a problem where mpi_paffinity_alone was
not working properly on systems in which there can be cpu namespaces with
holes, such as on ppc64 with smt off (as discussed in #2365).

This commit was SVN r22927.
2010-04-02 18:24:12 +00:00
Jeff Squyres
8a85c4617f Fixes trac:2366: dragonboy noticed that the PGI compiler is picky about
#if directives -- had to change a pair of #if conditionals in
opal/util/stacktrace.c to make the PGI compiler accept it.

This commit was SVN r22923.

The following Trac tickets were found above:
  Ticket 2366 --> https://svn.open-mpi.org/trac/ompi/ticket/2366
2010-04-01 17:04:06 +00:00
Josh Hursey
62f8d3c471 r22885 missed a few symbol updates when it changed ompi_want_ft to opal_want_ft
This commit was SVN r22916.

The following SVN revision numbers were found above:
  r22885 --> open-mpi/ompi@522a23d6a3
2010-03-30 16:47:39 +00:00
Ralph Castain
522a23d6a3 A few changes to the FT-related configure options:
1. fix a bug that caused an infinite loop in configure when specifying want-ft but not want-ft-thread by removing a stale reference to the opal-progress-thread option

2. add want-ft=orcm so we can build the orcm errmgr component

3. cleanup the use of "ompi_want_ft_xxx" and replace it with "opal_want_ft_xxx" so that naming conventions are preserved

This commit was SVN r22885.
2010-03-25 22:53:48 +00:00
Christopher Yeoh
cd5294944b fixes trac:2355 - race in opal_atomic_lifo
Adds memory barriers to remove race condition which can
occur on PowerPC architectures (and probably others)

This commit was SVN r22880.

The following Trac tickets were found above:
  Ticket 2355 --> https://svn.open-mpi.org/trac/ompi/ticket/2355
2010-03-25 03:44:38 +00:00
Jeff Squyres
c26dae01ce Update the if.c code to properly use the OBJ_* system.
This commit was SVN r22869.
2010-03-23 20:37:06 +00:00
Jeff Squyres
59126b1e0b Update copyrights.
This commit was SVN r22867.
2010-03-23 12:03:20 +00:00
Shiqing Fan
9591680ec0 One of the binaries was generated from a wrong source.
This commit was SVN r22865.
2010-03-23 09:56:11 +00:00
Jeff Squyres
136f926fd1 Really fixes trac:2104. There is a lengthy discussion about this patch on
#2322.

The short version is that this patch consolidates two pieces of code
that call the back-end munmap and ensures that (if dlsym is used) the
corresponding dlsym is only invoked once and that the variable holding
the result is volatile.

This commit was SVN r22863.

The following Trac tickets were found above:
  Ticket 2104 --> https://svn.open-mpi.org/trac/ompi/ticket/2104
2010-03-23 01:04:25 +00:00
Ralph Castain
df2d361b2b Add a pair of convenience macros for handling threads to minimize code duplication
This commit was SVN r22861.
2010-03-22 15:45:03 +00:00
Ralph Castain
b400b84162 Merge in the modified thread configure option branch per today's telecon.
Remove the --enable-progress-threads option as this is no longer functional, and hardcode OPAL_ENABLE_PROGRESS_THREADS to 0.

Replace the --enable-mpi-threads option with --enable-mpi-thread-multiple as this is clearer as to meaning. This option automatically turns "on" opal thread support if it wasn't already so specified. If the user specifies --disable-opal-multi-threads --enable-mpi-thread-multiple, we will error out with a message

Add a new --enable-opal-multi-threads option that turns "on" opal thread support without doing anything wrt mpi-thread-multiple

This commit was SVN r22841.
2010-03-16 23:10:50 +00:00
Rainer Keller
814fb9399f - Further patches for support on NetBSD (and DragonFly) by
Aleksej Saushev.
   Dont use bash or bashism in shell scripts
   We should use Posix' setpgid(0,0), which is equivalent to setpgrp().

This commit was SVN r22829.
2010-03-15 05:33:42 +00:00
Josh Hursey
e9b5162d79 Fix the configure logic for --with-ft so that it properly takes a comma separated list.
Many of the OPAL_ENABLE_FT should be OPAL_ENABLE_FT_CR, so fix those.

The OPAL Layer INC should call opal_output on restart so that it can refresh the string it prints to reflect the current pid/hostname which may have changed.

This commit was SVN r22824.
2010-03-12 23:57:50 +00:00
Ralph Castain
ed1dbabc0c Remove the last vestiges of mpi_portable_platform.h.in
This commit was SVN r22789.
2010-03-05 21:21:03 +00:00
Nadia Derbey
3f56f9e688 Fix typo in evutil.h
This commit was SVN r22730.
2010-03-01 07:55:08 +00:00
Ralph Castain
8c7f3a0c44 Silence warnings by correctly identifying when we are on a Mac
This commit was SVN r22724.
2010-02-27 08:15:49 +00:00
Iain Bason
7445b23e0d Fixed a minor typo.
This commit was SVN r22706.
2010-02-24 19:05:19 +00:00
Jeff Squyres
af6f1f4b00 Add pkg-config(1) config files to Open MPI. Additionally, fix a minor
bug: libmpi_f90 had libmpi.la in its LIBADD instead of libmpi_f77.la.

Fixes trac:2244.

This commit was SVN r22704.

The following Trac tickets were found above:
  Ticket 2244 --> https://svn.open-mpi.org/trac/ompi/ticket/2244
2010-02-24 18:46:06 +00:00
Jeff Squyres
d9b6b5af0c This commit converts us to the "one big libmpi" scheme that has been
discussed extensively.  See
https://svn.open-mpi.org/trac/ompi/ticket/2092 and the RFC thread
http://www.open-mpi.org/community/lists/devel/2010/02/7447.php.

Specifically:

 * Create LT convenience libraries for OPAL and ORTE if the layer
   above them is being created (use the already-defined
   AM_CONDITIONALs to know if the project above us is being built).
 * ORTE slurps in the LT convenience library for OPAL; OMPI slurps in
   the LT convenience library for ORTE.
 * Wrapper compilers now only -l one library (e.g., ortecc only does
   -lopen-ret, and mpicc only does -lmpi).

This commit was SVN r22691.
2010-02-23 22:20:01 +00:00
Terry Dontje
cfe37fb5a1 Fixed issue with detecting root dir and used appropriate defines for solaris detection
This commit was SVN r22686.
2010-02-23 15:58:49 +00:00
Christopher Yeoh
bccafbb5df Fixes the problem where the rcache and core memory allocation can deadlock itself
This commit fixes trac:2104. Request a cmr:v1.4

This commit was SVN r22675.

The following Trac tickets were found above:
  Ticket 2104 --> https://svn.open-mpi.org/trac/ompi/ticket/2104
2010-02-22 05:12:10 +00:00
George Bosilca
3356c2e241 Don't forget to update the return value for PPC32 and PPC64.
This commit was SVN r22665.
2010-02-18 19:16:41 +00:00
George Bosilca
ab202d0f69 Add the memory and the cc to the clobber list for the cas atomics.
This commit was SVN r22664.
2010-02-18 19:15:50 +00:00
Rainer Keller
a46cecf4f2 - Use strrchr instead of loop for '/' as Nysal suggests.
This commit was SVN r22649.
2010-02-17 23:40:08 +00:00
Jeff Squyres
17f0885f12 Add proper BSD interface detection code. Fixes a long-standing
discussion on the users list (see
http://www.open-mpi.org/community/lists/users/2009/12/11526.php). 

Many thanks to Kevin Buckley who did most of the coding work, and to
Aleksej Saushev for his extreme patience in waiting for me to review
and commit this stuff.

This commit was SVN r22640.
2010-02-17 19:43:57 +00:00
Terry Dontje
2a4b1227d9 corrected an array access bug in the latest libevent merge (see #2234) that was causing Solaris binaries to loop infinitely.
This commit was SVN r22638.
2010-02-17 14:50:37 +00:00
Shiqing Fan
3a3018deef Convert the line endings for the added header files. They were changed automatically by Windows when adding new files.
This commit was SVN r22634.
2010-02-16 17:24:44 +00:00
Shiqing Fan
08ffdbe987 Changes for portable platform headers. Commit it on behalf of Ralph.
This commit was SVN r22619.
2010-02-15 22:14:59 +00:00
Shiqing Fan
0b765637d9 A type cast.
This commit was SVN r22618.
2010-02-15 10:26:02 +00:00
Ralph Castain
7a1b2a706e Add a new ring_buffer class
This commit was SVN r22615.
2010-02-14 19:20:19 +00:00
Nysal Jan
0538b1a948 Adding GPFS to the list of file systems checked
This commit was SVN r22612.
2010-02-12 14:15:39 +00:00
Rainer Keller
ea4de16561 - Check whether file is opened on network file-system.
If file does not exist, check the directory it lives in...
   Maybe used by caller, trying to open mmap() on NFS, Lustre or
   Panasas (thanks Sam).
   For now, this is used to warn about the usage of mmap on such FS.

   Please note, that Ralph mentioned the orte_no_session_dir parameter.
   The help message includes a reference to this.

   Tested on NFS and Lustre on Linux on
     smoky: mpirun --mca orte_tmpdir_base $HOME/tmp -np 2 ./mpi_stub
     jaguar: mpirun ... --mca orte_tmpdir_base /tmp/work/$USER ...

   Fixes trac:1354

   This should   cmr:v1.5   once it has soaked and is shown to work on
   Solaris

This commit was SVN r22604.

The following Trac tickets were found above:
  Ticket 1354 --> https://svn.open-mpi.org/trac/ompi/ticket/1354
2010-02-10 23:18:29 +00:00
Jeff Squyres
a89dc623b0 Brice Goglin noticed that mpi_paffinity_alone didn't seem to be doing
anything for non-MPI apps.  Oops!  (But before you freak out, gentle
reader, note that mpi_paffinity_alone for MPI apps still worked fine)
When we made the switchover somewhere in the 1.3 series to have the
orted's do processor binding, then stuff like:

  mpirun --mca mpi_paffinity_alone 1 hostname

should have bound hostname to processor 0.  But it didn't because of a
subtle startup ordering issue: the MCA param registration for
opal_paffinity_alone was in the paffinity base (vs. being in
opal/runtime/opal_params.c), but it didn't actually get registered
until after the global variable opal_paffinity_alone was checked to
see if we wanted old-style affinity bindings.  Oops.

However, for MPI apps, even though the orted didn't do the binding,
ompi_mpi_init() would notice that opal_paffinity_alone was set, yet
the process didn't seem to be bound.  So the MPI process would bind
itself (this was done to support the running-without-orteds
scenarios).  Hence, MPI apps still obeyed mpi_paffinity_alone
semantics.

But note that the error described above caused the new mpirun switch
--report-bindings to not work with mpi_paffinity_alone=1, meaning that
the orted would not report the bindings when mpi_paffinity_alone was
set to 1 (it ''did'' correctly report bindings if you used
--bind-to-core or one of the other binding options).

This commit separates out the paffinity base MCA param registration
into a small function that can be called at the Right place during the
startup sequence.

This commit was SVN r22602.
2010-02-10 22:32:00 +00:00
Rainer Keller
80136ac9e2 - We don't configure-check for errno.h and don't need errno.h here...
This commit was SVN r22587.
2010-02-09 01:12:52 +00:00
Jeff Squyres
3c8685ea8c Add a check for strtoll for libevent.
Not having this check was causing distcheck errors on the OMPI
tarball-build machine because it's still a 32-bit-default machine, so
the evutil.c code was failing some #if conditionals (since it didn't
think it had strtoll available).

This commit was SVN r22577.
2010-02-08 20:55:21 +00:00
Jeff Squyres
cd5012d481 Fix "make dist: opal/event/WIN32-Code/misc.* don't exist anymore.
This commit was SVN r22562.
2010-02-05 13:42:45 +00:00
Shiqing Fan
23bb52ad05 Remove a few files from the CMakeList, that no long exist in the new libevent.
Add #ifdef for including _libevent_time.h.
Use the Windows version of function random().

This commit was SVN r22556.
2010-02-04 15:18:54 +00:00
Brian Barrett
b5e391251f Update libevent to 1.4.13
This commit was SVN r22548.
2010-02-04 05:38:30 +00:00
Ralph Castain
db1b07c02c In multiple places in the code base, we expect opal_bitmap_is_set_bit to return either true or false...not a negative error code. If the index is out of range, this is effectively a "false" condition as the bit was clearly not set by the program.
This revision is consistent with how the function was called in the OMPI code base. Will fix the test to match.

This commit was SVN r22544.
2010-02-03 19:45:22 +00:00
Shiqing Fan
7ad3d310b8 Define SIGPIPE for Windows, just for fixing the v1.4 Windows build.
cmr:v1.4.2:reviewer=jsquyres

This commit was SVN r22543.
2010-02-03 18:49:22 +00:00
Rainer Keller
0009d10c4d - This fixes the failing mpic++/mpiCC MTT tests, bailing due to not
finding symbol pthread_atfork, e.g. cxx-test-suite.

   Fixes trac:2088

   cmr:v1.5:reviewer=jsquyres

This commit was SVN r22542.

The following Trac tickets were found above:
  Ticket 2088 --> https://svn.open-mpi.org/trac/ompi/ticket/2088
2010-02-03 18:47:13 +00:00
Brian Barrett
8b4825ff37 Updates to make trunk run on Catamount again:
* Don't build the pstat component if all defines needed aren't there.
 * Update platform file to work better
 * Work around two places that depended on modex being operational

This commit was SVN r22536.
2010-02-03 05:07:40 +00:00
Jeff Squyres
007a6c7b99 Per #2201, move the user arguments up to be the first set of argv
after the compiler argv tokens.  

Not closing #2201 yet; there's still discussion on that ticket about
whether we want to do more or not.

Refs trac:2201
cmr:v1.4.2 
cmr:v1.5

This commit was SVN r22513.

The following Trac tickets were found above:
  Ticket 2201 --> https://svn.open-mpi.org/trac/ompi/ticket/2201
2010-01-29 22:51:35 +00:00
Jeff Squyres
93e930ae13 Fix minor typo
This commit was SVN r22497.
2010-01-26 23:21:00 +00:00
Josh Hursey
b749ecbab8 This commit fixes trac:2190.
Originally the patch was to improve the error message, but when digging into the code I found a subtle bug. If the daemon does not tell the HNP what CRS component it used, then the HNP tries to figure it out from the metadata (this is an uncommon case). The path the HNP used was not complete, so it was unable to find the metadata information. This patch fixes this by adding the 'snapshot_reference' to the 'snapshot_location' which completes the path for this search.

cmr:v1.4 (needs a custom patch)

cmr:v1.5

This commit was SVN r22479.

The following Trac tickets were found above:
  Ticket 2190 --> https://svn.open-mpi.org/trac/ompi/ticket/2190
2010-01-25 20:28:38 +00:00
Ralph Castain
2e2e49e46f Define a standard return value for when a thread exits.
Not sure what Windows or Solaris are looking for, so defined it all the same

This commit was SVN r22471.
2010-01-23 03:57:24 +00:00
Shiqing Fan
c29a668e37 Remove flex.exe and its license file from the tarball.
cmr:v1.4
cmr:v1.5

This commit was SVN r22469.
2010-01-22 16:40:13 +00:00
Shiqing Fan
4836e8878a Update a few more CMake scripts.
This commit was SVN r22454.
2010-01-19 17:34:55 +00:00
Jeff Squyres
596473e7ca Patch from Aleksej Saushev to properly only check for /proc/cpuinfo on Linux-based systems
This commit was SVN r22417.
2010-01-14 23:16:31 +00:00
Shiqing Fan
ad763c327d Restore several linked libraries that were deleted by mistake in r22405.
This commit was SVN r22415.

The following SVN revision numbers were found above:
  r22405 --> open-mpi/ompi@872a4047ba
2010-01-14 21:50:42 +00:00
Shiqing Fan
872a4047ba Fix the bug that caused by ADD_DEPENDENCIES() from different version of CMake.
In CMake 2.6 and earlier, this function add dependencies for targets and also link the target libraries automatically, but in CMake 2.8,this behavior has been changed, i.e. it will only add the dependencies but no link, which will cause linking errors at compilation time.

This commit was SVN r22405.
2010-01-14 18:10:20 +00:00
Ralph Castain
f0646b3603 Need separate flag for select when initializing sysinfo framework
This commit was SVN r22394.
2010-01-12 23:22:46 +00:00
Ralph Castain
b35486d945 The CM ess module needs to open the sysinfo framework and select modules prior to when others need it. Thus, setup a flag to avoid multiple open/select within that framework.
This commit was SVN r22393.
2010-01-12 22:03:49 +00:00
Jeff Squyres
d9fc4e0a9d Per http://www.open-mpi.org/community/lists/devel/2010/01/7283.php, allow MCA components to fail the component.register and component.open methods without the MCA base printing errors.
This commit was SVN r22391.
2010-01-12 19:29:12 +00:00
Brian Barrett
86d8356b13 Updates to allow OMPI to build on Cray XT platforms running Catamount
This commit was SVN r22381.
2010-01-07 18:14:03 +00:00
Jeff Squyres
dbb29663e8 Update the embedded PLPA version to v1.3.2. Since this is a 3rd
party/"vendor" import, the changes are actually far smaller than the
size of this changeset implies.  Here's a list of the changes:

 * Update the AMD license header in plpa_map.c to be less restrictive
   (see https://svn.open-mpi.org/trac/plpa/changeset/262 for details)
   -- '''this is the most/only important change of this update.'''  No
   code is changed by this; only removing a clase from a license
   header in plpa_map.c.
 * Changes to the generated {{{configure}}}, {{{config.guess}}}, and
   {{{config.sub}}} scripts (which aren't used by OMPI).
 * soname version tracking changes (which also aren't used by OMPI;
   they're only used when PLPA is built/installed in "standalone"
   mode).
 * Update the "get version" m4 (which was stolen from OMPI's m4 to
   begin with, and is only used during OMPI's autogen.sh step).
 * Update various PLPA version numbers to 1.3.2.
 * Bug fix in plpa-taskset (which is not built in the OMPI PLPA build).

This commit was SVN r22367.
2010-01-06 00:44:14 +00:00
Ralph Castain
fad1ba15b0 Move the test for case-sensitive file system from ompi to opal so that all layers can have that knowledge.
Use that for the orte wrapper compilers

This commit was SVN r22348.
2009-12-29 23:26:45 +00:00
Josh Hursey
313acba4ce Move the mca_base_is_component_required() functionality to mca/base per suggestion so that it can be reused in other components.
This commit was SVN r22327.
2009-12-17 15:12:26 +00:00
George Bosilca
7ba371cd92 Correct the atomics on x86 and x86_64. Thanks to Iain for the catch,
to Eugene, Jeff, and Briand for the help. This patch is supposed to
fix several outstanding issues, notably the one on tickets #2043.

This commit was SVN r22324.
2009-12-16 22:34:56 +00:00