1
1
Граф коммитов

755 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
9dc3a1aa54 Upgrade to hwloc 1.1.2; most likely the last release of the hwloc
1.1.x series

This commit was SVN r24611.
2011-04-12 14:35:26 +00:00
Jeff Squyres
38d3cdd4a6 Update hwloc to 1.1.1. Next stop: 1.1.2.
This commit was SVN r24610.
2011-04-12 14:16:37 +00:00
Shiqing Fan
4b3b713bfc Update the windows installdir component.
Don't use the old env component for windows, so remove the .windows file.

This commit was SVN r24597.
2011-04-05 12:15:41 +00:00
Ralph Castain
f40edd6b4f Add the stupid test word
This commit was SVN r24578.
2011-03-26 03:38:59 +00:00
Ralph Castain
5bfb01c6c8 Only build the linux component of sysinfo if linux is the operating system.
Thanks to Paul Hargrove for the suggestion.

This commit was SVN r24576.
2011-03-25 20:55:57 +00:00
Jeff Squyres
cf6c5e8d48 Fix a bug noted by Gus Correa on the user's list: mpi_paffinity_alone
appeared multiple times in ompi_info output (so did others, but this
is the one that was noticed).  Ensure that we don't repeat
opal_paffinity_base_register_params() multiple times.

This commit was SVN r24569.
2011-03-24 00:58:25 +00:00
Jeff Squyres
324b90142f Fix CID 1583: hwloc bitmap leak.
This commit was SVN r24496.
2011-03-08 16:47:26 +00:00
Josh Hursey
7c737b9274 Some string and state cleanup. Thanks to George Bosilca for the initial patch.
This commit was SVN r24471.
2011-03-01 20:12:23 +00:00
Jeff Squyres
ad985260d3 Ensure to disable XML and Cairo support in hwloc; OMPI doesn't use it. Additionally, ensure that the right flags are passed back to the wrappers in the case of static builds. We probably won't need these (especially since XML has been disabled), but it's the Right Thing to do.
This commit was SVN r24451.
2011-02-23 23:11:45 +00:00
Jeff Squyres
e8ba72258e Patch for PPC64 platforms with smt=off, issue raised by Brad. This
fix will be included in hwloc 1.1.2.

Brad -- can you verify that this fixes the issue for you?

Fixes trac:2732.

This commit was SVN r24450.

The following Trac tickets were found above:
  Ticket 2732 --> https://svn.open-mpi.org/trac/ompi/ticket/2732
2011-02-23 22:43:58 +00:00
Jeff Squyres
8143b201a9 Custom patch for hwloc (that will be included in hwloc 1.1.2) so that
we don't barf on Linux non-NUMA (NNUMA, aka UMA ;-) ) platforms.

This commit was SVN r24448.
2011-02-23 21:02:02 +00:00
Jeff Squyres
2368410eff * Ensure to follow standard filename conventions for output MCA DSO
filenames -- don't include the project name ("opal")
 * Don't link maffinity/hwloc and paffinity/hwloc against the common
   hwloc in the static build case (because this will result in
   duplicate symbols)

This commit was SVN r24447.
2011-02-23 21:00:20 +00:00
Jeff Squyres
5e082d68f6 Fix the compile error for libnuma 0.9.x introduced in r24442;
hopefully, this now compiles for libnuma 0.9.x and libnuma 2.0.x.

Fixes for the strategy discussed in the commit message for r24442
(i.e., check against numa_get_mems_allowed(), which only exists in
libnuma 2.0.x) and the new new new plan on #2698 coming in a separate
commit.

This commit was SVN r24443.

The following SVN revision numbers were found above:
  r24442 --> open-mpi/ompi@90a8fe4aad
2011-02-23 13:44:46 +00:00
Rainer Keller
90a8fe4aad - Addendum to r24421: get mca_maffinity_libnuma to compile on linux
(with libnuma-2.0.4 / LIBNUMA_API_VERSION 2): numa_get_run_node_mask
   returns a struct bitmask *.

   Whether it's a good idea to blindly pass that on to
   numa_set_membind() is another matter: one might want to match against
   the list returned by numa_get_mems_allowed(), which may be set by the
   outside environment.

   Refs trac:2698.

This commit was SVN r24442.

The following SVN revision numbers were found above:
  r24421 --> open-mpi/ompi@31510e683b

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-02-23 12:59:49 +00:00
Jeff Squyres
c1b26005d7 Create new opal/mca/common area, similar to ompi/mca/common. Move hwloc into this new opal MCA common area, and link the hwloc paffinity component against it. Also add a new hwloc maffinity component, and also link it against the opal MCA common hwloc. More development coming soon regarding this common hwloc instance (i.e., an OPAL-ized version of the hwloc API via a new framework so that we can safely use hwloc's services throughout the rest of the OPAL/ORTE/OMPI code bases.
This commit was SVN r24440.
2011-02-22 23:21:48 +00:00
Jeff Squyres
31510e683b Replace r24290 with something more meaningful. In this case, find out
what memory node the process is running on (which is guaranteed to be
a good answer because maffinity won't be invoked unless the process is
already bound to a specific processor), and then bind our memory to
that. 

Refs trac:2698.

This commit was SVN r24421.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-02-21 20:07:11 +00:00
Shiqing Fan
ddc05e05d7 Avoid blocking select on Windows.
This commit was SVN r24396.
2011-02-16 08:48:21 +00:00
Ralph Castain
bf1cff3711 Plug a couple of additional memory leaks - try to highlight a little better that strings returned from reg_string_name must be freed by caller
This commit was SVN r24383.
2011-02-14 20:58:22 +00:00
Ralph Castain
d85916c1c2 Plug memory leak
This commit was SVN r24380.
2011-02-14 19:51:33 +00:00
Jeff Squyres
b0ce9bae8e Oops. Also need to remove myriexpress.h from the Makefile.am.
This commit was SVN r24357.
2011-02-04 03:29:49 +00:00
Jeff Squyres
6421abecc7 Fixes trac:2690.
Temporarily remove hwloc's internal version of myriexpress.h.  It is
causing a problem when compiling Open MPI with MX support because
hwloc uses AC_CONFIG_HEADER in hwloc's hwloc.m4 to generate
opal/mca/paffinity/hwloc/hwloc/include/hwloc/config.h.
AC_CONFIG_HEADER apparently has the (undocumented) side effect of
adding -I$(top_builddir)/opal/mca/paffinity/hwloc/hwloc/include/hwloc
to OMPI's compilation flags.  Hence, when the OMPI MX components are
compiled and #include "myriexpress.h" (or <myriexpress.h>) they see
hwloc's myriexpress.h before the system one.  Badness ensures.

This removal is temporary because we need to figure out a better
solution.  But for now, OMPI is not using hwloc's myriexpress.h file --
so it's safe to remove.  I'll push this issue upstream to hwloc to
figure out a better solution...

This commit was SVN r24354.

The following Trac tickets were found above:
  Ticket 2690 --> https://svn.open-mpi.org/trac/ompi/ticket/2690
2011-02-03 14:24:32 +00:00
Jeff Squyres
4674e62929 These files are superflouos.
This commit was SVN r24331.
2011-02-01 21:31:35 +00:00
Jeff Squyres
c8badb79df Don't instantiate variables in for loops; we don't assume C99
compilers. 

This commit was SVN r24330.
2011-02-01 19:23:14 +00:00
Nysal Jan
42015cf30a Fix build failure on AIX
This commit was SVN r24321.
2011-01-28 08:09:45 +00:00
Nysal Jan
857c32784e Fix detection of fd_mask
This commit was SVN r24320.
2011-01-28 06:20:32 +00:00
Jeff Squyres
6c8de8fb76 Bump up to hwloc 1.1.1
This commit was SVN r24312.
2011-01-26 23:20:26 +00:00
Josh Hursey
66af515061 Fix C/R functionality with the new libtool. This fixes the case where the restarted process cannot be checkpointed or finalized.
Short Version:
--------------
Event engine needs to be flushed so it does not use old/stale file descriptors.

Long Version:
-------------
The problem was that the restarted process was waiting for the socket to the local daemon to finish establishing during the 'sync' operation. The core problem was that the daemon was sending a header of 36 bytes, but the restarted process only received 35 bytes of the message. So the restarted process became stuck waiting for the last byte to arrive.

After many hours of digging, I figured out that the event engine was using the same file descriptor for its evsig_cb functionality (to signal itself when a signal arrives). So when the daemon wrote in to the new fd the event engine was stealing the first byte (*shakes fist at event engine*) before the recv() could be posted.

The solution is to use the event_reinit() function on restart to re-establish the now-stale file descriptors in the event engine. This seems to have fixed the problem.


A few other minor things:
-------------------------
 * Add a check to make sure the event engine is balanced in its init/finalize
 * Add the opal_event_base_close() to the BLCR restart exec function (still not 100% sure it is needed, but there it is).

This commit was SVN r24296.
2011-01-25 22:43:47 +00:00
Jeff Squyres
afa654746c Somehow this has been sitting, uncommitted, in a local checkout since
last December.  :-(

Add new MCA param: maffinity_libnuma_policy.  Thanks to David
Singleton for the suggestion.  Here's the help text about it:

{{{
   MCA maffinity: parameter "maffinity_libnuma_policy" (current value:
                  <loose>, data source: default value)
                  Binding policy that determines what happens if memory
                  is unavailable on the local NUMA node.  A value of
                  "strict" means that the memory allocation will fail;
                  a value of "loose" means that the memory allocation
                  will spill over to another NUMA node.
}}}

This commit was SVN r24290.
2011-01-24 14:39:16 +00:00
Abhishek Kulkarni
3243b16bb3 Decode SOS error code before checking it with the native error code.
This commit was SVN r24281.
2011-01-20 23:21:38 +00:00
Jeff Squyres
189b541dbd Add a proper help message for the mca_verbose MCA param (and shuffle
the code to be slightly more efficient).

This commit was SVN r24256.
2011-01-14 20:18:06 +00:00
Terry Dontje
56c03a3853 removing a file I should not have added
This commit was SVN r24220.
2011-01-11 19:02:08 +00:00
Terry Dontje
a374661ead add configure.params to solaris sysinfo module to allow it to be built
This commit was SVN r24219.
2011-01-11 18:31:55 +00:00
Jeff Squyres
cd8f12d8e5 Remove a few useless files that were missed last night.
This commit was SVN r24218.
2011-01-11 14:15:31 +00:00
Jeff Squyres
54cb4eb2b5 Merge over new version of hwloc 1.1 from the vendor branch. Update
the module to use the new hwloc bitmap API (the cpuset API is both
klunkier and deprecated), which simplified a few things.

This commit was SVN r24217.
2011-01-11 01:41:10 +00:00
Josh Hursey
bbfdf04a81 Fix a couple of 'unused variable' warnings, and one return value warning.
{{{
base/paffinity_base_service.c: In function ‘opal_paffinity_base_cset2mapstr’:
base/paffinity_base_service.c:623: warning: unused variable ‘range_last’
base/paffinity_base_service.c:623: warning: unused variable ‘range_first’
base/paffinity_base_service.c:622: warning: unused variable ‘count’
base/paffinity_base_service.c:622: warning: unused variable ‘m’
}}}

{{{
connect/btl_openib_connect_oob.c: In function ‘init_ud_qp’:
connect/btl_openib_connect_oob.c:1111: warning: control reaches end of non-void function
connect/btl_openib_connect_oob.c: In function ‘init_device’:
connect/btl_openib_connect_oob.c:1235: warning: unused variable ‘i’
connect/btl_openib_connect_oob.c: In function ‘get_pathrecord_sl’:
connect/btl_openib_connect_oob.c:1323: warning: unused variable ‘i’
}}}

This commit was SVN r24196.
2010-12-30 15:37:50 +00:00
Terry Dontje
6da16ab0d7 add format parameter and layout format to OMPI_Affinity_str
This commit was SVN r24182.
2010-12-16 15:11:17 +00:00
Shiqing Fan
ec82e73bce use sockets instead of pipes on Windows.
This commit was SVN r24174.
2010-12-15 14:34:25 +00:00
Rolf vandeVaart
3f7dd84278 Fix libevent so it can compile in the few cases where sys/queue.h does not exist.
1. Remove it from libevent207.h because it is not needed.
2. Add compat to the include list so it can use queue.h when needed.

This commit was SVN r24144.
2010-12-02 23:05:02 +00:00
Shiqing Fan
f43862420c Convert the bad dos line endings to unix style for all windows related files.
This commit was SVN r24137.
2010-12-02 12:08:08 +00:00
Ralph Castain
c56185887b Change the event base "wakeup" support to enable the passing of events to the central thread for add/del. Add a macro OPAL_UPDATE_EVBASE for this purpose as it will likely be widely used.
Update the ORTE thread support to utilize this capability. Update the rmcast framework to track the change.

This commit was SVN r24121.
2010-12-01 04:26:43 +00:00
Ralph Castain
2523c9b2e8 Overload the event_base_t struct to include a (hopefully) temporary change to deal with cross-event-base synchronization. This is done transparently so no code changes are required within the rest of the code base. Comments explain what was changed and why.
This commit was SVN r24105.
2010-11-30 16:14:19 +00:00
Ralph Castain
71f116d21f Expose the event_active API
This commit was SVN r24090.
2010-11-24 23:30:13 +00:00
Ralph Castain
380835602c Add support for internal libevent threading support. Add configure logic to define an appropriate flag, and then use that flag to expose the required functions.
This commit was SVN r24088.
2010-11-24 23:24:53 +00:00
Shiqing Fan
39c9f7468e Add support for managing priorities of windows mca components.
Correct the generated strings in mpi.h.

This commit was SVN r24082.
2010-11-23 19:09:06 +00:00
Rolf vandeVaart
e7ff9375d7 Use pid_t to avoid warnings on some platforms.
This commit was SVN r24072.
2010-11-19 17:14:33 +00:00
Rolf vandeVaart
1735f98c78 Avoid potential warnings by using pid_t in all places.
This commit was SVN r24071.
2010-11-19 16:29:45 +00:00
Shiqing Fan
4fea0f021e Per r24062, this should also be removed.
This commit was SVN r24064.

The following SVN revision numbers were found above:
  r24062 --> open-mpi/ompi@3b0caf7dea
2010-11-17 17:14:55 +00:00
Rolf vandeVaart
3b0caf7dea Remove inclusion of stdbool.h where not needed.
Change OMPI code in libevent to not use bool.
Add some comments to indicate OMPI specific code.
This should fix compiles on Sun Studio Solaris.

This commit was SVN r24062.
2010-11-17 15:14:00 +00:00
Ralph Castain
32be69eaef Update the OMPI libevent interface module and the internal libevent event.c file to provide ability to disable specific event modes. Basically an issue between #define and checking to see if the value was defined to zero.
This commit was SVN r24056.
2010-11-16 16:06:32 +00:00
Ralph Castain
1b3421f16e Fix a bug spotted by Rolf - ensure that disable-event-xxx results in the corresponding have_event_xxx being undefined or defined to 0
This commit was SVN r24055.
2010-11-16 04:37:30 +00:00