1
1
Граф коммитов

1709 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
08c3ecd608 Handle the case where memory stats are in different order, or don't exist on that platform
This commit was SVN r24700.
2011-05-16 13:32:42 +00:00
Ralph Castain
a3e43594a4 Extend node stats to include additional memory info. Change "darwin" pstat module to "test" as we don't really know how to get all the stat info for darwin.
Add a new OPAL_ERROR_LOG macro similar to the ORTE_ERROR_LOG one.

This commit was SVN r24692.
2011-05-08 14:45:16 +00:00
George Bosilca
34abbce82c More accurate and trustworthy descriptions of the netmask exist.
Interested readers can quench their curiosity either with one
of the Richard Stevens books (ISBN 9780201633467) or the
Wikipedia page (http://en.wikipedia.org/wiki/Subnetwork).

This commit was SVN r24680.
2011-05-03 21:59:51 +00:00
Shiqing Fan
b4e5826403 Exclude two non-mca files that shouldn't be compiled under windows.
This commit was SVN r24669.
2011-05-02 14:39:22 +00:00
Ralph Castain
257473ebca Remove an extra "break" - thanks to Rainer for pointing it out.
This commit was SVN r24667.
2011-05-02 12:20:37 +00:00
Ralph Castain
7b29a6153e Cover all the netmask values
This commit was SVN r24665.
2011-04-29 17:56:15 +00:00
Shiqing Fan
4490fdbd34 Add the initial support for MinGW and MSYS.
Correctly check the dependencies of MSYS env.
Set up configure include and lib path for building the package.
update a few more CMake scripts.

This commit was SVN r24663.
2011-04-29 14:42:07 +00:00
Rolf vandeVaart
3e8878f556 Add in missing header file
This commit was SVN r24662.
2011-04-29 13:20:59 +00:00
Rolf vandeVaart
2634f6401a Add some basic support for sending and receiving CUDA device memory. Feature is disabled by default and has no effect on default code paths.
This commit was SVN r24659.
2011-04-28 23:05:55 +00:00
Jeff Squyres
0882d636a6 Oops -- need string.h, too (for strcasecmp).
This commit was SVN r24649.
2011-04-28 15:42:35 +00:00
Jeff Squyres
7362a0730a Change the default to "none". David Singleton raises a good point
that enabling "local_only" by default could cause excessive
by-NUMA-node paging and/or OOMs (rather than allowing memory
allocations to spill over to other NUMA nodes).

This brought home the very real-world example of people buying servers
with more processors/cores than they need, just to get more memory.
We wouldn't want Badness to occur in such scenarios by default.
Instead, let people turn on "only allow memory allocations on my local
NUMA node" if their application would benefit from it.

This commit was SVN r24648.
2011-04-28 15:16:39 +00:00
Jeff Squyres
7b48042ffd Commit patch from upstream hwloc: r3482. Fixes some compiler
warnings. 

This commit was SVN r24641.

The following SVN revision numbers were found above:
  r3482 --> open-mpi/ompi@2435be8d49
2011-04-27 17:08:15 +00:00
Jeff Squyres
d134ff9b4d Refs trac:2698
After a long period of development with many starts and stops, we
finally got this where we wanted it.

This commit introduces 2 new MCA params (note that the
"maffinity_libnuma_policy" MCA param introduced by r24290 was removed
when libnuma support was removed).  Remember that maffinity policies
are only in effect when paffinity is enaabled -- i.e., when processes
are bound to processors!

 * '''maffinity_base_alloc_policy:''' Policy that determines how
   general memory allocations are bound after MPI_INIT.  A value of
   "none" means that no memory policy is applied.  A value of
   "local_only" means that all memory allocations will be restricted
   to the local NUMA node where each process is placed.  Note that
   operating system paging policies are unaffected by this setting.
   For example, if "local_only" is used and local NUMA node memory is
   exhausted, a new memory allocation may cause paging.
 * '''maffinity_base_bind_failure_action:''' What Open MPI will do if
   it explicitly tries to bind memory to a specific NUMA location, and
   fails.  Note that this is a different case than the general
   allocation policy described by maffinity_base_alloc_policy.  A
   value of "warn" means that Open MPI will warn the first time this
   happens, but allow the job to continue (possibly with degraded
   performance).  A value of "error" means that Open MPI will abort
   the job if this happens.

This needs at least a little soak time on the trunk before going to
v1.5.

This commit was SVN r24639.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-04-26 13:31:07 +00:00
Jeff Squyres
926af377fe Refs trac:2778.
Upgrade to hwloc 1.2 (from hwloc 1.1.2).  This should fix the problems
Nathan's seeing in #2778.

Let's let this soak on the trunk for a little while and see how LANL's
MTT's work out.  If that works, then we can CMR this to v1.5.

This commit was SVN r24635.

The following Trac tickets were found above:
  Ticket 2778 --> https://svn.open-mpi.org/trac/ompi/ticket/2778
2011-04-25 19:31:49 +00:00
Jeff Squyres
b8af3b7c4a New comment explains it all -- previous code was failing to find the
Nth core, so it fell over to try to find the Nth PU.

-----

hwloc isn't able to find cores on all platforms.  Example: PPC64
running RHEL 5.4 (linux kernel 2.6.18) only reports NUMA nodes and
PU's.  Fine.

However, note that hwloc_get_obj_by_type() will return NULL in 2
(effectively) different cases:

- no objects of the requested type were found
- the Nth object of the requested type was not found

So first we have to see if we can find *any* cores by looking for the
0th core.  If we find it, then try to find the Nth core.  Otherwise,
try to find the Nth PU.

This commit was SVN r24632.
2011-04-25 16:55:27 +00:00
Jeff Squyres
16d8e9216b Ran across this comment about i18n support, so I figured I'd update
it.  :-)

This commit was SVN r24631.
2011-04-22 12:14:20 +00:00
Ralph Castain
9988b97b97 Extend/update how we handle process stats. Add the ability to collect node-level stats separate from the process stats. Update the process stat memory fields to report in MBytes instead of KBytes as I can't find any process that runs in KBytes nowadays.
Rename the memusage sensor plugin to "resusage" as it will soon be updated to include full process stat monitoring.

Extend the heartbeat sensor to report node and process stats in the heartbeat.

Store the process and node stats in their respective orte_xxx_t object.

This commit was SVN r24629.
2011-04-21 22:55:45 +00:00
Jeff Squyres
2fe94b929a Manually add hwloc v1.1 branch r3418 commit (went in after v1.1.2
released): 

backport hwloc r 3416 from trunk: Add cache info entry _after_ checking
that we need one, thanks Andriy Gapon for the fix

This commit was SVN r24612.

The following SVN revision numbers were found above:
  r3418 --> open-mpi/ompi@9972663a12
2011-04-12 14:41:46 +00:00
Jeff Squyres
9dc3a1aa54 Upgrade to hwloc 1.1.2; most likely the last release of the hwloc
1.1.x series

This commit was SVN r24611.
2011-04-12 14:35:26 +00:00
Jeff Squyres
38d3cdd4a6 Update hwloc to 1.1.1. Next stop: 1.1.2.
This commit was SVN r24610.
2011-04-12 14:16:37 +00:00
Jeff Squyres
48f418ee7b Fixes trac:2768: exclude opal/libltdl from "make distclean" when
--disable-dlopen is used.  Thanks to David Gunter for reporting the
issue. 

This commit was SVN r24603.

The following Trac tickets were found above:
  Ticket 2768 --> https://svn.open-mpi.org/trac/ompi/ticket/2768
2011-04-08 14:59:49 +00:00
Shiqing Fan
4b3b713bfc Update the windows installdir component.
Don't use the old env component for windows, so remove the .windows file.

This commit was SVN r24597.
2011-04-05 12:15:41 +00:00
Terry Dontje
266e663091 Add opal_tree class. This will be used in the future by sysinfo to store hw maps to be used by rmaps for the new affinity code.
This commit was SVN r24594.
2011-03-30 08:05:28 +00:00
Ralph Castain
f40edd6b4f Add the stupid test word
This commit was SVN r24578.
2011-03-26 03:38:59 +00:00
Ralph Castain
5bfb01c6c8 Only build the linux component of sysinfo if linux is the operating system.
Thanks to Paul Hargrove for the suggestion.

This commit was SVN r24576.
2011-03-25 20:55:57 +00:00
Jeff Squyres
58a13f87e6 Oops -- forgot to add opal_config_top.h to Makefile.am (so that it'll
be included in the tarball).

This commit was SVN r24572.
2011-03-25 01:21:11 +00:00
Jeff Squyres
5ae1b15b6e Ensure that other packages defining PACKAGE_ macros don't hurt us, and protect others from our PACKAGE_ macros.
This commit was SVN r24571.
2011-03-24 22:39:56 +00:00
Jeff Squyres
cf6c5e8d48 Fix a bug noted by Gus Correa on the user's list: mpi_paffinity_alone
appeared multiple times in ompi_info output (so did others, but this
is the one that was noticed).  Ensure that we don't repeat
opal_paffinity_base_register_params() multiple times.

This commit was SVN r24569.
2011-03-24 00:58:25 +00:00
Eugene Loh
2770a12beb Continue clean up of thread options started in r22841, 22842, and 22849.
No need for any CMRs to 1.5... that was already done in CMR 2728.

This commit was SVN r24545.

The following SVN revision numbers were found above:
  r22841 --> open-mpi/ompi@b400b84162
2011-03-18 21:36:35 +00:00
Jeff Squyres
bffa5c8f7e * Rename OMPI_CHECK_PTHREAD_PIDS to OPAL_CHECK_PTHREAD_PIDS.
* Convert from AC_TRY_RUN to AC_RUN_IFELSE.
 * Excellent suggestion from Paul Hargrove: use AC_CHECK_FUNC to look
   for a Linuxthreads-specific symbol when we're cross compiling to
   see if threads will have different PIDs (because AC_CHECK_FUNC
   works properly even when in cross-compiling environments).

Background: the old/Linuxthreads-based pthreads implementation used
the Linux clone() call to make threads, which effectively meant that
each thread had a different PID.  The new NPTL pthreads implementation
does things better, meaning that threads have the same PID.  

Open MPI no longer supports threads with different PIDs -- we ripped
out the supporting code for threads with different PIDs because we
don't have systems available to test this on anymore (anyone who still
has such a system can still use older versions of Open MPI).  Hence,
configure needs to determine whether the target system will have the
same PID for threads or not -- even if we're cross-compiling.  The
current test compiles and runs a multi-threaded app that checks PIDs
of different threads, but we clearly can't do that in a
cross-compiling environment.  So use AC_CHECK_FUNC in cross-compiling
environments.

Simple, no?

This commit was SVN r24537.
2011-03-17 11:59:54 +00:00
Ralph Castain
d5dfe05521 Remove stale code associated with OPAL_THREADS_HAVE_DIFFERENT_PIDS. In the past, we have supported the case of really, really old Linux kernels where threads have different pids. However, when we updated the event library, we didn't also update that support code. In addition, when we dropped progress thread support, we didn't remove areas of the code that could no longer be compiled (i.e., were protected by "if progress thread && if have different pids).
There was no compelling reason to support such old kernels. Accordingly, convert the test to print a nice error message indicating we no longer support old kernels (but indicate that earlier OMPI versions do) and error out. Remove all code that was protected by "if have different pids" since it can no longer be compiled.

This commit was SVN r24531.
2011-03-15 21:05:03 +00:00
Ralph Castain
7eede54b39 Solve a problem when cross-compiling for PPC32 - in this case, OPAL_HAVE_ATOMIC_CMPSET_64 is not set, but the code requires that the ADD_64 and SUB_64 values at least be defined.
This commit was SVN r24528.
2011-03-15 15:50:49 +00:00
Ralph Castain
45aacd30ab Add prefix for PPC hosts
This commit was SVN r24515.
2011-03-11 22:58:51 +00:00
Jeff Squyres
324b90142f Fix CID 1583: hwloc bitmap leak.
This commit was SVN r24496.
2011-03-08 16:47:26 +00:00
George Bosilca
95f4e0b502 We do need the name for debugging purposes.
This commit was SVN r24479.
2011-03-02 19:19:15 +00:00
George Bosilca
355d61bb0f No need for a printf.
This commit was SVN r24478.
2011-03-02 19:17:56 +00:00
Josh Hursey
7c737b9274 Some string and state cleanup. Thanks to George Bosilca for the initial patch.
This commit was SVN r24471.
2011-03-01 20:12:23 +00:00
Josh Hursey
7709005d86 Hack to get the C/R thread working again after r24377. Needs to be revisited.
See ticket #2741 for more details.

Refs trac:2741

This commit was SVN r24470.

The following SVN revision numbers were found above:
  r24377 --> open-mpi/ompi@e8c2519280

The following Trac tickets were found above:
  Ticket 2741 --> https://svn.open-mpi.org/trac/ompi/ticket/2741
2011-03-01 18:47:31 +00:00
George Bosilca
f981e02b4a Fix a typo and correct the usage of the defines.
This commit was SVN r24454.
2011-02-24 06:34:30 +00:00
George Bosilca
f79c87f0c3 Correct the assembly using xaddl for IA32.
Add atomic functions for add and sub 32 and 64 bits for AMD64.

This commit was SVN r24453.
2011-02-24 06:31:47 +00:00
George Bosilca
eb8383802e ret might have been used uninitialized. Not anymore.
This commit was SVN r24452.
2011-02-24 03:02:48 +00:00
Jeff Squyres
ad985260d3 Ensure to disable XML and Cairo support in hwloc; OMPI doesn't use it. Additionally, ensure that the right flags are passed back to the wrappers in the case of static builds. We probably won't need these (especially since XML has been disabled), but it's the Right Thing to do.
This commit was SVN r24451.
2011-02-23 23:11:45 +00:00
Jeff Squyres
e8ba72258e Patch for PPC64 platforms with smt=off, issue raised by Brad. This
fix will be included in hwloc 1.1.2.

Brad -- can you verify that this fixes the issue for you?

Fixes trac:2732.

This commit was SVN r24450.

The following Trac tickets were found above:
  Ticket 2732 --> https://svn.open-mpi.org/trac/ompi/ticket/2732
2011-02-23 22:43:58 +00:00
Brian Barrett
07996af388 Fix register clobber list for x86 assembly. Thanks to Jay Fenlason for the
patch.

This commit was SVN r24449.
2011-02-23 21:54:07 +00:00
Jeff Squyres
8143b201a9 Custom patch for hwloc (that will be included in hwloc 1.1.2) so that
we don't barf on Linux non-NUMA (NNUMA, aka UMA ;-) ) platforms.

This commit was SVN r24448.
2011-02-23 21:02:02 +00:00
Jeff Squyres
2368410eff * Ensure to follow standard filename conventions for output MCA DSO
filenames -- don't include the project name ("opal")
 * Don't link maffinity/hwloc and paffinity/hwloc against the common
   hwloc in the static build case (because this will result in
   duplicate symbols)

This commit was SVN r24447.
2011-02-23 21:00:20 +00:00
Jeff Squyres
5e082d68f6 Fix the compile error for libnuma 0.9.x introduced in r24442;
hopefully, this now compiles for libnuma 0.9.x and libnuma 2.0.x.

Fixes for the strategy discussed in the commit message for r24442
(i.e., check against numa_get_mems_allowed(), which only exists in
libnuma 2.0.x) and the new new new plan on #2698 coming in a separate
commit.

This commit was SVN r24443.

The following SVN revision numbers were found above:
  r24442 --> open-mpi/ompi@90a8fe4aad
2011-02-23 13:44:46 +00:00
Rainer Keller
90a8fe4aad - Addendum to r24421: get mca_maffinity_libnuma to compile on linux
(with libnuma-2.0.4 / LIBNUMA_API_VERSION 2): numa_get_run_node_mask
   returns a struct bitmask *.

   Whether it's a good idea to blindly pass that on to
   numa_set_membind() is another matter: one might want to match against
   the list returned by numa_get_mems_allowed(), which may be set by the
   outside environment.

   Refs trac:2698.

This commit was SVN r24442.

The following SVN revision numbers were found above:
  r24421 --> open-mpi/ompi@31510e683b

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-02-23 12:59:49 +00:00
Jeff Squyres
c1b26005d7 Create new opal/mca/common area, similar to ompi/mca/common. Move hwloc into this new opal MCA common area, and link the hwloc paffinity component against it. Also add a new hwloc maffinity component, and also link it against the opal MCA common hwloc. More development coming soon regarding this common hwloc instance (i.e., an OPAL-ized version of the hwloc API via a new framework so that we can safely use hwloc's services throughout the rest of the OPAL/ORTE/OMPI code bases.
This commit was SVN r24440.
2011-02-22 23:21:48 +00:00
Shiqing Fan
90eeba252e Make openib compile again for Windows.
Update the CMake script for checking mca subdirs.
Add windows support for __attribute__ packed structures.
Define usleep and posix_memalign with equivalent windows functions.
And a few minor fixes, type casts.

This commit was SVN r24429.
2011-02-22 15:49:27 +00:00