1
1
Граф коммитов

798 Коммитов

Автор SHA1 Сообщение Дата
Shiqing Fan
3af7c9f7bb Complete the MinGW build support on Windows.
This commit was SVN r25048.
2011-08-15 09:47:23 +00:00
Jeff Squyres
1cbfb53801 r24976 wasn't quite right -- you now actually get a warning if you
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.

This commit does a few things:

 * Introduce a new OPAL MCA base function:
   mca_base_param_check_exclusive_string().  It checks to see that the
   ''user'' does not set two MCA parameters that are mutually
   exclusive by checking the source of those MCS param values.
 * Use the above function in many BTLs (and the OOB TCP) to ensure
   that <foo>_if_include and <foo>_if_exclude are not both specified
   ''by the user''.
 * Re-arrange many of these BTLs to move their MCA registration code
   into a separate component_register() function (vs. the
   component_open() function).

This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.

This commit was SVN r25043.

The following SVN revision numbers were found above:
  r24976 --> open-mpi/ompi@8f4ac54336
2011-08-10 17:24:36 +00:00
Samuel Gutierrez
bb791eaa23 change opal_output_verbose level to be consistent with shmem base.
This commit was SVN r25036.
2011-08-09 21:34:12 +00:00
Samuel Gutierrez
b144c8c343 silence warning in shmem posix run-time test when err is not equal to EEXIST.
This commit was SVN r25034.
2011-08-09 21:13:28 +00:00
Jeff Squyres
ba432393d4 Remove some really old (internal) kruft that never ended up getting
used. 

This commit was SVN r24988.
2011-08-04 15:24:37 +00:00
Samuel Gutierrez
adde221413 use memcpy in ds_copy.
This commit was SVN r24942.
2011-07-25 17:16:29 +00:00
Ralph Castain
8a7f9f8997 Hide libevent symbols when internal thread support enabled
This commit was SVN r24922.
2011-07-22 19:49:47 +00:00
Ralph Castain
3f0d13efe2 Fix libevent internal thread support
This commit was SVN r24920.
2011-07-22 19:18:49 +00:00
Shiqing Fan
edaa7b96e4 This should not be commented out.
This commit was SVN r24914.
2011-07-21 12:56:18 +00:00
Shiqing Fan
665d1284be Fix a bug that memcpy'ing a wrong temp string.
This commit was SVN r24912.
2011-07-21 12:53:03 +00:00
Ralph Castain
6201581544 Fix the symbol visibility issue for libevent by renaming all visible libevent symbols
This commit was SVN r24902.
2011-07-14 07:10:52 +00:00
Ralph Castain
5e99d45ae4 Remove unused variable
This commit was SVN r24887.
2011-07-13 03:42:20 +00:00
Nadia Derbey
0d0cead33a Fix a hang in carto_base_select() if carto_module_init() fails
This commit was SVN r24876.
2011-07-12 05:47:28 +00:00
Abhishek Kulkarni
7363938ba8 add a missing include.
This commit was SVN r24873.
2011-07-11 00:04:31 +00:00
Jeff Squyres
08a05a1e35 Minor additions to make OMPI trunk compatible with the latest GNU
Autotools:

 * Autoconf 2.68
 * Automake 1.11.1
 * Libtool 2.4
 * m4 1.4.16

This commit was SVN r24867.
2011-07-10 12:11:47 +00:00
Jeff Squyres
e2df4d4a8d Some platforms don't have <execinfo.h>, even if they have backtrace()
function (e.g., NetBSD).  Thanks to Aleksej Saushev for pointing out
the issue. 

This commit was SVN r24866.
2011-07-10 11:14:19 +00:00
Jeff Squyres
b2b781e537 Fix a few miscelaneous memory leaks.
This commit was SVN r24865.
2011-07-08 16:39:58 +00:00
Terry Dontje
86a80411f0 update changes from review comments of #2816
This commit was SVN r24856.
2011-07-05 22:51:39 +00:00
Terry Dontje
8c0af7838a add configure check for Solaris Legacy munmap prototype
This commit was SVN r24839.
2011-06-29 23:45:27 +00:00
Ralph Castain
4dc3ee369f If event threads are enabled, we don't need to wakeup the event lib to pickup new events - so help valgrind to quit whining about it.
This commit was SVN r24837.
2011-06-29 22:52:28 +00:00
Samuel Gutierrez
93110ce805 place a bandage on ds_copy plus minor cleanup. i need to rethink this part of the framework. thanks to Rolf for pointing out the issue.
This commit was SVN r24831.
2011-06-28 19:37:12 +00:00
Ralph Castain
cd6b8417ec Cleanup a set of warnings that appear to be caused by failure of PRIsize_t on Linux.
Set ignore properties

This commit was SVN r24812.
2011-06-23 15:07:58 +00:00
Samuel Gutierrez
61ff422562 fix a few more spots in posix.
This commit was SVN r24808.
2011-06-22 23:17:26 +00:00
Samuel Gutierrez
7fcf806dc9 fix posix builds on solaris. shmem still needs more cleanup on solaris, but at least shmem will stop breaking builds (i hope).
This commit was SVN r24807.
2011-06-22 23:08:58 +00:00
Samuel Gutierrez
5b5ce434fc fix shmem sysv build on solaris.
This commit was SVN r24806.
2011-06-22 18:05:08 +00:00
Samuel Gutierrez
867df203bc fix shmem mmap build on solaris. thanks terry.
This commit was SVN r24805.
2011-06-22 16:05:50 +00:00
Rolf vandeVaart
856a9c43b1 Add string.h. Needed when configuring with --enable-picky
This commit was SVN r24804.
2011-06-22 15:48:32 +00:00
Samuel Gutierrez
81f38b258a commit of new shared memory backing facility framework (shmem) and its components.
This commit was SVN r24795.
2011-06-21 15:41:57 +00:00
Josh Hursey
b223d355fc Add explicit number for opal_crs_state_type_t enum (for debugging). Also add a MAX so we can easily check for out of bounds states during debugging.
This commit was SVN r24766.
2011-06-09 14:27:24 +00:00
Ralph Castain
8f401a0563 Enable the ability to constrain applications to hosts on the basis of resources.
This commit was SVN r24736.
2011-05-28 22:18:19 +00:00
Ralph Castain
b47ec2ee87 Remove lingering references to opal_profile option
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
ddf4914094 Plug fd leak
This commit was SVN r24707.
2011-05-18 13:46:27 +00:00
Ralph Castain
4083e23073 Complete cleanup of pstat linux
This commit was SVN r24701.
2011-05-16 14:08:08 +00:00
Ralph Castain
08c3ecd608 Handle the case where memory stats are in different order, or don't exist on that platform
This commit was SVN r24700.
2011-05-16 13:32:42 +00:00
Ralph Castain
a3e43594a4 Extend node stats to include additional memory info. Change "darwin" pstat module to "test" as we don't really know how to get all the stat info for darwin.
Add a new OPAL_ERROR_LOG macro similar to the ORTE_ERROR_LOG one.

This commit was SVN r24692.
2011-05-08 14:45:16 +00:00
Jeff Squyres
0882d636a6 Oops -- need string.h, too (for strcasecmp).
This commit was SVN r24649.
2011-04-28 15:42:35 +00:00
Jeff Squyres
7362a0730a Change the default to "none". David Singleton raises a good point
that enabling "local_only" by default could cause excessive
by-NUMA-node paging and/or OOMs (rather than allowing memory
allocations to spill over to other NUMA nodes).

This brought home the very real-world example of people buying servers
with more processors/cores than they need, just to get more memory.
We wouldn't want Badness to occur in such scenarios by default.
Instead, let people turn on "only allow memory allocations on my local
NUMA node" if their application would benefit from it.

This commit was SVN r24648.
2011-04-28 15:16:39 +00:00
Jeff Squyres
7b48042ffd Commit patch from upstream hwloc: r3482. Fixes some compiler
warnings. 

This commit was SVN r24641.

The following SVN revision numbers were found above:
  r3482 --> open-mpi/ompi@2435be8d49
2011-04-27 17:08:15 +00:00
Jeff Squyres
d134ff9b4d Refs trac:2698
After a long period of development with many starts and stops, we
finally got this where we wanted it.

This commit introduces 2 new MCA params (note that the
"maffinity_libnuma_policy" MCA param introduced by r24290 was removed
when libnuma support was removed).  Remember that maffinity policies
are only in effect when paffinity is enaabled -- i.e., when processes
are bound to processors!

 * '''maffinity_base_alloc_policy:''' Policy that determines how
   general memory allocations are bound after MPI_INIT.  A value of
   "none" means that no memory policy is applied.  A value of
   "local_only" means that all memory allocations will be restricted
   to the local NUMA node where each process is placed.  Note that
   operating system paging policies are unaffected by this setting.
   For example, if "local_only" is used and local NUMA node memory is
   exhausted, a new memory allocation may cause paging.
 * '''maffinity_base_bind_failure_action:''' What Open MPI will do if
   it explicitly tries to bind memory to a specific NUMA location, and
   fails.  Note that this is a different case than the general
   allocation policy described by maffinity_base_alloc_policy.  A
   value of "warn" means that Open MPI will warn the first time this
   happens, but allow the job to continue (possibly with degraded
   performance).  A value of "error" means that Open MPI will abort
   the job if this happens.

This needs at least a little soak time on the trunk before going to
v1.5.

This commit was SVN r24639.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-04-26 13:31:07 +00:00
Jeff Squyres
926af377fe Refs trac:2778.
Upgrade to hwloc 1.2 (from hwloc 1.1.2).  This should fix the problems
Nathan's seeing in #2778.

Let's let this soak on the trunk for a little while and see how LANL's
MTT's work out.  If that works, then we can CMR this to v1.5.

This commit was SVN r24635.

The following Trac tickets were found above:
  Ticket 2778 --> https://svn.open-mpi.org/trac/ompi/ticket/2778
2011-04-25 19:31:49 +00:00
Jeff Squyres
b8af3b7c4a New comment explains it all -- previous code was failing to find the
Nth core, so it fell over to try to find the Nth PU.

-----

hwloc isn't able to find cores on all platforms.  Example: PPC64
running RHEL 5.4 (linux kernel 2.6.18) only reports NUMA nodes and
PU's.  Fine.

However, note that hwloc_get_obj_by_type() will return NULL in 2
(effectively) different cases:

- no objects of the requested type were found
- the Nth object of the requested type was not found

So first we have to see if we can find *any* cores by looking for the
0th core.  If we find it, then try to find the Nth core.  Otherwise,
try to find the Nth PU.

This commit was SVN r24632.
2011-04-25 16:55:27 +00:00
Ralph Castain
9988b97b97 Extend/update how we handle process stats. Add the ability to collect node-level stats separate from the process stats. Update the process stat memory fields to report in MBytes instead of KBytes as I can't find any process that runs in KBytes nowadays.
Rename the memusage sensor plugin to "resusage" as it will soon be updated to include full process stat monitoring.

Extend the heartbeat sensor to report node and process stats in the heartbeat.

Store the process and node stats in their respective orte_xxx_t object.

This commit was SVN r24629.
2011-04-21 22:55:45 +00:00
Jeff Squyres
2fe94b929a Manually add hwloc v1.1 branch r3418 commit (went in after v1.1.2
released): 

backport hwloc r 3416 from trunk: Add cache info entry _after_ checking
that we need one, thanks Andriy Gapon for the fix

This commit was SVN r24612.

The following SVN revision numbers were found above:
  r3418 --> open-mpi/ompi@9972663a12
2011-04-12 14:41:46 +00:00
Jeff Squyres
9dc3a1aa54 Upgrade to hwloc 1.1.2; most likely the last release of the hwloc
1.1.x series

This commit was SVN r24611.
2011-04-12 14:35:26 +00:00
Jeff Squyres
38d3cdd4a6 Update hwloc to 1.1.1. Next stop: 1.1.2.
This commit was SVN r24610.
2011-04-12 14:16:37 +00:00
Shiqing Fan
4b3b713bfc Update the windows installdir component.
Don't use the old env component for windows, so remove the .windows file.

This commit was SVN r24597.
2011-04-05 12:15:41 +00:00
Ralph Castain
f40edd6b4f Add the stupid test word
This commit was SVN r24578.
2011-03-26 03:38:59 +00:00
Ralph Castain
5bfb01c6c8 Only build the linux component of sysinfo if linux is the operating system.
Thanks to Paul Hargrove for the suggestion.

This commit was SVN r24576.
2011-03-25 20:55:57 +00:00
Jeff Squyres
cf6c5e8d48 Fix a bug noted by Gus Correa on the user's list: mpi_paffinity_alone
appeared multiple times in ompi_info output (so did others, but this
is the one that was noticed).  Ensure that we don't repeat
opal_paffinity_base_register_params() multiple times.

This commit was SVN r24569.
2011-03-24 00:58:25 +00:00
Jeff Squyres
324b90142f Fix CID 1583: hwloc bitmap leak.
This commit was SVN r24496.
2011-03-08 16:47:26 +00:00