1
1
Граф коммитов

1884 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
9db4542c2b Move maffinity_base_alloc_policy and
maffinity_base_bind_failure_action MCA params to the hwloc base
(hwloc_base_alloc_polocy and hwloc_base_bind_failure_action).  Since
these MCA parameters were never on a release branch, I'm just
moving/renaming them outright and not leaving aliases to the old
names.

Note that some upper layer needs to call
opal_hwloc_base_set_process_membind_policy() to set the
set-by-MCA-param process-wide memory affinity policy.  We can't do
this automatically during hwloc_base_open() because, for reasons
described elsewhere, opal_hwloc_topology is not automatically filled
during hwloc_base_open() (in short: potential scalability issues when
launching many MPI processes simultaneously on a single machine, for
example).

This commit was SVN r25156.
2011-09-19 16:10:37 +00:00
Jeff Squyres
dc70100cee In reviewing CMR #2866, it was noticed that the maffinity/hwloc and
paffinity/hwloc components were still calling hwloc_topology_init/load
themselves, and not using the opal_hwloc_topology.  Doh!

This commit fixes that -- these 2 components no longer have their own
copy of the topology tree; they just use opal_hwloc_topology.

This commit was SVN r25151.
2011-09-17 13:13:36 +00:00
Jeff Squyres
ecd603256a * Rename opal_hwloc_components to opal_hwloc_base_components
* Fix some comments

This commit was SVN r25150.
2011-09-17 11:54:36 +00:00
Jeff Squyres
4a2cf81c6f Fixes to ensure that dependent libraries are carried forward from the embedded hwloc
This commit was SVN r25140.
2011-09-13 22:43:39 +00:00
Jeff Squyres
d6682523f6 Put in proper basename so that "make dist" can find it.
This commit was SVN r25135.
2011-09-13 11:09:56 +00:00
Jeff Squyres
4771c36061 * With some m4 trickery, if no form of --with-hwloc is specified on
the command line, hwloc is just like any other external dependency
   in OMPI: if we find it, we'll use it. If we don't find it, we'll
   ignore it.  See comments in opal/mca/hwloc/configure.m4 for an
   explanation. 
 * Fix some copy-n-paste errors in opal/mca/hwloc/configure.m4
   w.r.t. flags coming in from the winning component.
 * Add another line in ompi_info's output about whether hwloc support
   is included or not.

This commit was SVN r25134.
2011-09-13 00:39:14 +00:00
Jeff Squyres
7dc352a328 Add some notes about maintaining the hwloc framework.
This commit was SVN r25132.
2011-09-12 19:40:18 +00:00
Jeff Squyres
c5bfa09574 Fixes from Brice Goglin, post hwloc v1.2.1 for AMD Magny-Cours. See
http://www.open-mpi.org/community/lists/users/2011/09/17164.php.

This commit was SVN r25131.
2011-09-12 19:03:48 +00:00
Shiqing Fan
8bf5a61265 Fix another compile error for Windows.
This commit was SVN r25129.
2011-09-12 14:19:42 +00:00
Shiqing Fan
b61eed801f Fix the problem of building hwloc on Windows. Temporarily not using it for Windows.
This commit was SVN r25128.
2011-09-12 13:55:34 +00:00
Ralph Castain
6460fe5480 Silence warning
This commit was SVN r25127.
2011-09-12 13:32:21 +00:00
Shiqing Fan
d4aa37c22b This should be defined for both VS and MinGW env.
This commit was SVN r25126.
2011-09-12 08:24:59 +00:00
Shiqing Fan
0aea775837 Set the compiler flags in a better way.
This commit was SVN r25125.
2011-09-12 08:24:27 +00:00
Ralph Castain
92c7372e20 Per the RFC from Jeff, move hwloc from opal/mca/common to its own static framework ala libevent. Have ORTE daemons collect the topology info at startup and, if --enable-hwloc-xml is set, send that info back to the HNP for later use. The HNP only retains unique topology "templates" to reduce memory footprint. Have the daemon include the local topology info in the nidmap buffer sent to each app so the apps don't all hammer the local system to discover it for themselves.
Remove the sysinfo framework as hwloc replaces that functionality.

This commit was SVN r25124.
2011-09-11 19:02:24 +00:00
Shiqing Fan
4dc832e65f Fix the problem that windows xp does not have inet_ntop and inet_pton. Do not use the system provided functions on Windows Vista and above.
This commit was SVN r25111.
2011-08-31 12:49:08 +00:00
Ralph Castain
b2971df7df Ensure we loop over all cpu's.
Thanks to Nadia Derbey for spotting the error.

This commit was SVN r25102.
2011-08-29 14:24:34 +00:00
Ralph Castain
56ebfa23cc ORTE configure options belong in orte/config, not opal.
This commit was SVN r25100.
2011-08-27 14:23:49 +00:00
George Bosilca
531b241304 Define ORTE_ENABLE_EPOCH and ORTE_RESIL_ORTE.
This commit was SVN r25098.
2011-08-27 00:16:21 +00:00
Wesley Bland
4e7ff0bd5e By popular demand the epoch code is now disabled by default.
To enable the epochs and the resilient orte code, use the configure flag:

--enable-resilient-orte

This will define both:

ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE

This commit was SVN r25093.
2011-08-26 22:16:14 +00:00
Eugene Loh
55a7b474dd Change a stray __volatile to __volatile__.
This commit was SVN r25092.
2011-08-26 15:36:10 +00:00
Jeff Squyres
495ceef60d Upgrade hwloc to v1.2.1.
This commit was SVN r25088.
2011-08-26 13:14:26 +00:00
Ralph Castain
df28c63164 If we are on a single processor, then we are effectively bound - so have the macro correctly report it.
Thanks to Pascal Deveze for the patch.

This commit was SVN r25068.
2011-08-22 16:28:40 +00:00
Ralph Castain
8dd26993fc Correct packing of floats
This commit was SVN r25065.
2011-08-18 17:10:40 +00:00
Shiqing Fan
20ee92c16e Make the compiler wrappers work correctly for MinGW build.
This commit was SVN r25051.
2011-08-16 12:32:41 +00:00
Shiqing Fan
3af7c9f7bb Complete the MinGW build support on Windows.
This commit was SVN r25048.
2011-08-15 09:47:23 +00:00
Jeff Squyres
1cbfb53801 r24976 wasn't quite right -- you now actually get a warning if you
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.

This commit does a few things:

 * Introduce a new OPAL MCA base function:
   mca_base_param_check_exclusive_string().  It checks to see that the
   ''user'' does not set two MCA parameters that are mutually
   exclusive by checking the source of those MCS param values.
 * Use the above function in many BTLs (and the OOB TCP) to ensure
   that <foo>_if_include and <foo>_if_exclude are not both specified
   ''by the user''.
 * Re-arrange many of these BTLs to move their MCA registration code
   into a separate component_register() function (vs. the
   component_open() function).

This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.

This commit was SVN r25043.

The following SVN revision numbers were found above:
  r24976 --> open-mpi/ompi@8f4ac54336
2011-08-10 17:24:36 +00:00
Samuel Gutierrez
bb791eaa23 change opal_output_verbose level to be consistent with shmem base.
This commit was SVN r25036.
2011-08-09 21:34:12 +00:00
Samuel Gutierrez
b144c8c343 silence warning in shmem posix run-time test when err is not equal to EEXIST.
This commit was SVN r25034.
2011-08-09 21:13:28 +00:00
Ralph Castain
da9bbf68ec Fix the output of error strings. Every convertor is returning OPAL_SUCCESS, so you have to check each convertor to find which one this error belongs to, and then run ONLY that convertor.
This commit was SVN r25009.
2011-08-08 04:10:40 +00:00
Jeff Squyres
ba432393d4 Remove some really old (internal) kruft that never ended up getting
used. 

This commit was SVN r24988.
2011-08-04 15:24:37 +00:00
Jeff Squyres
f539b20a8f Patch from ARM for assembly:
http://www.open-mpi.org/community/lists/devel/2011/08/9586.php

This commit was SVN r24979.
2011-08-02 19:15:24 +00:00
Samuel Gutierrez
adde221413 use memcpy in ds_copy.
This commit was SVN r24942.
2011-07-25 17:16:29 +00:00
Jeff Squyres
5fd57dad37 Add in missing ARM.asm file (this is in addition to r24875, which
included a missing ARM directory).

This commit was SVN r24923.

The following SVN revision numbers were found above:
  r24875 --> open-mpi/ompi@ceabe91484
2011-07-22 20:04:50 +00:00
Ralph Castain
8a7f9f8997 Hide libevent symbols when internal thread support enabled
This commit was SVN r24922.
2011-07-22 19:49:47 +00:00
Ralph Castain
3f0d13efe2 Fix libevent internal thread support
This commit was SVN r24920.
2011-07-22 19:18:49 +00:00
Shiqing Fan
edaa7b96e4 This should not be commented out.
This commit was SVN r24914.
2011-07-21 12:56:18 +00:00
Shiqing Fan
665d1284be Fix a bug that memcpy'ing a wrong temp string.
This commit was SVN r24912.
2011-07-21 12:53:03 +00:00
Ralph Castain
6201581544 Fix the symbol visibility issue for libevent by renaming all visible libevent symbols
This commit was SVN r24902.
2011-07-14 07:10:52 +00:00
Abhishek Kulkarni
b64ea09d72 Fix C/R-related error messages during initialization.
This commit was SVN r24901.
2011-07-13 23:34:34 +00:00
Ralph Castain
5e99d45ae4 Remove unused variable
This commit was SVN r24887.
2011-07-13 03:42:20 +00:00
Ralph Castain
1ad110d2e9 After a nice, calm, rational discussion between Brian, Jeff, and myself, we decided to revert r24864 and r24862 to restore the reference counters in opal_init/finalize. The rationale was that we should instead change orte_init/finalize to also use reference counters to support multi-embedded libraries. Jeff and Brian will discuss proposing a similar change to mpi_init/finalize to the MPI Forum so that all three libraries will behave in similar manners.
It was agreed that opal_init_util had wound up being used in unintended ways, which raised the problem of getting reference counts to work right. However, fixing it would involve more pain than it was worth - and so long as the other layers are made to behave similarly, I have no preference either way.

Complete implementation will follow - for now, this just reverts the prior changes.

This commit was SVN r24886.

The following SVN revision numbers were found above:
  r24862 --> open-mpi/ompi@aa92e0c4eb
  r24864 --> open-mpi/ompi@a5062385c2
2011-07-12 17:07:41 +00:00
Nadia Derbey
0d0cead33a Fix a hang in carto_base_select() if carto_module_init() fails
This commit was SVN r24876.
2011-07-12 05:47:28 +00:00
Jeff Squyres
ceabe91484 Yow; we forgot to include the ARM stuff in the tarball. :-(
This commit was SVN r24875.
2011-07-11 23:52:07 +00:00
Abhishek Kulkarni
7363938ba8 add a missing include.
This commit was SVN r24873.
2011-07-11 00:04:31 +00:00
Jeff Squyres
08a05a1e35 Minor additions to make OMPI trunk compatible with the latest GNU
Autotools:

 * Autoconf 2.68
 * Automake 1.11.1
 * Libtool 2.4
 * m4 1.4.16

This commit was SVN r24867.
2011-07-10 12:11:47 +00:00
Jeff Squyres
e2df4d4a8d Some platforms don't have <execinfo.h>, even if they have backtrace()
function (e.g., NetBSD).  Thanks to Aleksej Saushev for pointing out
the issue. 

This commit was SVN r24866.
2011-07-10 11:14:19 +00:00
Jeff Squyres
b2b781e537 Fix a few miscelaneous memory leaks.
This commit was SVN r24865.
2011-07-08 16:39:58 +00:00
Ralph Castain
aa92e0c4eb Replace a useless counter with a boolean check to see if we have already passed thru opal_finalize so we don't call finalize, and then don't pass thru it (as was happening on several tools)
This commit was SVN r24862.
2011-07-08 06:43:19 +00:00
Terry Dontje
86a80411f0 update changes from review comments of #2816
This commit was SVN r24856.
2011-07-05 22:51:39 +00:00
Terry Dontje
8c0af7838a add configure check for Solaris Legacy munmap prototype
This commit was SVN r24839.
2011-06-29 23:45:27 +00:00
Ralph Castain
4dc3ee369f If event threads are enabled, we don't need to wakeup the event lib to pickup new events - so help valgrind to quit whining about it.
This commit was SVN r24837.
2011-06-29 22:52:28 +00:00
Ralph Castain
9244ea10fb Provide a way to look at the head of the ring
This commit was SVN r24832.
2011-06-28 19:46:48 +00:00
Samuel Gutierrez
93110ce805 place a bandage on ds_copy plus minor cleanup. i need to rethink this part of the framework. thanks to Rolf for pointing out the issue.
This commit was SVN r24831.
2011-06-28 19:37:12 +00:00
Ralph Castain
2af867d26f Don't segfault if show_help is called prior to calling opal_init_util
This commit was SVN r24825.
2011-06-27 16:35:19 +00:00
Ralph Castain
cd6b8417ec Cleanup a set of warnings that appear to be caused by failure of PRIsize_t on Linux.
Set ignore properties

This commit was SVN r24812.
2011-06-23 15:07:58 +00:00
Samuel Gutierrez
61ff422562 fix a few more spots in posix.
This commit was SVN r24808.
2011-06-22 23:17:26 +00:00
Samuel Gutierrez
7fcf806dc9 fix posix builds on solaris. shmem still needs more cleanup on solaris, but at least shmem will stop breaking builds (i hope).
This commit was SVN r24807.
2011-06-22 23:08:58 +00:00
Samuel Gutierrez
5b5ce434fc fix shmem sysv build on solaris.
This commit was SVN r24806.
2011-06-22 18:05:08 +00:00
Samuel Gutierrez
867df203bc fix shmem mmap build on solaris. thanks terry.
This commit was SVN r24805.
2011-06-22 16:05:50 +00:00
Rolf vandeVaart
856a9c43b1 Add string.h. Needed when configuring with --enable-picky
This commit was SVN r24804.
2011-06-22 15:48:32 +00:00
Samuel Gutierrez
81f38b258a commit of new shared memory backing facility framework (shmem) and its components.
This commit was SVN r24795.
2011-06-21 15:41:57 +00:00
Josh Hursey
6539a31b23 Cleanup configure checks for C/R functionality.
Add a WANT_FT_CR flag different from WANT_FT so tools like *-checkpoint are not built when a different FT technique is requested.

Also fix the C/R thread check so that it is only enabled if C/R is enabled, not generally when threads are enabled.

This commit was SVN r24769.
2011-06-09 19:45:29 +00:00
Josh Hursey
8cd5280299 Some assorted opal_bitmap extensions
* Protect the '->bitmap' field if init() is called more than once [it shouldn't be, but if it is then this avoids a memory leak].
 * Some new functions
   * opal_bitmap_bitwise_and_inplace
   * opal_bitmap_bitwise_or_inplace
   * opal_bitmap_bitwise_xor_inplace
   * opal_bitmap_are_different
   * opal_bitmap_get_string

Adding these features to the trunk so others have access to them if they need them. A couple off trunk branches make use of them.

This commit was SVN r24767.
2011-06-09 14:43:54 +00:00
Josh Hursey
b223d355fc Add explicit number for opal_crs_state_type_t enum (for debugging). Also add a MAX so we can easily check for out of bounds states during debugging.
This commit was SVN r24766.
2011-06-09 14:27:24 +00:00
Ralph Castain
4c06c9c07c Simplify the code a little bit by recognizing that end=start isn't an error, but just indicates a partial address typical of CIDR notation.
This commit was SVN r24757.
2011-06-07 11:33:22 +00:00
Ralph Castain
666fdeab8f Okay to return an error on end=start of string conversion so long as the strlen > 0, so restore that error check.
This commit was SVN r24756.
2011-06-07 03:20:01 +00:00
Ralph Castain
f3cae3d6f3 Cleanup the handling of if_include and if_exclude arguments based on CIDR notation.
Fix a bug in the new code that prevented the system from correctly matching addresses.

Remove comments in the show-help text indicating that we would continue in the face of incorrect specifications - leave that to the calling layer to decide.

Modify the new opal_ifmatches so it returns error codes letting the caller better understand the result.

Modify the oob to ensure we abort if we don't find interfaces matching specified constraints, and that we do so without multiple error messages.

NOTE: we have a conflict in our standards. We have been using comma-delimited lists of interfaces for all our params. However, one param - opal_net_private_ipv4 - now uses semicolons instead of comma separators. No idea why, but it is confusing.

This commit was SVN r24755.
2011-06-07 02:09:11 +00:00
George Bosilca
910a289e97 Remove the explicit "attemt to continue".
This commit was SVN r24754.
2011-06-07 01:27:08 +00:00
George Bosilca
7ebd094ecf Cleanup the IPv4 address parsing, and correct the error message.
This commit was SVN r24750.
2011-06-06 03:08:02 +00:00
Ralph Castain
1491d52bd7 Extend the parsing capability of the oob tcp module's if_include and if_exclude options to support subnet+mask notation, and to handle virtual IP addresses (it was previously having problems distinguishing between "eth1" and "eth1.3").
This commit was SVN r24747.
2011-06-05 19:16:42 +00:00
Ralph Castain
8f401a0563 Enable the ability to constrain applications to hosts on the basis of resources.
This commit was SVN r24736.
2011-05-28 22:18:19 +00:00
Ralph Castain
b47ec2ee87 Remove lingering references to opal_profile option
This commit was SVN r24709.
2011-05-18 18:27:29 +00:00
Ralph Castain
ddf4914094 Plug fd leak
This commit was SVN r24707.
2011-05-18 13:46:27 +00:00
Ralph Castain
486041f89d Get rid of the annoying error messages when setrlimit fails, which seems to be a constant problem on the Mac. Don't use the changed values for max limits if the setrlimit call failed.
This commit was SVN r24703.
2011-05-17 03:27:43 +00:00
Ralph Castain
4083e23073 Complete cleanup of pstat linux
This commit was SVN r24701.
2011-05-16 14:08:08 +00:00
Ralph Castain
08c3ecd608 Handle the case where memory stats are in different order, or don't exist on that platform
This commit was SVN r24700.
2011-05-16 13:32:42 +00:00
Ralph Castain
a3e43594a4 Extend node stats to include additional memory info. Change "darwin" pstat module to "test" as we don't really know how to get all the stat info for darwin.
Add a new OPAL_ERROR_LOG macro similar to the ORTE_ERROR_LOG one.

This commit was SVN r24692.
2011-05-08 14:45:16 +00:00
George Bosilca
34abbce82c More accurate and trustworthy descriptions of the netmask exist.
Interested readers can quench their curiosity either with one
of the Richard Stevens books (ISBN 9780201633467) or the
Wikipedia page (http://en.wikipedia.org/wiki/Subnetwork).

This commit was SVN r24680.
2011-05-03 21:59:51 +00:00
Shiqing Fan
b4e5826403 Exclude two non-mca files that shouldn't be compiled under windows.
This commit was SVN r24669.
2011-05-02 14:39:22 +00:00
Ralph Castain
257473ebca Remove an extra "break" - thanks to Rainer for pointing it out.
This commit was SVN r24667.
2011-05-02 12:20:37 +00:00
Ralph Castain
7b29a6153e Cover all the netmask values
This commit was SVN r24665.
2011-04-29 17:56:15 +00:00
Shiqing Fan
4490fdbd34 Add the initial support for MinGW and MSYS.
Correctly check the dependencies of MSYS env.
Set up configure include and lib path for building the package.
update a few more CMake scripts.

This commit was SVN r24663.
2011-04-29 14:42:07 +00:00
Rolf vandeVaart
3e8878f556 Add in missing header file
This commit was SVN r24662.
2011-04-29 13:20:59 +00:00
Rolf vandeVaart
2634f6401a Add some basic support for sending and receiving CUDA device memory. Feature is disabled by default and has no effect on default code paths.
This commit was SVN r24659.
2011-04-28 23:05:55 +00:00
Jeff Squyres
0882d636a6 Oops -- need string.h, too (for strcasecmp).
This commit was SVN r24649.
2011-04-28 15:42:35 +00:00
Jeff Squyres
7362a0730a Change the default to "none". David Singleton raises a good point
that enabling "local_only" by default could cause excessive
by-NUMA-node paging and/or OOMs (rather than allowing memory
allocations to spill over to other NUMA nodes).

This brought home the very real-world example of people buying servers
with more processors/cores than they need, just to get more memory.
We wouldn't want Badness to occur in such scenarios by default.
Instead, let people turn on "only allow memory allocations on my local
NUMA node" if their application would benefit from it.

This commit was SVN r24648.
2011-04-28 15:16:39 +00:00
Jeff Squyres
7b48042ffd Commit patch from upstream hwloc: r3482. Fixes some compiler
warnings. 

This commit was SVN r24641.

The following SVN revision numbers were found above:
  r3482 --> open-mpi/ompi@2435be8d49
2011-04-27 17:08:15 +00:00
Jeff Squyres
d134ff9b4d Refs trac:2698
After a long period of development with many starts and stops, we
finally got this where we wanted it.

This commit introduces 2 new MCA params (note that the
"maffinity_libnuma_policy" MCA param introduced by r24290 was removed
when libnuma support was removed).  Remember that maffinity policies
are only in effect when paffinity is enaabled -- i.e., when processes
are bound to processors!

 * '''maffinity_base_alloc_policy:''' Policy that determines how
   general memory allocations are bound after MPI_INIT.  A value of
   "none" means that no memory policy is applied.  A value of
   "local_only" means that all memory allocations will be restricted
   to the local NUMA node where each process is placed.  Note that
   operating system paging policies are unaffected by this setting.
   For example, if "local_only" is used and local NUMA node memory is
   exhausted, a new memory allocation may cause paging.
 * '''maffinity_base_bind_failure_action:''' What Open MPI will do if
   it explicitly tries to bind memory to a specific NUMA location, and
   fails.  Note that this is a different case than the general
   allocation policy described by maffinity_base_alloc_policy.  A
   value of "warn" means that Open MPI will warn the first time this
   happens, but allow the job to continue (possibly with degraded
   performance).  A value of "error" means that Open MPI will abort
   the job if this happens.

This needs at least a little soak time on the trunk before going to
v1.5.

This commit was SVN r24639.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-04-26 13:31:07 +00:00
Jeff Squyres
926af377fe Refs trac:2778.
Upgrade to hwloc 1.2 (from hwloc 1.1.2).  This should fix the problems
Nathan's seeing in #2778.

Let's let this soak on the trunk for a little while and see how LANL's
MTT's work out.  If that works, then we can CMR this to v1.5.

This commit was SVN r24635.

The following Trac tickets were found above:
  Ticket 2778 --> https://svn.open-mpi.org/trac/ompi/ticket/2778
2011-04-25 19:31:49 +00:00
Jeff Squyres
b8af3b7c4a New comment explains it all -- previous code was failing to find the
Nth core, so it fell over to try to find the Nth PU.

-----

hwloc isn't able to find cores on all platforms.  Example: PPC64
running RHEL 5.4 (linux kernel 2.6.18) only reports NUMA nodes and
PU's.  Fine.

However, note that hwloc_get_obj_by_type() will return NULL in 2
(effectively) different cases:

- no objects of the requested type were found
- the Nth object of the requested type was not found

So first we have to see if we can find *any* cores by looking for the
0th core.  If we find it, then try to find the Nth core.  Otherwise,
try to find the Nth PU.

This commit was SVN r24632.
2011-04-25 16:55:27 +00:00
Jeff Squyres
16d8e9216b Ran across this comment about i18n support, so I figured I'd update
it.  :-)

This commit was SVN r24631.
2011-04-22 12:14:20 +00:00
Ralph Castain
9988b97b97 Extend/update how we handle process stats. Add the ability to collect node-level stats separate from the process stats. Update the process stat memory fields to report in MBytes instead of KBytes as I can't find any process that runs in KBytes nowadays.
Rename the memusage sensor plugin to "resusage" as it will soon be updated to include full process stat monitoring.

Extend the heartbeat sensor to report node and process stats in the heartbeat.

Store the process and node stats in their respective orte_xxx_t object.

This commit was SVN r24629.
2011-04-21 22:55:45 +00:00
Jeff Squyres
2fe94b929a Manually add hwloc v1.1 branch r3418 commit (went in after v1.1.2
released): 

backport hwloc r 3416 from trunk: Add cache info entry _after_ checking
that we need one, thanks Andriy Gapon for the fix

This commit was SVN r24612.

The following SVN revision numbers were found above:
  r3418 --> open-mpi/ompi@9972663a12
2011-04-12 14:41:46 +00:00
Jeff Squyres
9dc3a1aa54 Upgrade to hwloc 1.1.2; most likely the last release of the hwloc
1.1.x series

This commit was SVN r24611.
2011-04-12 14:35:26 +00:00
Jeff Squyres
38d3cdd4a6 Update hwloc to 1.1.1. Next stop: 1.1.2.
This commit was SVN r24610.
2011-04-12 14:16:37 +00:00
Jeff Squyres
48f418ee7b Fixes trac:2768: exclude opal/libltdl from "make distclean" when
--disable-dlopen is used.  Thanks to David Gunter for reporting the
issue. 

This commit was SVN r24603.

The following Trac tickets were found above:
  Ticket 2768 --> https://svn.open-mpi.org/trac/ompi/ticket/2768
2011-04-08 14:59:49 +00:00
Shiqing Fan
4b3b713bfc Update the windows installdir component.
Don't use the old env component for windows, so remove the .windows file.

This commit was SVN r24597.
2011-04-05 12:15:41 +00:00
Terry Dontje
266e663091 Add opal_tree class. This will be used in the future by sysinfo to store hw maps to be used by rmaps for the new affinity code.
This commit was SVN r24594.
2011-03-30 08:05:28 +00:00
Ralph Castain
f40edd6b4f Add the stupid test word
This commit was SVN r24578.
2011-03-26 03:38:59 +00:00
Ralph Castain
5bfb01c6c8 Only build the linux component of sysinfo if linux is the operating system.
Thanks to Paul Hargrove for the suggestion.

This commit was SVN r24576.
2011-03-25 20:55:57 +00:00
Jeff Squyres
58a13f87e6 Oops -- forgot to add opal_config_top.h to Makefile.am (so that it'll
be included in the tarball).

This commit was SVN r24572.
2011-03-25 01:21:11 +00:00
Jeff Squyres
5ae1b15b6e Ensure that other packages defining PACKAGE_ macros don't hurt us, and protect others from our PACKAGE_ macros.
This commit was SVN r24571.
2011-03-24 22:39:56 +00:00
Jeff Squyres
cf6c5e8d48 Fix a bug noted by Gus Correa on the user's list: mpi_paffinity_alone
appeared multiple times in ompi_info output (so did others, but this
is the one that was noticed).  Ensure that we don't repeat
opal_paffinity_base_register_params() multiple times.

This commit was SVN r24569.
2011-03-24 00:58:25 +00:00
Eugene Loh
2770a12beb Continue clean up of thread options started in r22841, 22842, and 22849.
No need for any CMRs to 1.5... that was already done in CMR 2728.

This commit was SVN r24545.

The following SVN revision numbers were found above:
  r22841 --> open-mpi/ompi@b400b84162
2011-03-18 21:36:35 +00:00
Jeff Squyres
bffa5c8f7e * Rename OMPI_CHECK_PTHREAD_PIDS to OPAL_CHECK_PTHREAD_PIDS.
* Convert from AC_TRY_RUN to AC_RUN_IFELSE.
 * Excellent suggestion from Paul Hargrove: use AC_CHECK_FUNC to look
   for a Linuxthreads-specific symbol when we're cross compiling to
   see if threads will have different PIDs (because AC_CHECK_FUNC
   works properly even when in cross-compiling environments).

Background: the old/Linuxthreads-based pthreads implementation used
the Linux clone() call to make threads, which effectively meant that
each thread had a different PID.  The new NPTL pthreads implementation
does things better, meaning that threads have the same PID.  

Open MPI no longer supports threads with different PIDs -- we ripped
out the supporting code for threads with different PIDs because we
don't have systems available to test this on anymore (anyone who still
has such a system can still use older versions of Open MPI).  Hence,
configure needs to determine whether the target system will have the
same PID for threads or not -- even if we're cross-compiling.  The
current test compiles and runs a multi-threaded app that checks PIDs
of different threads, but we clearly can't do that in a
cross-compiling environment.  So use AC_CHECK_FUNC in cross-compiling
environments.

Simple, no?

This commit was SVN r24537.
2011-03-17 11:59:54 +00:00
Ralph Castain
d5dfe05521 Remove stale code associated with OPAL_THREADS_HAVE_DIFFERENT_PIDS. In the past, we have supported the case of really, really old Linux kernels where threads have different pids. However, when we updated the event library, we didn't also update that support code. In addition, when we dropped progress thread support, we didn't remove areas of the code that could no longer be compiled (i.e., were protected by "if progress thread && if have different pids).
There was no compelling reason to support such old kernels. Accordingly, convert the test to print a nice error message indicating we no longer support old kernels (but indicate that earlier OMPI versions do) and error out. Remove all code that was protected by "if have different pids" since it can no longer be compiled.

This commit was SVN r24531.
2011-03-15 21:05:03 +00:00
Ralph Castain
7eede54b39 Solve a problem when cross-compiling for PPC32 - in this case, OPAL_HAVE_ATOMIC_CMPSET_64 is not set, but the code requires that the ADD_64 and SUB_64 values at least be defined.
This commit was SVN r24528.
2011-03-15 15:50:49 +00:00
Ralph Castain
45aacd30ab Add prefix for PPC hosts
This commit was SVN r24515.
2011-03-11 22:58:51 +00:00
Jeff Squyres
324b90142f Fix CID 1583: hwloc bitmap leak.
This commit was SVN r24496.
2011-03-08 16:47:26 +00:00
George Bosilca
95f4e0b502 We do need the name for debugging purposes.
This commit was SVN r24479.
2011-03-02 19:19:15 +00:00
George Bosilca
355d61bb0f No need for a printf.
This commit was SVN r24478.
2011-03-02 19:17:56 +00:00
Josh Hursey
7c737b9274 Some string and state cleanup. Thanks to George Bosilca for the initial patch.
This commit was SVN r24471.
2011-03-01 20:12:23 +00:00
Josh Hursey
7709005d86 Hack to get the C/R thread working again after r24377. Needs to be revisited.
See ticket #2741 for more details.

Refs trac:2741

This commit was SVN r24470.

The following SVN revision numbers were found above:
  r24377 --> open-mpi/ompi@e8c2519280

The following Trac tickets were found above:
  Ticket 2741 --> https://svn.open-mpi.org/trac/ompi/ticket/2741
2011-03-01 18:47:31 +00:00
George Bosilca
f981e02b4a Fix a typo and correct the usage of the defines.
This commit was SVN r24454.
2011-02-24 06:34:30 +00:00
George Bosilca
f79c87f0c3 Correct the assembly using xaddl for IA32.
Add atomic functions for add and sub 32 and 64 bits for AMD64.

This commit was SVN r24453.
2011-02-24 06:31:47 +00:00
George Bosilca
eb8383802e ret might have been used uninitialized. Not anymore.
This commit was SVN r24452.
2011-02-24 03:02:48 +00:00
Jeff Squyres
ad985260d3 Ensure to disable XML and Cairo support in hwloc; OMPI doesn't use it. Additionally, ensure that the right flags are passed back to the wrappers in the case of static builds. We probably won't need these (especially since XML has been disabled), but it's the Right Thing to do.
This commit was SVN r24451.
2011-02-23 23:11:45 +00:00
Jeff Squyres
e8ba72258e Patch for PPC64 platforms with smt=off, issue raised by Brad. This
fix will be included in hwloc 1.1.2.

Brad -- can you verify that this fixes the issue for you?

Fixes trac:2732.

This commit was SVN r24450.

The following Trac tickets were found above:
  Ticket 2732 --> https://svn.open-mpi.org/trac/ompi/ticket/2732
2011-02-23 22:43:58 +00:00
Brian Barrett
07996af388 Fix register clobber list for x86 assembly. Thanks to Jay Fenlason for the
patch.

This commit was SVN r24449.
2011-02-23 21:54:07 +00:00
Jeff Squyres
8143b201a9 Custom patch for hwloc (that will be included in hwloc 1.1.2) so that
we don't barf on Linux non-NUMA (NNUMA, aka UMA ;-) ) platforms.

This commit was SVN r24448.
2011-02-23 21:02:02 +00:00
Jeff Squyres
2368410eff * Ensure to follow standard filename conventions for output MCA DSO
filenames -- don't include the project name ("opal")
 * Don't link maffinity/hwloc and paffinity/hwloc against the common
   hwloc in the static build case (because this will result in
   duplicate symbols)

This commit was SVN r24447.
2011-02-23 21:00:20 +00:00
Jeff Squyres
5e082d68f6 Fix the compile error for libnuma 0.9.x introduced in r24442;
hopefully, this now compiles for libnuma 0.9.x and libnuma 2.0.x.

Fixes for the strategy discussed in the commit message for r24442
(i.e., check against numa_get_mems_allowed(), which only exists in
libnuma 2.0.x) and the new new new plan on #2698 coming in a separate
commit.

This commit was SVN r24443.

The following SVN revision numbers were found above:
  r24442 --> open-mpi/ompi@90a8fe4aad
2011-02-23 13:44:46 +00:00
Rainer Keller
90a8fe4aad - Addendum to r24421: get mca_maffinity_libnuma to compile on linux
(with libnuma-2.0.4 / LIBNUMA_API_VERSION 2): numa_get_run_node_mask
   returns a struct bitmask *.

   Whether it's a good idea to blindly pass that on to
   numa_set_membind() is another matter: one might want to match against
   the list returned by numa_get_mems_allowed(), which may be set by the
   outside environment.

   Refs trac:2698.

This commit was SVN r24442.

The following SVN revision numbers were found above:
  r24421 --> open-mpi/ompi@31510e683b

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-02-23 12:59:49 +00:00
Jeff Squyres
c1b26005d7 Create new opal/mca/common area, similar to ompi/mca/common. Move hwloc into this new opal MCA common area, and link the hwloc paffinity component against it. Also add a new hwloc maffinity component, and also link it against the opal MCA common hwloc. More development coming soon regarding this common hwloc instance (i.e., an OPAL-ized version of the hwloc API via a new framework so that we can safely use hwloc's services throughout the rest of the OPAL/ORTE/OMPI code bases.
This commit was SVN r24440.
2011-02-22 23:21:48 +00:00
Shiqing Fan
90eeba252e Make openib compile again for Windows.
Update the CMake script for checking mca subdirs.
Add windows support for __attribute__ packed structures.
Define usleep and posix_memalign with equivalent windows functions.
And a few minor fixes, type casts.

This commit was SVN r24429.
2011-02-22 15:49:27 +00:00
Shiqing Fan
baad4e1844 fix a non if-controlled brace.
This commit was SVN r24428.
2011-02-22 11:45:43 +00:00
Ralph Castain
e22262602e Extend the opal output code to support systems that cannot allow stdout/err to be output to console or files. This occurs in some embedded environments where file systems are in flash and consoles are redirected to NULL.
Add three new envars (not MCA params!) that control this behavior (see output.h for explanation).

This commit was SVN r24422.
2011-02-21 21:42:59 +00:00
Jeff Squyres
31510e683b Replace r24290 with something more meaningful. In this case, find out
what memory node the process is running on (which is guaranteed to be
a good answer because maffinity won't be invoked unless the process is
already bound to a specific processor), and then bind our memory to
that. 

Refs trac:2698.

This commit was SVN r24421.

The following SVN revision numbers were found above:
  r24290 --> open-mpi/ompi@afa654746c

The following Trac tickets were found above:
  Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
2011-02-21 20:07:11 +00:00
Jeff Squyres
3f4d4886f2 Minor update for something that has been bugging me for quite a while:
OMPI supports multiple different repository systems (SVN, hg, git).
But the VERSION file has listed "want_svn" and "svn_r" as fields, even
though the actual repo system and version may not be SVN.

So search/replace those fields (and derrivative values that come from
those fields) with "want_repo_rev" and "repo_rev", respectively.

This commit was SVN r24405.
2011-02-16 22:53:23 +00:00
Shiqing Fan
d464601e9b Define all the missing SIG* for Windows.
This commit was SVN r24402.
2011-02-16 13:46:27 +00:00
Shiqing Fan
ddc05e05d7 Avoid blocking select on Windows.
This commit was SVN r24396.
2011-02-16 08:48:21 +00:00
Ralph Castain
bf1cff3711 Plug a couple of additional memory leaks - try to highlight a little better that strings returned from reg_string_name must be freed by caller
This commit was SVN r24383.
2011-02-14 20:58:22 +00:00
Ralph Castain
d85916c1c2 Plug memory leak
This commit was SVN r24380.
2011-02-14 19:51:33 +00:00
Ralph Castain
b5de068533 Clean up an error in r24371 - can't use a const parameter as target in asprintf as it changes the value of the address.
Add some new proc/job states

Rename a constant to reflect coming change - remove the arbitrary difference between restarting a proc locally and relocating it to another node in terms of the number of restarts allowed.

Add pretty-print of signals for "proc aborted due to signal" reports.

This commit was SVN r24378.

The following SVN revision numbers were found above:
  r24371 --> open-mpi/ompi@93d28a5792
2011-02-14 19:29:09 +00:00
Ralph Castain
e8c2519280 Restore thread-supported condition waits when thread support requested
This commit was SVN r24377.
2011-02-14 19:10:38 +00:00
Abhishek Kulkarni
93d28a5792 Change opal_err2str_fn_t to return the error string as an argument.
This means that the converters (opal_err2str, orte_err2str) can now
return NULL as a "silent error". The return value of opal_err2str_fn_t
is the status of the operation (OPAL_SUCCESS or OPAL_ERROR).

This fixes the "Unknown error" message issues on the trunk.

This commit was SVN r24371.
2011-02-13 16:09:17 +00:00
Nysal Jan
92e06b0a1f Missed this change suggested by Terry
This commit was SVN r24364.
2011-02-08 04:06:52 +00:00
Nysal Jan
a31025bb48 Fix pty setup code on AIX
This commit was SVN r24363.
2011-02-08 02:54:47 +00:00
Nysal Jan
f0f1d4e311 Older versions of config.guess detect the canonical system name of an AIX 7.1 system to be rs6000-ibm-aix. Add this workaround until AIX 7.1 support is available in the autotools releases
This commit was SVN r24362.
2011-02-08 02:52:10 +00:00
Jeff Squyres
b0ce9bae8e Oops. Also need to remove myriexpress.h from the Makefile.am.
This commit was SVN r24357.
2011-02-04 03:29:49 +00:00
Abhishek Kulkarni
d711c5a4b1 SOS fix for the Studio compilers (Thanks to Terry for spotting this).
This commit was SVN r24355.
2011-02-03 22:36:28 +00:00
Jeff Squyres
6421abecc7 Fixes trac:2690.
Temporarily remove hwloc's internal version of myriexpress.h.  It is
causing a problem when compiling Open MPI with MX support because
hwloc uses AC_CONFIG_HEADER in hwloc's hwloc.m4 to generate
opal/mca/paffinity/hwloc/hwloc/include/hwloc/config.h.
AC_CONFIG_HEADER apparently has the (undocumented) side effect of
adding -I$(top_builddir)/opal/mca/paffinity/hwloc/hwloc/include/hwloc
to OMPI's compilation flags.  Hence, when the OMPI MX components are
compiled and #include "myriexpress.h" (or <myriexpress.h>) they see
hwloc's myriexpress.h before the system one.  Badness ensures.

This removal is temporary because we need to figure out a better
solution.  But for now, OMPI is not using hwloc's myriexpress.h file --
so it's safe to remove.  I'll push this issue upstream to hwloc to
figure out a better solution...

This commit was SVN r24354.

The following Trac tickets were found above:
  Ticket 2690 --> https://svn.open-mpi.org/trac/ompi/ticket/2690
2011-02-03 14:24:32 +00:00
Nysal Jan
ab2f738b0b Recent versions of IBM XL compilers on AIX support GCC inline assembly format
This commit was SVN r24340.
2011-02-02 11:31:30 +00:00
Jeff Squyres
4674e62929 These files are superflouos.
This commit was SVN r24331.
2011-02-01 21:31:35 +00:00
Jeff Squyres
c8badb79df Don't instantiate variables in for loops; we don't assume C99
compilers. 

This commit was SVN r24330.
2011-02-01 19:23:14 +00:00
Nysal Jan
42015cf30a Fix build failure on AIX
This commit was SVN r24321.
2011-01-28 08:09:45 +00:00
Nysal Jan
857c32784e Fix detection of fd_mask
This commit was SVN r24320.
2011-01-28 06:20:32 +00:00
George Bosilca
d457338f66 Force mips2 asm acceptance before sc and ll.
This commit was SVN r24319.
2011-01-27 22:42:26 +00:00
Jeff Squyres
6c8de8fb76 Bump up to hwloc 1.1.1
This commit was SVN r24312.
2011-01-26 23:20:26 +00:00
Jeff Squyres
511f87665b Fixes trac:2680: Add ARM support.
This commit was SVN r24308.

The following Trac tickets were found above:
  Ticket 2680 --> https://svn.open-mpi.org/trac/ompi/ticket/2680
2011-01-26 17:22:44 +00:00