Interested readers can quench their curiosity either with one
of the Richard Stevens books (ISBN 9780201633467) or the
Wikipedia page (http://en.wikipedia.org/wiki/Subnetwork).
This commit was SVN r24680.
Correctly check the dependencies of MSYS env.
Set up configure include and lib path for building the package.
update a few more CMake scripts.
This commit was SVN r24663.
that enabling "local_only" by default could cause excessive
by-NUMA-node paging and/or OOMs (rather than allowing memory
allocations to spill over to other NUMA nodes).
This brought home the very real-world example of people buying servers
with more processors/cores than they need, just to get more memory.
We wouldn't want Badness to occur in such scenarios by default.
Instead, let people turn on "only allow memory allocations on my local
NUMA node" if their application would benefit from it.
This commit was SVN r24648.
After a long period of development with many starts and stops, we
finally got this where we wanted it.
This commit introduces 2 new MCA params (note that the
"maffinity_libnuma_policy" MCA param introduced by r24290 was removed
when libnuma support was removed). Remember that maffinity policies
are only in effect when paffinity is enaabled -- i.e., when processes
are bound to processors!
* '''maffinity_base_alloc_policy:''' Policy that determines how
general memory allocations are bound after MPI_INIT. A value of
"none" means that no memory policy is applied. A value of
"local_only" means that all memory allocations will be restricted
to the local NUMA node where each process is placed. Note that
operating system paging policies are unaffected by this setting.
For example, if "local_only" is used and local NUMA node memory is
exhausted, a new memory allocation may cause paging.
* '''maffinity_base_bind_failure_action:''' What Open MPI will do if
it explicitly tries to bind memory to a specific NUMA location, and
fails. Note that this is a different case than the general
allocation policy described by maffinity_base_alloc_policy. A
value of "warn" means that Open MPI will warn the first time this
happens, but allow the job to continue (possibly with degraded
performance). A value of "error" means that Open MPI will abort
the job if this happens.
This needs at least a little soak time on the trunk before going to
v1.5.
This commit was SVN r24639.
The following SVN revision numbers were found above:
r24290 --> open-mpi/ompi@afa654746c
The following Trac tickets were found above:
Ticket 2698 --> https://svn.open-mpi.org/trac/ompi/ticket/2698
Upgrade to hwloc 1.2 (from hwloc 1.1.2). This should fix the problems
Nathan's seeing in #2778.
Let's let this soak on the trunk for a little while and see how LANL's
MTT's work out. If that works, then we can CMR this to v1.5.
This commit was SVN r24635.
The following Trac tickets were found above:
Ticket 2778 --> https://svn.open-mpi.org/trac/ompi/ticket/2778
Nth core, so it fell over to try to find the Nth PU.
-----
hwloc isn't able to find cores on all platforms. Example: PPC64
running RHEL 5.4 (linux kernel 2.6.18) only reports NUMA nodes and
PU's. Fine.
However, note that hwloc_get_obj_by_type() will return NULL in 2
(effectively) different cases:
- no objects of the requested type were found
- the Nth object of the requested type was not found
So first we have to see if we can find *any* cores by looking for the
0th core. If we find it, then try to find the Nth core. Otherwise,
try to find the Nth PU.
This commit was SVN r24632.
Rename the memusage sensor plugin to "resusage" as it will soon be updated to include full process stat monitoring.
Extend the heartbeat sensor to report node and process stats in the heartbeat.
Store the process and node stats in their respective orte_xxx_t object.
This commit was SVN r24629.
released):
backport hwloc r 3416 from trunk: Add cache info entry _after_ checking
that we need one, thanks Andriy Gapon for the fix
This commit was SVN r24612.
The following SVN revision numbers were found above:
r3418 --> open-mpi/ompi@9972663a12
--disable-dlopen is used. Thanks to David Gunter for reporting the
issue.
This commit was SVN r24603.
The following Trac tickets were found above:
Ticket 2768 --> https://svn.open-mpi.org/trac/ompi/ticket/2768
appeared multiple times in ompi_info output (so did others, but this
is the one that was noticed). Ensure that we don't repeat
opal_paffinity_base_register_params() multiple times.
This commit was SVN r24569.
No need for any CMRs to 1.5... that was already done in CMR 2728.
This commit was SVN r24545.
The following SVN revision numbers were found above:
r22841 --> open-mpi/ompi@b400b84162
* Convert from AC_TRY_RUN to AC_RUN_IFELSE.
* Excellent suggestion from Paul Hargrove: use AC_CHECK_FUNC to look
for a Linuxthreads-specific symbol when we're cross compiling to
see if threads will have different PIDs (because AC_CHECK_FUNC
works properly even when in cross-compiling environments).
Background: the old/Linuxthreads-based pthreads implementation used
the Linux clone() call to make threads, which effectively meant that
each thread had a different PID. The new NPTL pthreads implementation
does things better, meaning that threads have the same PID.
Open MPI no longer supports threads with different PIDs -- we ripped
out the supporting code for threads with different PIDs because we
don't have systems available to test this on anymore (anyone who still
has such a system can still use older versions of Open MPI). Hence,
configure needs to determine whether the target system will have the
same PID for threads or not -- even if we're cross-compiling. The
current test compiles and runs a multi-threaded app that checks PIDs
of different threads, but we clearly can't do that in a
cross-compiling environment. So use AC_CHECK_FUNC in cross-compiling
environments.
Simple, no?
This commit was SVN r24537.
There was no compelling reason to support such old kernels. Accordingly, convert the test to print a nice error message indicating we no longer support old kernels (but indicate that earlier OMPI versions do) and error out. Remove all code that was protected by "if have different pids" since it can no longer be compiled.
This commit was SVN r24531.
fix will be included in hwloc 1.1.2.
Brad -- can you verify that this fixes the issue for you?
Fixes trac:2732.
This commit was SVN r24450.
The following Trac tickets were found above:
Ticket 2732 --> https://svn.open-mpi.org/trac/ompi/ticket/2732
filenames -- don't include the project name ("opal")
* Don't link maffinity/hwloc and paffinity/hwloc against the common
hwloc in the static build case (because this will result in
duplicate symbols)
This commit was SVN r24447.
hopefully, this now compiles for libnuma 0.9.x and libnuma 2.0.x.
Fixes for the strategy discussed in the commit message for r24442
(i.e., check against numa_get_mems_allowed(), which only exists in
libnuma 2.0.x) and the new new new plan on #2698 coming in a separate
commit.
This commit was SVN r24443.
The following SVN revision numbers were found above:
r24442 --> open-mpi/ompi@90a8fe4aad