1
1
Граф коммитов

26308 Коммитов

Автор SHA1 Сообщение Дата
Joshua Ladd
57c0c847d0 Merge pull request #2603 from xinzhao3/topic/revert-ucx-mt
Revert "PML/SPML/UCX: add UCX MT support to PML and SPML."
2017-01-04 11:50:37 -05:00
Ralph Castain
5737a45b35 Merge pull request #2658 from rhc54/topic/removal
Remove the bcol, coll/ml, and sbgp code as stale and lacking a maintainer
2017-01-03 20:34:09 -08:00
Ralph Castain
66131b4183 Remove the bcol, coll/ml, and sbgp code as stale and lacking a maintainer
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-03 19:32:48 -08:00
Ralph Castain
dadc6fbaf6 Merge pull request #2448 from thananon/remove_request_lock
Completely removed ompi_request_lock and ompi_request_cond
2017-01-03 19:31:46 -08:00
Jeff Squyres
33d2988985 Merge pull request #2647 from OMGtechy/master
Fixed -Wmisleading-indentation in ad_read_coll.c
2017-01-03 12:24:22 -05:00
Ralph Castain
218aed144d Merge pull request #2654 from rhc54/topic/memory
Remove stale global variables
2017-01-02 15:09:09 -08:00
Ralph Castain
9eab9a1ed3 Remove stale global variables
Revamp the event notification integration to rely on the PMIx event chaining and remove the duplicate chaining in OPAL. This ensures we get system-level events that target non-default handlers.

Restore the hostname entries for MPI-level error messages, but provide an MCA param (orte_hostname_cutoff) to remove them for large clusters where the memory footprint is problematic. Set the default at 1000 nodes in the job (not the allocation).

Begin first cut at memory profiler

Some minor cleanups of memprobe

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-02 14:04:24 -08:00
rhc54
5f68d655d6 Merge pull request #2651 from rhc54/topic/minor
Minor cleanups
2016-12-30 18:52:12 -08:00
Ralph Castain
e8aea2ebfc Minor cleanups
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-30 16:19:42 -08:00
rhc54
56b1e10ac0 Merge pull request #2649 from rhc54/topic/foot2
Update to latest PMIx master
2016-12-30 15:36:03 -08:00
Ralph Castain
08c76a42bb Update to latest PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>

Plug a minor memory leak. Tell the PMIx server not to create a dstore memory region for the daemon job as there is nobody to share it with.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>

Protect users of hwloc membind functions

Signed-off-by: Ralph Castain <rhc@open-mpi.org>

Update PMIx to include NULL string protection

Signed-off-by: Ralph Castain <rhc@open-mpi.org>

Update to PMIx master to include key overwrite protection

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-30 12:44:47 -08:00
rhc54
a16162832b Merge pull request #2648 from rhc54/topic/topo
Only instantiate the HWLOC topology in an MPI process if it actually will be used.
2016-12-29 11:52:08 -08:00
Ralph Castain
fe68f23099 Only instantiate the HWLOC topology in an MPI process if it actually will be used.
There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced:

* shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string
  instead of the topology itself, if available, thus avoiding instantiating the topology

* openib BTL. This uses the distance matrix. At present, I haven't developed a method
  for replacing that reference. Thus, this component will instantiate the topology

* usnic BTL. Uses the distance matrix.

* treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate
  the topology

* ess base functions. If a process is direct launched and not bound at launch, this
  code attempts to bind it. Thus, procs in this scenario will instantiate the
  topology

Note that instantiating the topology on complex chips such as KNL can consume
megabytes of memory.

Fix pernode binding policy

Properly handle the unbound case

Correct pointer usage

Do not free static error messages!

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-29 10:33:29 -08:00
Ralph Castain
52533f755e Remove debug
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-28 13:24:39 -08:00
Joshua Gerrard
94e87654c6 Fixed -Wmisleading-indentation in ad_read_coll.c
Signed-off-by: Joshua Gerrard <joshuagerrard+ompi-commit@protonmail.com>
2016-12-28 20:14:13 +00:00
rhc54
acbf1cbaef Merge pull request #2646 from rhc54/topic/squeze
Begin to reduce reliance of application procs on the topology tree it…
2016-12-28 10:16:58 -08:00
Ralph Castain
3a2d6a5ab6 Begin to reduce reliance of application procs on the topology tree itself by having the daemon provide more detailed info. In this case, provide the topology description string so that procs can readily determine the number of types of objects on the node, and a "locality" string that describes which objects this process is executing upon. The latter allows a process to compute the objects of overlap between itself and another proc without consulting the topology tree.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-28 09:14:26 -08:00
rhc54
75be023f90 Merge pull request #2645 from rhc54/topic/maps
Fix mapping directive checks
2016-12-28 03:43:46 -08:00
Ralph Castain
7866bb1119 Add debug, cleanup cpus/rank
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-27 21:25:52 -08:00
Ralph Castain
1e4bffd937 Fix mapping directive checks
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-27 20:42:47 -08:00
Jeff Squyres
31e98401c7 Merge pull request #2636 from jsquyres/pr/fix-cflags-uniqueness
configury: fix OMPI_UNIQUE -> OMPI_FLAGS_UNIQUE
2016-12-27 17:24:06 -05:00
Jeff Squyres
d772fcf8f1 Merge pull request #2509 from OMGtechy/master
Fixed memory leak and some -Werror=unused-result warnings
2016-12-27 17:13:23 -05:00
Jeff Squyres
15c1ee13fb Merge pull request #2624 from jsquyres/pr/one-more-buildrpm-fix
buildrpm.sh: don't use $HOME
2016-12-27 17:02:36 -05:00
Jeff Squyres
fb74c80e4b configury: fix OMPI_UNIQUE -> OMPI_FLAGS_UNIQUE
Looks like we missed one place where we needed to swap OMPI_UNIQUE for
OMPI_FLAGS_UNIQUE.  Thanks to Phil Tooley (@Telemin) for reporting the
issue.

Fixes #2635.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-12-27 13:36:53 -08:00
rhc54
24000aae84 Merge pull request #2638 from rhc54/topic/pmixcflags
Avoid mangling user-provided CFLAGS by using the new PMIX_FLAGS_UNIQ autoconf script in place of PMIX_UNIQ
2016-12-27 10:13:58 -08:00
Ralph Castain
d3aa3777f3 Per @jsquyres: avoid mangling user-provided CFLAGS by using the new PMIX_FLAGS_UNIQ autoconf script in place of PMIX_UNIQ
Refs #2636

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-27 09:00:59 -08:00
Ralph Castain
791f4f1ce3 Adjust debug output for clarity
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-26 14:04:20 -08:00
Gilles Gouaillardet
22db1d36b6 pmix2x: silence misc warnings
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-26 13:35:17 +09:00
rhc54
67a08e825e Merge pull request #2632 from rhc54/topic/updates
Transfer some minor cleanups back from the PMIx reference server
2016-12-23 12:49:33 -08:00
Ralph Castain
ef3f748d0d Transfer some minor cleanups back from the PMIx reference server
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-23 08:46:04 -08:00
Nysal Jan K A
19e3be31e5 Merge pull request #2421 from nysal/master
mpit: Fix MPI_T_pvar_get_index
2016-12-22 15:33:51 +05:30
Gilles Gouaillardet
54c84196a6 btl/vader: plug a memory leak
as reported by Coverity with CID 1362691

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-22 16:04:36 +09:00
Nysal Jan K.A
25ba507ada mpit: Fix MPI_T_pvar_get_index
MPI_T_pvar_get_index was returning an incorrect index. The index
was never set correctly while registering the performance variables.
Additionally fix a missing case in the mca_base_var_type_t to MPI
datatype conversion. This type is currently used for control variables
registered by mxm, fca and hcoll components.

Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
2016-12-22 12:30:21 +05:30
Gilles Gouaillardet
773cad6b3e ompi/debugger: fix mqs_version_string()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-22 15:00:47 +09:00
Jeff Squyres
3571c3c5bb hwloc external: minor fixes to 9649c44
- Fix capitolization typos
- Make comment more correct / flow better
- Use AM_CPPFLAGS, not DEFAULT_INCLUDES
- Remove extra "hwloc/" from external hwloc.h specification

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-12-21 09:06:24 -08:00
Jeff Squyres
6002a8bca5 buildrpm.sh: don't use $HOME
This is news to me: I didn't know that some distros do not set $HOME.
So use "~" instead, and only try to grep ~/.rpmmacros if it exists.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-12-21 07:42:32 -08:00
Jeff Squyres
678e314c0e opal_configure_options: remove stale option help
--with-libltdl is now added (via AC_ARG_WITH) in
opal/mca/dl/libltdl/configure.m4 -- it no longer belongs up here in
this top-level m4 file.  Plus, the help string in this stale entry is
also stale/incorrect.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-12-21 07:28:31 -08:00
Gilles Gouaillardet
9649c44fa0 hwloc: correctly handle --with-hwloc=external
- simply #include "hwloc.h" to use the external hwloc header
- do use the external hwloc header instead of opal/mca/hwloc/hwloc.h

Thanks Orion Poplawski for the report

Fixes open-mpi/ompi#2616

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-21 11:58:10 +09:00
Jeff Squyres
5ecd271934 buildrpm.sh: minor fixes
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-12-20 10:54:37 -08:00
rhc54
75ec38db7d Merge pull request #2609 from rhc54/topic/psrv
Bring across some more patches from the debugger work
2016-12-19 20:38:28 -08:00
Ralph Castain
ea133206ec Sync the internal OMPI component to PMIx master
Update external PMIx v2.x component
Add missing Makefile

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-19 19:14:16 -08:00
Ralph Castain
4774eb8b5a Update NEWS with 1.10.5 items
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-19 16:48:45 -08:00
Xin Zhao
2d77912c19 Revert "PML/SPML/UCX: add UCX MT support to PML and SPML."
This reverts commit 0ecf3c951c.

Signed-off-by: Xin Zhao <xinz@mellanox.com>
2016-12-19 18:57:48 +02:00
rhc54
0acdcebab2 Merge pull request #2601 from rhc54/topic/dbgr
Transfer across final fixes from debugger attach work
2016-12-19 03:56:42 -08:00
Ralph Castain
256b5adac5 Transfer across final fixes from debugger attach work
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-19 00:34:27 -08:00
rhc54
c1b8538216 Merge pull request #2600 from rhc54/topic/dbg
Transfer debugger support changes
2016-12-17 20:13:40 -08:00
Ralph Castain
c6f6f40529 Transfer debugger support changes
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-17 18:14:46 -08:00
rhc54
54c4925f3f Merge pull request #2598 from rhc54/topic/debugger
Transfer back changes from debugger attach work
2016-12-17 13:09:38 -08:00
Nathan Hjelm
16a2f09cd5 Merge pull request #2596 from hjelmn/x86_rtdtsc
opal/timer: add code to check if rtdtsc is core invariant
2016-12-17 11:14:49 -07:00
Ralph Castain
269753f5c1 Transfer back changes from debugger attach work
Silence warning

Remove debug

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-17 10:00:52 -08:00