1
1

27777 Коммитов

Автор SHA1 Сообщение Дата
Joshua Hursey
535a621f49 README: Clarify note about ld issue for XL and PGI on PPC
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit d3b82a3cc78b6c13a02c8120c8efea22679e8abd)
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-07 15:14:57 -05:00
Ralph Castain
9921237f99 Merge pull request #4012 from rhc54/topic/p3
Cover the use-cases for OPAL_PREFIX and PMIX_INSTALL_PREFIX options
2017-08-07 11:42:53 -07:00
Ralph Castain
9499acc56a Merge pull request #4043 from rhc54/topic/libpmix
Fix libpmix linking
2017-08-07 11:28:15 -07:00
Ralph Castain
d593e5a4ce When we specify --with-devel-headers, we also emit a copy of libpmix. However, that library was built against the OPAL libevent component, which means all the libevent functions are prefixed with OPAL names. So ensure that the emitted libpmix is linked back against libopen-pal so those symbols will be resolved.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-07 09:36:16 -07:00
Nathan Hjelm
813762334e memory/patcher: hook madvise
It is not possible to use the patcher based memory hooks without
hooking madvise (MADV_DONTNEED). This commit updates the patcher
memory hooks to always hook madvise. This should be safe with recent
rcache updates.

References #3685. Close when merged into v2.0.x, v2.x, and v3.0.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 10:29:45 -06:00
Nathan Hjelm
b6bf3f4d95 rcache/base: reduce probability of deadlock when hooking madvise
The current VMA cache implementation backing rcache/grdma can run into
a deadlock situation in multi-threaded code when madvise is hooked and
the c library uses locks. In this case we may run into the following
situation:

Thread 1:

    ...
    free ()           <- Holding libc lock
    madvice_hook ()
    vma_iteration ()  <- Blocked waiting for vma lock

Thread 2:
    ...
    vma_insert ()     <- Holding vma lock
    vma_item_new ()
    malloc ()         <- Blocked waiting for libc lock

To fix this problem we chose to remove the madvise () hook but that
fix is causing issue #3685. This commit aims to greatly reduce the
chance that the deadlock will be hit by putting vma items into a free
list. This moves the allocation outside the vma lock. In general there
are a relatively small number of vma items so the default is to
allocate 2048 vma items. This default is configurable but it is likely
the number is too large not too small.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 10:29:45 -06:00
Ralph Castain
8e0ca63bdc Merge pull request #4040 from rhc54/topic/instructions
Add a brief ointer to the HACKING file
2017-08-07 08:04:41 -07:00
Ralph Castain
67655dba02 Add a brief ointer to the HACKING file
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-07 06:53:52 -07:00
Mike Dubman
d168a1e95b Merge pull request #4033 from alex-mikheev/topic/oshmem_sysv_hp_alloc
oshmem: sshmem sysv: auto huge page alloc can fallback to regular pages.
2017-08-07 10:48:06 +03:00
KAWASHIMA Takahiro
d468cdb7a6 test: Update nmcheck_prefix.pl
The linker of Linux/AArch64 (at least) generates `__bss_start__`,
`__bss_end__`, `_bss_end__`, and `__end__` symbols.

`libmpi_usempi_ignore_tkr.so` is added but `libmpi_usempif08.so`
is not added because `use-mpi-f08` has `contains` statements
in modules and compilers automatically generate compiler-specific
symbols for them. For example, gfortran 4.9 generates
`__mpi_f08_callbacks_MOD_mpi_comm_dup_fn` etc.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-08-07 13:54:15 +09:00
Alex Mikheev
692021f637
oshmem: sshmem sysv: auto huge page alloc can fallback to regular pages.
Fallback to the regular pages if huge page allocation is set to auto
and it was not possible to allocate requested amount of memory with
the hugepages.

Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-08-06 13:33:04 +03:00
Ralph Castain
c15df97cc2 Merge pull request #4031 from rhc54/topic/touchups
Silence some compile-time warnings. Update scripts now that AUTHORS is gone
2017-08-04 22:11:56 -06:00
Ralph Castain
d1b7c3d8d5 Silence some compile-time warnings. Update scripts now that AUTHORS is gone
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-04 20:08:31 -07:00
Ralph Castain
a239b4c3c3 Per discussion on the PMIx side, do a better job of detecting mismatches between location directives for OPAL and PMIx. Provide a more helpful error message and error out if we find a mismatch. If any OPAL values are set and the PMIx equivalent is not, then transfer it.
Do not clear PMIX_INSTALL_PREFIX from the daemon's launch environment

Fixes #3980
Closes #4007
Refs #3985

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-04 19:36:00 -07:00
Josh Hursey
11f04094db Merge pull request #4029 from jjhursey/fix/sm-argv
btl/sm: Missing argv header
2017-08-04 20:56:54 -05:00
Joshua Hursey
196b314643 btl/sm: Missing argv header
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-04 21:10:49 -04:00
Howard Pritchard
e79eb85690 Merge pull request #3970 from hppritcha/topic/disable_xrc_by_default
btl/openib: disable XRC by default
2017-08-04 10:25:51 -06:00
Howard Pritchard
8223d4cba0 btl/openib: disable XRC by default
Change the default enable configure option XRC to disabled.  If a user want's
to give it a try they have to explicitly ask for it.

Modify the configury help message to indicate it is not enabled by default.

Related to #3890
Fixes #3969

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-08-04 09:42:12 -06:00
Ralph Castain
21c0319a2f Merge pull request #4018 from rhc54/topic/test
Fix incorrect usage of '==' in test comparisons
2017-08-04 07:09:09 -06:00
Ralph Castain
f128b4c546 Fix incorrect usage of '==' in test comparisons
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-03 21:21:26 -07:00
Ralph Castain
88a7c9dca3 Merge pull request #4013 from rhc54/topic/hwloc
Silence warning on Mac - we know Mac doesn't support hwloc, and so it…
2017-08-03 15:52:44 -06:00
Howard Pritchard
897c62756b Merge pull request #3999 from hppritcha/topic/slurmd_controls_them_all
SLURM: launch all processes via slurmd
2017-08-03 15:33:44 -06:00
Joshua Ladd
c27beea3a1 Merge pull request #3962 from karasevb/ucx_detect
configure: detect UCX support by default
2017-08-03 16:33:57 -04:00
Mike Dubman
dd3acd9220 Merge pull request #4006 from alex-mikheev/topic/oshmem_shmem_ptr
oshmem: shmem_ptr() implementation
2017-08-03 19:45:38 +03:00
Nathan Hjelm
ebce88b7ad opal: remove generated asm code
Every modern compiler supports either inline assembly or builtin atomic
operations. Because of this it is time to delete all the code associated
with pre-built atomics.

This commit also clean out the DEC and XLC asm checks. Neither check
does anything and the XLC compiler supports GCC ASM.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-03 09:18:58 -06:00
Nathan Hjelm
29b059e4eb Merge pull request #3971 from plesn/yield_srun
fix srun latency, change default yield_when_idle=0
2017-08-03 07:49:00 -06:00
Alex Mikheev
1b5df76f8b
oshmem: shmem_ptr() implementation
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-08-03 13:56:34 +03:00
George Bosilca
3d27e0d3a4
Add support for hwloc 2.0 API.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 02:47:04 -04:00
Gilles Gouaillardet
6b6e65a5bc rtc/hwloc: fix MCA parameter handling
always re-initialize vmhole *before* mca_base_component_var_register()
otherwise the vmhole gets NULL'ified if orte is initialized a second time.
that typically occurs when Open MPI is configure'd with --disable-dlopen
and the app does MPI_T_init_thread(); MPI_T_finalize(); MPI_T_init_thread();

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-03 14:45:43 +09:00
Guillaume Mercier
569239ec44
Check if topo weighted in case of partially distrib case
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:47:46 -04:00
George Bosilca
1d7cca75a1
Fix a typo in the copyright.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:47:10 -04:00
George Bosilca
e4db9e574f
Fix all warnings.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:47:02 -04:00
George Bosilca
c2927d7e91
Update to the latest version provided by Guillaume.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:46:48 -04:00
George Bosilca
6c8ea09cc5
Use OPAL random generator.
This fix is related to issue #1877, and prevents the OMPI library from
messing the user level random values.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:46:37 -04:00
George Bosilca
5542559130
Cleaning and optimizations.
Including variable renaming and loop merging.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:46:28 -04:00
George Bosilca
bc634dbcb0
Make sure the gather is called in all cases, and not
simply based on some local state. This is the second
part of the patch proposed for open-mpi/ompi#1183.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:46:17 -04:00
Ralph Castain
9f926b8083 Silence warning on Mac - we know Mac doesn't support hwloc, and so it doesn't matter if a VM hole isn't found. It also doesn't matter in general as all it really means is that we have to turn the hwloc shmem support "off".
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-02 20:20:45 -06:00
Gilles Gouaillardet
2216b80b82 README: remove references to the removed coll/hierarch module
Fixes open-mpi/ompi@4005

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-03 09:15:38 +09:00
Brian Barrett
0fba0f4f80 Merge pull request #4008 from open-mpi/revert-3960-topic/treematch
Revert "Topic/treematch"
2017-08-02 15:25:32 -07:00
Howard Pritchard
5ce07a6983 Merge pull request #3997 from hppritcha/topic/swat_compiler_warning
btl/ugni: swat compiler warning
2017-08-02 15:44:09 -06:00
Brian Barrett
1ec3fd38be Revert "Topic/treematch" 2017-08-02 14:40:55 -07:00
Howard Pritchard
d08be74573 SLURM: launch all processes via slurmd
It turns out that the approach of having the HNP do the
fork/exec of MPI ranks on the head node in a SLURM environment
introduces problems when users/sysadmins want to use the SLURM
scancl tool or sbatch --signal option to signal a job.

This commit disables use of the HNP fork/exec procedure when
a job is launched into a SLURM controlled allocation.

update NEWS with a blurb about new ras framework mca parameter.

related to #3998

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-02 14:56:55 -06:00
bosilca
d6048af915 Merge pull request #3960 from bosilca/topic/treematch
Update OMPI support for topologies and reordering.
2017-08-02 12:47:23 -04:00
Ralph Castain
355c71b6b0 Merge pull request #4004 from artpol84/pmix_instdirs/master
pmix: fix PMIx envar name for the installation prefix.
2017-08-02 06:29:32 -06:00
Artem Polyakov
71da0fcbef plm/rsh: Propagate PMIx prefix to orted's
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:06:13 +03:00
Artem Polyakov
500c8be888 pmix: fix PMIx envar name for the installation prefix.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:03:36 +03:00
Nathan Hjelm
31171d04f1 Merge pull request #4000 from hjelmn/sync_check
config: remove erroneous define
2017-08-01 16:01:44 -06:00
Ralph Castain
f39ce67982 Merge pull request #3951 from rhc54/topic/hwloc2
Update to hwloc 2.0.0a
2017-08-01 15:18:31 -06:00
Ralph Castain
69612b3e2a Merge pull request #3990 from rhc54/topic/p2
Move handling of OPAL_PREFIX to PMIX_PREFIX down into embedded PMIx integration code
2017-08-01 15:13:59 -06:00
Nathan Hjelm
35c9b93754 config: remove erroneous define
This removes a copy-and-paste error where we were setting the
OPAL_ASM_SYNC_HAVE_64BIT more than once.

References #3993. Close when on master and v3.0.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-01 14:53:35 -06:00