1
1
Граф коммитов

27503 Коммитов

Автор SHA1 Сообщение Дата
Alex Mikheev
692021f637
oshmem: sshmem sysv: auto huge page alloc can fallback to regular pages.
Fallback to the regular pages if huge page allocation is set to auto
and it was not possible to allocate requested amount of memory with
the hugepages.

Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-08-06 13:33:04 +03:00
Ralph Castain
c15df97cc2 Merge pull request #4031 from rhc54/topic/touchups
Silence some compile-time warnings. Update scripts now that AUTHORS is gone
2017-08-04 22:11:56 -06:00
Ralph Castain
d1b7c3d8d5 Silence some compile-time warnings. Update scripts now that AUTHORS is gone
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-04 20:08:31 -07:00
Josh Hursey
11f04094db Merge pull request #4029 from jjhursey/fix/sm-argv
btl/sm: Missing argv header
2017-08-04 20:56:54 -05:00
Joshua Hursey
196b314643 btl/sm: Missing argv header
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-04 21:10:49 -04:00
Howard Pritchard
e79eb85690 Merge pull request #3970 from hppritcha/topic/disable_xrc_by_default
btl/openib: disable XRC by default
2017-08-04 10:25:51 -06:00
Howard Pritchard
8223d4cba0 btl/openib: disable XRC by default
Change the default enable configure option XRC to disabled.  If a user want's
to give it a try they have to explicitly ask for it.

Modify the configury help message to indicate it is not enabled by default.

Related to #3890
Fixes #3969

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-08-04 09:42:12 -06:00
Ralph Castain
21c0319a2f Merge pull request #4018 from rhc54/topic/test
Fix incorrect usage of '==' in test comparisons
2017-08-04 07:09:09 -06:00
Ralph Castain
f128b4c546 Fix incorrect usage of '==' in test comparisons
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-03 21:21:26 -07:00
Ralph Castain
88a7c9dca3 Merge pull request #4013 from rhc54/topic/hwloc
Silence warning on Mac - we know Mac doesn't support hwloc, and so it…
2017-08-03 15:52:44 -06:00
Howard Pritchard
897c62756b Merge pull request #3999 from hppritcha/topic/slurmd_controls_them_all
SLURM: launch all processes via slurmd
2017-08-03 15:33:44 -06:00
Joshua Ladd
c27beea3a1 Merge pull request #3962 from karasevb/ucx_detect
configure: detect UCX support by default
2017-08-03 16:33:57 -04:00
Mike Dubman
dd3acd9220 Merge pull request #4006 from alex-mikheev/topic/oshmem_shmem_ptr
oshmem: shmem_ptr() implementation
2017-08-03 19:45:38 +03:00
Nathan Hjelm
ebce88b7ad opal: remove generated asm code
Every modern compiler supports either inline assembly or builtin atomic
operations. Because of this it is time to delete all the code associated
with pre-built atomics.

This commit also clean out the DEC and XLC asm checks. Neither check
does anything and the XLC compiler supports GCC ASM.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-03 09:18:58 -06:00
Nathan Hjelm
29b059e4eb Merge pull request #3971 from plesn/yield_srun
fix srun latency, change default yield_when_idle=0
2017-08-03 07:49:00 -06:00
Alex Mikheev
1b5df76f8b
oshmem: shmem_ptr() implementation
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-08-03 13:56:34 +03:00
Gilles Gouaillardet
6b6e65a5bc rtc/hwloc: fix MCA parameter handling
always re-initialize vmhole *before* mca_base_component_var_register()
otherwise the vmhole gets NULL'ified if orte is initialized a second time.
that typically occurs when Open MPI is configure'd with --disable-dlopen
and the app does MPI_T_init_thread(); MPI_T_finalize(); MPI_T_init_thread();

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-03 14:45:43 +09:00
Ralph Castain
9f926b8083 Silence warning on Mac - we know Mac doesn't support hwloc, and so it doesn't matter if a VM hole isn't found. It also doesn't matter in general as all it really means is that we have to turn the hwloc shmem support "off".
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-02 20:20:45 -06:00
Gilles Gouaillardet
2216b80b82 README: remove references to the removed coll/hierarch module
Fixes open-mpi/ompi@4005

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-03 09:15:38 +09:00
Brian Barrett
0fba0f4f80 Merge pull request #4008 from open-mpi/revert-3960-topic/treematch
Revert "Topic/treematch"
2017-08-02 15:25:32 -07:00
Howard Pritchard
5ce07a6983 Merge pull request #3997 from hppritcha/topic/swat_compiler_warning
btl/ugni: swat compiler warning
2017-08-02 15:44:09 -06:00
Brian Barrett
1ec3fd38be Revert "Topic/treematch" 2017-08-02 14:40:55 -07:00
Howard Pritchard
d08be74573 SLURM: launch all processes via slurmd
It turns out that the approach of having the HNP do the
fork/exec of MPI ranks on the head node in a SLURM environment
introduces problems when users/sysadmins want to use the SLURM
scancl tool or sbatch --signal option to signal a job.

This commit disables use of the HNP fork/exec procedure when
a job is launched into a SLURM controlled allocation.

update NEWS with a blurb about new ras framework mca parameter.

related to #3998

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-02 14:56:55 -06:00
bosilca
d6048af915 Merge pull request #3960 from bosilca/topic/treematch
Update OMPI support for topologies and reordering.
2017-08-02 12:47:23 -04:00
Ralph Castain
355c71b6b0 Merge pull request #4004 from artpol84/pmix_instdirs/master
pmix: fix PMIx envar name for the installation prefix.
2017-08-02 06:29:32 -06:00
Artem Polyakov
71da0fcbef plm/rsh: Propagate PMIx prefix to orted's
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:06:13 +03:00
Artem Polyakov
500c8be888 pmix: fix PMIx envar name for the installation prefix.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:03:36 +03:00
Nathan Hjelm
31171d04f1 Merge pull request #4000 from hjelmn/sync_check
config: remove erroneous define
2017-08-01 16:01:44 -06:00
Ralph Castain
f39ce67982 Merge pull request #3951 from rhc54/topic/hwloc2
Update to hwloc 2.0.0a
2017-08-01 15:18:31 -06:00
Ralph Castain
69612b3e2a Merge pull request #3990 from rhc54/topic/p2
Move handling of OPAL_PREFIX to PMIX_PREFIX down into embedded PMIx integration code
2017-08-01 15:13:59 -06:00
Nathan Hjelm
35c9b93754 config: remove erroneous define
This removes a copy-and-paste error where we were setting the
OPAL_ASM_SYNC_HAVE_64BIT more than once.

References #3993. Close when on master and v3.0.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-01 14:53:35 -06:00
Brian Barrett
c4ae36f971 Merge pull request #3869 from Zzzoom/find_freq_bogomips
opal: Get x86 TSC frequency from bogomips
2017-08-01 13:23:21 -07:00
Howard Pritchard
12a5aacdfd btl/ugni: swat compiler warning
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-01 12:21:57 -06:00
Howard Pritchard
b0c82737c5 Merge pull request #3994 from hppritcha/topic/swat_issue_3968
oshmem: fix issue with shmem_g c11 generics
2017-08-01 11:29:14 -06:00
Howard Pritchard
1d612da1cb oshmem: fix issue with shmem_g c11 generics
There was a typo in the shmem_g c11 generic interface
in shmem.h.in

Thanks to @nspark for reporting the problem and
specifying the fix.

Fixes #3968

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-01 09:58:20 -06:00
Ralph Castain
8f34fa4a56 Move the detection of OPAL_PREFIX and subsequent posting of PMIX_PREFIX to the internal integration code for PMIx so we only do this when running with the embeddied PMIx
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-01 08:24:27 -06:00
Ralph Castain
e94786f4b7 Revert "Check for OPAL_PREFIX and set corresponding PMIX_PREFIX if found"
This reverts commit 3744967adb.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-01 08:14:12 -06:00
KAWASHIMA Takahiro
a7a30424cb Merge pull request #3982 from kawashima-fj/pr/comm-set-error
communicator: Refine `ompi_comm_set` error check
2017-07-31 18:42:47 -05:00
Ralph Castain
08299794a7 Merge pull request #3983 from rhc54/topic/prefix
Check for OPAL_PREFIX and set corresponding PMIX_PREFIX if found
2017-07-31 11:40:53 -06:00
Sylvain Jeaugey
eee494fc8a common/cuda: Fix near-hang when remote side has exited
Ignore errors caused by remote side having exited when closing CUDA IPC mappings.
openmpi/ompi#3244

Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com>
2017-07-31 10:34:45 -07:00
Ralph Castain
3744967adb Check for OPAL_PREFIX and set corresponding PMIX_PREFIX if found
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-31 09:14:01 -06:00
KAWASHIMA Takahiro
3eac4b0c9a communicator: Refine ompi_comm_set error check
The `ompi_comm_set` function never sets `NULL` to its first argument
`ncomm`.  So `NULL` check is unnecessary in its callers. Furthermore,
`NULL` check may obscure a real return code when an error occurs
if the variable is initialized to a `NULL` value.

Also, `NULL` check is added in the `ompi_comm_set` function to
avoid segmentation fault in an out-of-memory condition.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-07-31 20:26:51 +09:00
KAWASHIMA Takahiro
ebc4eb347c Merge pull request #3701 from kawashima-fj/pr/non-pml-persistent
ompi/request: Support non-PML persistent requests
2017-07-31 02:36:17 -05:00
Ralph Castain
0c8a73a53c Merge pull request #3978 from karasevb/fix_hangs_pmix1
pmix: fixed immediate request
2017-07-28 11:28:18 -05:00
Boris Karasev
e20b581529 pmix: fixed immediate request
This commit fixes a hang when using external PMIx v1 module

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-28 15:53:48 +06:00
Gilles Gouaillardet
825116044e hwloc/base: fix info message for opal_hwloc_base_binding_policy
if np > 2, the default binding is now "numa"

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-28 11:17:15 +09:00
Brian Barrett
fe8e4a0402 dist: Autogenerate AUTHORS file
Per discussion at the Summer 2017 developers meeting, generate
the AUTHORS list at make dist time, rather than trying to
keep it up to date and merge on the branches by hand.  While
most of the data is generated from git, the organization list
was maintained by hand.  The general feeling at the meeting was
that the organization list was not adding value and there were
concrete cases where it involved much chasing by the RMs, so
it has been removed.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-07-27 13:04:17 -07:00
Edgar Gabriel
d93dae326e Merge pull request #3959 from edgargabriel/topic/performance-fixes
Topic/performance fixes
2017-07-27 09:51:57 -05:00
Piotr Lesnicki
3fa7aabf89 fix srun latency, change default yield_when_idle=0
This changes the default to 0, to avoid yields during progress in srun.

In mpirun, ompi_mpi_yield_when_idle is set to 1 if oversubscribed
otherwise 0. But the default is 1 though, and it is used in srun.
Now srun and mpirun have the same latency in non-oversubscribed cases.

Signed-off-by: Piotr Lesnicki <piotr.lesnicki@atos.net>
2017-07-27 09:41:48 +02:00
Guillaume Mercier
a66dc811b2
Check if topo weighted in case of partially distrib case 2017-07-26 11:54:24 -04:00