1
1
Граф коммитов

27690 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
bc634dbcb0
Make sure the gather is called in all cases, and not
simply based on some local state. This is the second
part of the patch proposed for open-mpi/ompi#1183.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-03 00:46:17 -04:00
Ralph Castain
9f926b8083 Silence warning on Mac - we know Mac doesn't support hwloc, and so it doesn't matter if a VM hole isn't found. It also doesn't matter in general as all it really means is that we have to turn the hwloc shmem support "off".
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-02 20:20:45 -06:00
Gilles Gouaillardet
2216b80b82 README: remove references to the removed coll/hierarch module
Fixes open-mpi/ompi@4005

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-03 09:15:38 +09:00
Brian Barrett
0fba0f4f80 Merge pull request #4008 from open-mpi/revert-3960-topic/treematch
Revert "Topic/treematch"
2017-08-02 15:25:32 -07:00
Howard Pritchard
5ce07a6983 Merge pull request #3997 from hppritcha/topic/swat_compiler_warning
btl/ugni: swat compiler warning
2017-08-02 15:44:09 -06:00
Brian Barrett
1ec3fd38be Revert "Topic/treematch" 2017-08-02 14:40:55 -07:00
Howard Pritchard
d08be74573 SLURM: launch all processes via slurmd
It turns out that the approach of having the HNP do the
fork/exec of MPI ranks on the head node in a SLURM environment
introduces problems when users/sysadmins want to use the SLURM
scancl tool or sbatch --signal option to signal a job.

This commit disables use of the HNP fork/exec procedure when
a job is launched into a SLURM controlled allocation.

update NEWS with a blurb about new ras framework mca parameter.

related to #3998

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-02 14:56:55 -06:00
bosilca
d6048af915 Merge pull request #3960 from bosilca/topic/treematch
Update OMPI support for topologies and reordering.
2017-08-02 12:47:23 -04:00
Ralph Castain
355c71b6b0 Merge pull request #4004 from artpol84/pmix_instdirs/master
pmix: fix PMIx envar name for the installation prefix.
2017-08-02 06:29:32 -06:00
Artem Polyakov
71da0fcbef plm/rsh: Propagate PMIx prefix to orted's
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:06:13 +03:00
Artem Polyakov
500c8be888 pmix: fix PMIx envar name for the installation prefix.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:03:36 +03:00
Nathan Hjelm
31171d04f1 Merge pull request #4000 from hjelmn/sync_check
config: remove erroneous define
2017-08-01 16:01:44 -06:00
Ralph Castain
f39ce67982 Merge pull request #3951 from rhc54/topic/hwloc2
Update to hwloc 2.0.0a
2017-08-01 15:18:31 -06:00
Ralph Castain
69612b3e2a Merge pull request #3990 from rhc54/topic/p2
Move handling of OPAL_PREFIX to PMIX_PREFIX down into embedded PMIx integration code
2017-08-01 15:13:59 -06:00
Nathan Hjelm
35c9b93754 config: remove erroneous define
This removes a copy-and-paste error where we were setting the
OPAL_ASM_SYNC_HAVE_64BIT more than once.

References #3993. Close when on master and v3.0.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-01 14:53:35 -06:00
Brian Barrett
c4ae36f971 Merge pull request #3869 from Zzzoom/find_freq_bogomips
opal: Get x86 TSC frequency from bogomips
2017-08-01 13:23:21 -07:00
Howard Pritchard
12a5aacdfd btl/ugni: swat compiler warning
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-01 12:21:57 -06:00
Howard Pritchard
b0c82737c5 Merge pull request #3994 from hppritcha/topic/swat_issue_3968
oshmem: fix issue with shmem_g c11 generics
2017-08-01 11:29:14 -06:00
Howard Pritchard
1d612da1cb oshmem: fix issue with shmem_g c11 generics
There was a typo in the shmem_g c11 generic interface
in shmem.h.in

Thanks to @nspark for reporting the problem and
specifying the fix.

Fixes #3968

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-01 09:58:20 -06:00
Ralph Castain
8f34fa4a56 Move the detection of OPAL_PREFIX and subsequent posting of PMIX_PREFIX to the internal integration code for PMIx so we only do this when running with the embeddied PMIx
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-01 08:24:27 -06:00
Ralph Castain
e94786f4b7 Revert "Check for OPAL_PREFIX and set corresponding PMIX_PREFIX if found"
This reverts commit 3744967adb.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-01 08:14:12 -06:00
KAWASHIMA Takahiro
a7a30424cb Merge pull request #3982 from kawashima-fj/pr/comm-set-error
communicator: Refine `ompi_comm_set` error check
2017-07-31 18:42:47 -05:00
Ralph Castain
08299794a7 Merge pull request #3983 from rhc54/topic/prefix
Check for OPAL_PREFIX and set corresponding PMIX_PREFIX if found
2017-07-31 11:40:53 -06:00
Sylvain Jeaugey
eee494fc8a common/cuda: Fix near-hang when remote side has exited
Ignore errors caused by remote side having exited when closing CUDA IPC mappings.
openmpi/ompi#3244

Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com>
2017-07-31 10:34:45 -07:00
Ralph Castain
3744967adb Check for OPAL_PREFIX and set corresponding PMIX_PREFIX if found
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-31 09:14:01 -06:00
KAWASHIMA Takahiro
3eac4b0c9a communicator: Refine ompi_comm_set error check
The `ompi_comm_set` function never sets `NULL` to its first argument
`ncomm`.  So `NULL` check is unnecessary in its callers. Furthermore,
`NULL` check may obscure a real return code when an error occurs
if the variable is initialized to a `NULL` value.

Also, `NULL` check is added in the `ompi_comm_set` function to
avoid segmentation fault in an out-of-memory condition.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-07-31 20:26:51 +09:00
KAWASHIMA Takahiro
ebc4eb347c Merge pull request #3701 from kawashima-fj/pr/non-pml-persistent
ompi/request: Support non-PML persistent requests
2017-07-31 02:36:17 -05:00
Ralph Castain
0c8a73a53c Merge pull request #3978 from karasevb/fix_hangs_pmix1
pmix: fixed immediate request
2017-07-28 11:28:18 -05:00
Boris Karasev
e20b581529 pmix: fixed immediate request
This commit fixes a hang when using external PMIx v1 module

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-28 15:53:48 +06:00
Gilles Gouaillardet
825116044e hwloc/base: fix info message for opal_hwloc_base_binding_policy
if np > 2, the default binding is now "numa"

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-28 11:17:15 +09:00
Brian Barrett
fe8e4a0402 dist: Autogenerate AUTHORS file
Per discussion at the Summer 2017 developers meeting, generate
the AUTHORS list at make dist time, rather than trying to
keep it up to date and merge on the branches by hand.  While
most of the data is generated from git, the organization list
was maintained by hand.  The general feeling at the meeting was
that the organization list was not adding value and there were
concrete cases where it involved much chasing by the RMs, so
it has been removed.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-07-27 13:04:17 -07:00
Edgar Gabriel
d93dae326e Merge pull request #3959 from edgargabriel/topic/performance-fixes
Topic/performance fixes
2017-07-27 09:51:57 -05:00
Piotr Lesnicki
3fa7aabf89 fix srun latency, change default yield_when_idle=0
This changes the default to 0, to avoid yields during progress in srun.

In mpirun, ompi_mpi_yield_when_idle is set to 1 if oversubscribed
otherwise 0. But the default is 1 though, and it is used in srun.
Now srun and mpirun have the same latency in non-oversubscribed cases.

Signed-off-by: Piotr Lesnicki <piotr.lesnicki@atos.net>
2017-07-27 09:41:48 +02:00
Guillaume Mercier
a66dc811b2
Check if topo weighted in case of partially distrib case 2017-07-26 11:54:24 -04:00
George Bosilca
8a7f0baee0
Fix call to opal_hwloc_base_get_topology.
Make sure the HWLOC topology is available as early as possible, so that
we can fail graciously.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-26 11:54:24 -04:00
George Bosilca
6061454055
Fix a typo in the copyright.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-26 11:54:24 -04:00
George Bosilca
911850d82e
Fix all warnings.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-26 11:54:24 -04:00
George Bosilca
2c00c4209a
Update to the latest version provided by Guillaume.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-26 11:54:23 -04:00
George Bosilca
fc21ffadc9
Cleaning and optimizations.
Including variable renaming and loop merging.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-26 11:54:23 -04:00
George Bosilca
081f9bc8db
Use OPAL random generator.
This fix is related to issue #1877, and prevents the OMPI library from
messing the user level random values.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-26 11:54:23 -04:00
George Bosilca
fbe6c22b90
Make sure the gather is called in all cases, and not
simply based on some local state. This is the second
part of the patch proposed for open-mpi/ompi#1183.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-26 11:52:47 -04:00
Jeff Squyres
d954167ecf Merge pull request #3881 from bharatpotnuri/master
master: btl/openib: Handle EOPNOTSUPP
2017-07-26 11:32:40 -04:00
Boris Karasev
cc348fdb40 configure: adds detect UCX by pkg-config
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-26 17:58:24 +06:00
Ralph Castain
6ebaed8c01 Restore support for user-provided cpulist
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 23:51:21 -07:00
Ralph Castain
7a83fdb9bb Update to hwloc 2.0.0a with shmem support.
Update to support passing of HWLOC shmem topology to client procs
Update use of distance API per @bgoglin
Have the openib component lookup its object in the distance matrix
Bring usnic up-to-date
Restore binding for hwloc2

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 20:26:22 -07:00
Ralph Castain
6fe5b36b50 Merge pull request #3963 from rhc54/topic/hwfix
Restore binding support
2017-07-25 22:09:04 -05:00
Ralph Castain
9211b5d86d Merge pull request #3961 from rhc54/topic/tool
Update the tools support so it allows tools to access PMIx
2017-07-25 21:06:07 -05:00
Ralph Castain
96f07aebfa Restore binding support
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 18:44:44 -07:00
Ralph Castain
0042c758f1 Update the tools support so it allows tools to access PMIx
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 17:10:08 -07:00
Boris Karasev
d917d54ddc configure: detect UCX support by default
Adds detecting UCX from following paths: "/usr /usr/local /opt/ucx"

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-25 23:48:49 +03:00