1
1
Граф коммитов

28067 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
8b8aae372d opal/asm: add atomic min/max convenience functions
This commit adds atomic functions for min/max.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-01-02 08:38:36 -07:00
Gilles Gouaillardet
e2808c9ccc
Merge pull request #4673 from ggouaillardet/topic/ompi_comm_split
ompi/communicator: optimize ompi_comm_split()
2017-12-28 16:23:00 +09:00
Gilles Gouaillardet
2dd345465f ompi/communicator: optimize ompi_comm_split()
set grp_local_rank as MPI_UNDEFINED before invoking
ompi_comm_nexcid() in order to benefit from the optimizations
introduced in open-mpi/ompi@68167ec879

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-28 15:34:04 +09:00
Yossi Itigin
70a2098493
Merge pull request #4655 from yosefe/topic/spml-ucx-fix-rkey-leak
spml_ucx: fix rkey leak
2017-12-27 14:52:31 +02:00
Yossi Itigin
1193e1eb83 spml_ucx: fix rkey leak
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-12-26 20:47:26 +02:00
Yossi Itigin
697a9437e2
Merge pull request #4666 from alex-mikheev/topic/pml_ucx_recv_fix
ompi: pml ucx: improve recv latency
2017-12-26 20:12:18 +02:00
Alex Mikheev
e7bf0617cf
ompi: pml ucx: improve recv latency
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-12-26 16:24:16 +02:00
KAWASHIMA Takahiro
91c4cad343
Merge pull request #4668 from kawashima-fj/pr/f08-set-cancelled
fortran: Call PMPI from PMPI_Status_set_cancelled_f08
2017-12-26 03:12:48 -06:00
KAWASHIMA Takahiro
bd2fe9c324 fortran: Call PMPI from PMPI_Status_set_cancelled_f08
This is a bug which was forgotten to change in c08f97b030.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-12-26 15:53:12 +09:00
Nathan Hjelm
39d598899b rcache/grdma: fix crash when part of a registration is unmapped
This commit fixes an issue when a registration is created for a large
region and then invalidated while part of it is in use.

References #4509

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-22 10:36:35 -07:00
KAWASHIMA Takahiro
4cd917a3fd
Merge pull request #4659 from yanagibashi/pr/fortran2008
Add missing Fortran 2008 binding subroutines
2017-12-22 01:53:36 -06:00
Tsubasa Yanagibashi
3f4b373856 Add missing Fortran 2008 binding subroutines
added missing Fortran 2008 binding pmpi_{*} subroutines to Open MPI.

Signed-off-by: Tsubasa Yanagibashi <fj2505dt@aa.jp.fujitsu.com>
2017-12-22 13:45:53 +09:00
Matias Cabral
a600525836
Merge pull request #4653 from aravindksg/ofi_regression_fix
MTL OFI: Allow retries in MTL progress for interrupted syscalls
2017-12-21 10:55:40 -08:00
Aravind Gopalakrishnan
fb68726baf MTL OFI: Allow retries in MTL progress for interrupted syscalls
This fixes a regression in sockets provider which could return -EINTR value
from fi_cq_read() due to a syscall being interrupted. The error value is
currently interpreted as fatal condition. Relax the rule so that we can retry
fi_cq_read() operation.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2017-12-20 14:58:49 -08:00
Ralph Castain
3f7494aec1
Merge pull request #4648 from rhc54/topic/cleanup
Silence warnings in optimized build
2017-12-20 14:00:22 -08:00
Ralph Castain
d5471d7898 Silence warnings in optimized build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-20 12:00:28 -08:00
Ralph Castain
ad96fa19d4
Merge pull request #4642 from rhc54/topic/validate
Detect/warn of illegal node names
2017-12-20 10:18:43 -08:00
Brian Barrett
465842294f doc: Add README note about ARM/POWER hangs
As documented in #4563 and #3697, there is an issue on ARM and
POWER platforms when the atomic fifo assembly isn't inlined,
which manifests as a hang.  Document the issue and the
work-around until a proper fix is committed.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-12-20 09:47:27 -08:00
Ralph Castain
1687b04c9e
Merge pull request #4645 from rhc54/topic/debug
Remove debug from rmaps base
2017-12-20 07:53:51 -08:00
Ralph Castain
8a7a57d4e2 Remove debug from rmaps base
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-20 00:22:51 -08:00
Ralph Castain
3269f2de66 Detect/warn of illegal node names
If we detect that someone has given us an incorrect node name, provide a helpful message telling them as it is almost certainly a typo.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-19 12:55:04 -08:00
Ralph Castain
b37315658b
Merge pull request #4636 from rhc54/topic/attrs
Fix the optnone attribute, add extension attribute
2017-12-19 10:18:59 -08:00
Ralph Castain
ccc2fcdfdf
Merge pull request #4627 from ggouaillardet/topic/nidmap
orte/nidmap: correctly handle '-' as a valid hostname character
2017-12-19 09:09:58 -08:00
Ralph Castain
db8ebd33ad Fix the optnone attribute, add extension attribute
See how the various compilers handle these

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 19:18:53 -08:00
Ralph Castain
e9f4e93800
Merge pull request #4606 from rhc54/topic/register
Update to PMIx v3.0 PR for cleanup registration
2017-12-18 07:57:07 -08:00
Nathan Hjelm
47fd2313ab btl/vader: move backing files into /dev/shm on Linux
This commit moves the backing files to /dev/shm to avoid limitations
that may be set on /tmp. The files are registered with pmix to ensure
they are cleaned up after an erroneous exit.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 48101278160672317ade352365592f56ef3b8977)
2017-12-18 07:09:18 -08:00
Ralph Castain
07427c6d89 Update to PMIx v3.0 PR for cleanup registration
If available, have apps use registration capability to cleanup their session directories. Setup capability for vader to register its shared memory file location - let someone familiar with that code do so.

Final cleanup to track uid/gid, update the opal/pmix API to pass flags for ignore and leave top directory alone

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 06:53:11 -08:00
Ralph Castain
a863c26d6f
Merge pull request #4628 from rhc54/topic/treespawn
Fix the tree-spawn-with-rollup
2017-12-18 06:48:10 -08:00
Jeff Squyres
3d80794df2
Merge pull request #4631 from jedbrown/jed/doc-fix-mpi-attr-get
MPI_Attr_get: doc fix: MPI_Comm_create_attr -> MPI_Comm_get_attr
2017-12-17 11:23:03 -05:00
Jed Brown
533800070e MPI_Attr_get: doc fix: MPI_Comm_create_attr -> MPI_Comm_get_attr
MPI_Comm_create_attr does not exist.

Signed-off-by: Jed Brown <jed@jedbrown.org>
2017-12-17 07:44:22 -07:00
Ralph Castain
7a58f91ab9 Fix the tree-spawn-with-rollup
Somehow, the code for passing a daemon's parent was accidentally removed, thus breaking the tree-spawn callback sequence and causing all daemons to phone directly home. Note that this is noticeably slower than no-tree-spawn for small clusters where directly ssh launch of the child daemons from the HNP doesn't overload the available file descriptors.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-15 16:03:43 -08:00
Gilles Gouaillardet
f3e2a313af orte/nidmap: correctly handle '-' as a valid hostname character
'-' is not an alpha character nor a digit, but it is a valid hostname
character and should be handled as an alpha character, otherwise, nodes
such as node-001 do not get "compressed" in the regex.

Refs open-mpi/ompi#4621

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-15 15:28:50 +09:00
Brian Barrett
ea35820246 dist: Update NEWS to match release branches
Pull in changes from the v2.0x, v2.x, and v3.0.x release branches
so that master includes all items from released releases.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-12-14 15:11:11 -08:00
Ralph Castain
fc1acea533
Merge pull request #4626 from rhc54/topic/optnone
Add the __optnone__ attribute
2017-12-14 15:01:56 -08:00
Ralph Castain
5c4185abd8 Add the __optnone__ attribute to help avoid optimizing out MPIR_Breakpoint
Thanks to @kiranchandramohan for the suggestion

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-14 13:14:21 -08:00
Ralph Castain
6ec4ab57fc
Merge pull request #4620 from rhc54/topic/closefd
Close the shmemfd to avoid leaking it
2017-12-13 11:09:35 -08:00
Ralph Castain
cfa810f125 Close the shmemfd to avoid leaking it
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-13 10:20:05 -08:00
Ralph Castain
c4501185b7
Merge pull request #4614 from rhc54/topic/hwloc
Silence error messages and ensure we still support binding
2017-12-12 12:57:07 -08:00
Ralph Castain
9a7b0d8d9c
Merge pull request #4586 from rhc54/topic/addhosts
Fix add-host support by including the location for procs of prior jobs when spawning new daemons.
2017-12-12 12:45:57 -08:00
Ralph Castain
84c51847b1 Silence error messages and ensure we still support binding, even if shmem support for hwloc isn't available
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-12 12:11:26 -08:00
Nathan Hjelm
d3fa1bbbb0 rcache/grdma: try to prevent erroneous free error messages
It is possible to have parts of an in-use registered region be passed
to munmap or madvise. This does not necessarily mean the user has made
an error but does mean the entire region should be invalidated. This
commit checks that the munmap or madvise base matches the beginning of
the cached region. If it does and the region is in-use then we print
an error. There will certainly be false-negatives where a user
unmaps something that really is in-use but that is preferrable to a
false-positive.

References #4509

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-12 09:18:39 -07:00
Nathan Hjelm
1e630f4a46 configure: check for C11 and atomic types
This commit updates the configure code for Open MPI to check for C11
support. The features requested are: atomics and thread local
storage.

References #3879

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-11 15:27:12 -07:00
Nathan Hjelm
a82f761a4a btl/vader: change the way fast boxes are used
There were multiple paths that could lead to a fast box
allocation. One of them made little sense (in-place send) so it has
been removed to allow a rework of the fast-box send function. This
should fix a number of issues with hanging/crashing when using the
vader btl.

References #4260

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-11 10:38:33 -07:00
Gilles Gouaillardet
3e12cfff03
Merge pull request #4593 from ggouaillardet/topic/USEMPIF08_MOD
mpiext: fix path to Fortran 2008 modules
2017-12-11 11:49:50 +09:00
Gilles Gouaillardet
794cc09d3e mpiext: fix path to Fortran 2008 modules
OMPI_FORTRAN_USEMPIF08_MOD macro was removed in open-mpi/ompi@791bcee6c0
so this macro is now manually expanded to mpi/fortran/use-mpi-f08/mod

Thanks to Nathan T. Weeks for reporting

Refs open-mpi/ompi#3605

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-11 11:02:43 +09:00
Ralph Castain
f9b15797df
Merge pull request #4585 from rhc54/topic/signal
Ensure we don't send a kill signal to pid=0 as that hits ourselves and initiates an infinite loop.
2017-12-07 16:30:55 -08:00
Ralph Castain
4316213805 Fix add-host support by including the location for procs of prior jobs when spawning new daemons.
Thanks to CalugaruVaxile for the report

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-07 14:48:58 -08:00
Ralph Castain
ee2a93cb2e Ensure we don't send a kill signal to pid=0 as that hits ourselves and initiates an infinite loop.
Thanks to Michael Fenn for the report.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-07 10:38:11 -08:00
Jeff Squyres
0d1c58853b
Merge pull request #4579 from bwbarrett/examples-build
build: Clean up flag handling for examples
2017-12-07 08:29:30 -08:00
Gilles Gouaillardet
11e5f86bf8 mpool/base: plug a memory leak
set the key of all mpool_tree_item objects, so they can be retrieved
in mpool_base_free and then returned back to the
mca_mpool_base_tree_item_free_list free list.

Refs. open-mpi/ompi#4567

Thanks Philip Blakely for the bug report.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-07 09:06:25 +09:00