1
1
Граф коммитов

28051 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
ad96fa19d4
Merge pull request #4642 from rhc54/topic/validate
Detect/warn of illegal node names
2017-12-20 10:18:43 -08:00
Brian Barrett
465842294f doc: Add README note about ARM/POWER hangs
As documented in #4563 and #3697, there is an issue on ARM and
POWER platforms when the atomic fifo assembly isn't inlined,
which manifests as a hang.  Document the issue and the
work-around until a proper fix is committed.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-12-20 09:47:27 -08:00
Ralph Castain
1687b04c9e
Merge pull request #4645 from rhc54/topic/debug
Remove debug from rmaps base
2017-12-20 07:53:51 -08:00
Ralph Castain
8a7a57d4e2 Remove debug from rmaps base
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-20 00:22:51 -08:00
Ralph Castain
3269f2de66 Detect/warn of illegal node names
If we detect that someone has given us an incorrect node name, provide a helpful message telling them as it is almost certainly a typo.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-19 12:55:04 -08:00
Ralph Castain
b37315658b
Merge pull request #4636 from rhc54/topic/attrs
Fix the optnone attribute, add extension attribute
2017-12-19 10:18:59 -08:00
Ralph Castain
ccc2fcdfdf
Merge pull request #4627 from ggouaillardet/topic/nidmap
orte/nidmap: correctly handle '-' as a valid hostname character
2017-12-19 09:09:58 -08:00
Ralph Castain
db8ebd33ad Fix the optnone attribute, add extension attribute
See how the various compilers handle these

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 19:18:53 -08:00
Ralph Castain
e9f4e93800
Merge pull request #4606 from rhc54/topic/register
Update to PMIx v3.0 PR for cleanup registration
2017-12-18 07:57:07 -08:00
Nathan Hjelm
47fd2313ab btl/vader: move backing files into /dev/shm on Linux
This commit moves the backing files to /dev/shm to avoid limitations
that may be set on /tmp. The files are registered with pmix to ensure
they are cleaned up after an erroneous exit.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 48101278160672317ade352365592f56ef3b8977)
2017-12-18 07:09:18 -08:00
Ralph Castain
07427c6d89 Update to PMIx v3.0 PR for cleanup registration
If available, have apps use registration capability to cleanup their session directories. Setup capability for vader to register its shared memory file location - let someone familiar with that code do so.

Final cleanup to track uid/gid, update the opal/pmix API to pass flags for ignore and leave top directory alone

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 06:53:11 -08:00
Ralph Castain
a863c26d6f
Merge pull request #4628 from rhc54/topic/treespawn
Fix the tree-spawn-with-rollup
2017-12-18 06:48:10 -08:00
Jeff Squyres
3d80794df2
Merge pull request #4631 from jedbrown/jed/doc-fix-mpi-attr-get
MPI_Attr_get: doc fix: MPI_Comm_create_attr -> MPI_Comm_get_attr
2017-12-17 11:23:03 -05:00
Jed Brown
533800070e MPI_Attr_get: doc fix: MPI_Comm_create_attr -> MPI_Comm_get_attr
MPI_Comm_create_attr does not exist.

Signed-off-by: Jed Brown <jed@jedbrown.org>
2017-12-17 07:44:22 -07:00
Ralph Castain
7a58f91ab9 Fix the tree-spawn-with-rollup
Somehow, the code for passing a daemon's parent was accidentally removed, thus breaking the tree-spawn callback sequence and causing all daemons to phone directly home. Note that this is noticeably slower than no-tree-spawn for small clusters where directly ssh launch of the child daemons from the HNP doesn't overload the available file descriptors.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-15 16:03:43 -08:00
Gilles Gouaillardet
f3e2a313af orte/nidmap: correctly handle '-' as a valid hostname character
'-' is not an alpha character nor a digit, but it is a valid hostname
character and should be handled as an alpha character, otherwise, nodes
such as node-001 do not get "compressed" in the regex.

Refs open-mpi/ompi#4621

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-15 15:28:50 +09:00
Brian Barrett
ea35820246 dist: Update NEWS to match release branches
Pull in changes from the v2.0x, v2.x, and v3.0.x release branches
so that master includes all items from released releases.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-12-14 15:11:11 -08:00
Ralph Castain
fc1acea533
Merge pull request #4626 from rhc54/topic/optnone
Add the __optnone__ attribute
2017-12-14 15:01:56 -08:00
Ralph Castain
5c4185abd8 Add the __optnone__ attribute to help avoid optimizing out MPIR_Breakpoint
Thanks to @kiranchandramohan for the suggestion

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-14 13:14:21 -08:00
Ralph Castain
6ec4ab57fc
Merge pull request #4620 from rhc54/topic/closefd
Close the shmemfd to avoid leaking it
2017-12-13 11:09:35 -08:00
Ralph Castain
cfa810f125 Close the shmemfd to avoid leaking it
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-13 10:20:05 -08:00
Ralph Castain
c4501185b7
Merge pull request #4614 from rhc54/topic/hwloc
Silence error messages and ensure we still support binding
2017-12-12 12:57:07 -08:00
Ralph Castain
9a7b0d8d9c
Merge pull request #4586 from rhc54/topic/addhosts
Fix add-host support by including the location for procs of prior jobs when spawning new daemons.
2017-12-12 12:45:57 -08:00
Ralph Castain
84c51847b1 Silence error messages and ensure we still support binding, even if shmem support for hwloc isn't available
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-12 12:11:26 -08:00
Nathan Hjelm
d3fa1bbbb0 rcache/grdma: try to prevent erroneous free error messages
It is possible to have parts of an in-use registered region be passed
to munmap or madvise. This does not necessarily mean the user has made
an error but does mean the entire region should be invalidated. This
commit checks that the munmap or madvise base matches the beginning of
the cached region. If it does and the region is in-use then we print
an error. There will certainly be false-negatives where a user
unmaps something that really is in-use but that is preferrable to a
false-positive.

References #4509

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-12 09:18:39 -07:00
Nathan Hjelm
1e630f4a46 configure: check for C11 and atomic types
This commit updates the configure code for Open MPI to check for C11
support. The features requested are: atomics and thread local
storage.

References #3879

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-11 15:27:12 -07:00
Nathan Hjelm
a82f761a4a btl/vader: change the way fast boxes are used
There were multiple paths that could lead to a fast box
allocation. One of them made little sense (in-place send) so it has
been removed to allow a rework of the fast-box send function. This
should fix a number of issues with hanging/crashing when using the
vader btl.

References #4260

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-11 10:38:33 -07:00
Gilles Gouaillardet
3e12cfff03
Merge pull request #4593 from ggouaillardet/topic/USEMPIF08_MOD
mpiext: fix path to Fortran 2008 modules
2017-12-11 11:49:50 +09:00
Gilles Gouaillardet
794cc09d3e mpiext: fix path to Fortran 2008 modules
OMPI_FORTRAN_USEMPIF08_MOD macro was removed in open-mpi/ompi@791bcee6c0
so this macro is now manually expanded to mpi/fortran/use-mpi-f08/mod

Thanks to Nathan T. Weeks for reporting

Refs open-mpi/ompi#3605

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-11 11:02:43 +09:00
Ralph Castain
f9b15797df
Merge pull request #4585 from rhc54/topic/signal
Ensure we don't send a kill signal to pid=0 as that hits ourselves and initiates an infinite loop.
2017-12-07 16:30:55 -08:00
Ralph Castain
4316213805 Fix add-host support by including the location for procs of prior jobs when spawning new daemons.
Thanks to CalugaruVaxile for the report

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-07 14:48:58 -08:00
Ralph Castain
ee2a93cb2e Ensure we don't send a kill signal to pid=0 as that hits ourselves and initiates an infinite loop.
Thanks to Michael Fenn for the report.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-07 10:38:11 -08:00
Jeff Squyres
0d1c58853b
Merge pull request #4579 from bwbarrett/examples-build
build: Clean up flag handling for examples
2017-12-07 08:29:30 -08:00
Gilles Gouaillardet
11e5f86bf8 mpool/base: plug a memory leak
set the key of all mpool_tree_item objects, so they can be retrieved
in mpool_base_free and then returned back to the
mca_mpool_base_tree_item_free_list free list.

Refs. open-mpi/ompi#4567

Thanks Philip Blakely for the bug report.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-07 09:06:25 +09:00
Brian Barrett
97294dca4a build: Clean up flag handling for examples
Fix ability to build examples from 32-bit builds.  Remove the implicit
rule usage, so that we know what flags are being used.  Make the override
of the FLAGS variables additive so that we don't wipe out FLAGS variables
set in the environment.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-12-06 19:43:41 +00:00
Nathan Hjelm
2e74befa13
Merge pull request #4565 from benmenadue/master
Use malloc instead of posix_memalign for small (<= sizeof(void *)) alignments
2017-12-05 16:35:24 -07:00
Nathan Hjelm
ad59b93266
Merge pull request #4566 from kawashima-fj/pr/arm64-atomic
opal/asm/arm64: Fix `opal_atomic_compare_exchange_*` bug
2017-12-05 16:34:51 -07:00
Nathan Hjelm
8e0e184bc9 opal/asm: fix compilation of 128-bit compare-exchange with gcc7
This commit removes eax and edx from the clobber list. Older versions
of gcc handled these ok but gcc 7 does not. They are not required as
eax and edx are specified in output constraints.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-05 15:42:47 -07:00
Ben Menadue
90fa8af10b Use correct alignment request in mca_mpool_base_alloc.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2017-12-06 07:02:17 +11:00
Nathan Hjelm
641bdc4ab7 opal/asm: fix 32-bit build
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-05 11:49:13 -07:00
KAWASHIMA Takahiro
08254e8b12 opal/asm/arm64: Fix opal_atomic_compare_exchange_* bug
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-12-05 15:57:29 +09:00
Ben Menadue
db3e25edad Update mca_mpool_base_alloc to use malloc instead of posix_memalign for alignment requests of <= sizeof(void *). This works around issue #4564.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2017-12-05 09:51:31 +11:00
Matias Cabral
2c86b8723d
Merge pull request #4510 from matcabral/mtl_psm2_shadow_vars
New flag for MCA parameters that allows a behaving with a default value of "unset".
2017-12-04 12:25:37 -08:00
Howard Pritchard
b160cf6339
Merge pull request #4533 from hppritcha/topic/ofi_mtl_mprobe_fixes
mtl/ofi: fix problem with mprobe/mrecv
2017-12-04 09:11:47 -07:00
Howard Pritchard
2233e44848
Merge pull request #4534 from hppritcha/topic/fix_a_segv_in_request
pml/cm: check for request comp. before completing bsend
2017-12-04 09:09:41 -07:00
Ralph Castain
452b1ca736
Merge pull request #4562 from ggouaillardet/topic/odls_base_num_threads
odls/base: fix handling of the odls_base_num_threads MCA param
2017-12-04 05:39:49 -08:00
Gilles Gouaillardet
4a481f66e6 odls/base: fix orte_odls_base_harvest_threads()
Do not try to finalize odls progress threads if they have not been started yet

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-04 15:18:04 +09:00
Gilles Gouaillardet
d062db1a98 sync_builtin: fix misc typos
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-04 11:59:50 +09:00
Gilles Gouaillardet
3496897961 odls/base: fix handling of the odls_base_num_threads MCA param
If a number of odls threads is explicitly required, then use
that number no matter what.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-04 11:19:25 +09:00
Gilles Gouaillardet
2f5b1e9fe0
Merge pull request #4551 from ggouaillardet/topic/communicator_mutex_c_lock
Make usage of ompi_communicator_t, ompi_file_t and ompi_win_t mutex consistent
2017-12-04 09:20:52 +09:00