1
1

28039 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
a863c26d6f
Merge pull request #4628 from rhc54/topic/treespawn
Fix the tree-spawn-with-rollup
2017-12-18 06:48:10 -08:00
Jeff Squyres
3d80794df2
Merge pull request #4631 from jedbrown/jed/doc-fix-mpi-attr-get
MPI_Attr_get: doc fix: MPI_Comm_create_attr -> MPI_Comm_get_attr
2017-12-17 11:23:03 -05:00
Jed Brown
533800070e MPI_Attr_get: doc fix: MPI_Comm_create_attr -> MPI_Comm_get_attr
MPI_Comm_create_attr does not exist.

Signed-off-by: Jed Brown <jed@jedbrown.org>
2017-12-17 07:44:22 -07:00
Ralph Castain
7a58f91ab9 Fix the tree-spawn-with-rollup
Somehow, the code for passing a daemon's parent was accidentally removed, thus breaking the tree-spawn callback sequence and causing all daemons to phone directly home. Note that this is noticeably slower than no-tree-spawn for small clusters where directly ssh launch of the child daemons from the HNP doesn't overload the available file descriptors.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-15 16:03:43 -08:00
Brian Barrett
ea35820246 dist: Update NEWS to match release branches
Pull in changes from the v2.0x, v2.x, and v3.0.x release branches
so that master includes all items from released releases.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-12-14 15:11:11 -08:00
Ralph Castain
fc1acea533
Merge pull request #4626 from rhc54/topic/optnone
Add the __optnone__ attribute
2017-12-14 15:01:56 -08:00
Ralph Castain
5c4185abd8 Add the __optnone__ attribute to help avoid optimizing out MPIR_Breakpoint
Thanks to @kiranchandramohan for the suggestion

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-14 13:14:21 -08:00
Ralph Castain
6ec4ab57fc
Merge pull request #4620 from rhc54/topic/closefd
Close the shmemfd to avoid leaking it
2017-12-13 11:09:35 -08:00
Ralph Castain
cfa810f125 Close the shmemfd to avoid leaking it
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-13 10:20:05 -08:00
Ralph Castain
c4501185b7
Merge pull request #4614 from rhc54/topic/hwloc
Silence error messages and ensure we still support binding
2017-12-12 12:57:07 -08:00
Ralph Castain
9a7b0d8d9c
Merge pull request #4586 from rhc54/topic/addhosts
Fix add-host support by including the location for procs of prior jobs when spawning new daemons.
2017-12-12 12:45:57 -08:00
Ralph Castain
84c51847b1 Silence error messages and ensure we still support binding, even if shmem support for hwloc isn't available
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-12 12:11:26 -08:00
Nathan Hjelm
d3fa1bbbb0 rcache/grdma: try to prevent erroneous free error messages
It is possible to have parts of an in-use registered region be passed
to munmap or madvise. This does not necessarily mean the user has made
an error but does mean the entire region should be invalidated. This
commit checks that the munmap or madvise base matches the beginning of
the cached region. If it does and the region is in-use then we print
an error. There will certainly be false-negatives where a user
unmaps something that really is in-use but that is preferrable to a
false-positive.

References #4509

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-12 09:18:39 -07:00
Nathan Hjelm
1e630f4a46 configure: check for C11 and atomic types
This commit updates the configure code for Open MPI to check for C11
support. The features requested are: atomics and thread local
storage.

References #3879

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-11 15:27:12 -07:00
Nathan Hjelm
a82f761a4a btl/vader: change the way fast boxes are used
There were multiple paths that could lead to a fast box
allocation. One of them made little sense (in-place send) so it has
been removed to allow a rework of the fast-box send function. This
should fix a number of issues with hanging/crashing when using the
vader btl.

References #4260

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-11 10:38:33 -07:00
Gilles Gouaillardet
3e12cfff03
Merge pull request #4593 from ggouaillardet/topic/USEMPIF08_MOD
mpiext: fix path to Fortran 2008 modules
2017-12-11 11:49:50 +09:00
Gilles Gouaillardet
794cc09d3e mpiext: fix path to Fortran 2008 modules
OMPI_FORTRAN_USEMPIF08_MOD macro was removed in open-mpi/ompi@791bcee6c0
so this macro is now manually expanded to mpi/fortran/use-mpi-f08/mod

Thanks to Nathan T. Weeks for reporting

Refs open-mpi/ompi#3605

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-11 11:02:43 +09:00
Ralph Castain
f9b15797df
Merge pull request #4585 from rhc54/topic/signal
Ensure we don't send a kill signal to pid=0 as that hits ourselves and initiates an infinite loop.
2017-12-07 16:30:55 -08:00
Ralph Castain
4316213805 Fix add-host support by including the location for procs of prior jobs when spawning new daemons.
Thanks to CalugaruVaxile for the report

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-07 14:48:58 -08:00
Ralph Castain
ee2a93cb2e Ensure we don't send a kill signal to pid=0 as that hits ourselves and initiates an infinite loop.
Thanks to Michael Fenn for the report.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-07 10:38:11 -08:00
Jeff Squyres
0d1c58853b
Merge pull request #4579 from bwbarrett/examples-build
build: Clean up flag handling for examples
2017-12-07 08:29:30 -08:00
Gilles Gouaillardet
11e5f86bf8 mpool/base: plug a memory leak
set the key of all mpool_tree_item objects, so they can be retrieved
in mpool_base_free and then returned back to the
mca_mpool_base_tree_item_free_list free list.

Refs. open-mpi/ompi#4567

Thanks Philip Blakely for the bug report.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-07 09:06:25 +09:00
Brian Barrett
97294dca4a build: Clean up flag handling for examples
Fix ability to build examples from 32-bit builds.  Remove the implicit
rule usage, so that we know what flags are being used.  Make the override
of the FLAGS variables additive so that we don't wipe out FLAGS variables
set in the environment.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-12-06 19:43:41 +00:00
Nathan Hjelm
2e74befa13
Merge pull request #4565 from benmenadue/master
Use malloc instead of posix_memalign for small (<= sizeof(void *)) alignments
2017-12-05 16:35:24 -07:00
Nathan Hjelm
ad59b93266
Merge pull request #4566 from kawashima-fj/pr/arm64-atomic
opal/asm/arm64: Fix `opal_atomic_compare_exchange_*` bug
2017-12-05 16:34:51 -07:00
Nathan Hjelm
8e0e184bc9 opal/asm: fix compilation of 128-bit compare-exchange with gcc7
This commit removes eax and edx from the clobber list. Older versions
of gcc handled these ok but gcc 7 does not. They are not required as
eax and edx are specified in output constraints.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-05 15:42:47 -07:00
Ben Menadue
90fa8af10b Use correct alignment request in mca_mpool_base_alloc.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2017-12-06 07:02:17 +11:00
Nathan Hjelm
641bdc4ab7 opal/asm: fix 32-bit build
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-05 11:49:13 -07:00
KAWASHIMA Takahiro
08254e8b12 opal/asm/arm64: Fix opal_atomic_compare_exchange_* bug
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-12-05 15:57:29 +09:00
Ben Menadue
db3e25edad Update mca_mpool_base_alloc to use malloc instead of posix_memalign for alignment requests of <= sizeof(void *). This works around issue #4564.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2017-12-05 09:51:31 +11:00
Matias Cabral
2c86b8723d
Merge pull request #4510 from matcabral/mtl_psm2_shadow_vars
New flag for MCA parameters that allows a behaving with a default value of "unset".
2017-12-04 12:25:37 -08:00
Howard Pritchard
b160cf6339
Merge pull request #4533 from hppritcha/topic/ofi_mtl_mprobe_fixes
mtl/ofi: fix problem with mprobe/mrecv
2017-12-04 09:11:47 -07:00
Howard Pritchard
2233e44848
Merge pull request #4534 from hppritcha/topic/fix_a_segv_in_request
pml/cm: check for request comp. before completing bsend
2017-12-04 09:09:41 -07:00
Ralph Castain
452b1ca736
Merge pull request #4562 from ggouaillardet/topic/odls_base_num_threads
odls/base: fix handling of the odls_base_num_threads MCA param
2017-12-04 05:39:49 -08:00
Gilles Gouaillardet
4a481f66e6 odls/base: fix orte_odls_base_harvest_threads()
Do not try to finalize odls progress threads if they have not been started yet

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-04 15:18:04 +09:00
Gilles Gouaillardet
d062db1a98 sync_builtin: fix misc typos
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-04 11:59:50 +09:00
Gilles Gouaillardet
3496897961 odls/base: fix handling of the odls_base_num_threads MCA param
If a number of odls threads is explicitly required, then use
that number no matter what.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-04 11:19:25 +09:00
Gilles Gouaillardet
2f5b1e9fe0
Merge pull request #4551 from ggouaillardet/topic/communicator_mutex_c_lock
Make usage of ompi_communicator_t, ompi_file_t and ompi_win_t mutex consistent
2017-12-04 09:20:52 +09:00
Edgar Gabriel
710fb72afa
Merge pull request #4559 from edgargabriel/topic/disable-amode-overwrite
io/ompio: introduce a new function to retrieve mca parameter values
2017-12-01 13:28:15 -06:00
Edgar Gabriel
1f151be6d2 io/ompio: introduce a new function to retrieve mca parameter values
ompio has the unique problem, that mca parameters set in the io/ompio component
have to be accessible from other frameworks as well. This is mostly done to avoid
a replication in the parameter names and to reduce the number of mca parameters that
and end-user has to worry about.

This commit introduces a generic function to retrieve ompio mca parameters, the function pointer
is stored on the file handle. It replaces two functions that used the same concept already for
one parameter each.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-12-01 10:00:23 -06:00
Gilles Gouaillardet
b8e77ba759 mpi/c: use OPAL_THREAD[UN]LOCK() instead of opal_mutex_[un]lock()
in order to keep consistency between ompi_communicator_t, ompi_file_t
and ompi_win_t.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-01 16:06:32 +09:00
Gilles Gouaillardet
1ba4c185bc ompi/communicator: destruct ompi_communicator_t's c_lock in the destructor
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-01 16:06:27 +09:00
Gilles Gouaillardet
5f1a967351 ompi/file: rename ompi_file_t's f_mutex into f_lock
in order to use a consistent name between ompi_file_t,
ompi_win_t and ompi_communicator_t

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-01 16:06:22 +09:00
Gilles Gouaillardet
a4755b694b odls/pspawn: record the pid of the spawn'ed process
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-01 15:00:50 +09:00
Ralph Castain
4225b89c47
Merge pull request #4556 from rhc54/topic/pspawn
Add a new posix_spawn component to the ODLS framework.
2017-11-30 20:23:23 -08:00
bosilca
5cb72aa568
Merge pull request #4552 from hjelmn/asm_cleanup2
Add atomic fetch-and-op and compare-exchange functions
2017-11-30 22:29:38 -05:00
Ralph Castain
b5bf0a7f1d Add a new posix_spawn component to the ODLS framework.
Only selectable when specifically requested via "-mca odls pspawn"

Note that there are several concerns:
  * we aren't getting SIGCHLD calls when the procs terminate
  * we aren't seeing the IO pipes close on termination, though
    we are getting output forwarded to mpirun
  * I haven't found a way to bind the child process prior to exec.
    If we want to use this method, we probably need someone to
    implement a cgroup component for the orte/rtc framework

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-11-30 18:01:31 -08:00
Nathan Hjelm
7893248c5a opal/asm: add fetch-and-op atomics
This commit adds support for fetch-and-op atomics. This is needed
because and and or are irreversible operations so there needs to be a
way to get the old value atomically. These are also the only semantics
supported by C11 (there is not atomic_op_fetch, just
atomic_fetch_op). The old op-and-fetch atomics have been defined in
terms of fetch-and-op.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:23 -07:00
Nathan Hjelm
1282e98a01 opal/asm: rename existing arithmetic atomic functions
This commit renames the arithmetic atomic operations in opal to
indicate that they return the new value not the old value. This naming
differentiates these routines from new functions that return the old
value.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:22 -07:00
Nathan Hjelm
9d0b3fe9f4 opal/asm: remove opal_atomic_bool_cmpset functions
This commit eliminates the old opal_atomic_bool_cmpset functions. They
have been replaced by the opal_atomic_compare_exchange_strong
functions.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:22 -07:00