1
1
Граф коммитов

29027 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
080e20fa02 mtl/psm2: fix a misc memory leak
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@316e4e38f4)
2018-09-10 09:17:54 +09:00
Gilles Gouaillardet
baf41aceed pmix/pmix3x: plug a memory leak in external_register()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@aeddd2f249)
2018-09-10 09:17:31 +09:00
Gilles Gouaillardet
8015e6f929 pmix/base: plug a memory leak in opal_pmix_base_select()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@6e47c5708e)
2018-09-10 09:16:56 +09:00
Howard Pritchard
71d3afdc59
Merge pull request #5657 from gpaulsen/gpaulsen_v4_vers2
Updating VERSION for shared libs for v4.0.0
2018-09-09 14:29:35 -06:00
Geoff Paulsen
60c16ca9d2
Merge pull request #5640 from hoopoepg/topic/suppress-shmem-wait-until-warning-v4.0
OSHMEM: removed incorrect pshmem_wait_until macro redefinition - v4.0
2018-09-08 09:11:57 -05:00
Geoffrey Paulsen
449020aeaa Updating VERSION for shared libs for v4.0.0
This was done after discussions with core developers about any
  potential ABI breakage for any of the libs the user directly
  links against.  Also compaitiblity tests were done using the
  ibm test suite and building with v3.1.x and running with v4.0.x
  see: https://github.com/open-mpi/ompi/issues/5447

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2018-09-08 08:38:37 -05:00
Geoff Paulsen
954692f06e
Merge pull request #5614 from karasevb/v4.0.x_fix_hwloc_numa_obj
v4.0.x: Fixed the NUMA obj detection for hwloc ver >= 2.0.0
2018-09-07 14:49:26 -05:00
Geoff Paulsen
b3d6adbf22
Merge pull request #5655 from hppritcha/topic/update_readme
README: update for 4.0
2018-09-07 14:37:51 -05:00
Geoff Paulsen
a0b4fcce6f
Merge pull request #5649 from hjelmn/v4.0.x_btl_uct_fix
btl/uct: add missing opal_mem_hooks_unregister_release call
2018-09-07 14:02:36 -05:00
Jeff Squyres
38c7364896 btl/openib: don't complain about no NICs
Since openib is on its long, slow way out the door, don't let it
complain about not being able to find any NICs at run time.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 098ec55e37)
2018-09-07 12:01:33 -07:00
Geoff Paulsen
100f7c1f78
Merge pull request #5654 from hppritcha/topic/NEWS_update2_for_400
NEWS: part II of NEWS update for 4.0.0
2018-09-07 13:57:23 -05:00
Howard Pritchard
4d5df45bc8 NEWS: part II of NEWS update for 4.0.0
[skip ci]

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-09-07 12:50:21 -06:00
Howard Pritchard
2743fadfec README: update for 4.0
some clean up of old cruft for configure options that
we should have cleaned up a while ago.

[skip ci]

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-09-07 12:39:45 -06:00
Howard Pritchard
69f632b8f0
Merge pull request #5619 from jsquyres/pr/v4.0.x/atomic-128-fixes
v4.0.x: atomic 128 fixes
2018-09-06 18:40:55 -06:00
matcabral
8fa172e60b MTL PSM2: Remove shadow variables from v4.0.x
As agreed on #4574, where removed in past release branches
to avoid perfomance impacts in the default values for
some paramters.

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2018-09-05 18:44:40 -04:00
Nathan Hjelm
2fb1a5e1b2 btl/uct: add missing opal_mem_hooks_unregister_release call
This commit fixes a bug when using the UCT btl with the UCX memory
hooks disabled. We were misssing a call to
opal_mem_hooks_unregister_release to remove the btl memory hook
callback.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 36c206d2d6)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-05 14:02:58 -06:00
Howard Pritchard
b6dafb6b90
Merge pull request #5636 from jsquyres/pr/v4.0.x/verbs-usnic-configury-moar-strictness
v4.0.x: make common/verbs-usnic actually check if it can compile
2018-09-04 09:13:12 -06:00
Howard Pritchard
7e10bc0833
Merge pull request #5607 from edgargabriel/pr/sharedfp-naming-conflict-v4.0
sharedfp/sm and lockedfile: fix naming bug
2018-09-02 16:03:14 -04:00
Sergey Oblomov
6f9da0c3d5 OSHMEM: removed incorrect pshmem_wait_until macro redefinition
- fixes https://github.com/open-mpi/ompi/issues/5585

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 7a5ff6a076)
2018-09-02 08:32:33 +03:00
Geoff Paulsen
3282c61048
Merge pull request #5625 from hoopoepg/topic/optimize-blocked-calls-v4.0
PML/UCX: blocked calls optimizations - v4.0
2018-08-31 14:11:11 -05:00
Geoff Paulsen
334748753c
Merge pull request #5626 from hoopoepg/topic/opal-mem-hooks-syno-v4.0
MCA/COMMON/UCX: added synonim to opal_mem_hook variable - v4.0
2018-08-31 14:09:14 -05:00
Geoff Paulsen
51e685ff40
Merge pull request #5622 from aravindksg/ofi_race_fix_40x
MTL OFI: Fix race condition due to global progress entries array
2018-08-31 14:07:42 -05:00
Geoff Paulsen
1bf00630b9
Merge pull request #5616 from edgargabriel/pr/file-delete-fs-framework-open-v4.0
io/base: fixes to file_delete selection logic
2018-08-31 14:02:22 -05:00
Geoff Paulsen
325e8d26e8
Merge pull request #5608 from hppritcha/topic/pr5600_to_v4.0.x
Deal with special case during cleanup
2018-08-31 14:00:04 -05:00
Geoff Paulsen
b2daa0001f
Merge pull request #5565 from rhc54/cmr40/pmix301
Update to PMIx 3.0.1
2018-08-31 13:58:41 -05:00
Geoff Paulsen
118f61c928
Merge pull request #5606 from hppritcha/topic/sync_news_for_4.0.x
NEWS: sync 4.0.x NEWS with 3.1.x
2018-08-31 13:57:48 -05:00
Howard Pritchard
d364553667
Merge pull request #5627 from hjelmn/v4.0.x_odls_alps_fix
odls/alps: resolve hang when launching with mpirun on Crays
2018-08-30 18:12:09 -04:00
Jeff Squyres
1d0e695bd6 README: Add note about --with-verbs-usnic
This option isn't needed on modern distros; add a note to README about
it.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 9a8b0d0e18)
2018-08-30 15:11:36 -07:00
Jeff Squyres
3e842348d1 common/verbs-usnic: check that it will actually compile
If someone specifies --with-verbs-usnic, actually do a configury check
to ensure that it will compile (vs. assuming that it will compile if
someone asks for it).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 05e5f61fe1)
2018-08-30 14:56:37 -07:00
Nathan Hjelm
4eeb41506c odls/alps: resolve hang when launching with mpirun on Crays
This commit removes some code that protected the odls/alps component
from closing alps file descriptors. For some unknown reason leaving
these file descriptors open causes can cause an orted to hang when
launching apps.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 98172163e6)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-29 09:22:37 -06:00
Sergey Oblomov
028bcb8a73 MCA/COMMON/UCX: added synonim to opal_mem_hook variable
- added synonim to common ucx variables to allow
  to print it in opal_info -a

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit e00f7a68ba)
2018-08-29 15:17:00 +03:00
Sergey Oblomov
9215eb9a3b PML/UCX: blocked calls optimizations
- refactoring of opal/UCX progress calls
- added UCX progress priority

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit b0f87f2235)
2018-08-29 14:38:22 +03:00
Aravind Gopalakrishnan
37d1a202be MTL OFI: Fix race condition due to global progress entries array
Since progress entries array is globally allocated, it is susceptible
to race conditions when using multi-threaded applications. Allocating it
on the stack resolves any potential races as it is thread local by default.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit ed2343034d)
2018-08-28 14:23:56 -07:00
Jeff Squyres
8a0f5c57f3 opal_functions.m4: minor typo fixes
Thanks to George for finding/fixing these.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 9194dbbe7b)
2018-08-28 12:02:22 -07:00
Jeff Squyres
55fd437d0f opal_config_asm.m4: replace tabs with spaces
Whitespace change only; no code or logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 63560fe9c4)
2018-08-28 12:02:21 -07:00
Jeff Squyres
420ffe7588 opal_config_asm.m4: Fix the detection of 128 bits atomics.
Thanks to Stefan Teleman for identifying this issue and providing a
proof-of-concept patch.  We ended up revamping the detection of
128-bit atomics to reduce duplicated code and be a slightly simpler --
albiet perhaps a bit more verbose -- approach:

- Remove the --enable-cross-* options; they were confusing and
  unnecessary.
- Always try to compile / link the compiler-intrinsic 128-bit atomic
  functions.
  - Strengthen the C tests we use to be more robust.
  - Use m4 to avoid duplicating the C tests multiple times in the .m4
    source.
- If not cross-compiling, try to run a short test and ensure that they
  actually work (as of Aug 2018, there's at least one platform where
  they don't: clang 6 on ARM64).  If cross-compiling, just assume that
  they work.
- Add more comments about what is going on with all the tests; it's
  tricky stuff.  Our Future Selves will thank us.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit ff9df91887)
2018-08-28 12:02:21 -07:00
Howard Pritchard
ea4d30b16f
Merge pull request #5601 from ggouaillardet/topic/v4.0.x/haiku
v4.0.x: add missing #ifdef protection around header files
2018-08-28 11:07:53 -04:00
Edgar Gabriel
2e3cf6fb12 io/base: fixes to file_delete selection logic
file_delete triggers underneath the hood the full component selection
logic, since we do not have a file handle, just a file name.

As part of the selection logic, we have to however initiate the
framework-open of the fs component in case of ompio, since ompio
will call the delete function of the selected fs componentn, which
is based on the file system where the file is located.

This was not handled correctly so far. The problem however only
shows up if the first I/O operatin to be executed is a file_delete,
other wise the file_open will lead to the correct opening and initialization
of the fs framework. This commit ensures that we do the right thing
even if file_delete is the first file I/O operation in the application.

Fixes issue #5611

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-08-28 08:18:59 -05:00
Boris Karasev
31ca3842da Fixed copyrights of prev commit.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
(cherry picked from commit beb0697f24)
2018-08-28 12:29:16 +03:00
Boris Karasev
d995fb1b3f Fixed the NUMA obj detection for hwloc ver >= 2.0.0
Since version hwloc 2.0.0 has a new organization of NUMA nodes on the
topology tree. This commit adds the detection of local NUMA object for
hwloc => 2.0.0, which fixes the procs bindings policy for rmaps mindist
component.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
(cherry picked from commit e5291ccc34)
2018-08-28 12:29:08 +03:00
Ralph Castain
5725e91c1a Deal with special case during cleanup
In some scenarios, we can have a daemon sharing the node with mpirun. In
those cases, we need to avoid race conditions in cleanup

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 8d1be27a1e)
2018-08-27 13:20:11 -06:00
Howard Pritchard
5e8e33706e
Merge pull request #5598 from rhc54/cmr40/root
v4.0.x: Allow run-as-root if 2 envars are set
2018-08-27 15:14:21 -04:00
Edgar Gabriel
a489a6fc9d sharedfp/sm and lockedfile: fix naming bug
If an application opens a file for reading from multiple processes
using MPI_COMM_SELF (or another communicator that has distinct
process groups but the same comm-id, as can happen as the result
of comm_split), the naming chosen for the lockedfile or the mmapped
file used by the sharedfp/sm component would collide. This patch
ensures that the filename is different by integrating the process id
of rank 0 for each sub-communicator.

This fixes one aspect of the problem reported in github issue 5593

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-08-27 14:11:03 -05:00
Howard Pritchard
221fc3ec66 NEWS: sync 4.0.x NEWS with 3.1.x
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-08-27 10:51:14 -06:00
Gilles Gouaillardet
2e4955427d test: protect <sys/mount.h> with the HAVE_SYS_MOUNT_H macro
Thanks Zoltan Mizsei for bringing this to our attention.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@a02be5e91a)
2018-08-27 09:49:29 +09:00
Zoltán Mizsei
b2628129fd fcntl include bugfix
Signed-off-by: Zoltán Mizsei <zmizsei@extrowerk.com>

(cherry picked from commit open-mpi/ompi@ac3f8a16ed)
2018-08-27 09:48:54 +09:00
Howard Pritchard
37440aca90
Merge pull request #5497 from markalle/apply_romio314_patch_to_v40x
v4.0.x: apply romio314 patch to romio321
2018-08-25 11:12:08 -04:00
Ralph Castain
b4ae5d005f Allow run-as-root if 2 envars are set
Per suggestion by @bangerth, allow mpirun to execute as root if two
envars are set to specific values

Per conversation with @jsquyres, name the envars OMPI_ALLOW_RUN_AS_ROOT
and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM

Fixes #4451

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 7f1444d5f9)
2018-08-24 20:24:22 -07:00
Jeff Squyres
4fd51a1563
Merge pull request #5592 from hjelmn/v4.0.x_sc_emu
btl/vader: clean up debuging and squash warning
2018-08-23 17:15:09 -07:00
Nathan Hjelm
eba44d3709 btl/vader: clean up debuging and squash warning
References #5512

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit c74cf666a9)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-23 15:53:43 -06:00