1
1

29197 Коммитов

Автор SHA1 Сообщение Дата
Boris Karasev
beb0697f24 Fixed copyrights of prev commit.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-08-27 09:50:11 +03:00
Gilles Gouaillardet
1665b8db8f
Merge pull request #5519 from extrowerk/haiku_patches
fcntl include bugfix
2018-08-27 09:46:43 +09:00
Sergey Oblomov
b72dd83f05 MCA/COMMON/UCX: added synonims for common ucx variables
- added synonims for atomic/osc modules

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-26 18:25:21 +03:00
Ralph Castain
8d1be27a1e Deal with special case during cleanup
In some scenarios, we can have a daemon sharing the node with mpirun. In
those cases, we need to avoid race conditions in cleanup

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-08-25 07:45:28 -07:00
Jeff Squyres
9194dbbe7b opal_functions.m4: minor typo fixes
Thanks to George for finding/fixing these.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-25 05:21:24 -07:00
Jeff Squyres
63560fe9c4 opal_config_asm.m4: replace tabs with spaces
Whitespace change only; no code or logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-25 05:21:24 -07:00
Jeff Squyres
ff9df91887 opal_config_asm.m4: Fix the detection of 128 bits atomics.
Thanks to Stefan Teleman for identifying this issue and providing a
proof-of-concept patch.  We ended up revamping the detection of
128-bit atomics to reduce duplicated code and be a slightly simpler --
albiet perhaps a bit more verbose -- approach:

- Remove the --enable-cross-* options; they were confusing and
  unnecessary.
- Always try to compile / link the compiler-intrinsic 128-bit atomic
  functions.
  - Strengthen the C tests we use to be more robust.
  - Use m4 to avoid duplicating the C tests multiple times in the .m4
    source.
- If not cross-compiling, try to run a short test and ensure that they
  actually work (as of Aug 2018, there's at least one platform where
  they don't: clang 6 on ARM64).  If cross-compiling, just assume that
  they work.
- Add more comments about what is going on with all the tests; it's
  tricky stuff.  Our Future Selves will thank us.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-25 05:21:24 -07:00
Ralph Castain
a0ea197e97
Merge pull request #5597 from rhc54/topic/root
Allow run-as-root if 2 envars are set
2018-08-24 20:20:58 -07:00
Ralph Castain
7f1444d5f9 Allow run-as-root if 2 envars are set
Per suggestion by @bangerth, allow mpirun to execute as root if two
envars are set to specific values

Per conversation with @jsquyres, name the envars OMPI_ALLOW_RUN_AS_ROOT
and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM

Fixes #4451

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-24 18:12:51 -07:00
Boris Karasev
e5291ccc34 Fixed the NUMA obj detection for hwloc ver >= 2.0.0
Since version hwloc 2.0.0 has a new organization of NUMA nodes on the
topology tree. This commit adds the detection of local NUMA object for
hwloc => 2.0.0, which fixes the procs bindings policy for rmaps mindist
component.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-08-24 19:11:52 +03:00
Jeff Squyres
f52142dc2b
Merge pull request #5595 from jsquyres/pr/misc-warnings-fixes
Miscellaneous compiler warning stomps.
2018-08-24 08:54:47 -07:00
Jeff Squyres
fe0852bcb4 Miscellaneous compiler warning stomps.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-24 07:39:14 -07:00
Jeff Squyres
f0899b9d1d
Merge pull request #5586 from hoopoepg/topic/suppress-shmem-wait-until-warning
OSHMEM: removed incorrect pshmem_wait_until macro redefinition
2018-08-24 07:15:57 -07:00
Jeff Squyres
a731da024e
Merge pull request #5594 from ggouaillardet/topic/opal_path_nfs
test: protect <sys/mount.h> with the HAVE_SYS_MOUNT_H macro
2018-08-24 06:51:45 -07:00
Gilles Gouaillardet
a02be5e91a test: protect <sys/mount.h> with the HAVE_SYS_MOUNT_H macro
Thanks Zoltan Mizsei for bringing this to our attention.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-08-24 17:03:54 +09:00
Nathan Hjelm
feb0e90301
Merge pull request #5589 from hjelmn/threads_cleanup
config: remove OPAL_ENABLE_MULTI_THREADS config macro
2018-08-23 15:43:13 -06:00
Nathan Hjelm
d0cd80e902 osc/rdma: clean out stale aggregation code
The aggregation code in osc/rdma is currently broken and will likely
not be reused. This commit cleans it out.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-23 15:40:21 -06:00
Aravind Gopalakrishnan
5cbcae79d8 MTL OFI: Ask for FI_THREAD_DOMAIN support when not using MPI_THREAD_MULTIPLE
When an application is not using multiple threads to call into MPI, we can
safely ask for FI_THREAD_DOMAIN setting from the provider as it should
translate to the least amount of locking in provider.

Conversely, for applications using THREAD_MULTIPLE, explicitly ask for
FI_THREAD_SAFE to prevent race conditions.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2018-08-23 14:18:32 -07:00
Nathan Hjelm
1c84f48640 config: remove OPAL_ENABLE_MULTI_THREADS config macro
We long ago hard-coded this value to 1. This commit cleans it out
entirely.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-23 13:47:02 -06:00
Nathan Hjelm
0a9c358ef8
Merge pull request #5577 from hjelmn/btl_vader_sc_emu_warnings
btl/vader: clean up debuging and squash warning
2018-08-23 09:41:13 -06:00
Sergey Oblomov
7a5ff6a076 OSHMEM: removed incorrect pshmem_wait_until macro redefinition
- fixes https://github.com/open-mpi/ompi/issues/5585

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-23 18:03:01 +03:00
Yossi Itigin
764cc9fd21
Merge pull request #5576 from hoopoepg/topic/c11-compilation-fix
OSHMEM/API/C11: fixed API macro
2018-08-23 02:55:51 +03:00
Ralph Castain
f7655280cb
Merge pull request #5503 from aravindksg/aravindksg/fix_ofi_race
MTL OFI: Fix race condition due to global progress entries array
2018-08-22 14:31:38 -07:00
Nathan Hjelm
c74cf666a9 btl/vader: clean up debuging and squash warning
References #5512

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-22 10:57:32 -06:00
Nathan Hjelm
a99dba6cb7
Merge pull request #5553 from markalle/info_snprintf2
snprintf() length fix for info
2018-08-22 10:27:14 -06:00
Nathan Hjelm
2574468d02
Merge pull request #5575 from hjelmn/osc_rdma_warning
osc/rdma: quiet warning
2018-08-22 10:25:49 -06:00
Sergey Oblomov
2f941ae864 OSHMEM/C11: removed unused macro
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-22 17:37:31 +03:00
Sergey Oblomov
6a7f66d9c2 MCA/COMMON/UCX: renamed synonim to opal_mem_hook variable
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-22 14:12:33 +03:00
Sergey Oblomov
be0ea1d764 OSHMEM/API/C11: fixed API macro
- updated compilation of C11 compiler for API macro

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-22 11:16:15 +03:00
Nathan Hjelm
29320872b3 osc/rdma: quiet warning
gcc complains about ret possibly being used uninitialized. That will
never happen but we should still quiet the warning. This commit sets
ret to a valid value.

Fixes #5513

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-21 15:54:53 -06:00
Nathan Hjelm
69a4f93ccd
Merge pull request #5572 from hjelmn/osc_c99
osc/pt2pt: use c99 for module initialization
2018-08-21 12:35:01 -06:00
Brian Barrett
4b56b0df08 dist: Update NEWS with 2.1.5 updates
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-08-21 11:03:33 -07:00
Nathan Hjelm
438c40de03 osc/pt2pt: use c99 for module initialization
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-21 11:23:33 -06:00
Sergey Oblomov
e00f7a68ba MCA/COMMON/UCX: added synonim to opal_mem_hook variable
- added synonim to opal_mem_hook variable to allow
  to print it in opal_info -a

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-21 15:05:12 +03:00
Ralph Castain
f12a693d80
Merge pull request #5568 from rhc54/topic/pmix301
Update to PMIx v3.0.1
2018-08-20 17:24:52 -07:00
Ralph Castain
5cfa2a7fca Complete integration of job_control
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-08-20 16:10:50 -07:00
Ralph Castain
9948084130 Update to PMIx v3.0.1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-08-20 15:05:24 -07:00
Edgar Gabriel
e6a344ba63
Merge pull request #5561 from edgargabriel/pr/file_open_sharedfp_ordering
common/ompio: fix an ordering problem during file_open
2018-08-20 10:18:14 -05:00
Edgar Gabriel
eaabfdd028
Merge pull request #5539 from DDNStorage/ime-support
ompio: support for DDN's Infinite Memory Engine
2018-08-20 09:52:22 -05:00
Edgar Gabriel
2742273ee3 common/ompio: fix an ordering problem during file_open
the sharedfp component has to be selected and opened before
we set the default file view during file_open. Otherwise
there is a sperious error message from the sharefp_file_seek
operation that is called during the file_set_view.

Fixes Issue #5560

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-08-20 09:28:29 -05:00
Aurélien Bouteiller
e46c907468
Adding error handling in OpenIB BTL
bugfix: major: openib send credits returned correctly after a fault for pending frags to dead processes; also tweak the default IB retry timeouts tomake this happen faster

Make it compile in non-debug builds

Mark the IB endpoint as failed when invoking an error; this resolves UDCM connection deadlocks

Changing the default IB retry timeouts is not a good idea.
We'll need to find another way to speedup credit recovery in failure cases.

Remove ULFM specific cases

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2018-08-16 16:53:17 -04:00
Jeff Squyres
fc9218dd01
Merge pull request #5552 from jsquyres/pr/remove-excess-fortran-tkr-type-extent-declaration
fortran/use TKR: remove excess declaration for PMPI_Type_extent
2018-08-16 16:12:00 -04:00
Mark Allen
ee6eb9b12b snprintf() length fix for info
The important part of this fix is a couple places 5 was hard-coded that needed to be
strlen(OPAL_INFO_SAVE_PREFIX).

But also this contains a fix for a gcc 7.3.0 compiler warning about snprintf(). There
was an "if" statement making sure all the arguments had appropriate strlen(), but gcc
still complained about the following snprintf() because the size of the struct element
is iterator->ie_key[OPAL_MAX_INFO_KEY + 1].

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2018-08-16 15:32:04 -04:00
Jeff Squyres
8a0b5454ae fortran/use TKR: remove excess declaration for PMPI_Type_extent
This declaration was accidentally left behind in 89da9651bb2fe.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-16 10:31:41 -07:00
Gaëtan Bossu
ccc96efc2e DDN's Infinite Memory Engine support for OMPIO
Changes made:
 - Create a new fs component for IME
 - Create a new fbtl component for IME
 - Modify the close function of OMPIO to finalize IME if necessary

Signed-off-by: Gaëtan Bossu <gbossu@ddn.com>
Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
2018-08-16 11:45:47 +02:00
Brian Barrett
ac53ab9f5b dist: Update master NEWS with 2.1.4
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-08-15 12:21:08 -07:00
Aurelien Bouteiller
6acebc40a1
Handle error cases in TCP BTL
When an error is returned by the socket operations, trigger the
appropriate error path in the PML to give an opportunity for
rerouting/error handling.

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2018-08-14 15:35:24 -04:00
Jeff Squyres
1b96be5f2f
Merge pull request #5536 from hjelmn/btl_vader_mr_fix
btl/vader: move memory barrier to where it belongs
2018-08-14 11:32:49 -04:00
Nathan Hjelm
dca3516765 btl/vader: move memory barrier to where it belongs
The write memory barrier was intended to precede setting a fast-box
header but instead follows it. This commit moves the memory barrier to
the intended location.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-08-13 10:14:34 -06:00
Nathan Hjelm
e0f73866ef
Merge pull request #5525 from hjelmn/event_threading
opal/progress: protect against multiple threads in event base
2018-08-13 09:21:44 -06:00