1
1
Граф коммитов

29394 Коммитов

Автор SHA1 Сообщение Дата
Howard Pritchard
176206fe8c
Merge pull request #6098 from jjhursey/enh/v4.0.x/vpid-unpack
Add OPAL_VPID to unpacking
2018-11-26 13:30:20 -07:00
Yossi Itigin
a112d10c93 pml_ucx: initialize req_mpi_object.comm for error handler
without this fix, an error handler invoked on pml_ucx request would
segfault while trying to dereference requests[i]->req_mpi_object.comm

(picked from master f36eeef)

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-11-26 11:57:34 +02:00
KAWASHIMA Takahiro
6f68483fd5 README & man: Update pcollreq documentation
The feature of persistent collectives is approved in the Sept. 2018
MPI Forum meeting and 2018 Draft Specification of the MPI standard is
published during SC18.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit 5f0fcf0f45)
2018-11-26 18:28:08 +09:00
Sergey Oblomov
63cbe36cab OSHMEM/AMO: added int/uint/32/64 atomics calls
- added int/uint/32/64 atomics calls
- added SHMEM_SYNC_SIZE macro

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 4c071da565)
2018-11-26 09:29:55 +02:00
Sergey Oblomov
38a4953707 OSC/UCX: added UCX version evaluation
- added UCX version evaluation to set OSC UCX priority

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit e91f214982)
2018-11-22 11:31:53 +02:00
Sergey Oblomov
012e27af77 OSC: set UCX module used by default
- OSC/UCX module set priority to 200 to be used by default

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 36934a8bb2)
2018-11-22 10:59:43 +02:00
Joshua Hursey
e1f75d5ff1 Add OPAL_VPID to unpacking
* Needed to properly read PMIx job data like the following
   - `OPAL_PMIX_LOCALLDR`
   - `OPAL_PMIX_RANK`
   - `OPAL_PMIX_GLOBAL_RANK`
   - `OPAL_PMIX_APPLDR`
   - `OPAL_PMIX_APP_RANK`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit a557c4130c)
2018-11-21 11:48:58 -06:00
Geoff Paulsen
206b574bb3
Merge pull request #6076 from hppritcha/topic/on_to_v4.0.1
roll to v4.0.1a1
2018-11-20 09:55:55 -06:00
Geoff Paulsen
6898de01a7
Merge pull request #6012 from hoopoepg/topic/added-missing-amo-datatypes-v4.0
OSHMEM/AMO: added missing C11 macro datatypes - v4.0
2018-11-19 14:20:15 -06:00
Howard Pritchard
8adaeb1536
Merge pull request #6007 from aravindksg/coll-tuned-fix-40x
coll/tuned: Fix MPI_IN_PLACE processing in tuned algorithms
2018-11-19 13:15:40 -07:00
Howard Pritchard
3369b0d10f
Merge pull request #6011 from hoopoepg/topic/fixed-oshmem-profile-build-v4.0
OSHMEM/PROFILE: fixed profile build - v4.0
2018-11-19 13:15:09 -07:00
Howard Pritchard
24dae8609e
Merge pull request #5926 from hjelmn/v4.0.x_need_to_unblock_sigchld_in_some_cases
v4.0.x: Ensure SIGCHLD is unblocked
2018-11-19 13:13:10 -07:00
Howard Pritchard
d6bab4e26d
Merge pull request #6064 from jsquyres/pr/v4.0.x/rmaps-help-message-update
v4.0.x: orte-rmaps-base: update out-of-slots show_help message
2018-11-19 13:12:41 -07:00
Howard Pritchard
ec79631ba2
Merge pull request #5936 from edgargabriel/pr/testmpio-v4.0.x
Pr/testmpio v4.0.x
2018-11-19 13:11:50 -07:00
Howard Pritchard
116a140be8 roll to v4.0.1a1
long live 4.0.1!

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-11-13 07:18:57 -07:00
Gilles Gouaillardet
9366c6eb2e mpiext/cuda: do not include automatically generated file into dist tarball
ompi/mpiext/cuda/c/mpiext_cuda_c.h is automatically generated from
ompi/mpiext/cuda/c/mpiext_cuda_c.h.in at configure time.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@f8318f0a8f)
(cherry picked from commit open-mpi/ompi@b3ce25af95)
2018-11-13 00:09:01 -06:00
Howard Pritchard
725f62554e
Merge pull request #6067 from jsquyres/pr/v4.0.x/fix-readme-lsf-references
v4.0.x: README: Make LSF text more accurate
2018-11-10 14:47:40 -07:00
Jeff Squyres
c6d8caf302 README: Make LSF text more accurate
Also remove a now-outdated LSF reference.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 419852ab43)
2018-11-08 18:32:34 -05:00
Jeff Squyres
8be14b9b07 orte-rmaps-base: slightly amend help message
Follow on to 430c659908: clarify the help message and fix one typo.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit e9bf318dcb)
2018-11-08 18:20:28 -05:00
Howard Pritchard
e47c3eaa8d
Merge pull request #6003 from yosefe/topic/scoll-basic-fix-pSync-v4.0.x
SCOLL/BASIC: Fix invalid pSync pointer passed to barrier func
2018-11-08 15:45:39 -07:00
Jeff Squyres
76d4c1843e orte-rmaps-base: update out-of-slots show_help message
Update the show_help message for when there are not enough slots to
run an application.

Also, remove a bunch of copies of this message in various show_help
text files that aren't used/referred to anywhere in the code.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 430c659908)
2018-11-08 16:03:28 -05:00
Howard Pritchard
9275b692b7
Merge pull request #6043 from hjelmn/v4.0.x_fix_this_damn_memory_barrier_bug_that_is_referenced_in_github_bug_6014_now_lets_get_this_release_out_the_door
v4.0.x: opal/asm: work around possible gcc compiler bug
2018-11-08 08:01:33 -07:00
Geoff Paulsen
d99518c0d4
Merge pull request #6046 from hjelmn/v4.0.x_fix_a_memory_barrier_bug_that_is_totally_related_to_6014_but_in_the_pmix_code
pmix3x: fix potential memory barrier bug with __atomic builtin atomics
2018-11-07 10:46:35 -06:00
Geoff Paulsen
f17dcd5961
Merge pull request #6027 from jsquyres/pr/v4.0.0-text-updates
v4.0.0: text updates
2018-11-07 10:45:34 -06:00
Jeff Squyres
9a7320fdab README: Clarify that only IB->openib is deprecated
Per feedback from https://github.com/open-mpi/ompi/pull/6028, remove
"+ob1" from the sentence to emphasize that it's only IB usage through
openib that is deprecated/superceded (i.e., ob1 is definitely not
deprecated).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 6cb4159826)
2018-11-06 10:07:32 -08:00
Jeff Squyres
7cb6cbc80f README: More updates for v4.0.0
Move the UCX and MXM text up to flow better with the rest of the
text+content.  Also emphasize that MXM is deprecated.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 4ec8e6fe22)
2018-11-06 10:07:15 -08:00
Jeff Squyres
740567ff92 README: Add extensive information about deleted MPI-1 syms
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit e2ab41efac)
2018-11-06 10:07:05 -08:00
Jeff Squyres
f149f64f7e README: Update information about UCX
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 78552e81c1)
2018-11-06 10:07:05 -08:00
Jeff Squyres
d0efdfd9c8 MPI_Type_get_envelope: remove MPI-1 deleted names
Several names are now no longer returned by MPI_Type_get_envelope.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 65eb118e08)
2018-11-06 10:07:05 -08:00
Nathan Hjelm
5efc76ef44 pmix3x: fix potential memory barrier bug with __atomic builtin atomics
See open-mpi/ompi#6014 for more information.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-11-06 10:37:14 -07:00
Nathan Hjelm
e57c3fb3c9 opal/asm: work around possible gcc compiler bug
It seems in some cases (gcc older than v6.0.0) the __atomic_thread_fence is a
no-op with __ATOMIC_ACQUIRE. This appears to be the case with X86_64 so go
ahead and use __ATOMIC_SEQ_CST for the x86_64 read memory barrier. This should
not cause any performance issues as it is equivalent to the memory barrier
in the hand-written atomics.

References #6014

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 30119ee339)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-11-06 10:28:13 -07:00
Geoff Paulsen
4008c46e84
Merge pull request #6032 from gpaulsen/topic/v4.0.x/README_lsf
README: updating LSF version supported to 9.1.1 or later
2018-11-05 14:16:01 -06:00
Geoffrey Paulsen
f5dbecd5e7 README: updating LSF version supported to 9.1.1 or later
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
(cherry picked from commit 0100595898)
2018-11-05 02:33:42 -06:00
Geoff Paulsen
e720a9d31d
Merge pull request #6018 from gpaulsen/topic/v4.0.x/api_removal_for_v4.0.0
mpi.h: restore some MPI-deprecated items to default builds
2018-11-02 16:17:04 -05:00
Howard Pritchard
949dbb4b25
Merge pull request #6017 from hppritcha/topic/swat_issue5810_4.0.x
btl/openib: fix a problem with ib query
2018-11-02 13:11:58 -06:00
Geoffrey Paulsen
2d3b4bb91a mpi.h: restore some MPI-deprecated items to default builds
Commit 89da9651b inadvertantly #if'ed out both deprecated *and*
removed items from mpi.h.  The intent was only to #if out items that
have been *removed* from the MPI specification and leave all items
that are merely deprecated.

This commit also re-orders the deleted typedef+functions to be in the
same order as they are listed in MPI-3.1 chapter 17, just to make
verifying/checking the code easier.

Note that --enable-mpi1-compatibility can still be used to restore
prototypes for the items that have been removed from the MPI
specification (e.g., MPI_Address()).

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit b03a39d359)
2018-11-02 14:07:26 -05:00
Howard Pritchard
bbfde1533b btl/openib: fix a problem with ib query
Under certain circumstances, ibv_exp_query_device was
returning an error due to uninitialized fields in the
extended attributes struct.

Fixes: #5810
Fixes: #5914

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 8126779a35)
2018-11-02 10:42:17 -06:00
Sergey Oblomov
b416c8afe2 OSHMEM/AMO: code beautify
- added <cr> to split API groups to simplify human processing

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 6e78102089)
2018-11-01 16:52:01 +02:00
Sergey Oblomov
d3f08d010c OSHMEM/AMO: added missing C11 macro datatypes
- added signed datatypes for atomic_add calls
- added unsigned datatypes for atomic put/inc/get/fetch calls
- fixed incorrect SHMEM_CTX_DEFAULT macro, added
  external declaration of oshmem_ctx_default variable

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit f63d6da6d7)
2018-11-01 16:51:56 +02:00
Sergey Oblomov
aaf15a6c17 OSHMEM/PROFILE: fixed profile build
- added missing file to profile makefile
- constants SHMEM_CTX_* are shifted into public header

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 4a3e83780c)
2018-11-01 16:37:49 +02:00
Howard Pritchard
e851879081
Merge pull request #5994 from rhc54/cmr40/cleanup
Remove stale defunct tools
2018-10-31 13:29:57 -06:00
Aravind Gopalakrishnan
5a74ddb34d coll/tuned: Fix MPI_IN_PLACE processing in tuned algorithms
PR #5450 addresses MPI_IN_PLACE processing for basic collective algorithms.
But in conjunction with that, we need to check for MPI_IN_PLACE in tuned paths
as well before calling ompi_datatype_type_size() as otherwise we segfault.

MPI spec also stipulates to ignore sendcount and sendtype for Alltoall and
Allgatherv operations. So, extending the check to these algorithms as well.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit 88d781056f)
2018-10-31 11:37:29 -07:00
Yossi Itigin
8a329a797c SCOLL/BASIC: Fix invalid pSync pointer passed to barrier func
mca_scoll_basic_alltoall() passed (pSync + 1) to barrier function, but
the value of _SHMEM_ALLTOALL_SYNC_SIZE is 1, which made the barrier
function use an invalid memory location. In particular, this location
was not initialized to _SHMEM_SYNC_VALUE, which broke the barrier
algorithm and it did not complete: One PE could read 0 from its peer and
assume the peer already started the barrier, and then write 1 to the
peer. Then, the peer entered the barrier and overwrote the 1 with 0, and
then it waited forever to see '1' in its pSync.

Found with shmem_verifier test suite.

(picked from master 6754bf1)

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-31 12:22:19 +02:00
Geoff Paulsen
cd1e927e1b
Merge pull request #5995 from gpaulsen/topic/v4.0.x/fix_pgi
COMMON/UCX: added error code to log output
2018-10-30 15:11:37 -05:00
Ralph Castain
ba6ad9fe42 Remove stale defunct tools
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 05ac8fa71c)
2018-10-30 08:51:25 -07:00
Sergey Oblomov
0846c9d112 COMMON/UCX: added error code to log output
Also fixes a PGI compilation error with --enable-debug.

Signed-off-by: Geoff Paulsen <gpaulsen@users.noreply.github.com>
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 1099d5f023)
2018-10-30 09:55:25 -05:00
Ralph Castain
712ddd326f Remove the stale orte-dvm code
Users should migrate to https://github.com/pmix/prrte

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 1bd772e8eb)
2018-10-30 07:54:35 -07:00
Geoff Paulsen
660743cd3a
Merge pull request #5988 from hppritcha/topic/libevent_configure_4.0.x
Topic/libevent configure 4.0.x
2018-10-30 07:49:46 -05:00
Geoff Paulsen
21d2597afd
Merge pull request #5974 from rhc54/cmr4/mpir
v4.0: Provide deprecation warning of MPIR debugger
2018-10-29 16:29:15 -05:00
Gilles Gouaillardet
7f36dce348 event/external: fix version requirement
Only default to the external component if its version is
greater or equal than the internal libevent (2.0.22)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit b205039205)
2018-10-29 15:20:48 -06:00