1
1

10443 Коммитов

Автор SHA1 Сообщение Дата
Howard Pritchard
3f752f1d4f
Merge pull request #7237 from mcoil1/pr/v4.0.x/wbailey2-fixes
v4.0.x/two fixes
2019-12-17 09:22:07 -07:00
William Bailey
c01a71fbe9 romio: fix uninitialized variable
Squash compiler warning.

ROMIO is third-party software but has an annoying compiler warning;
this is the minimum distance fix.

Signed-off-by: William Bailey <wbailey2@nd.edu>
(cherry picked from commit 30bda56bcef6f56823ac07f0418fd33e1eff837f)
2019-12-14 17:18:57 -05:00
Maxwell Coil
6fdd902d3f romio: Update ADIOI_R_Exchange_data function
Squash compiler warning due to whitespace/brace problems.

The code block from lines 829-839 was improperly indented, which led to
both the code being confusing and a compiler warning. Comparing this code to
the current version in the MPICH repo made it clear that the code was simply
improperly indented. Fixing the indentation both makes the code readable and
squashes the compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
(cherry picked from commit 8c237e268472a763ad4aa55e24d25ee7ca64b888)
2019-12-14 12:25:51 -05:00
Maxwell Coil
84a67bd6cf libnbc: fixed uninitialized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
(cherry picked from commit 52241dbbcdcbf3605c8098d0cfbcf3c5a75a1c9c)
2019-12-14 12:25:18 -05:00
Maxwell Coil
879a25c239 ompi/dpm/dpm.c: Fix uninititalized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
(cherry picked from commit 3ced33c2ebf403206c242c89bb70b6b9a0fc8968)
2019-12-14 12:24:57 -05:00
Howard Pritchard
59d8d62555
Merge pull request #7116 from ggouaillardet/topic/v4.0.x/f08_bind_c_constants_revamp
v4.0.x: fortran/use-mpi-f08: revamp mpi_f08 constants
2019-12-13 08:07:42 -07:00
Todd Kordenbrock
1f5a79bbd4 mtl-portals4: don't finalize flow control if Portals4 was not initialized
This commit fixes a segfault in mtl-portals4 finalize().  The segfault
occurs if finalize() is called without any calls to add_procs().  This
commit resolves the segfault by skipping the flow control fini() call if
Portals4 was not initialized.

Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
(cherry picked from commit e7b867c044f8b776b75f3c6d917745c06237743e)
2019-12-12 08:43:20 -06:00
William Bailey
71fe9d78e0 fcoll/two_phase: Compiler warning for wrong variable type used
Squash compiler warning. Changed output specifier to match variable type (long int -> long long int).

Signed-off-by: William Bailey <wbailey2@nd.edu>
(cherry picked from commit e2718e01961782bffeef0bdfdb19ea286da0db2a)
2019-12-08 14:15:29 -05:00
Gilles Gouaillardet
b004f4c391 schizo/ompi: correctly handle the yield_when_idle option
in schizo/ompi, sets the new OMPI_MCA_mpi_oversubscribe environment
variable according to the node oversubscription state.

This MCA parameter is used to set the default value of the
mpi_yield_when_idle parameter.

This two steps tango is needed so the mpi_yield_when_idle setting
is always honored when set in a config file.

Refs. open-mpi/ompi#6433

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry-picked from cc97c0f6116ae62feef)
2019-12-02 17:18:26 -05:00
Howard Pritchard
d4dd837a3c
Merge pull request #7194 from edgargabriel/pr/two-phase-aggr-calc-32bits-bug-v4.0.x
fcoll/two_phase: fix error in calculating aggregators in 32bit mode
2019-11-27 12:20:34 -07:00
Edgar Gabriel
02da54c174 fcoll/two_phase: fix error in calculating aggregators in 32bit mode
In fcoll_two_phase_supprot_fns.c: calculation of the aggregator index
failed for large offsets on 32bit machine, due to improper handling of
64bit offsets.

Fixes Issue #7110

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit ea1355beae918b3acd67d5c0ccc44afbcc5b7ca9)
2019-11-25 09:06:36 -06:00
Edgar Gabriel
39acc3a251 common/ompio: fix calculation in simple-grouping option
This is based on a bug reported on the mailing list using a netcdf testcase.
The problem occurs if processes are using a custom file view, but on some
of them it appears as if the default file view is being used. Because of that,
the simple-grouping option lead to different number of aggregators used on different
processes, and ultimately to a deadlock. This patch fixes the problem by not using
the file_view size anymore for the calculation in the simple-grouping option,
but the contiguous chunk size (which is identical on all processes).

Fixes issue #7109

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit ad5d0df4e91d66efa5b69e8388f7ed0b4b6a1d09)
2019-11-25 09:04:13 -06:00
Gilles Gouaillardet
02c79ac0c8 fortran/use-mpi-f08: misc fixes
- fix typos from open-mpi/ompi@b10a60a5a9
 - remove remaining references to OMPI_PROTECTED from open-mpi/ompi@df6d763a53

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@fda4d040da)
2019-11-06 10:10:22 +09:00
Gilles Gouaillardet
0ab61c9b74 fortran/use-mpi-f08: remove unused references to OMPI_PROTECTED
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(back-ported from commit open-mpi/ompi@df6d763a53)
2019-11-06 10:10:22 +09:00
Gilles Gouaillardet
23ed2f44a2 fortran/use-mpi-f08: revamp constant declarations
In order to work around an issue with flang based compilers,
avoid declaring bind(C) constants and use plain Fortran parameter
instead.

For example,
type(MPI_Comm), bind(C, name="ompi_f08_mpi_comm_world") OMPI_PROTECTED :: MPI_COMM_WORLD
is changed to
type(MPI_Comm), parameter :: MPI_COMM_WORLD = MPI_Comm(OMPI_MPI_COMM_WORLD)

Note that in order to preserve ABI compatibility, ompi/mpi/fortran/use-mpi-f08/constants.{c,h}
have been kept even if its symbols are no more referenced by Open MPI.

Refs. open-mpi/ompi#7091

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(back-ported from commit open-mpi/ompi@b10a60a5a9)
2019-11-06 10:10:18 +09:00
KAWASHIMA Takahiro
56d2865cf6 fortran/use-mpi-f08: Add C++ datatypes and MPI_NO_OP
Though the MPI standard does not have `MPI_CXX_COMPLEX`, `mpi.h`,
`mpif.h`, and `mpi.mod` have it. So I added it for consistency.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>

(cherry picked from commit open-mpi/ompi@63ecf01610)
2019-11-06 09:49:17 +09:00
KAWASHIMA Takahiro
792b5a01a5 fortran/use-mpi-f08: Remove unnecessary ;
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>

(cherry picked from commit open-mpi/ompi@e0c5bad195)
2019-11-06 09:48:58 +09:00
Gilles Gouaillardet
628883b38b fortran/use-mpi-f08: add MPI C types
Refers open-mpi/ompi#5801

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@69f1a19c5d)
2019-11-06 09:48:11 +09:00
Geoff Paulsen
524960dcdd
Merge pull request #7119 from devreal/grequestx-progress-v4.0.x
Ensure that grequestx continuously make progress (v4.0.x)
2019-11-01 14:12:48 -05:00
Howard Pritchard
608502ff85
Merge pull request #7102 from edgargabriel/pr/v4.0.x-romio321-status-set-elements-fix
MPIR_Status_set_bytes: fix for large count sizes
2019-11-01 13:08:35 -06:00
Joseph Schuchart
b7f5c17d83 Ensure that grequestx continuously make progress
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
(cherry picked from commit 37e6bbb1e12a61b4a7ac90c411752a768fede67d)
2019-10-29 10:31:23 +01:00
Edgar Gabriel
a3e1ecc14b comomn_ompio_file_read/write: fix 2GB limiting issue
individual read/write operations exceeding 2GB fail in ompio
due to improper conversions from size_t to int in two different
locations. This commit fixes an issue reported by Richard Warren
from the HDF5 group.

Fixes Issue #7045

Cherry-picked from commit a130f569df6badebe639d17c0a1a0cb79a596094

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-10-22 12:12:55 -05:00
Edgar Gabriel
6185fa1946 MPIR_Status_set_bytes: fix for large count sizes
Change the ncounts argument to MPI_Count and use
MPI_Status_set_elements_x for enabling read/write operations beyond
the 2GB limit.

Thanks to  Richard Warren from the HDF5 group for reporting the issue
and providing the suggested fix for romio.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit 8a3abbf80373bb36eb78688954118f1fb0cc2e9d)
2019-10-22 09:51:29 -05:00
Howard Pritchard
5f3dbdb5c8 mtl/ofi: replace OMPI_UNLIKELY with OPAL version
one off patch for v4.0.x.  for some reason commit on master
didn't have this problem.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2019-09-26 16:01:28 -05:00
Geoff Paulsen
32984ceb65
Merge pull request #7005 from mwheinz/REFS6976-4.0.x
v4.0.x: REF6976 Silent failure of OMPI over OFI with large messages sizes
2019-09-24 12:25:43 -05:00
Michael Heinz
89be953cfd REF6976 Silent failure of OMPI over OFI with large messages sizes
INTERNAL: STL-59403

The OFI (libfabric) MTL does not respect the maximum message size
parameter that OFI provides in the fi_info data.

This patch adds this missing max_msg_size field to the mca_ofi_module_t
structure and adds a length check to the low-level send routines.

(cherry-picked from commit 3aca4af548a3d781b6b52f89f4d6c7e66d379609)
Change-Id: Ie50445e5edfb0f30916de0836db0edc64ecf7c60
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Reviewed-by: Brendan Cunningham <brendan.cunningham@intel.com>
2019-09-23 17:19:10 -04:00
guserav
9bf1873215 Fix osc sm posts when only 32 bit atomics support
Signed-off-by: guserav <erik.zeiske@hpe.com>
(cherry picked from commit 3c9f4e682369e6fd5860b46ba81d79f2d1599a35)
2019-08-31 12:31:19 -07:00
Geoff Paulsen
2d515f747f
Merge pull request #6934 from devreal/osc-ucx-excl-lock-v4.0.x
UCX osc: properly release exclusive lock to avoid lockup (v4.0.x)
2019-08-29 13:41:03 -05:00
Joseph Schuchart
8d130e1964 UCX osc: properly release exclusive lock to avoid lockup
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
(cherry picked from commit 08cb6389e034c1a70368671f745f20904c774a1e)
2019-08-27 23:12:56 +02:00
Valentin Petrov
83a2518994 Coll/hcoll: fixes hcoll non-blocking colls support
open-mpi/ompi@0fe756d416 Introduced
    a bug in coll/hcoll component. The ompi_requests allocated by
    libhcoll would be treated as coll_base_nbc_request during
    ompi_coll_base_retain_<> call. Afterwards this would lead to a
    segv in the request cleanup.

    Fix: since libhcoll interface does not distinguish between the
    blocling/non-blocking requests use coll_base_nbc_request all the
    time and initialize it properly in
    coll/hcoll/get_coll_handle(). It is still within 2 cache lines.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2019-08-27 17:23:52 +03:00
Howard Pritchard
e4adbeefe7
Merge pull request #6905 from edgargabriel/pr/file-seek-end-fix-v4.0.x
io_ompio_file_open: fix offset calculation with SEEK_END
2019-08-23 13:11:33 -06:00
Geoff Paulsen
390e0bc5b2
Merge pull request #6863 from bosilca/topic/backport_6695
Refresh of the datatype engine from Topic/backport 6695
2019-08-21 10:49:37 -05:00
Howard Pritchard
f96994b12f
Merge pull request #6865 from rhc54/cmr40/locality
Provide locality for all procs on node
2019-08-19 13:26:59 -06:00
Howard Pritchard
7b09c15b90
Merge pull request #6892 from janjust/v4.0.x-osc_fix
v4.0.x: osc/ucx: Fix possible win creation/destruction race condition
2019-08-19 13:26:32 -06:00
George Bosilca
c9f48e2e77
Whitespace cleanup
No code or logic changes.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-08-16 10:27:43 -04:00
Edgar Gabriel
d72d39bfee io_ompio_file_open: fix offset calculation with SEEK_END
and SEEK_CUR. fixes an issue reported by Wei-keng Liao

Fixes Issue #6858

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-08-16 09:03:10 -05:00
Ralph Castain
e17203b4f7
Silence Coverity warning
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-08-12 12:42:41 -07:00
Ralph Castain
14f3fbb8c1
Provide locality for all procs on node
Update PMIx to latest master to get supporting updates. For
connect/accept (part of comm_spawn as well), lookup locality for all
participating procs on the node and compute the relative locality so it
can be used for MPI operations.

Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit d202e10c1407d2f9177e9b871eadde1f25526676)
2019-08-12 12:42:40 -07:00
Tomislav Janjusic
e9a0343780 osc/ucx: Fix possible win creation/destruction race condition
To avoid fully initializing the osc/ucx component for MPI application
that are not using One-Sided functionality, the initialization happens
at the first MPI window creation.

This commit ensures atomicity of global state modifications.

ported from: 6678ac0f557935b291ec2310216b7ea46e0c13b1
Signed-off-by: Artem Polyakov <artpol84@gmail.com>

fix alignment, and fix error path
2019-08-12 22:23:17 +03:00
Gilles Gouaillardet
39ec580b76 coll/base: only retain datatypes/op if the request has not yet completed
a non blocking collective might return ompi_request_null, so we should not
retain anything in that case.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@63d3ccde9d)
2019-08-13 00:13:40 +09:00
Gilles Gouaillardet
ae26957619 coll/base: cleanup ompi_coll_base_nbc_request_t elements
Since ompi_coll_base_nbc_request_t is to be used in an
opal_free_list_t, it must be returned into a "clean" state.
So cleanup some data in the callback completion subroutines.

This fixes a regression introduced in open-mpi/ompi@0fe756d416

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@0862c409f1)
2019-08-13 00:13:40 +09:00
Gilles Gouaillardet
b37c85dcca coll/libnbc: fixes ompi ompi_coll_libnbc_request_t parent
base ompi_coll_libnbc_request_t on top of ompi_coll_base_nbc_request_t
to correctly support the retention of datatypes/operators

This fixes a regression introduced in open-mpi/ompi@0fe756d416

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@f8eef0fde9)
2019-08-13 00:13:40 +09:00
Sergey Oblomov
2fa112c0a6 UCX: added PPN hint for UCX context
- added PPN hint for UCX context init

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 43186e494b47ca29e8d5e7a864b6b98b8e873195)

Conflicts:
	opal/mca/common/ucx/common_ucx_wpool.c
2019-08-09 11:51:30 +03:00
George Bosilca
8b794235b8
Update the datatype dump to match the actual types.
Update the comments to better reflect what is going on.
Minor indentations.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-05 09:37:47 -04:00
George Bosilca
4f754d0156
Optimized datatype description.
Move toward a base type of vector (count, type, blocklen, extent, disp)
with disp and extent applying toward the count repertition and blocklen
being a contiguous memory of type type.
Implement 2 optimizations on this description used during type_commit:
- collapse: successive similar datatype descriptions are collapsed
together with an increased count.
- fusion: fuse successive datatype descriptions in order to minimize the
number of resulting memcpy during pack/unpack.

Fixes at the OMPI datatype level including:
 - Fix the create_hindexed and vector creation.
 - Fix the handling of [get|set]_elements and _count.
 - Correctly compute the dispacement for block indexed types.
 - Support the MPI_LB and MPI_UB deprecation, aka. OMPI_ENABLE_MPI1_COMPAT.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-05 09:35:07 -04:00
George Bosilca
f68b06e9ee
Fix incorrect behavior with length == 0
Fixes #6575.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-05 09:33:28 -04:00
Howard Pritchard
e547a2b94d
Merge pull request #6838 from ggouaillardet/topic/v4.0.x/misc_fortran_bindings
v4.0.x: misc Fortran related backports
2019-08-02 13:00:31 -06:00
Howard Pritchard
31aa52f11a
Merge pull request #6846 from nysal/topic/v4.0.x/ucx_accumulate_fix
v4.0.x: osc/ucx: Fix data corruption with non-contiguous accumulates
2019-08-02 12:43:40 -06:00
Nysal Jan K.A
359cdf2b53 osc/ucx: Fix data corruption with non-contiguous accumulates
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
(cherry picked from commit 3529d447020684ab305411caa97423826bb40906)
2019-07-26 14:41:08 +05:30
Mikhail Brinskii
b9998a14dc COLL/TUNED: Minor var names/comments fixes
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 65618f8db848613c95cbe112033df94721d326a8)
2019-07-26 11:29:12 +03:00