1
1

10338 Коммитов

Автор SHA1 Сообщение Дата
Maxwell Coil
84a67bd6cf libnbc: fixed uninitialized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
(cherry picked from commit 52241dbbcdcbf3605c8098d0cfbcf3c5a75a1c9c)
2019-12-14 12:25:18 -05:00
Maxwell Coil
879a25c239 ompi/dpm/dpm.c: Fix uninititalized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
(cherry picked from commit 3ced33c2ebf403206c242c89bb70b6b9a0fc8968)
2019-12-14 12:24:57 -05:00
Howard Pritchard
59d8d62555
Merge pull request #7116 from ggouaillardet/topic/v4.0.x/f08_bind_c_constants_revamp
v4.0.x: fortran/use-mpi-f08: revamp mpi_f08 constants
2019-12-13 08:07:42 -07:00
William Bailey
71fe9d78e0 fcoll/two_phase: Compiler warning for wrong variable type used
Squash compiler warning. Changed output specifier to match variable type (long int -> long long int).

Signed-off-by: William Bailey <wbailey2@nd.edu>
(cherry picked from commit e2718e01961782bffeef0bdfdb19ea286da0db2a)
2019-12-08 14:15:29 -05:00
Howard Pritchard
d4dd837a3c
Merge pull request #7194 from edgargabriel/pr/two-phase-aggr-calc-32bits-bug-v4.0.x
fcoll/two_phase: fix error in calculating aggregators in 32bit mode
2019-11-27 12:20:34 -07:00
Edgar Gabriel
02da54c174 fcoll/two_phase: fix error in calculating aggregators in 32bit mode
In fcoll_two_phase_supprot_fns.c: calculation of the aggregator index
failed for large offsets on 32bit machine, due to improper handling of
64bit offsets.

Fixes Issue #7110

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit ea1355beae918b3acd67d5c0ccc44afbcc5b7ca9)
2019-11-25 09:06:36 -06:00
Edgar Gabriel
39acc3a251 common/ompio: fix calculation in simple-grouping option
This is based on a bug reported on the mailing list using a netcdf testcase.
The problem occurs if processes are using a custom file view, but on some
of them it appears as if the default file view is being used. Because of that,
the simple-grouping option lead to different number of aggregators used on different
processes, and ultimately to a deadlock. This patch fixes the problem by not using
the file_view size anymore for the calculation in the simple-grouping option,
but the contiguous chunk size (which is identical on all processes).

Fixes issue #7109

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit ad5d0df4e91d66efa5b69e8388f7ed0b4b6a1d09)
2019-11-25 09:04:13 -06:00
Gilles Gouaillardet
02c79ac0c8 fortran/use-mpi-f08: misc fixes
- fix typos from open-mpi/ompi@b10a60a5a9
 - remove remaining references to OMPI_PROTECTED from open-mpi/ompi@df6d763a53

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@fda4d040da)
2019-11-06 10:10:22 +09:00
Gilles Gouaillardet
0ab61c9b74 fortran/use-mpi-f08: remove unused references to OMPI_PROTECTED
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(back-ported from commit open-mpi/ompi@df6d763a53)
2019-11-06 10:10:22 +09:00
Gilles Gouaillardet
23ed2f44a2 fortran/use-mpi-f08: revamp constant declarations
In order to work around an issue with flang based compilers,
avoid declaring bind(C) constants and use plain Fortran parameter
instead.

For example,
type(MPI_Comm), bind(C, name="ompi_f08_mpi_comm_world") OMPI_PROTECTED :: MPI_COMM_WORLD
is changed to
type(MPI_Comm), parameter :: MPI_COMM_WORLD = MPI_Comm(OMPI_MPI_COMM_WORLD)

Note that in order to preserve ABI compatibility, ompi/mpi/fortran/use-mpi-f08/constants.{c,h}
have been kept even if its symbols are no more referenced by Open MPI.

Refs. open-mpi/ompi#7091

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(back-ported from commit open-mpi/ompi@b10a60a5a9)
2019-11-06 10:10:18 +09:00
KAWASHIMA Takahiro
56d2865cf6 fortran/use-mpi-f08: Add C++ datatypes and MPI_NO_OP
Though the MPI standard does not have `MPI_CXX_COMPLEX`, `mpi.h`,
`mpif.h`, and `mpi.mod` have it. So I added it for consistency.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>

(cherry picked from commit open-mpi/ompi@63ecf01610)
2019-11-06 09:49:17 +09:00
KAWASHIMA Takahiro
792b5a01a5 fortran/use-mpi-f08: Remove unnecessary ;
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>

(cherry picked from commit open-mpi/ompi@e0c5bad195)
2019-11-06 09:48:58 +09:00
Gilles Gouaillardet
628883b38b fortran/use-mpi-f08: add MPI C types
Refers open-mpi/ompi#5801

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@69f1a19c5d)
2019-11-06 09:48:11 +09:00
Geoff Paulsen
524960dcdd
Merge pull request #7119 from devreal/grequestx-progress-v4.0.x
Ensure that grequestx continuously make progress (v4.0.x)
2019-11-01 14:12:48 -05:00
Howard Pritchard
608502ff85
Merge pull request #7102 from edgargabriel/pr/v4.0.x-romio321-status-set-elements-fix
MPIR_Status_set_bytes: fix for large count sizes
2019-11-01 13:08:35 -06:00
Joseph Schuchart
b7f5c17d83 Ensure that grequestx continuously make progress
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
(cherry picked from commit 37e6bbb1e12a61b4a7ac90c411752a768fede67d)
2019-10-29 10:31:23 +01:00
Edgar Gabriel
a3e1ecc14b comomn_ompio_file_read/write: fix 2GB limiting issue
individual read/write operations exceeding 2GB fail in ompio
due to improper conversions from size_t to int in two different
locations. This commit fixes an issue reported by Richard Warren
from the HDF5 group.

Fixes Issue #7045

Cherry-picked from commit a130f569df6badebe639d17c0a1a0cb79a596094

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-10-22 12:12:55 -05:00
Edgar Gabriel
6185fa1946 MPIR_Status_set_bytes: fix for large count sizes
Change the ncounts argument to MPI_Count and use
MPI_Status_set_elements_x for enabling read/write operations beyond
the 2GB limit.

Thanks to  Richard Warren from the HDF5 group for reporting the issue
and providing the suggested fix for romio.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit 8a3abbf80373bb36eb78688954118f1fb0cc2e9d)
2019-10-22 09:51:29 -05:00
Howard Pritchard
5f3dbdb5c8 mtl/ofi: replace OMPI_UNLIKELY with OPAL version
one off patch for v4.0.x.  for some reason commit on master
didn't have this problem.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2019-09-26 16:01:28 -05:00
Geoff Paulsen
32984ceb65
Merge pull request #7005 from mwheinz/REFS6976-4.0.x
v4.0.x: REF6976 Silent failure of OMPI over OFI with large messages sizes
2019-09-24 12:25:43 -05:00
Michael Heinz
89be953cfd REF6976 Silent failure of OMPI over OFI with large messages sizes
INTERNAL: STL-59403

The OFI (libfabric) MTL does not respect the maximum message size
parameter that OFI provides in the fi_info data.

This patch adds this missing max_msg_size field to the mca_ofi_module_t
structure and adds a length check to the low-level send routines.

(cherry-picked from commit 3aca4af548a3d781b6b52f89f4d6c7e66d379609)
Change-Id: Ie50445e5edfb0f30916de0836db0edc64ecf7c60
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Reviewed-by: Brendan Cunningham <brendan.cunningham@intel.com>
2019-09-23 17:19:10 -04:00
guserav
9bf1873215 Fix osc sm posts when only 32 bit atomics support
Signed-off-by: guserav <erik.zeiske@hpe.com>
(cherry picked from commit 3c9f4e682369e6fd5860b46ba81d79f2d1599a35)
2019-08-31 12:31:19 -07:00
Geoff Paulsen
2d515f747f
Merge pull request #6934 from devreal/osc-ucx-excl-lock-v4.0.x
UCX osc: properly release exclusive lock to avoid lockup (v4.0.x)
2019-08-29 13:41:03 -05:00
Joseph Schuchart
8d130e1964 UCX osc: properly release exclusive lock to avoid lockup
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
(cherry picked from commit 08cb6389e034c1a70368671f745f20904c774a1e)
2019-08-27 23:12:56 +02:00
Valentin Petrov
83a2518994 Coll/hcoll: fixes hcoll non-blocking colls support
open-mpi/ompi@0fe756d416 Introduced
    a bug in coll/hcoll component. The ompi_requests allocated by
    libhcoll would be treated as coll_base_nbc_request during
    ompi_coll_base_retain_<> call. Afterwards this would lead to a
    segv in the request cleanup.

    Fix: since libhcoll interface does not distinguish between the
    blocling/non-blocking requests use coll_base_nbc_request all the
    time and initialize it properly in
    coll/hcoll/get_coll_handle(). It is still within 2 cache lines.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2019-08-27 17:23:52 +03:00
Howard Pritchard
e4adbeefe7
Merge pull request #6905 from edgargabriel/pr/file-seek-end-fix-v4.0.x
io_ompio_file_open: fix offset calculation with SEEK_END
2019-08-23 13:11:33 -06:00
Geoff Paulsen
390e0bc5b2
Merge pull request #6863 from bosilca/topic/backport_6695
Refresh of the datatype engine from Topic/backport 6695
2019-08-21 10:49:37 -05:00
Howard Pritchard
f96994b12f
Merge pull request #6865 from rhc54/cmr40/locality
Provide locality for all procs on node
2019-08-19 13:26:59 -06:00
Howard Pritchard
7b09c15b90
Merge pull request #6892 from janjust/v4.0.x-osc_fix
v4.0.x: osc/ucx: Fix possible win creation/destruction race condition
2019-08-19 13:26:32 -06:00
George Bosilca
c9f48e2e77
Whitespace cleanup
No code or logic changes.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-08-16 10:27:43 -04:00
Edgar Gabriel
d72d39bfee io_ompio_file_open: fix offset calculation with SEEK_END
and SEEK_CUR. fixes an issue reported by Wei-keng Liao

Fixes Issue #6858

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-08-16 09:03:10 -05:00
Ralph Castain
e17203b4f7
Silence Coverity warning
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-08-12 12:42:41 -07:00
Ralph Castain
14f3fbb8c1
Provide locality for all procs on node
Update PMIx to latest master to get supporting updates. For
connect/accept (part of comm_spawn as well), lookup locality for all
participating procs on the node and compute the relative locality so it
can be used for MPI operations.

Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit d202e10c1407d2f9177e9b871eadde1f25526676)
2019-08-12 12:42:40 -07:00
Tomislav Janjusic
e9a0343780 osc/ucx: Fix possible win creation/destruction race condition
To avoid fully initializing the osc/ucx component for MPI application
that are not using One-Sided functionality, the initialization happens
at the first MPI window creation.

This commit ensures atomicity of global state modifications.

ported from: 6678ac0f557935b291ec2310216b7ea46e0c13b1
Signed-off-by: Artem Polyakov <artpol84@gmail.com>

fix alignment, and fix error path
2019-08-12 22:23:17 +03:00
Gilles Gouaillardet
39ec580b76 coll/base: only retain datatypes/op if the request has not yet completed
a non blocking collective might return ompi_request_null, so we should not
retain anything in that case.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@63d3ccde9d)
2019-08-13 00:13:40 +09:00
Gilles Gouaillardet
ae26957619 coll/base: cleanup ompi_coll_base_nbc_request_t elements
Since ompi_coll_base_nbc_request_t is to be used in an
opal_free_list_t, it must be returned into a "clean" state.
So cleanup some data in the callback completion subroutines.

This fixes a regression introduced in open-mpi/ompi@0fe756d416

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@0862c409f1)
2019-08-13 00:13:40 +09:00
Gilles Gouaillardet
b37c85dcca coll/libnbc: fixes ompi ompi_coll_libnbc_request_t parent
base ompi_coll_libnbc_request_t on top of ompi_coll_base_nbc_request_t
to correctly support the retention of datatypes/operators

This fixes a regression introduced in open-mpi/ompi@0fe756d416

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@f8eef0fde9)
2019-08-13 00:13:40 +09:00
Sergey Oblomov
2fa112c0a6 UCX: added PPN hint for UCX context
- added PPN hint for UCX context init

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 43186e494b47ca29e8d5e7a864b6b98b8e873195)

Conflicts:
	opal/mca/common/ucx/common_ucx_wpool.c
2019-08-09 11:51:30 +03:00
George Bosilca
8b794235b8
Update the datatype dump to match the actual types.
Update the comments to better reflect what is going on.
Minor indentations.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-05 09:37:47 -04:00
George Bosilca
4f754d0156
Optimized datatype description.
Move toward a base type of vector (count, type, blocklen, extent, disp)
with disp and extent applying toward the count repertition and blocklen
being a contiguous memory of type type.
Implement 2 optimizations on this description used during type_commit:
- collapse: successive similar datatype descriptions are collapsed
together with an increased count.
- fusion: fuse successive datatype descriptions in order to minimize the
number of resulting memcpy during pack/unpack.

Fixes at the OMPI datatype level including:
 - Fix the create_hindexed and vector creation.
 - Fix the handling of [get|set]_elements and _count.
 - Correctly compute the dispacement for block indexed types.
 - Support the MPI_LB and MPI_UB deprecation, aka. OMPI_ENABLE_MPI1_COMPAT.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-05 09:35:07 -04:00
George Bosilca
f68b06e9ee
Fix incorrect behavior with length == 0
Fixes #6575.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-05 09:33:28 -04:00
Howard Pritchard
e547a2b94d
Merge pull request #6838 from ggouaillardet/topic/v4.0.x/misc_fortran_bindings
v4.0.x: misc Fortran related backports
2019-08-02 13:00:31 -06:00
Howard Pritchard
31aa52f11a
Merge pull request #6846 from nysal/topic/v4.0.x/ucx_accumulate_fix
v4.0.x: osc/ucx: Fix data corruption with non-contiguous accumulates
2019-08-02 12:43:40 -06:00
Nysal Jan K.A
359cdf2b53 osc/ucx: Fix data corruption with non-contiguous accumulates
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
(cherry picked from commit 3529d447020684ab305411caa97423826bb40906)
2019-07-26 14:41:08 +05:30
Mikhail Brinskii
b9998a14dc COLL/TUNED: Minor var names/comments fixes
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 65618f8db848613c95cbe112033df94721d326a8)
2019-07-26 11:29:12 +03:00
Mikhail Brinskii
3d5b7b4a1b COLL/TUNED: Update alltoall selection rule for mlx
Use linear with sync alltoall algorithm for certain message/comm size
ranges. Does not affect default fixed decision, unless HPCX (with its
custom parameters) is used or corresponding mca is set.

Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 404c4800688548b021bda68bdf10792424e6b1c5)
2019-07-26 11:28:47 +03:00
KAWASHIMA Takahiro
1ffb9b10bb pcollreq/mpif-h: fix MPIX_Alltoallw_init() binding
These issues were introduced in the recent commit b71af0eca0.
This commit fixes Coverity CID 1451661 and 1451660.

Though `c_info` part was an actual bug, the `c_sendtypes` part was not.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>

(cherry picked from commit open-mpi/ompi@facf8c5e98)
2019-07-24 17:12:10 +09:00
Gilles Gouaillardet
13ba2b0d75 pcollreq/mpif-h: fix MPIX_Alltoallw_init() binding
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@b71af0eca0)
2019-07-24 17:11:44 +09:00
Gilles Gouaillardet
5ab26e490a fortran/mpif-h: fix [i]alltoallw bindings
Fix a regression introduced in open-mpi/ompi@cdaed89d04

Fixes CID 1451610, 1451611 and 1451612

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@ed703bec1b)
2019-07-24 17:10:58 +09:00
Gilles Gouaillardet
fbf7d31fd1 fortran/mpif-h: fix MPI_[I]Alltoallw() binding
- ignore sendcounts, sendispls and sendtypes arguments when MPI_IN_PLACE is used
 - use the right size when an inter-communicator is used.

Thanks Markus Geimer for reporting this.

Refs. open-mpi/ompi#5459

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@cdaed89d04)
2019-07-24 17:10:27 +09:00