1
1

10350 Коммитов

Автор SHA1 Сообщение Дата
Matias Cabral
0601b3e982
Merge pull request #6325 from aravindksg/fix_help_reference
mtl/ofi: Fix reference to help text object
2019-02-05 07:22:51 -08:00
Jeff Squyres
4c64322db4
Merge pull request #6334 from jsquyres/pr/make-mpi-h-a-little-more-c++-friendly
mpi.h.in: use C++ static_cast<> where appropriate
2019-01-31 07:14:34 -05:00
Jeff Squyres
30afdcead9 mpi.h.in: use C++ static_cast<> where appropriate
When compiling mpi.h with a modern C++ compiler and a high degree of
pickyness (e.g., -Wold-style-cast), casting using (void*) in the
OMPI_PREDEFINED_GLOBAL and MPI_STATUS*_IGNORE macros will emit
warnings.  So if we're compiling with a C++ compiler, use C++'s
static_cast<> instead of (void*).

Thanks to @shadow-fax for identifying the issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-01-31 03:22:26 -08:00
Thananon Patinyasakdikul
782ec851ea
Merge pull request #6319 from thananon/pr/allow_overtake
pml/ob1: fix deadlock with communicator flag ALLOW_OVERTAKE.
2019-01-30 15:32:04 -05:00
Jeff Squyres
2203f8d900
Merge pull request #6185 from ggouaillardet/topic/hwloc_macros
hwloc: remove public hwloc macros from opal_config.h
2019-01-30 07:32:22 -05:00
Gilles Gouaillardet
0aeb27f776 topo/treematch: silence a hwloc related warning
treematch/km_partitioning.c #include "config.h",
but there is no such file when the embedded treematch is used.

In order to prevent the embedded treematch from incorrectly using
the config.h from the embedded hwloc, generate a dummy config.h.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-01-30 14:51:38 +09:00
Aravind Gopalakrishnan
9cabcfdbba mtl/ofi: Fix reference to help text object
When we exceed the threshold number of contexts created, print appropriate help
text

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2019-01-29 15:10:06 -08:00
Thananon Patinyasakdikul
0263456cf4 pml/ob1: fix deadlock with communicator flag ALLOW_OVERTAKE.
We missed an assert to check if ALLOW_OVERTAKE is set or not before
validating the sequence number and this will cause deadlock.

Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
2019-01-29 14:55:06 -05:00
Nathan Hjelm
f9338dac93
Merge pull request #6312 from ggouaillardet/topic/op
ompi/op: fix support of non predefined datatypes with predefined oper…
2019-01-29 10:55:00 -07:00
Brian Barrett
23da9fac23
Merge pull request #6294 from bwbarrett/mtl-ofi-no-device-warning
mtl/ofi: Print descriptive error message on modex failure
2019-01-29 08:32:49 -08:00
Brian Barrett
1bb7a73a9c
Merge pull request #6302 from bwbarrett/feature/ofi-av-count
mtl/ofi: Provide av count hint during initialization
2019-01-29 08:32:24 -08:00
Edgar Gabriel
7023357843
Merge pull request #6286 from edgargabriel/pr/floating-point-division-problem
common/ompio: fix a floating point division problem
2019-01-29 10:07:09 -06:00
Gilles Gouaillardet
bc1cab5498 ompi/op: fix support of non predefined datatypes with predefined operators
ACCUMULATE, unlike REDUCE, can use with derived
datatypes with predefinied operations, with some
restrictions outlined in MPI-3:11.3.4.  The derived
datatype must be composed entierly from one predefined
datatype (so you can do all the construction you want,
but at the bottom, you can only use one datatype, say,
MPI_INT).

Refs. open-mpi/ompi#6275

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-01-29 09:33:39 +09:00
Gilles Gouaillardet
45fb69b2b9 ompi/datatype: fix how we compute the space needed for the args
Refs. open-mpi/ompi#6275

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-01-28 15:26:11 +09:00
Brian Barrett
44be7f139a mtl/ofi: Provide av count hint during initialization
Provide the av_attr.count hint (number of addresses that will be
inserted into the address vector through the life of the process)
at initialization of the address vector.  It's ok to be a bit
wrong, but some endpoints (RxR) can benefit by not going through
the slow growth realloc churn.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2019-01-24 15:47:24 -08:00
Edgar Gabriel
c0f8ce0fff common/ompio: fix a floating point division problem
This commit fixes  a problem reported on the mailing list with
individual writes larger than 512 MB.

The culprit is a floating point division of two large, close values.
Changing the datatypes from float to double (which is what is being
used in the fcoll components) fixes the problem.

See issue #6285 and

 https://forum.hdfgroup.org/t/cannot-write-more-than-512-mb-in-1d/5118

Thanks for Axel Huebl and René Widera for reporting the issue.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-01-21 17:59:12 -06:00
Brian Barrett
fe25097194 mtl/ofi: Print descriptive error message on modex failure
With MTLs, there's no "other transport" when the remote side
does not have an active NIC, so we should print a useful error
message when the modex failed (indicating lack of a NIC on
the remote side).

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2019-01-21 23:50:31 +00:00
KAWASHIMA Takahiro
352b667323
Merge pull request #6210 from kawashima-fj/pr/mpiext-use-mod
Use mpi_f08 module in mpi_f08_ext module
2019-01-21 11:56:41 +09:00
René Widera
a91fab80a1 common/ompio: possible rounding issue
Similar to #6286 rounding number of bytes into a single precision floating point value to round up the result of a division is a potential risk due to rounding errors.

- remove floating point operations for `round up`
- removes floating point conversion for round down (native behavior of integer division)

Signed-off-by: René Widera <r.widera@hzdr.de>
2019-01-18 14:05:23 +01:00
Yossi Itigin
387b2ff56f
Merge pull request #6260 from hoopoepg/topic/removed-fca
COLL: removed FCA component
2019-01-17 00:05:07 +08:00
KAWASHIMA Takahiro
b380dd58b5 config/ompi_ext: use mpi module in mpi_ext module
If MPI extensions are enabled, all
`ompi/mpiext/pcollreq/use-mpi/mpiext_*_usempi.h` are included in
`ompi/mpi/fortran/mpiext-use-mpi/mpi-ext-module.F90` and all
`ompi/mpiext/pcollreq/use-mpi/mpiext_*_usempif08.h` are included in
`ompi/mpi/fortran/mpiext-use-mpi-f08/mpi-f08-ext-module.F90` using
`#include` directives.

In `mpiext_*_usempi.h` and `mpiext_*_usempif08.h`, some MPI extension
may want to use constants or handles defined in the `mpi` module and
the `mpi_f08` module. For example, if you want to define a new
datatype in `mpi_f08_ext`, you'll need the definition of
`type(mpi_datatype)`. However, putting `use mpi_f08` line in thier
`mpiext_*_usempif08.h` may cause a compilation error if more than
one MPI extensions are enabled because the `use` statement must be
put prior to any variable declarations.

To resolve this problem, this commit puts `use mpi` and `use mpi_f08`
as first lines of `mpi-ext-module.F90` and `mpi-f08-ext-module.F90`
respectively.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2019-01-16 11:55:55 +09:00
KAWASHIMA Takahiro
2220623f34 config/ompi_ext: Don't include mpiext_*_mpifh.h in mpi_f08_ext
Including `mpiext_*_mpifh.h` in the source file of the `mpi_f08_ext`
module is not always appropriate. For example, if you want to define
a new datatype in an MPI extension, the `include 'mpif-ext.h'` binding
defines the datatype as `integer` but the `use mpi_f08_ext` binding
defines it as `type(mpi_datatype)`. They conflict.

This commit allows each MPI extension to declare whether it wants to
include its `mpiext_*_mpifh.h` in `mpi_f08` and `mpi_f08_ext`
respectively. The default (no declaration) is 'want'.

See `ompi/mpiext/example/configure.m4` for an example.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2019-01-16 11:55:55 +09:00
Aravind Gopalakrishnan
37f9aff2a0 mtl/ofi: Add MCA variables to enable SEP and to request number of OFI contexts
Moving to a model where we have users actively _enable_ SEP feature for use
rather than opening SEP by default if provider supports it. This allows us to
not regress (either functionally or for performance reasons) any apps that were
working correctly on regular endpoints.

Also, providing MCA to specify number of OFI contexts to create and default
this value to 1 (Given btl/ofi also creates one by default, this reduces the
incidence of a scenario where we allocate all available contexts by default and
if btl/ofi asks for one more, then provider breaks as it doesn't support it).

While at it, spruce up README on SEP content.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2019-01-14 09:58:36 -08:00
Ralph Castain
d1fd1f4cce
Merge pull request #6151 from nrspruit/ns_ompi_mtl_ofi_specializations
MTL_OFI: Generation of specialized functions at build time
2019-01-14 09:31:54 -08:00
Sergey Oblomov
0759bb8561 COLL: removed FCA component
- removed FCA collectives from coll/scoll

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-01-09 16:51:40 +02:00
Risto Toijala
f14a0f4fc9 mpi/fortran: Fix valgrind warnings for type create
Valgrind warns that *newtype is uninitialized when calling from
Fortran as e.g.
    use mpi
    integer :: t, err
    call MPI_Type_create_f90_integer(5, t, err)

Since newtype is intent(out), this should not happen. There is
no reason to convert the type using PMPI_Type_f2c, only to over-
write it immediately afterwards. The other type_create_* functions
did not convert newtype.

The valgrind warnings:
==28441== Conditional jump or move depends on uninitialised value(s)
==28441==    at 0x581B555: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441==    by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441==    by 0x400BA1: MAIN__ (in [...])
==28441==    by 0x400C46: main (in [...])
==28441==
==28441== Conditional jump or move depends on uninitialised value(s)
==28441==    at 0x581B563: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441==    by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441==    by 0x400BA1: MAIN__ (in [..])
==28441==    by 0x400C46: main (in [...])
==28441==
==28441== Use of uninitialised value of size 8
==28441==    at 0x581B577: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441==    by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441==    by 0x400BA1: MAIN__ (in [...])
==28441==    by 0x400C46: main (in [...])
==28441==

Signed-off-by: Risto Toijala <risto.toijala@gmail.com>
2019-01-08 22:00:00 +02:00
Aurelien Bouteiller
e54496bf2a
Merge pull request #6087 from ICLDisco/export/errors_cid
Manage errors in communicator creations (cid)
2018-12-31 15:01:55 -05:00
Jeff Squyres
17be4c6d1f
Merge pull request #6229 from jsquyres/pr/fix-enable-grequest-extension-in-a-tarball
romio321: ensure to distribute ompi_grequestx.h
2018-12-28 16:15:23 -05:00
Jeff Squyres
62321be186 romio321: ensure to distribute ompi_grequestx.h
Refs https://github.com/open-mpi/ompi/issues/6227.  Thanks to
@georgemarselis for reporting.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-12-27 15:39:47 -08:00
bosilca
96f88052e9
Merge pull request #5948 from mkurnosov/coll-ireduce-silence-coverity
coll/libnbc/ireduce: silence Coverity warning CID 1440360
2018-12-24 12:59:16 -05:00
bosilca
593db292da
Merge pull request #5644 from mkurnosov/coll-iallreduce-rabenseifner
coll/libnbc: add Rabenseifner's algorithm for MPI_Iallreduce
2018-12-24 12:58:21 -05:00
Jeff Squyres
efcaef74d8 MPI_Type_set_name: fix string length at target
opal_string_copy() takes care of all the string computations.
Specifically: when we converted to opal_string_copy(), we accidentally
left the *source* length as the argument, not the *target* length,
which resulted in one less character being copied than intended (as
was showing up in MTT C++ testing results).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-12-23 13:00:01 -08:00
Aurelien Bouteiller
bd0d2b832e
Merge pull request #6086 from ICLDisco/export/errors_nbc
Manage errors in NBC collective ops
2018-12-21 02:34:00 -05:00
Jeff Squyres
1be5358834
Merge pull request #6212 from jsquyres/pr/fix-treematch-common-symbol
treematch: fix global common symbol
2018-12-20 15:20:41 -05:00
Jeff Squyres
e9a6246b90 treematch: fix global common symbol
Despite its name, this symbol doesn't need to be global.  So just make
it static.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-12-20 11:06:14 -08:00
Jeff Squyres
81bfb5f5e5 Remove some IMPI attributes that were never implemented.
This is a holdover from LAM/MPI that was never implemented here in
Open MPI (and never will be).  Might as well remove this dead code.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-12-20 10:12:32 -08:00
Nathan Hjelm
4944508603
Merge pull request #6136 from hjelmn/opal_cleanup
opal: clean up init/finalize
2018-12-18 15:23:32 -07:00
Nathan Hjelm
a39cb747dd ompi/datatype: don't call opal_datatype_finalize directly
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-12-18 14:37:04 -07:00
Nathan Hjelm
06baa518f7 rte/pmix: fill in opal_process_info when using prrte/pmix
This commit fixes a bug when launching with prun where the process
info structures used by the btls are not populated.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-12-13 16:04:31 -07:00
bosilca
804a517929
Merge pull request #6146 from bosilca/topic/treematch_update
Update to the latest TreeMatch (v1.3).
2018-12-13 13:26:40 -05:00
Spruit, Neil R
bef5f50a42 MTL_OFI: Generation of specialized functions at build time
-> Added new targets in Makefile.am to call a new build script
   generate-opt-funcs.pl to generate specialized functions for
   each *.pm file.

-> Added new perl module *.pm files for send,isend,irecv,iprobe,improbe
   which are loaded by generate-opt-funcs.pl to create new source files
   that correspond to the name of the .pm file to be used as part of
   MTL OFI.

-> Added mtl_ofi_opt.pm.template and updated README with details on the
   specialization features and how to add additional specialization
   support.

-> Added new opt_common/mtl_ofi_opt_common.pm containing common
   functions for generating the specialized functions used by
   all other *.pm modules.

-> Added new mtl_ofi.h which includes the definitions for the
   function symbol table for storing the specialized functions along
   with the definitions for the initialization functions for the
   corresponding function pointers.

-> Based off the OFI provider capabilities the specialized function
   pointers are assigned at mtl_ofi_component_init to the corresponding
   MTL OFI function.

-> mca_mtl_ofi_module_t has been updated with the symbol table
   struct which is assigned at component init.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
2018-12-13 00:35:19 -08:00
Aravind Gopalakrishnan
e5e19dfcf7 Fix for SEP when num local procs is greater than available contexts
For cases when the number of local processes is greater than the number of
available contexts, the SEP initialization phase would calculate the number of
contexts to provision for each rank to be 0 and would eventually crash.

Fix the issue here by using regular endpoints in the event the number of local
processes is more than available contexts. This fixes issue #6182.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2018-12-12 16:49:04 -08:00
KAWASHIMA Takahiro
adc05f705e
Merge pull request #6174 from kawashima-fj/pr/f08-missing-handles
fortran/use-mpi-f08: Add C++ datatypes and MPI_NO_OP
2018-12-12 14:13:36 +09:00
Brian Barrett
6e15128d96 mtl/ofi: Fix crash if no providers found
Commit 109d0569ffd introduced a crash when an error occurred
before ofi_ctxt was allocated, including when no providers
passed the selection logic.  Properly check that the pointer
is not NULL in the error cleanup code before dereferencing
the pointer.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-12-11 15:46:18 -08:00
Jeff Squyres
6f7fbd1676
Merge pull request #6158 from ggouaillardet/topic/mpiext-path-updates
mpiext: updates for header file locations
2018-12-11 13:01:46 -05:00
KAWASHIMA Takahiro
63ecf01610 fortran/use-mpi-f08: Add C++ datatypes and MPI_NO_OP
Though the MPI standard does not have `MPI_CXX_COMPLEX`, `mpi.h`,
`mpif.h`, and `mpi.mod` have it. So I added it for consistency.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2018-12-11 13:08:29 +09:00
KAWASHIMA Takahiro
e0c5bad195 fortran/use-mpi-f08: Remove unnecessary ;
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2018-12-11 09:06:21 +09:00
Matias Cabral
cdb952f66d
Merge pull request #6170 from matcabral/remove_psm2_lower_p
MTL/PSM2: add missing default priority
2018-12-07 16:11:45 -08:00
Matias A Cabral
c76c6d8b28 MTL/PSM2: add missing default priority
Missing default priority after PR #6153

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2018-12-07 14:46:34 -08:00
Matias Cabral
0b821f2184
Merge pull request #6153 from matcabral/remove_psm2_lower_p
MTL/PSM2: Do not lower the priority when all processes are local.
2018-12-07 10:19:53 -08:00