1
1

10684 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
7fac7a07ab
allow thread support in the pessimist vprotocol.
Add MCA parameter to allow support for message logging even when the MPI
library is initialized with support for thread multiple.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-01-08 08:00:06 -05:00
George Bosilca
271deed68e
Fix the vprotocol initialization.
Add a comment about the stages in the vprotocols initialization and why
we cant use the threading provided during the PML initialization.
Instead, use the OMPI internal threading status to prevent the use of
message logging in multi-threaded applications.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-01-07 18:06:45 -05:00
George Bosilca
684b91a1bb
We can only specify one single PML as MCA params.
Make sure the MCA parameter for the PML selection only contains a single
value.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-01-07 18:06:45 -05:00
Eisuke Kawashima
d26d4e1d63
Fix typo and update URLs (https, redirection) [skip ci]
Signed-off-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
2020-01-07 03:52:25 +09:00
Howard Pritchard
b17cd9049f
Merge pull request #7238 from rwespetal/ofi-prov-name-fix
mtl/ofi: ignore case when comparing provider names
2020-01-02 12:23:39 -07:00
Jeff Squyres
8b424c3863
Merge pull request #7232 from bosilca/hjelmn_neighbor_alltoall_fix
Neighbor alltoall fix
2019-12-17 17:24:05 -05:00
Robert Wespetal
9b72e9465d mtl/ofi: ignore case when comparing provider names
Change the provider include and exclude list name comparison check to
ignore case. The UDP provider's name is uppercase and was being selected
despite being in the exclude list.

Signed-off-by: Robert Wespetal <wesper@amazon.com>
2019-12-16 13:05:00 -08:00
George Bosilca
97eb5d0cf2 Remove unused variable.
It got left over during the Fortran rework.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-12-11 16:59:50 -05:00
George Bosilca
86acdee460
Fix the communication ordering for all cartesian neighbor collectives.
This work is rooted in the [MPI Forum issue
153](https://github.com/mpi-forum/mpi-issues/issues/153).

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-12-11 12:40:38 -05:00
Todd Kordenbrock
1af6dbe277
Merge pull request #7066 from tkordenbrock/topic/master/portals4.fix.flowcontrol.bugs
portals4: fix flow control bugs
2019-12-11 06:31:26 -06:00
Jeff Squyres
73ecece55d
Merge pull request #7211 from mcoil1/pr/fixing-compiler-warnings
Fix a few compiler warnings
2019-12-09 07:26:15 -05:00
Maxwell Coil
52241dbbcd libnbc: fixed uninitialized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-08 14:03:48 -05:00
Maxwell Coil
3ced33c2eb ompi/dpm/dpm.c: Fix uninititalized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-08 14:03:48 -05:00
William Bailey
e2718e0196 fcoll/two_phase: Compiler warning for wrong variable type used
Squash compiler warning. Changed output specifier to match variable type (long int -> long long int).

Signed-off-by: William Bailey <wbailey2@nd.edu>
2019-12-08 13:15:30 -05:00
Maxwell Coil
8c237e2684 romio: Update ADIOI_R_Exchange_data function
Squash compiler warning due to whitespace/brace problems.

The code block from lines 829-839 was improperly indented, which led to
both the code being confusing and a compiler warning. Comparing this code to
the current version in the MPICH repo made it clear that the code was simply
improperly indented. Fixing the indentation both makes the code readable and
squashes the compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-05 18:24:02 -05:00
William Bailey
30bda56bce romio: fix uninitialized variable
Squash compiler warning.

ROMIO is third-party software but has an annoying compiler warning;
this is the minimum distance fix.

Signed-off-by: William Bailey <wbailey2@nd.edu>
2019-12-02 17:35:05 -05:00
Gilles Gouaillardet
967cf68027 configury: fix a typo in mpiext/shortfloat
remove an extra and useless comma

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-11-29 11:09:54 +09:00
Joseph Schuchart
06bbcf4fd6 Correctly set baseptr in contiguous shared memory window with local size zero
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-11-28 11:04:30 +01:00
raafatfeki
7b2d83c898 mca/fs: Remove unused functions and prototypes & reduce recurent code through all components
I removed the implementation and/or prototypes of all unused functions defined for all components.
To reduce recurrent code, I created functions under base for the management of error codes and setting of file permission and amode.
Then, I replaced these recurrent code by those function for all components.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>

add a missing header file

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-25 09:01:38 -06:00
raafatfeki
5988297bd8 fs/gpfs: Assign file creation to one process and management of error codes translation.
When O_CREAT and O_EXCL are set, open() shall fail if the file exists. Therefore, I assigned the file creation to the root process only.
I also translated the errno codes to their corresponding MPI error codes.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2019-11-25 09:01:38 -06:00
raafatfeki
a2b1a03e7b fs/gpfs: Update component to compile with latest gpfs versions
I updated the gpfs component to match the latest structures and functions definitions and remove compiler wranings.
Disabled mpi info setting for the gpfs option "Data Shipping" since it is no longer supported in the latest gpfs versions.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2019-11-25 09:01:38 -06:00
Edgar Gabriel
715a6e8c27 fs/gpfs: update component and module file
to match what we are doing meanwhile in the other fs compomenents.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-25 09:01:38 -06:00
Edgar Gabriel
8d31ba3a09 fs/gpfs: update configure logic
update the configure logic of the gpfs component
based on what we learned from user feedback over the last
two years for the other components

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-25 09:01:38 -06:00
XuanWang1982
cd76e53479 Changed parameter name from siox to gpfs
Adding get_info function for gpfs module
annotate printf information

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-25 09:01:38 -06:00
Christoph Niethammer
d3c11396fd First code review round of the GPFS module.
Delete check for amode which should go to a higler layer, e.g. ompi_file_open.
Only perform Info value check if key is found.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-25 09:01:38 -06:00
XuanWang1982
b1dc58eeb2 First version for GPFS module. To be tested
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-25 09:01:38 -06:00
Howard Pritchard
ebf003581b
Merge pull request #7183 from hppritcha/topic/quiet_lustre
lustre: squash some compiler warnings
2019-11-22 19:19:08 -07:00
Barton Chittenden
47816ef83b Remove mxm, yalla and ikrit
Signed-off-by: Barton Chittenden <bartonski@gmail.com>
2019-11-22 13:40:16 -08:00
Michael Lass
67490118ad dims_create: fix calculation of factors for odd squares
Until now sqrt(n) was missed as a factor for odd square numbers n. This
lead to suboptimal results of MPI_Dims_create for input numbers like 9,
25, 49, ... Fix the results by including sqrt(n) in the search for
factors.

Refs: #7186

Signed-off-by: Michael Lass <bevan@bi-co.net>
2019-11-22 20:12:58 +01:00
Edgar Gabriel
ea1355beae fcoll/two_phase: fix error in calculating aggregators in 32bit mode
In fcoll_two_phase_supprot_fns.c: calculation of the aggregator index
failed for large offsets on 32bit machine, due to improper handling of
64bit offsets.

Fixes Issue #7110

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-22 09:45:19 -06:00
Howard Pritchard
e66a7cef11 lustre: squash some compiler warnings
Compiling OMPI on cray systems using latest Cray compilers (clang based)
yielded some compiler warnings from ompio/lustre.  Squash these warnings.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2019-11-21 14:14:40 -06:00
Geoff Paulsen
9e17f028b5
Merge pull request #5507 from markalle/prot_table
Adding -mca comm_method to print table of communication methods
2019-11-14 14:24:06 -06:00
Austen Lauria
edcd6d8aeb
Merge pull request #7146 from bgoglin/master
fix typos hlwoc->hwloc
2019-11-13 15:38:00 -05:00
Yossi Itigin
40e2fbb093
Merge pull request #7114 from brminich/topic/mlx_scat_tuning
COLL/TUNED: Add linear scatter using isend for mlnx platform
2019-11-07 13:38:21 +02:00
Mikhail Brinskii
f2cbd4806e COLL/TUNED: Add linear scatter using isend for mlnx platform
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-11-07 11:04:39 +02:00
Brice Goglin
5c6bd7ea4e fix typos hlwoc->hwloc
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2019-11-06 10:42:36 +01:00
Mark Allen
6855ebb84b Adding -mca comm_method to print table of communication methods
This is closely related to Platform-MPI's old -prot feature.

The long-format of the tables it prints could look like this:
>   Host 0 [myhost001] ranks 0 - 1
>   Host 1 [myhost002] ranks 2 - 3
>   Host 2 [myhost003] ranks 4
>   Host 3 [myhost004] ranks 5
>   Host 4 [myhost005] ranks 6
>   Host 5 [myhost006] ranks 7
>   Host 6 [myhost007] ranks 8
>   Host 7 [myhost008] ranks 9
>   Host 8 [myhost009] ranks 10
>
>    host | 0    1    2    3    4    5    6    7    8
>   ======|==============================================
>       0 : sm   tcp  tcp  tcp  tcp  tcp  tcp  tcp  tcp
>       1 : tcp  sm   tcp  tcp  tcp  tcp  tcp  tcp  tcp
>       2 : tcp  tcp  self tcp  tcp  tcp  tcp  tcp  tcp
>       3 : tcp  tcp  tcp  self tcp  tcp  tcp  tcp  tcp
>       4 : tcp  tcp  tcp  tcp  self tcp  tcp  tcp  tcp
>       5 : tcp  tcp  tcp  tcp  tcp  self tcp  tcp  tcp
>       6 : tcp  tcp  tcp  tcp  tcp  tcp  self tcp  tcp
>       7 : tcp  tcp  tcp  tcp  tcp  tcp  tcp  self tcp
>       8 : tcp  tcp  tcp  tcp  tcp  tcp  tcp  tcp  self
>
>   Connection summary:
>     on-host:  all connections are sm or self
>     off-host: all connections are tcp

In this example hosts 0 and 1 had multiple ranks so "sm" was more
meaningful than "self" to identify how the ranks on the host are
talking to each other. While host 2..8 were one rank per host so
"self" was more meaningful as their btl.

Above a certain number of hosts (12 by default) the above table gets too big
so we shrink to a more abbreviated looking table that has the same data:
>    host | 0 1 2 3 4       8
>   ======|====================
>       0 : A C C C C C C C C
>       1 : C A C C C C C C C
>       2 : C C B C C C C C C
>       3 : C C C B C C C C C
>       4 : C C C C B C C C C
>       5 : C C C C C B C C C
>       6 : C C C C C C B C C
>       7 : C C C C C C C B C
>       8 : C C C C C C C C B
>   key: A == sm
>   key: B == self
>   key: C == tcp

Then above 36 hosts we stop printing the 2d table entirely and just print the
summary:
>   Connection summary:
>     on-host:  all connections are sm or self
>     off-host: all connections are tcp

The options to control it are
    -mca comm_method 1   :   print the above table at the end of MPI_Init
    -mca comm_method 2   :   print the above table at the beginning of MPI_Finalize
    -mca comm_method_max <n> :  number of hosts <n> for which to print a full size 2d
    -mca comm_method_brief 1 :  only print summary output, no 2d table
    -mca comm_method_fakefile <filename> :  for debugging only

* printing at init vs finalize:

The most important difference between these two is that when printing the table
during MPI_Init(), we send extra messages to make sure all hosts are connected to
each other. So the table ends up working against the idea of on-demand connections
(although it's only forcing the n^2 connections in the number of hosts, not the
total ranks).  If printing at MPI_Finalize() we don't create any connections that
aren't already connected, so the table is more likely to have "n/a" entries if
some hosts never connected to each other.

* how many hosts <n> for which to print a full size 2d table

The option -mca comm_method_max <n> can be used to specify a number of hosts <n>
(default 12) that controls at what host-count the unabbreviated / abbreviated
2d tables get printed:
    1 - n      : full size 2d table
    n+1 - 3n   : shortened 2d table
    3n+1 - inf : summary only, no 2d table

* brief

The option -mca comm_method_brief 1 can be used to skip the printing of the 2d
table and only show the short summary

* fakefile

This is a debugging option that allows easeir testing of all the printout
routines by letting all the detected communication methods between the hosts
be overridden by fake data from a file.

The source of the information used in the table is the .mca_component_name

In the case of BTLs, the module always had a .btl_component linking back to the
component. The vars mca_pml_base_selected_component and ompi_mtl_base_selected_component
offer similar functionality for pml/mtl.

So with the ability to identify the component, we can then access
the component name with code like this
    mca_pml_base_selected_component.pmlm_version.mca_component_name
See the three lookup_{pml,mtl,btl}_name() functions in hook_comm_method_fns.c,
and their use in comm_method() to parse the strings and produce an integer
to represent the connection type being used.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2019-10-31 16:23:57 -04:00
Gilles Gouaillardet
631a43581f
Merge pull request #7117 from ggouaillardet/topic/f08_bind_c_constants_revamp_misc_fixes
fortran/use-mpi-f08: misc fixes
2019-10-30 10:42:33 +09:00
Edgar Gabriel
ad5d0df4e9 common/ompio: fix calculation in simple-grouping option
This is based on a bug reported on the mailing list using a netcdf testcase.
The problem occurs if processes are using a custom file view, but on some
of them it appears as if the default file view is being used. Because of that,
the simple-grouping option lead to different number of aggregators used on different
processes, and ultimately to a deadlock. This patch fixes the problem by not using
the file_view size anymore for the calculation in the simple-grouping option,
but the contiguous chunk size (which is identical on all processes).

Fixes issue #7109

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-10-29 12:30:41 -05:00
Gilles Gouaillardet
fda4d040da fortran/use-mpi-f08: misc fixes
- fix typos from open-mpi/ompi@b10a60a5a9
 - remove remaining references to OMPI_PROTECTED from open-mpi/ompi@df6d763a53

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-10-29 15:00:51 +09:00
Joseph Schuchart
02dd877d8a RDMA osc: perform CAS in shared memory if possible
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-10-28 11:24:16 +01:00
Gilles Gouaillardet
51e23f8cb6 fortran/use-mpi-f08: remove bind(C) constants.
Remove unused bind(C) constants in ompi/mpi/fortran/use-mpi-f08/constants.{c,h}
(and break ABI compatibility).

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-10-28 10:28:17 +09:00
Gilles Gouaillardet
df6d763a53 configury: remove references to unused OMPI_PROTECTED
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-10-28 10:28:17 +09:00
Gilles Gouaillardet
b10a60a5a9 fortran/use-mpi-f08: revamp constant declarations
In order to work around an issue with flang based compilers,
avoid declaring bind(C) constants and use plain Fortran parameter
instead.

For example,
type(MPI_Comm), bind(C, name="ompi_f08_mpi_comm_world") OMPI_PROTECTED :: MPI_COMM_WORLD
is changed to
type(MPI_Comm), parameter :: MPI_COMM_WORLD = MPI_Comm(OMPI_MPI_COMM_WORLD)

Note that in order to preserve ABI compatibility, ompi/mpi/fortran/use-mpi-f08/constants.{c,h}
have been kept even if its symbols are no more referenced by Open MPI.

Refs. open-mpi/ompi#7091

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-10-28 10:01:17 +09:00
Austen Lauria
aa8be9c12d
Merge pull request #6284 from devreal/ompi-rdma-memalign
Ensure proper alignment of memory provided by MPI
2019-10-25 12:27:58 -04:00
Austen Lauria
ecd990a67c
Merge pull request #6933 from devreal/osc-ucx-excl-lock
UCX osc: properly release exclusive lock to avoid lockup
2019-10-25 09:16:51 -04:00
Edgar Gabriel
dce203ffc6
Merge pull request #7057 from edgargabriel/topic/romio321-status-set-elements-fix
MPIR_Status_set_bytes: fix for large counts
2019-10-18 08:16:36 -05:00
Mark Browning
77b3ff9d38 Remove the stale cr MPI extension
Also removed text block from line 883 of README.

Signed-off-by: Mark Browning <marksbrowning3@gmail.com>
2019-10-10 13:24:30 -04:00
Todd Kordenbrock
e7b867c044 mtl-portals4: don't finalize flow control if Portals4 was not initialized
This commit fixes a segfault in mtl-portals4 finalize().  The segfault
occurs if finalize() is called without any calls to add_procs().  This
commit resolves the segfault by skipping the flow control fini() call if
Portals4 was not initialized.

Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
2019-10-09 17:07:50 -05:00
Geoff Paulsen
4e1e6f8972
Merge pull request #6993 from awlauria/fix_warnings_master
Fix miscellaneous compiler warnings.
2019-10-09 09:17:02 -05:00