1
1

7 Коммитов

Автор SHA1 Сообщение Дата
Edgar Gabriel
eeee011ac0 common/ompio: use avg. file view size in the aggregator selection logic
This is a fix  based on a bugreport on github/mailing list from CGNS.
The core of the problem was that different processes entered different branches of
our aggregator selection logic, due to the fact that in some cases processes had
a matching file_view size and contiguous chunk size (thus assuming 1-D distribution),
and some processes did not (thus assuming 2-D distribution). The fix is to calculate
the avg. file view size across all processes and use this value, thus ensuring that
all processes enter the same branch.

Fixes issue #7809

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit 4a8a330bbaf9fe5ea07cd01146afb83b569f3138)
2020-06-16 10:21:59 -05:00
Edgar Gabriel
39acc3a251 common/ompio: fix calculation in simple-grouping option
This is based on a bug reported on the mailing list using a netcdf testcase.
The problem occurs if processes are using a custom file view, but on some
of them it appears as if the default file view is being used. Because of that,
the simple-grouping option lead to different number of aggregators used on different
processes, and ultimately to a deadlock. This patch fixes the problem by not using
the file_view size anymore for the calculation in the simple-grouping option,
but the contiguous chunk size (which is identical on all processes).

Fixes issue #7109

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
(cherry picked from commit ad5d0df4e91d66efa5b69e8388f7ed0b4b6a1d09)
2019-11-25 09:04:13 -06:00
Harald Klimach
16e1d74c8f Suggestion to fix division by zero in file view.
In common_ompi_aggregators calc_cost routine:
do not cast the real division to an int intermediately.
This patch removes the obsolete int variable c and assigns
the result of the P_a/P_x division directly to n_as.

With the intermediate int c variable, n_as gets 0 if P_a < P_x,
resulting in a division by 0 when computing n_s.

Signed-off-by: Harald Klimach <harald.klimach@uni-siegen.de>
(cherry picked from commit e222a04ae57e5d09b8559f3c111de1f10a47246a)
2019-06-25 09:29:08 -06:00
René Widera
e30e5b95c6 common/ompio: possible rounding issue
Similar to #6286 rounding number of bytes into a single precision floating point value to round up the result of a division is a potential risk due to rounding errors.

- remove floating point operations for `round up`
- removes floating point conversion for round down (native behavior of integer division)

Signed-off-by: René Widera <r.widera@hzdr.de>
(cherry picked from commit a91fab80a1e55e1df15f649e18d247e5d4654eb9)
2019-01-30 12:31:39 -06:00
Edgar Gabriel
fd8c5fba4e common/ompio: fix the fview based grouping options
a bug sneaked into constructing the list of aggregators
processes when using the fileview based grouping options

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-22 14:01:31 -05:00
Gilles Gouaillardet
cd45c7abb6 ompio: misc renames
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-14 09:41:10 +09:00
Gilles Gouaillardet
36b35ae0db ompio: fix abstraction
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-14 09:41:10 +09:00