1
1

9984 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
4272b57089 ROMIO 3.2.1 refresh: prepare new romio directory ompi/mca/io/romio321
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:13 +09:00
Mikhail Kurnosov
66bc86a25b Change the tree_next to a flexible array member
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-06-19 13:01:26 -06:00
Mikhail Kurnosov
6547b58316 coll/base: add knomial tree algorithm for MPI_Bcast
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-06-19 13:01:26 -06:00
Edgar Gabriel
c3ac06dc1b sharedfp/sm and lockedfile: fix coverty warnings
this commit fixes the coverty warnings CID 1437402 and
CID 1437401

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-19 10:04:51 -05:00
Edgar Gabriel
9986a15b57 sharedfp/individual: only complain about fseek if sharedfp operations are really in use
this component can only be used in very specific scenarios. However, since some file systems do not support file locking and processes might be distributed over multiple nodes (hence the sm sharedfp component is also inelligible), the component might be selected in some scenarios, even if an application does not intend to use shared file pointers.

Since the fseek_shared function is involved as part of the File_set_view operation, only complain about the inability to perform the seek_shared operation if actual shared file pointer operations are being used. This avoid spurious error values being returned.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-18 18:25:29 -05:00
Edgar Gabriel
bb1522472f
Merge pull request #5286 from edgargabriel/topic/sharedfp-revamp
sharedfp/all components: revamp internal operations
2018-06-18 16:09:54 -05:00
Edgar Gabriel
bc0f60dfd9 sharedfp/all components: revamp internal operations
this commit revamps the internal operations of the sharedfp components.
Specifically, it is focused around removing the second file_open
operation for shared file pointers. This makes the code more efficient.
Because of that, there is no necessity anymore for the sharedfp_lazy_open
mca parameter.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-18 14:34:05 -05:00
Yossi Itigin
733cac864a
Merge pull request #5282 from yosefe/topic/pml-ucx-opal-mem-hooks
pml_ucx: add option to use opal memhooks instead of ucx internal hooks
2018-06-18 19:07:01 +03:00
Gilles Gouaillardet
3f874c9857 spc: remove ompi_spc_get_count() prototype from ompi_spc.h
This function is only used in ompi_spc.c and is hence declared as static.
Remove its prototype from the header file in order to silence compiler warnings who will typically consider ompi_spc_get_count() as a declared but not defined function.

Fixes open-mpi/ompi#5279
Fixes open-mpi/ompi#5273

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-18 16:07:11 +09:00
Yossi Itigin
564f80d362 pml_ucx: add option to use opal memhooks instead of ucx internal hooks
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-06-17 15:30:44 +03:00
Gilles Gouaillardet
2caf1bf0e5
Merge pull request #5263 from ggouaillardet/topic/ompio_abstraction
ompio: fix abstraction
2018-06-16 23:29:29 +09:00
Matias A Cabral
e6674556aa MTL OFI: add support for FI_REMOTE_CQ_DATA.
Extend number of supported ranks with providers that support
FI_REMOTE_CQ_DATA. Add README file to OFI MTL
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2018-06-14 17:17:38 -07:00
Edgar Gabriel
d5bdcf8595 fs/pvfs2: fix compilation problem
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-14 09:30:45 -05:00
Howard Pritchard
7dcab6e4a4
Merge pull request #5269 from hppritcha/topic/squash_gcc7.3.0_warnings
topo/treematch - quash compiler warning
2018-06-13 21:13:04 -05:00
Gilles Gouaillardet
cd45c7abb6 ompio: misc renames
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-14 09:41:10 +09:00
Gilles Gouaillardet
36b35ae0db ompio: fix abstraction
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-14 09:41:10 +09:00
Howard Pritchard
64de269cc3 topo/treematch - quash compiler warning
quash a compiler warning showing up with gcc 7.3

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-06-13 16:34:17 -05:00
Thananon Patinyasakdikul
390d72addd
Merge pull request #4885 from davideberius/spc_pr
Initial Software-based Performance Counters PR
2018-06-12 14:04:49 -07:00
David Eberius
d377a6b6f4 Added Software-based Performance Counters driver code along with several counters.
This code is the implementation of Software-base Performance Counters as described in the paper 'Using Software-Base Performance Counters to Expose Low-Level Open MPI Performance Information' in EuroMPI/USA '17 (http://icl.cs.utk.edu/news_pub/submissions/software-performance-counters.pdf).  More practical usage information can be found here: https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI.

All software events functions are put in macros that become no-ops when SOFTWARE_EVENTS_ENABLE is not defined.  The internal timer units have been changed to cycles to avoid division operations which was a large source of overhead as discussed in the paper.  Added a --with-spc configure option to enable SPCs in the Open MPI build.  This defines SOFTWARE_EVENTS_ENABLE.  Added an MCA parameter, mpi_spc_enable, for turning on specific counters.  Added an MCA parameter, mpi_spc_dump_enabled, for turning on and off dumping SPC counters in MPI_Finalize.  Added an SPC test and example.

Signed-off-by: David Eberius <deberius@vols.utk.edu>
2018-06-11 22:48:16 -04:00
Jeff Squyres
84701cd2b0
Merge pull request #5204 from markalle/info_snprintf
fix info-subscribe to use snprintf() and warn on long key
2018-06-08 15:22:55 -04:00
Edgar Gabriel
2d8a769bfd fcoll/static: remove component
now that we have a shiny new fcoll component, no need
to keep the static component around. No use for it anymore.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-08 07:39:46 -05:00
Edgar Gabriel
b27a40cdf9
Merge pull request #5246 from edgargabriel/topic/ibm-testsuite-fixes
Topic/ibm testsuite fixes
2018-06-08 06:06:49 -05:00
Yossi Itigin
fd12540751
Merge pull request #5227 from hoopoepg/topic/pml-ucx-hang-on-finalize
PML/UCX: fixed hand on MPI_Finalize
2018-06-08 13:19:49 +03:00
KAWASHIMA Takahiro
317e53f83f
Merge pull request #5243 from t-kurita/pr/mpiext-mpi-f08-logical
Fortran: Enable using `LOGICAL` parameter in MPI extensions.
2018-06-08 13:11:13 +09:00
Edgar Gabriel
a1484ec69a io/ompio: check error conditions before executing file_sync
check for pending I/O operations and invalid modes
and return proper error codes before executing MPI_File_sync

makes the e_sync_1 test from the ibm testsuite pass.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-07 19:30:27 -05:00
Edgar Gabriel
5f1e88d265 mpi/c: check for valid datatype in file_get_type_extend
the interface if file_get_type_extent did not check
whether the input datatype is valid or not.

Makes the e_get_type_extend_2 test from the ibm testsuite pass.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-07 19:30:27 -05:00
Edgar Gabriel
14bd114973 common/ompio: return error code from file_delete operation in file_close
in case the user opened a file using the DELETE_ON_CLOSE flag,
return the error code generated in the delete operation.

Note, that this is however just a partial fix to the e_close_1 test
from the ibm testsuite, since the object destructor that triggers
the file_close function does not have a mechanism right now to recognize
and return an error code.

Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
2018-06-07 19:30:14 -05:00
Edgar Gabriel
f7cae7731c io/ompio: return error code for invalid offset
in file_get_byte_offset, return an error code if the offset
leads to an invalid position in file.

Makes the e_get_byte_offset_1 test from the ibm testsuite pass.

Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
2018-06-07 18:46:17 -05:00
Edgar Gabriel
deaeaa60de fcoll/vulcan: minor bugfix
when creating the groups_per_proc arrays

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-07 17:52:32 -05:00
Edgar Gabriel
8feb497dbe io/ompio: cleanup the aggregator selection logic
and some internal structure elements/components. Along the way,
add support for the cb_nodes Info object.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-07 16:47:10 -05:00
Edgar Gabriel
529d882ff0 io/ompio and common/ompio: relocate ompio_request code to common
since the request code is now being accessed also from the vulcan fcoll
component, the request code was relocated into the common/ompio
directory to avoid ld load problems.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-07 16:13:12 -05:00
raafatfeki
5ecb4a56e3 fcoll/vulcan: Support of asynchronous write in collective writeAll
We introduced a new mca_vulcan parameter that specify the I/O synchronization
type (Async/sync I/O) applied within the collective write operation.
The user can explicitly choose to use async or sync write operation or make
the choice automatically made.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-06-07 16:13:12 -05:00
raafatfeki
4f7172ddf6 fcoll/vulcan: Support of larger offsets
For very large offsets, the data chunk size to be written by each aggregator
exceeds the capacity of an integer variable. Besides, some variables were
not large enough to hold intermediate values.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-06-07 16:13:12 -05:00
raafatfeki
4670fe50d7 fcoll/vulcan: Remove unnecessary calls to write
Identify the index of each aggregator process in order to restrict the call to write_init function by the specific aggregator.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-06-07 16:13:12 -05:00
raafatfeki
bc6431bee9 fcoll/vulcan: use hindexed constructor on the sender side
Instead of using a temporary buffer and copy data into the temp buffer before sending, use a derived datatype to describe the data that needs to be sent during a cycle in the collective I/O operation.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-06-07 16:13:12 -05:00
Edgar Gabriel
1c2c110824 fcoll/vulcan: add new fcoll component
import of the new vulcan component. It is an enhanced version
of the two_phase component, which uses however the ompio internal
codes/loops to assemble the data arrays. It is therefore more inline
with the dynamic and dynamic_gen2 component, and will be easier to
maintain.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-07 16:13:12 -05:00
Kurita, Takehiro
f9ae932bfd Fortran: Enable using LOGICAL parameter in MPI extensions.
If a subroutine of the Fortran `use-mpi-f08` binding in an MPI extension
have a `LOGICAL` parameter and no `TYPE(MPI_Status)` parameter,
it needs to use the `mpi_ext` module and call its corresponding subroutine
in the `mpif-h` directory, as explained in
`ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h`.
However, as shown in the figure below, the required directories are dependent
on each other, and "Can't open module file" error occurs at build time.

             ompi/mpiext/{extension name}/use-mpi-f08
                A                               |
                |                               |
                |                               V
   ompi/mpi/fortran/use-mpi-f08  <---  ompi/mpi/fortran/mpiext (mpi_ext.mod)

In order to solve this problem, change the configuration and the build order.
- divide Fortran extension directory (`ompi/mpi/fortran/mpiext`)
  into the directories for `use-mpi` and for `use-mpi-08`
    - `ompi/mpi/fortran/mpiext-use-mpi`     : for `use-mpi` (mpi_ext.mod)
    - `ompi/mpi/fortran/mpiext-use-mpi-f08` : for `use-mpi-08` (mpi_f08_ext.mod)

- change to the following build order about Fortran `use-mpi` and
  `use-mpi-f08` bindings in `ompi`
    1. mpi_ext bindings of MPI extensions (`mpiext/{extension name}/use-mpi` directory)
    2. Fortran use-mpi (`mpi/fortran/use-mpi-[ignore-]tkr` directory)
    3. Fortran extension for use-mpi (`mpi/fortran/mpiext-use-mpi` directory)
    4. Fortran use-mpi-f08 modules only (`mpi/fortran/use-mpi-f08/mod` directory)
    5. mpi_f08_ext bindings of MPI extensions (`mpiext/{extension name}/use-mpi-f08` directory)
    6. Fortran use-mpi-f08 (`mpi/fortran/use-mpi-f08` directory)
    7. Fortran extension for use-mpi-f08 (`mpi/fortran/mpiext-use-mpi-f08` directory)

Signed-off-by: Kurita, Takehiro <fj6370fp@aa.jp.fujitsu.com>
2018-06-07 15:02:17 +09:00
Nathan Hjelm
63ded4d083
Merge pull request #5224 from benmenadue/master
io/romio314: Replace deprecated MPI-1 functions
2018-06-06 15:41:53 -06:00
Ralph Castain
5853ebee1a
Merge pull request #5240 from rhc54/topic/foo
Correct typo in name comparison flags
2018-06-06 13:25:20 -07:00
Ralph Castain
86d699d42e Correct typo in name comparison flags
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-06 12:18:52 -07:00
bosilca
fa1386768f
Merge pull request #5234 from jsquyres/pr/oshmem-init-race
ompi_mpi_init: fix race condition
2018-06-06 12:14:00 -04:00
Jeff Squyres
9b9cb5fef0 to be squashed: move wait-for-init loop to ompi_mpi_init()
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-06 05:35:19 -07:00
Ralph Castain
840fb42f93 PMIx rte component does support dynamics
Minor cleanups

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-05 21:55:19 -07:00
Jeff Squyres
67ba8da76f ompi_mpi_init: fix race condition
There was a race condition in 35438ae9b5: if multiple threads invoked
ompi_mpi_init() simultaneously (which could happen from both MPI and
OSHMEM), the code did not catch this condition -- Bad Things would
happen.

Now use an atomic cmp/set to ensure that only one thread is able to
advance ompi_mpi_init from NOT_INITIALIZED to INIT_STARTED.

Additionally, change the prototype of ompi_mpi_init() so that
oshmem_init() can safely invoke ompi_mpi_init() multiple times (as
long as MPI_FINALIZE has not started) without displaying an error.  If
multiple threads invoke oshmem_init() simultaneously, one of them will
actually do the initialization, and the rest will loop waiting for it
to complete.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-05 18:09:13 -07:00
Nathan Hjelm
64a5baaa28
Merge pull request #5193 from hjelmn/osc_sm_location
Use /dev/shm for shared memory files in osc components
2018-06-05 09:42:14 -06:00
Sergey Oblomov
0a8261f3b0 PML/UCX: fixed hand on MPI_Finalize
fixes issue https://github.com/openucx/ucx/issues/2656

added flush for worker object to complete all pending operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-05 17:22:03 +03:00
Mikhail Kurnosov
3adf96fdb8 coll/base: add butterfly algorithm for MPI_Reduce_scatter
Implements butterfly algorithm for MPI_Reduce_scatter.
The algorithm can be used both by commutative and non-commutative operations, for power-of-two and non-power-of-two number of processes.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-06-05 15:53:13 +07:00
Ben Menadue
34ec0bd8ab Replace MPI_Type_extent with MPI_Type_get_extent in ROMIO.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2018-06-05 15:27:58 +10:00
Ben Menadue
756cc67221 Replace MPI_Address with MPI_Get_address in ROMIO.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2018-06-05 15:27:25 +10:00
Gilles Gouaillardet
05b3546151 java: do not use MPI1 deprecated subroutines
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-04 10:49:52 +09:00