This commit attempts to update the romio io component to not use
functions removed in MPI-3.0 (2012). This is a first cut and will
probably need to be reviewed for correctness.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(back-ported from commit open-mpi/ompi@84765001aa)
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
romio assumes that all predefined datatypes are contiguous. Because of
the (terribly named) composed datatypes MPI_SHORT_INT, MPI_DOUBLE_INT,
MPI_LONG_INT, etc this is an incorrect assumption. The simplest way to
fix this is to override the MPI_Type_get_envelope and
MPI_Type_get_contents calls with calls that will work on these
datatypes. Note that not all calls to these MPI functions are
replaced, only the ones used when flattening a non-contiguous
datatype.
References #5009
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(back-ported from commit open-mpi/ompi@4d876ec6fe)
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
add support for the info objects cb_buffer_size and collective_buffering.
Also, introduce a new mca parameter that allows to give feedback
on whether an info object is recognized (and honored).
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
this component can only be used in very specific scenarios. However, since some file systems do not support file locking and processes might be distributed over multiple nodes (hence the sm sharedfp component is also inelligible), the component might be selected in some scenarios, even if an application does not intend to use shared file pointers.
Since the fseek_shared function is involved as part of the File_set_view operation, only complain about the inability to perform the seek_shared operation if actual shared file pointer operations are being used. This avoid spurious error values being returned.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
this commit revamps the internal operations of the sharedfp components.
Specifically, it is focused around removing the second file_open
operation for shared file pointers. This makes the code more efficient.
Because of that, there is no necessity anymore for the sharedfp_lazy_open
mca parameter.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Extend number of supported ranks with providers that support
FI_REMOTE_CQ_DATA. Add README file to OFI MTL
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
This code is the implementation of Software-base Performance Counters as described in the paper 'Using Software-Base Performance Counters to Expose Low-Level Open MPI Performance Information' in EuroMPI/USA '17 (http://icl.cs.utk.edu/news_pub/submissions/software-performance-counters.pdf). More practical usage information can be found here: https://github.com/davideberius/ompi/wiki/How-to-Use-Software-Based-Performance-Counters-(SPCs)-in-Open-MPI.
All software events functions are put in macros that become no-ops when SOFTWARE_EVENTS_ENABLE is not defined. The internal timer units have been changed to cycles to avoid division operations which was a large source of overhead as discussed in the paper. Added a --with-spc configure option to enable SPCs in the Open MPI build. This defines SOFTWARE_EVENTS_ENABLE. Added an MCA parameter, mpi_spc_enable, for turning on specific counters. Added an MCA parameter, mpi_spc_dump_enabled, for turning on and off dumping SPC counters in MPI_Finalize. Added an SPC test and example.
Signed-off-by: David Eberius <deberius@vols.utk.edu>
The `nbc_i*` functions don't start communication, but create a request.
`nbc_*_init` are appropriate names for them.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Persistent operation for `NBC_A2A_DISS` is not supported currently.
Though the algorithm is not selected at all currently, I put an
assertion not to select it by mistake.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
`NBC_Copy` shoud not be called in `MPI_*_INIT`.
`NBC_Sched_copy` should be called instead.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Because a persistent reuqest does not free its `schedule` object
when the communication completes, the `NBC_Progress` function cannot
determine the completion using `schedule`.
Without this change, a hang occurs when the `NBC_Progress` function
is called recursively through the `NBC_Start_round` function.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Now libnbc COLL supports persistent collectives and all `*_init`
functions of the COLL interface are available. So let's enable the
check of availability of those functions on a communicator creation.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
prepare the upcoming persistent collectives by pre-factoring some code
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
fixup 808c3c62cd9475edd91ecde9d2d53b12e28b2c04
now that we have a shiny new fcoll component, no need
to keep the static component around. No use for it anymore.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
check for pending I/O operations and invalid modes
and return proper error codes before executing MPI_File_sync
makes the e_sync_1 test from the ibm testsuite pass.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
in case the user opened a file using the DELETE_ON_CLOSE flag,
return the error code generated in the delete operation.
Note, that this is however just a partial fix to the e_close_1 test
from the ibm testsuite, since the object destructor that triggers
the file_close function does not have a mechanism right now to recognize
and return an error code.
Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
in file_get_byte_offset, return an error code if the offset
leads to an invalid position in file.
Makes the e_get_byte_offset_1 test from the ibm testsuite pass.
Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
and some internal structure elements/components. Along the way,
add support for the cb_nodes Info object.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
since the request code is now being accessed also from the vulcan fcoll
component, the request code was relocated into the common/ompio
directory to avoid ld load problems.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
We introduced a new mca_vulcan parameter that specify the I/O synchronization
type (Async/sync I/O) applied within the collective write operation.
The user can explicitly choose to use async or sync write operation or make
the choice automatically made.
Signed-off-by: raafatfeki <fekiraafat@gmail.com>
For very large offsets, the data chunk size to be written by each aggregator
exceeds the capacity of an integer variable. Besides, some variables were
not large enough to hold intermediate values.
Signed-off-by: raafatfeki <fekiraafat@gmail.com>