1
1
Граф коммитов

72 Коммитов

Автор SHA1 Сообщение Дата
Edgar Gabriel
d955753cb8 common/ompio: abstraction for different convertor types
introduce separate convertors for memory vs. file representation. Adjust the interfaces for decode_datatype to provide the convertor to be used for that.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-05-20 13:35:38 -05:00
Edgar Gabriel
cf5cdad40f fcoll: make vulcan the default component
make vulcan the default component except for Lustre file systems.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-22 14:12:02 -05:00
Edgar Gabriel
0757cb11a8 fcoll/all components: minor updates
two minor updates:
 - in all components: use the fh->f_bytes_per_agg value
   (which might have been set by an info object) instead
   of re-reading the mca parameter
 - vulcan and dynamic_gen2: replace one allgather operation
   by an allreduce, since it is used to determine the sum
   of an array.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-20 07:47:29 -05:00
Gilles Gouaillardet
cd45c7abb6 ompio: misc renames
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-14 09:41:10 +09:00
Gilles Gouaillardet
36b35ae0db ompio: fix abstraction
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-14 09:41:10 +09:00
Edgar Gabriel
8feb497dbe io/ompio: cleanup the aggregator selection logic
and some internal structure elements/components. Along the way,
add support for the cb_nodes Info object.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-07 16:47:10 -05:00
Nathan Hjelm
5f7ff5307e fcoll/two_phase: do not use removed function (MPI_Address)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-23 08:43:24 -06:00
Edgar Gabriel
da640f98df fcoll/two_phase: data sieving has to occur at offset 0 as well
data sieving has to occur for any offset provided that is larger
or equal zero for this implementation to work correctly.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-10 11:23:09 -06:00
Edgar Gabriel
1f151be6d2 io/ompio: introduce a new function to retrieve mca parameter values
ompio has the unique problem, that mca parameters set in the io/ompio component
have to be accessible from other frameworks as well. This is mostly done to avoid
a replication in the parameter names and to reduce the number of mca parameters that
and end-user has to worry about.

This commit introduces a generic function to retrieve ompio mca parameters, the function pointer
is stored on the file handle. It replaces two functions that used the same concept already for
one parameter each.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-12-01 10:00:23 -06:00
Edgar Gabriel
75ab006ec0 io/ompio: add a new option to disable amode overwriting
ompio has historically changed the WRONLY flag provided by the applicaiton
to RDWR to allow for the data sieving optimization within the two-phase I/O
fcoll component. This change did not have a performance impact
on regular UNIX file systems, but seems to hurt performance on NFS (and maybe Lustre?)

So provide an option that allows to keep the WRONLY option, and raise an error
if tha fcoll/two-phase would actually like to use the data sieving.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-11-17 13:13:38 -06:00
Joshua Hursey
e1d079544b mca: Dynamic components link against project lib
* Resolves #3705
 * Components should link against the project level library to better
   support `dlopen` with `RTLD_LOCAL`.
 * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
   with the appropriate project level library:
```
MCA components in ompi/
       $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
       $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
       $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
       $(top_builddir)/oshmem/liboshmem.la"
```

Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Edgar Gabriel
f258036e06 fcoll/two_phase: adjust aggregator selection to new mapby flag on MPI_COMM_WORLD
adjust how the aggregator nodes are selected depending on whether processes
have been mapped by node or anything else.

Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
2017-08-15 09:50:41 -05:00
Gilles Gouaillardet
fa5cd0dbe5 use ptrdiff_t instead of OPAL_PTRDIFF_TYPE
since Open MPI now requires a C99, and ptrdiff_t type is part of C99,
there is no more need for the abstract OPAL_PTRDIFF_TYPE type.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-04-19 13:41:56 +09:00
George Bosilca
366d64b7e5 Move the collective structure outside the communicator.
As we changed the ABI (forcing a major release), we can limit
the size of the predefined communicators by moving the collective
structure outside the communicator. This might have a minimal,
but unnoticeable, impact on performance. This approach has been
discussed during the January 2017 devel meeting.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-27 11:54:17 -06:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
Edgar Gabriel
ccf76b7791 moving the internal read/write functions to common/ompio
and update all fs/fcoll/sharedfp components to use these functions.
2016-07-21 13:08:32 -05:00
Edgar Gabriel
39ae93b87b modify the fcoll components to use the common/ompio print queues 2016-07-21 13:08:32 -05:00
Nathan Hjelm
8871bdb2f8 fcoll/two_phase: fix coverity issues
Fix CID 72296: Resource leak (RESOURCE_LEAK):

Changed code to goto exit instead of returning to ensure memory is
freed.

Fix CID 712589: Out-of-bounds read (OVERRUN):

In this loop i and j are identical and always less than
iov_count. The CID was triggered because i was incremented if i was <
iov_count. This meant that if the loop did go on the next iteration
would access an invalid index.

Fix CID 741363: Uninitialized scalar variable (UNINIT):

Allocate tmp_len with calloc to insure every index is initialized.

Fix CID 741364: Uninitialized pointer read (UNINIT):

Allocate recv_types with calloc to ensure all indices are always
initialized. Also added a check to not loop and destroy if recv_types
is NULL.

Also added a NULL check on the allocation of decoded iov. This is not
the cause of CID 126784 but should be fixed.

Fix CID 712588: Out-of-bounds read (OVERRUN):

Similar to CID 712589. Should silence the issue.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-19 14:47:41 -06:00
Gilles Gouaillardet
bfe8e03d9d fcoll/two_phase: use ompi_mpi_abort instead of PMPI_Abort
Thanks Jeff for the review
2015-12-07 11:34:36 +09:00
Gilles Gouaillardet
002c7b8b3a fcoll/two_phase: use PMPI_* insted of MPI_* 2015-11-20 13:46:19 +09:00
Nathan Hjelm
5122327727 fcoll/two_phase: fix new coverity errors
Fix CID 1325467: use after free

Remove extra free of aggregator_list.

Fix CID 1325466: resource leak

Fix typo in prior coverity fix.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-10-02 21:38:31 -06:00
Nathan Hjelm
09df7aa205 fcoll/two_phase: fix coverity errors
Fixes CIDs 72300, 72344, 1196764-1196768, 72300: Resource leaks

Mulitple allocated arrays are going out of scope at the end of
mca_fcoll_two_phase_file_write_all. Free these arrays. Also removed
the extraneous NULL checks since free (NULL) is safe in C.

Change returns to goto exit where the allocated resources are freed.

Fixes CIDs 72285-72292, 72297, 72298: Resource leaks

Change all appropriate return statements to goto exit to ensure that
all resources are freed. Also removed the NULL checks since free
(NULL) is safe in C.

Fixes CIDs 72295, 72296: Resource leaks

Moved free of requests and recv_types to after exit label. This will
ensure these are freed on error.

Also added a loop and statement to free send_buf which is going out of
scope at the end of the function.

Fixes CIDs 72336-72240, 735197, 735198: Resource leaks

Moved the exit label before to before the resources are released and
changed all appropriate return statements to goto exit. Also removed
extraneous NULL checks because free (NULL) is safe in C.

Fixes CIDs 72341, 72343, 1196805-1196809: Resource leaks

Free all resources after exit label and change return statements to
goto exit to ensure all resources are freed on error.

Fixes CID 1269973: Unused value

Check return code of ompi_request_wait_all. If it fails jump to the
exit.

Fixes CID 714119: Dereference before NULL check

Wrong value checked in conditional.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-01 14:38:09 -06:00
Edgar Gabriel
01fcfb08fe do not set the contigous flag in two_phase_file_read_all. This optimization
needs some more debugging for the two_phase component, and is disabled
for two_phase_file_write_all as well.
2015-09-18 09:30:50 -05:00
Gilles Gouaillardet
fe351f6801 io: do not cast way the const modifier when this is not necessary
update the io framework and mpi c bindings
2015-09-09 09:18:58 +09:00
Edgar Gabriel
072b18e197 Code cleanup for the time breakdown feature in ompio/fcoll
- make the internal structure follow the Open MPI naming convention
 - provide a single flag/macro which controls the compilation/utilization of this
   feature, to avoid that somebody using this has to modify every single
   fcoll component. A configure option could be added later if desired.
2015-08-14 08:53:04 -05:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Edgar Gabriel
cc219281ba checkpoint of the current work, since I need to resync wioth master to fix the compilation problems 2015-06-18 05:20:07 -05:00
Edgar Gabriel
100515e321 remove split collective interfaces from fcoll and their fake implemenations. Not required anymore 2015-06-18 05:20:07 -05:00
Edgar Gabriel
19cac73a9b first part of the changes trequired to support non-blocking colelctive io operations 2015-06-18 05:20:07 -05:00
Gilles Gouaillardet
0f17cdfc57 fcoll: fix misc memory leaks
as reported by Coverity with CIDs 72293,72294 and 1269894
2015-06-17 11:17:52 +09:00
Nathan Hjelm
033894b493 Merge pull request #541 from hjelmn/c99_components
C99 component initialization
2015-04-20 10:45:39 -06:00
Howard Pritchard
3339274136 Merge pull request #542 from hppritcha/topic/coverity_714118
fcoll/two_phase: coverity fix
2015-04-20 05:42:12 -06:00
Howard Pritchard
de215addc6 fcoll/two_phase: coverity fix
fix CID 714118

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-04-18 14:34:48 -06:00
Nathan Hjelm
df75d0382f ompi: use C99 subobject naming for component initialization
This commit helps future-proof ompi components by initializing each
component member by name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-18 10:29:58 -06:00
Mangala Jyothi Bhaskar
c4de46e284 Fix number of aggregators used in two phase fcoll 2015-04-16 10:39:10 -05:00
Nathan Hjelm
b68d66bb9b MCA: Add the project/project version to the MCA base component
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.

All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:04 -06:00
Jeff Squyres
c7d8563d8d fcoll_two_phase: fix trivial compiler warning 2015-02-23 08:58:43 -08:00
Howard Pritchard
bf89131f9e add owner files to opa/ompi/orte mca directories
This commit adds an owner file in each of the component directories
for each framework.  This allows for a simple script to parse
the contents of the files and generate, among other things, tables
to be used on the project's wiki page.  Currently there are two
"fields" in the file, an owner and a status.  A tool to parse
the files and generate tables for the wiki page will be added
in a subsequent commit.
2015-02-22 15:10:23 -07:00
mjbhaskar
39f9880759 Fixed the data type argument in an all reduce operation to fix a bug
seen on 32 bit machines.
2015-01-08 14:18:54 -06:00
Edgar Gabriel
7e41e0e62b fix a segfault in the two-phase I/O algorithm for fileviews of 0 byte size. 2014-12-01 15:59:00 -06:00
Edgar Gabriel
0758d7570e part 1 of the fix to get rid of the missing symbols that prevent the sub-modules to be loaded. 2014-11-29 20:01:36 -06:00
Gilles Gouaillardet
64c18686b7 fix ompi_request_wait vs ompi_request_wait_all and
MPI_STATUS_IGNORE vs MPI_STATUSES_IGNORE
2014-11-04 12:16:30 +09:00
Edgar Gabriel
3a5f4f72da make the zero byte read/write scenarios work without the contiguous flag.
This commit was SVN r32690.
2014-09-09 16:26:14 +00:00
Edgar Gabriel
ed02927767 - do not set the contiguous memory option in the collective operations. It
should not be stored on the file handle anyway, since it is not a property of
the file.
- protect a realloc for zero byte scenarios.

This commit was SVN r32678.
2014-09-07 18:09:43 +00:00
Edgar Gabriel
0f59ce6591 use the fbtl return value as originally intended, namely to retrieve the
number of bytes written and read. Status contains now the actual number of
bytes written for individual operations. For collective operations, this is
unfortunately not possible.

This commit was SVN r32674.
2014-09-07 15:14:57 +00:00
Edgar Gabriel
52eac0146d cleanup of the fbtl interfaces: remove the *sorted optimization flag, since it
was not used anyway in the last two years. Simplifies the code significantly.

This commit was SVN r32602.
2014-08-25 18:04:24 +00:00
Vishwanath Venkatesan
b176787d0f Remove unwanted spaces + Test commit
This commit was SVN r32576.
2014-08-22 05:11:17 +00:00
Edgar Gabriel
d4f83ab929 clean up of the MCA parameters of the fcoll framework. Most parameters are now
set/retrieved in ompio instead of the fcoll components.

This commit was SVN r32294.
2014-07-23 19:03:14 +00:00
Brian Barrett
8b778903d8 Fix longstanding issue with our multi-project support. Rather than using
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi.  This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.

This commit was SVN r30140.
2014-01-07 22:11:15 +00:00