1
1
Граф коммитов

5935 Коммитов

Автор SHA1 Сообщение Дата
2f447b2c4c bml/r2: use the bml framework output and set verbosity level to info
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-17 11:48:06 -06:00
98b300e1bb mtl/ofi: Require proper ordering by OFI provider. 2015-08-14 16:36:10 -07:00
072b18e197 Code cleanup for the time breakdown feature in ompio/fcoll
- make the internal structure follow the Open MPI naming convention
 - provide a single flag/macro which controls the compilation/utilization of this
   feature, to avoid that somebody using this has to modify every single
   fcoll component. A configure option could be added later if desired.
2015-08-14 08:53:04 -05:00
4bfc6ae798 Performance tuning: incorporate the usage of non-blocking operations in our array group-communication operations. 2015-08-13 20:05:18 -05:00
6118236f1a Merge pull request #796 from ggouaillardet/topic/hcoll_config
configury: fix hcoll, fca and mxm detection and revamp yalla Makefile.am
Thanks to David Shrader and Ake Sandgren for bringing this issue to our attention
2015-08-14 08:55:46 +09:00
9f369ba515 move the inclusion of the lustre_user and lliblustreapi header files to the fs_lustre.h file. 2015-08-13 15:36:16 -05:00
6b2fe9120e yalla: fix Makefile.am LDFLAGS 2015-08-13 17:33:52 +09:00
1a238d3a4f configury: fix fca detection
* do not add -I/.../include/fca -I /.../include/fca_core to CPPFLAGS
 * allow configure --with-fca
 * search fca libs in both DIR/lib and DIR/lib64
 * fix the description of the --with-fca option
2015-08-13 11:09:15 +09:00
df98a73131 configury: fix hcoll detection
* do not add -I/.../include/hcoll -I /.../include/hcoll/api to CPPFLAGS
 * allow configure --with-hcoll
 * search hcoll libs in both DIR/lib and DIR/lib64
 * fix the description of the --with-hcoll option
2015-08-13 11:08:56 +09:00
27520b99b8 mtl/ofi: add include/exclude list MCA vars.
mtl_ofi_provider_include (resp. mtl_ofi_provider_exclude) can be used
to specify which provider(s) the OFI MTL can select (resp. ignore).

e.g. --mca mtl_ofi_provider_include "psm,sockets"

By default, mtl_ofi_provider_exclude is set to "sockets,mxm".

This deprecates the old MCA var named "mtl_ofi_provider".
2015-08-12 13:52:04 -07:00
e9b7203ece treematch: ensure hwloc support is enabled
This commit does the following:

* s/ompi_check_treematch/ompi_topo_treematch/ (i.e., abide by the
  prefix rule)
* change the value of ompi_topo_treematch_happy from yes/no to 0/1, so
  that we can use -eq for numerical comparisons (vs. string
  comparisons).  It's the little things in life, no?
* Check the valueo f $OPAL_HAVE_HWLOC to ensure that hwloc support is
  enabled.  If not, disqualify treematch from building.
* Fixes a few places that were underquoted
* Convert from "test ... -a ..." to "test ... && test ..."

Fixes open-mpi/ompi#797
2015-08-12 12:23:12 -07:00
55f0e1a1f8 fix the lustre compilation problems for older lustre versions. Add the prototype for the static function to avoid a warning message. 2015-08-12 09:45:07 -05:00
3be125afff op base: whitespace cleanup
No logical code changes.
2015-08-12 05:35:11 -07:00
a2addbafed op base: move return statement to correct level
This fixes CID 71945.
2015-08-12 05:35:11 -07:00
624a4a0f82 Merge pull request #699 from hjelmn/libnbc_fixes
coll/libnbc: rewrite parts of libnbc
2015-08-10 14:51:42 -06:00
87db836800 Merge pull request #788 from yburette/topic/deprioritize_some_providers
mtl/ofi: Deprioritize some OFI providers.
2015-08-10 14:45:59 -04:00
d42e0968b1 coll/libnbc: rewrite parts of libnbc
This commit rewrites parts of libnbc to fix issues identified by
coverity and myself. The changes are as follows:

 - libnbc function would return invalid error codes (internal to
   libnbc) to the mpi layer. These codes names are of the form
   NBC_. They do not match up with the error codes expected by the mpi
   layer. I purged the use of all these error codes with the exception
   of NBC_OK and NBC_CONTINUE in progress. These codes are used to
   identify when a request handle is complete.

 - Handles and schedules were leaked by all collective routines on
   error. A new routine was added to return a collective handle
   (NBC_Return_handle).

 - Temporary buffers containting in/out neighbors for neighborhood
   collectives were always leaked.

 - Neigborhood collectives contained code to handle MPI_IN_PLACE which
   is never a valid input for the send or receive buffer. Stipped this
   code out.

 - Files were inconsistently named. Most are nbc_isomething.c but one
   was named coll_libnbc_ireduce_scatter_block.c.

 - Made the NBC_Schedule "structure" and object so it can be
   retained/released. This may enable the use of schedule caching at a
   later time. More testing will be needed to ensure the caching code
   works. If it doesn't the code should be stripped out completely.

 - Added code to simply common case of scheduling send/recv +
   barrier.

 - Code cleanup for readability.

The code now passes the clang static analyzer.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-10 11:53:25 -06:00
0a91d7af4d Fix issues identified by Coverity. 2015-08-08 16:41:30 -04:00
bd5bf4a224 Merge pull request #781 from hppritcha/topic/suppress_picky_warning
mca/topo: suppress picky warning
2015-08-08 06:14:52 -04:00
88038b5261 mtl/ofi: Deprioritize some OFI providers.
Some OFI providers such as "sockets" are used for debugging
purposes mostly. For these providers, other components usually
offer better performance -- e.g. for sockets, the BTL/TCP would
be a better choice.
Thus, we chose to ignore some providers unless explicitly asked
by the user on the command line:

e.g. --mca mtl_ofi_provider sockets
2015-08-07 16:09:51 -07:00
d719497f82 Performance tuning: increase the priority of the sm sharedfp component to ensure that it is selected if it can run. 2015-08-07 16:32:53 -05:00
9e29edf15c remove a erroneous paranthesis which prevents the compilation of the lustre adio 2015-08-07 15:22:41 -05:00
1293d9c69b free memory correctly in case of an error. Fixes CID 131540 and CID 1315419 2015-08-07 13:30:50 -05:00
0aa3049bfc Performance tuning: change the default behavior of ompio to *not* segment individual read/write operations.
In most cases, performance seems to be better if not segmented.
2015-08-07 13:06:39 -05:00
db5af26de7 Performance tuning. make sure we catch if the user wants to set the default fileview and replace it with our optimized default file view. Otherwise, performance will suffer. file_get_view should still return the correct filetype, not our optimized default file view. This is the correct version compared to ffa67b9693, which unfortunately broke
some test cases in mpi_test_suite. Thanks for @ggouaillardet for reporting this!
2015-08-07 12:49:58 -05:00
6f6c01ee8d free the datatypes that were created using type_dup during file_set_view 2015-08-07 11:50:25 -05:00
1ae4f8c7e6 Revert "Performance tuning. make sure we catch if the user wants to set the default fileview and replace it with"
This reverts commit ffa67b9693.
2015-08-07 09:53:07 -05:00
907c095f66 Merge pull request #779 from edgargabriel/topic/fcoll_fixes
Topic/fcoll fixes
2015-08-07 09:14:31 +09:00
10aac8037f mca/topo: suppress picky warning
When configured with --enable-picky

topo_base_lazy_init.c compiles with a warning:

  CC       base/topo_base_lazy_init.lo
base/topo_base_lazy_init.c:46:67: warning: implicit conversion from enumeration type 'enum mca_base_register_flag_t' to different enumeration type 'mca_base_open_flag_t' (aka 'enum mca_base_open_flag_t') [-Wenum-conversion]
        err = mca_base_framework_open (&ompi_topo_base_framework, MCA_BASE_REGISTER_DEFAULT);

This commit fixes this implicit conversion problem.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-08-05 16:11:04 -06:00
16d4171f6b the individual component should call internal ompio functions directly. The reason is that otherwise
the redirection to the ompi_file_t structure (and back to the ompio internal structure) is ambiguise and wrong
for the shared file pointer scenario.
2015-08-05 14:31:11 -05:00
02a4eb2f13 add the ompi_file_t pointer correctly on the ompio file handle for the sm and individual component. 2015-08-05 14:28:27 -05:00
a36d7e6026 treematch: __FUNCTION__ -> __func__ fixes 2015-08-05 05:39:38 -07:00
a0ebbee6ef libnbc: __FUNCTION__ -> __func__ fixes 2015-08-05 05:27:23 -07:00
3d1780f1a2 sharedfp: set f_fh when opening a shared file 2015-08-05 15:07:21 +09:00
047eccef8d Merge pull request #725 from bosilca/treematch
Add a new topo module: Treematch
2015-07-31 15:17:54 -04:00
8649a9f6ef Merge pull request #757 from roblatham00/lustre-excl-open-fix
hint processing should not open files
2015-07-31 12:16:14 -06:00
a9b10cfbf0 Merge pull request #761 from jithinjosepkl/master
Fix warnings in direct (pml-cm,mtl-ofi) build
2015-07-31 09:15:30 -07:00
ffa67b9693 Performance tuning. make sure we catch if the user wants to set the default fileview and replace it with
our optimized default file view. Otherwise, performance will suffer. file_get_view should still return the correct filetype, not our optimized default file view
2015-07-30 19:15:00 -05:00
93a303ba89 Performance tuning: make sure the individual component is selected for 1 and 2 process communicators (important for some benchmarks) 2015-07-30 17:31:16 -05:00
9b2a7e41f0 make sure the final number of aggregators is recorded correctly when not using
our aggregator selection logic.
2015-07-30 17:24:01 -05:00
6e9cbe397f hint processing should not open files
move opening of files from hint processing and into open routines.

This is MPICH commit 92f1c69f0de8 and 22a77dceda11

see https://trac.mpich.org/projects/mpich/ticket/2261
Ref: https://github.com/open-mpi/ompi/issues/158

Signed-off-by: Pavan Balaji <balaji@anl.gov>
2015-07-30 12:25:20 -05:00
bc4e8b7e73 Fix warnings in direct (pml-cm,mtl-ofi) build
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-07-29 15:49:37 -07:00
477083bca3 the memory chunk that has to be allocated for the llapi_get_stripe function seems to have changed compared to earlier version. This implementation now follows the code snipplet from the man pages. 2015-07-29 17:13:55 -05:00
217dcca853 - the memory chunk that has to be allocated for the llapi_get_stripe function seems to have changed compared to earlier version. This implementation now follows the code snipplet from the man pages.
- implementation of file_get_size and set_size
2015-07-29 17:10:39 -05:00
6eba52a121 mtl/ofi: add missing return. 2015-07-29 14:14:34 -07:00
023936e84b Silence coverity warnings 2015-07-29 07:28:08 -07:00
a3327fe299 Merge pull request #756 from edgargabriel/pr/nb-sharedfp-splitcoll2
- make the split collective shared file pointer operations work
2015-07-28 19:53:27 -05:00
3780089ce0 clean up the usage of opal_output vs. printf 2015-07-28 18:27:31 -05:00
377bad18bd Merge pull request #747 from hppritcha/topic/ofi_progress_fix
mtl/ofi: don't inline ofi progress method
2015-07-28 09:42:01 -06:00
824d488709 - make the split collective shared file pointer operations work
- minor code restructering in io/ompio required for that.
2015-07-28 09:05:05 -05:00