1
1
Граф коммитов

5599 Коммитов

Автор SHA1 Сообщение Дата
--quiet
1e9227765a ofi mtl: also link in mtl_ofi_LIBS in the static case 2015-08-20 10:40:46 -07:00
Edgar Gabriel
4be20b119f bring the addproc component up to date with the fileview changes 2015-08-20 09:30:58 -05:00
Edgar Gabriel
8b84da5e35 bring the lockedfile component up to date with the fileview changes. 2015-08-20 09:26:30 -05:00
Edgar Gabriel
b0461f8d3c the back pointer from the ompio_file structure to the ompi_file_t structure
has to be set earlier in case the user disables the lazy_open option.
2015-08-19 17:11:42 -05:00
Edgar Gabriel
096fe78d73 the offset provided to the read_at/write_at routines has to be a multiple of the etype. 2015-08-19 17:11:42 -05:00
Edgar Gabriel
7e370948c1 first cut on the fileview for shared filepointers fix. 2015-08-19 17:11:42 -05:00
yohann
bcc10fbcd4 mtl/ofi: remove redundant code. 2015-08-19 13:13:59 -07:00
Yossi Itigin
f9e2ede47f Merge pull request #816 from yosefe/topic/yalla-fix-on-demand-map
yalla: fix passing on-demand mapping config to mxm.
2015-08-19 17:25:30 +03:00
Gilles Gouaillardet
646b9943e8 topo/treematch: initialize the global_bl symbol 2015-08-19 10:39:17 +09:00
Edgar Gabriel
1b45712595 bring the addproc component up to date with support for split collectives. No pr required
for this commit, since the addproc component is not part of v2.x
2015-08-18 12:17:46 -05:00
Todd Kordenbrock
10cf64373a osc-portals4: allow atomic ops on datatypes that are max_fetch_atomic_size bytes in length
Portals4 supports atomic ops on datatypes less than or equal to
max_fetch_atomic_size bytes.  This commit fixes a bug that required
the datatype to be less than max_fetch_atomic_size bytes.
2015-08-18 11:51:16 -05:00
Nathan Hjelm
145bac088d Merge pull request #753 from hjelmn/verbose_standard
Standardize verbosity levels
2015-08-18 09:43:28 -06:00
yosefe
85580ad055 yalla: fix passing on-demand mapping config to mxm. 2015-08-18 15:00:59 +03:00
Edgar Gabriel
5ef0632f9d cleanup the usage of printf vs. opal_output 2015-08-17 14:55:12 -05:00
Nathan Hjelm
2f447b2c4c bml/r2: use the bml framework output and set verbosity level to info
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-17 11:48:06 -06:00
yohann
98b300e1bb mtl/ofi: Require proper ordering by OFI provider. 2015-08-14 16:36:10 -07:00
Edgar Gabriel
072b18e197 Code cleanup for the time breakdown feature in ompio/fcoll
- make the internal structure follow the Open MPI naming convention
 - provide a single flag/macro which controls the compilation/utilization of this
   feature, to avoid that somebody using this has to modify every single
   fcoll component. A configure option could be added later if desired.
2015-08-14 08:53:04 -05:00
Edgar Gabriel
4bfc6ae798 Performance tuning: incorporate the usage of non-blocking operations in our array group-communication operations. 2015-08-13 20:05:18 -05:00
Gilles Gouaillardet
6118236f1a Merge pull request #796 from ggouaillardet/topic/hcoll_config
configury: fix hcoll, fca and mxm detection and revamp yalla Makefile.am
Thanks to David Shrader and Ake Sandgren for bringing this issue to our attention
2015-08-14 08:55:46 +09:00
Edgar Gabriel
9f369ba515 move the inclusion of the lustre_user and lliblustreapi header files to the fs_lustre.h file. 2015-08-13 15:36:16 -05:00
Gilles Gouaillardet
6b2fe9120e yalla: fix Makefile.am LDFLAGS 2015-08-13 17:33:52 +09:00
Gilles Gouaillardet
1a238d3a4f configury: fix fca detection
* do not add -I/.../include/fca -I /.../include/fca_core to CPPFLAGS
 * allow configure --with-fca
 * search fca libs in both DIR/lib and DIR/lib64
 * fix the description of the --with-fca option
2015-08-13 11:09:15 +09:00
Gilles Gouaillardet
df98a73131 configury: fix hcoll detection
* do not add -I/.../include/hcoll -I /.../include/hcoll/api to CPPFLAGS
 * allow configure --with-hcoll
 * search hcoll libs in both DIR/lib and DIR/lib64
 * fix the description of the --with-hcoll option
2015-08-13 11:08:56 +09:00
yohann
27520b99b8 mtl/ofi: add include/exclude list MCA vars.
mtl_ofi_provider_include (resp. mtl_ofi_provider_exclude) can be used
to specify which provider(s) the OFI MTL can select (resp. ignore).

e.g. --mca mtl_ofi_provider_include "psm,sockets"

By default, mtl_ofi_provider_exclude is set to "sockets,mxm".

This deprecates the old MCA var named "mtl_ofi_provider".
2015-08-12 13:52:04 -07:00
Jeff Squyres
e9b7203ece treematch: ensure hwloc support is enabled
This commit does the following:

* s/ompi_check_treematch/ompi_topo_treematch/ (i.e., abide by the
  prefix rule)
* change the value of ompi_topo_treematch_happy from yes/no to 0/1, so
  that we can use -eq for numerical comparisons (vs. string
  comparisons).  It's the little things in life, no?
* Check the valueo f $OPAL_HAVE_HWLOC to ensure that hwloc support is
  enabled.  If not, disqualify treematch from building.
* Fixes a few places that were underquoted
* Convert from "test ... -a ..." to "test ... && test ..."

Fixes open-mpi/ompi#797
2015-08-12 12:23:12 -07:00
Edgar Gabriel
55f0e1a1f8 fix the lustre compilation problems for older lustre versions. Add the prototype for the static function to avoid a warning message. 2015-08-12 09:45:07 -05:00
Jeff Squyres
3be125afff op base: whitespace cleanup
No logical code changes.
2015-08-12 05:35:11 -07:00
Jeff Squyres
a2addbafed op base: move return statement to correct level
This fixes CID 71945.
2015-08-12 05:35:11 -07:00
Nathan Hjelm
624a4a0f82 Merge pull request #699 from hjelmn/libnbc_fixes
coll/libnbc: rewrite parts of libnbc
2015-08-10 14:51:42 -06:00
Jeff Squyres
87db836800 Merge pull request #788 from yburette/topic/deprioritize_some_providers
mtl/ofi: Deprioritize some OFI providers.
2015-08-10 14:45:59 -04:00
Nathan Hjelm
d42e0968b1 coll/libnbc: rewrite parts of libnbc
This commit rewrites parts of libnbc to fix issues identified by
coverity and myself. The changes are as follows:

 - libnbc function would return invalid error codes (internal to
   libnbc) to the mpi layer. These codes names are of the form
   NBC_. They do not match up with the error codes expected by the mpi
   layer. I purged the use of all these error codes with the exception
   of NBC_OK and NBC_CONTINUE in progress. These codes are used to
   identify when a request handle is complete.

 - Handles and schedules were leaked by all collective routines on
   error. A new routine was added to return a collective handle
   (NBC_Return_handle).

 - Temporary buffers containting in/out neighbors for neighborhood
   collectives were always leaked.

 - Neigborhood collectives contained code to handle MPI_IN_PLACE which
   is never a valid input for the send or receive buffer. Stipped this
   code out.

 - Files were inconsistently named. Most are nbc_isomething.c but one
   was named coll_libnbc_ireduce_scatter_block.c.

 - Made the NBC_Schedule "structure" and object so it can be
   retained/released. This may enable the use of schedule caching at a
   later time. More testing will be needed to ensure the caching code
   works. If it doesn't the code should be stripped out completely.

 - Added code to simply common case of scheduling send/recv +
   barrier.

 - Code cleanup for readability.

The code now passes the clang static analyzer.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-10 11:53:25 -06:00
George Bosilca
0a91d7af4d Fix issues identified by Coverity. 2015-08-08 16:41:30 -04:00
Jeff Squyres
bd5bf4a224 Merge pull request #781 from hppritcha/topic/suppress_picky_warning
mca/topo: suppress picky warning
2015-08-08 06:14:52 -04:00
yohann
88038b5261 mtl/ofi: Deprioritize some OFI providers.
Some OFI providers such as "sockets" are used for debugging
purposes mostly. For these providers, other components usually
offer better performance -- e.g. for sockets, the BTL/TCP would
be a better choice.
Thus, we chose to ignore some providers unless explicitly asked
by the user on the command line:

e.g. --mca mtl_ofi_provider sockets
2015-08-07 16:09:51 -07:00
Edgar Gabriel
d719497f82 Performance tuning: increase the priority of the sm sharedfp component to ensure that it is selected if it can run. 2015-08-07 16:32:53 -05:00
Edgar Gabriel
9e29edf15c remove a erroneous paranthesis which prevents the compilation of the lustre adio 2015-08-07 15:22:41 -05:00
Edgar Gabriel
1293d9c69b free memory correctly in case of an error. Fixes CID 131540 and CID 1315419 2015-08-07 13:30:50 -05:00
Edgar Gabriel
0aa3049bfc Performance tuning: change the default behavior of ompio to *not* segment individual read/write operations.
In most cases, performance seems to be better if not segmented.
2015-08-07 13:06:39 -05:00
Edgar Gabriel
db5af26de7 Performance tuning. make sure we catch if the user wants to set the default fileview and replace it with our optimized default file view. Otherwise, performance will suffer. file_get_view should still return the correct filetype, not our optimized default file view. This is the correct version compared to ffa67b9693, which unfortunately broke
some test cases in mpi_test_suite. Thanks for @ggouaillardet for reporting this!
2015-08-07 12:49:58 -05:00
Edgar Gabriel
6f6c01ee8d free the datatypes that were created using type_dup during file_set_view 2015-08-07 11:50:25 -05:00
Edgar Gabriel
1ae4f8c7e6 Revert "Performance tuning. make sure we catch if the user wants to set the default fileview and replace it with"
This reverts commit ffa67b9693.
2015-08-07 09:53:07 -05:00
Gilles Gouaillardet
907c095f66 Merge pull request #779 from edgargabriel/topic/fcoll_fixes
Topic/fcoll fixes
2015-08-07 09:14:31 +09:00
Howard Pritchard
10aac8037f mca/topo: suppress picky warning
When configured with --enable-picky

topo_base_lazy_init.c compiles with a warning:

  CC       base/topo_base_lazy_init.lo
base/topo_base_lazy_init.c:46:67: warning: implicit conversion from enumeration type 'enum mca_base_register_flag_t' to different enumeration type 'mca_base_open_flag_t' (aka 'enum mca_base_open_flag_t') [-Wenum-conversion]
        err = mca_base_framework_open (&ompi_topo_base_framework, MCA_BASE_REGISTER_DEFAULT);

This commit fixes this implicit conversion problem.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-08-05 16:11:04 -06:00
Edgar Gabriel
16d4171f6b the individual component should call internal ompio functions directly. The reason is that otherwise
the redirection to the ompi_file_t structure (and back to the ompio internal structure) is ambiguise and wrong
for the shared file pointer scenario.
2015-08-05 14:31:11 -05:00
Edgar Gabriel
02a4eb2f13 add the ompi_file_t pointer correctly on the ompio file handle for the sm and individual component. 2015-08-05 14:28:27 -05:00
Jeff Squyres
a36d7e6026 treematch: __FUNCTION__ -> __func__ fixes 2015-08-05 05:39:38 -07:00
Jeff Squyres
a0ebbee6ef libnbc: __FUNCTION__ -> __func__ fixes 2015-08-05 05:27:23 -07:00
Gilles Gouaillardet
3d1780f1a2 sharedfp: set f_fh when opening a shared file 2015-08-05 15:07:21 +09:00
Jeff Squyres
047eccef8d Merge pull request #725 from bosilca/treematch
Add a new topo module: Treematch
2015-07-31 15:17:54 -04:00
Howard Pritchard
8649a9f6ef Merge pull request #757 from roblatham00/lustre-excl-open-fix
hint processing should not open files
2015-07-31 12:16:14 -06:00
rhc54
a9b10cfbf0 Merge pull request #761 from jithinjosepkl/master
Fix warnings in direct (pml-cm,mtl-ofi) build
2015-07-31 09:15:30 -07:00
Edgar Gabriel
ffa67b9693 Performance tuning. make sure we catch if the user wants to set the default fileview and replace it with
our optimized default file view. Otherwise, performance will suffer. file_get_view should still return the correct filetype, not our optimized default file view
2015-07-30 19:15:00 -05:00
Edgar Gabriel
93a303ba89 Performance tuning: make sure the individual component is selected for 1 and 2 process communicators (important for some benchmarks) 2015-07-30 17:31:16 -05:00
Edgar Gabriel
9b2a7e41f0 make sure the final number of aggregators is recorded correctly when not using
our aggregator selection logic.
2015-07-30 17:24:01 -05:00
Rob Latham
6e9cbe397f hint processing should not open files
move opening of files from hint processing and into open routines.

This is MPICH commit 92f1c69f0de8 and 22a77dceda11

see https://trac.mpich.org/projects/mpich/ticket/2261
Ref: https://github.com/open-mpi/ompi/issues/158

Signed-off-by: Pavan Balaji <balaji@anl.gov>
2015-07-30 12:25:20 -05:00
Jithin Jose
bc4e8b7e73 Fix warnings in direct (pml-cm,mtl-ofi) build
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-07-29 15:49:37 -07:00
Edgar Gabriel
477083bca3 the memory chunk that has to be allocated for the llapi_get_stripe function seems to have changed compared to earlier version. This implementation now follows the code snipplet from the man pages. 2015-07-29 17:13:55 -05:00
Edgar Gabriel
217dcca853 - the memory chunk that has to be allocated for the llapi_get_stripe function seems to have changed compared to earlier version. This implementation now follows the code snipplet from the man pages.
- implementation of file_get_size and set_size
2015-07-29 17:10:39 -05:00
yohann
6eba52a121 mtl/ofi: add missing return. 2015-07-29 14:14:34 -07:00
Ralph Castain
023936e84b Silence coverity warnings 2015-07-29 07:28:08 -07:00
Edgar Gabriel
a3327fe299 Merge pull request #756 from edgargabriel/pr/nb-sharedfp-splitcoll2
- make the split collective shared file pointer operations work
2015-07-28 19:53:27 -05:00
Edgar Gabriel
3780089ce0 clean up the usage of opal_output vs. printf 2015-07-28 18:27:31 -05:00
Howard Pritchard
377bad18bd Merge pull request #747 from hppritcha/topic/ofi_progress_fix
mtl/ofi: don't inline ofi progress method
2015-07-28 09:42:01 -06:00
Edgar Gabriel
824d488709 - make the split collective shared file pointer operations work
- minor code restructering in io/ompio required for that.
2015-07-28 09:05:05 -05:00
Edgar Gabriel
e380f8c235 - fix the delete priority of the ompio component
- some application use MPI_File_delete as a collective function (e.g. IOR), which I think is not really covered by the standard. Right now, one process succeeds and theother ones return an error code. Fix that by not returning no error if the file that we try to delete does not exist anymore, to make these applications work.
2015-07-27 15:53:40 -05:00
Edgar Gabriel
3fb0614566 mark the request as ACTIVE 2015-07-27 12:43:45 -05:00
Edgar Gabriel
5e166c81a1 Merge pull request #745 from edgargabriel/pr/sharedfp-sm-logic3
Pr/sharedfp sm logic3
2015-07-27 12:04:53 -05:00
Howard Pritchard
f5c43c1185 mtl/ofi: retain inline progress function
Retain inline progress function for ofi
mtl, but have a non-inlined progress function
which is registered with the opal progress
mechanism.

 @jithinjosepkl

I've bad news about the psm provider.  I still notice
segfaults - not always - but frequently at finalize
when using the psm provider.  I don't notice this
when using the sockets provider.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-07-27 09:16:52 -06:00
Gilles Gouaillardet
318a1a40a4 coll/libnbc: ireduce_scatter_block
silence malloc(0) warning reported by Lisandro
2015-07-27 16:23:08 +09:00
George Bosilca
e239de581b Create a new topology framework using the TreeMatch library developped
at Inria Bordeaux. This allows us to take advantage of the remap
capability of MPI to rearrange the ranks beased on the weights
povided by the application.

Fix the indentation and protect with __DEBUG__ one fprintf.

Add the Cecill-B license to the imported library.

Fix a compiler warning.

Restrict the TreeMatch dependencies.

The TreeMatch software is released under BSD3 (as indicated by their
copyright information @
https://gforge.inria.fr/scm/viewvc.php/COPYING?view=markup&root=treematch).

Update the README.
2015-07-25 13:30:42 -04:00
Jeff Squyres
3e6694f7ea sharedfp: whitespace cleanup
No code changes.

Replace tabs with spaces and do other whitespace cleanup (via emacs).
2015-07-25 05:46:37 -07:00
Jeff Squyres
868a84d4da sharedfp: have sm_data->mutex always point to the right mutex
Even if the mutex is actually located in
sm_data->sm_offset_ptr->mutex, have sm_data->mutex point to it.  This
avoids a few #if blocks that are otherwise identical.
2015-07-25 05:42:57 -07:00
Edgar Gabriel
4f85e0d833 add the configure logic to check for sem_open and sem_init.
Change the code to rely on HAVE_SEM_OPEN etc. instead of my internal macro.
2015-07-24 10:23:43 -05:00
Edgar Gabriel
d1d23054c6 rename the sm_offset structure to mca_sharedfp_sm_offset to obey to the Open MPI naming convention 2015-07-24 10:10:41 -05:00
Edgar Gabriel
c91cb67787 fix a bug in the unnamed semaphore section that was introduced when I tried to unify the named and unnamed semaphore logic. 2015-07-24 10:05:07 -05:00
Edgar Gabriel
57c301f25a remove an erroneous free statement. 2015-07-24 09:44:27 -05:00
Jeff Squyres
6929aca1b7 topo/basic: also remove .windows from Makefile.am 2015-07-22 09:20:43 -04:00
Jeff Squyres
24ca887bd8 topo/basic: remove stale (empty) .windows file 2015-07-22 09:10:50 -04:00
Edgar Gabriel
b484784dca make ompio return gracefully in case something goes wrong early in file_open. 2015-07-20 10:03:16 -05:00
Edgar Gabriel
86c3000e18 fix the delete selection logic in io/base. With the previous version, there was a mismatch
in the version number and no component was selected for file_delete.
2015-07-20 10:01:30 -05:00
Howard Pritchard
466c8b0159 Merge pull request #697 from edgargabriel/pr/nb-coll-part2
pr/nb collective I/O part2
2015-07-14 14:00:39 -06:00
Edgar Gabriel
e355db005e fix the logic for setting stripe size and stripe count in the lustre fs module. Takes now also the MPI_Info object into consideration. 2015-07-14 10:53:19 -05:00
Ralph Castain
683efcb850 Rename the current opal_event_base to opal_sync_event_base in preparation for adding an async progress thread to opal. No functional changes made here - just a simple rename. 2015-07-11 10:08:19 -07:00
Edgar Gabriel
f2af8e94ff - first cut on the io interface changes
- add the C interfaces for the new non-blocking collective I/O functions of MPI 3.1
2015-07-09 10:58:13 -05:00
yosefe
103cac5bd9 yalla: fix mxm configuration parsing.
Take configuration from MXM_MPI_xx instead of MXM_PML_xx, same as mtl
mxm.
2015-07-08 19:18:23 +03:00
Gilles Gouaillardet
9e89985f3d restore whitespaces into the pdf files 2015-07-07 09:17:00 +09:00
Rolf vandeVaart
30a872b478 Add the ability to send host buffers through one sized staging buffers and CUDA buffers through different sized buffers. Fixes performance issues 2015-07-02 11:11:15 -04:00
Jeff Squyres
13425e759c bml r2: very minor cleanups
Delete stale comments, use C99 struct initialization.
2015-06-25 15:54:16 -07:00
Ralph Castain
265cd14f60 Purge whitespace 2015-06-25 13:27:56 -07:00
rhc54
8b62b63786 Merge pull request #666 from rhc54/topic/libfab
Add a common/libfabric component
2015-06-25 12:46:34 -07:00
Yohann Burette
7fd5ded327 mtl/ofi: message truncation is now indicated by FI_ETRUNC. 2015-06-25 11:06:41 -07:00
Yohann Burette
483ff23db1 mtl/ofi: cancels are now tracked by an error entry. 2015-06-25 11:06:41 -07:00
Ralph Castain
ea0e21bb06 Add a common/libfabric component to the opal layer where we can place common functions 2015-06-25 11:04:00 -07:00
Nathan Hjelm
ee36d813dc Merge pull request #657 from hjelmn/c99
more c99 updates
2015-06-25 11:21:09 -06:00
Howard Pritchard
f45914db9b Merge pull request #670 from hppritcha/topic/ownership_update
ownership: update ownership files
2015-06-25 11:02:45 -06:00
Nathan Hjelm
4d92c9989e more c99 updates
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-25 10:14:13 -06:00
Howard Pritchard
e49a37c034 ownership: update ownership files
per discussions at OMPI devel workshop

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-06-25 10:04:42 -06:00
Jeff Squyres
0bb3fd0a10 coll hierarch: remove last stale file 2015-06-25 08:40:50 -07:00
George Bosilca
dc1b125b12 There is no destructor for the base requests. 2015-06-24 14:29:45 -07:00
bosilca
1b8556f926 Merge pull request #653 from hjelmn/moar_ob1_fixes
pml/ob1: fix bugs in static request objects
2015-06-24 14:28:11 -07:00