Aurelien Bouteiller
efbc6ff6a5
Merge pull request #7798 from abouteiller/mpi-next/unbounderr-self
...
MPI-4 error handling: 'unbound' errors to MPI_COMM_SELF
2020-08-03 15:59:14 -04:00
Tomislav Janjusic
cbfc9a3263
opal/mca/common/ucx: Use new TSD api
...
Co-authored-by: Artem Y. Polyakov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-30 00:21:26 +03:00
Tomislav Janjusic
27ba4b612f
ompi/osc/ucx: Remove workerpool's global thread storage tables.
...
Co-authored-by: Artem Y. Polyakov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-30 00:21:26 +03:00
Aurélien Bouteiller
b37202c74e
Add compliance mode with MPI-4 routing of errors to MPI_COMM_SELF by
...
default
And other streamlining of aborting behavior.
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Remove OMPI_COMM_ERRORS and use NOHANDLE macros instead.
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
route unbound errors to self error handler
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Do not raise the error handler from within components
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-23 05:09:29 -04:00
Joseph Schuchart
eebc451ec8
osc/rdma: fail query_btls if no endpoint for non-local peer is found
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-07-16 17:06:35 +02:00
Austen Lauria
9b86f1442a
Merge pull request #7823 from jsquyres/pr/put-osc-pt2pt-back
...
Fix typos in OSC RDMA BTL allowlist
2020-06-30 10:55:16 -04:00
Austen Lauria
a26e494953
Merge pull request #7882 from devreal/osc-rdma-noncontig-requests
...
osc rdma: check for outstanding fragments before completing a request (II)
2020-06-29 09:51:47 -04:00
Joseph Schuchart
caed3b2eed
osc rdma: check for outstanding fragments before completing a request in ompi_osc_rdma_put_complete_flush as well
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-26 22:19:21 +02:00
Joseph Schuchart
2c36d37033
Merge pull request #7871 from devreal/osc-ucx-rget-rput-fetch-alignment
...
OSC UCX: make sure no-op fetch in rget/rput is properly aligned
2020-06-26 15:58:51 +02:00
Joseph Schuchart
1314ef7668
OSC UCX: Remove stale free from merge conflict
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-25 19:01:53 +02:00
Joseph Schuchart
634f67b216
Merge pull request #7843 from devreal/clang-tidy-free
...
Some fixups for issues detected by clang-tidy
2020-06-25 17:30:04 +02:00
Artem Polyakov
907f4e196a
Merge pull request #6980 from devreal/ucx-acc-singel-intrinsics
...
UCX osc: add support for acc_single_intrinsic
2020-06-25 07:39:42 -07:00
Joseph Schuchart
c1f7776341
OSC UCX: make sure no-op fetch in rget/rput is properly aligned
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-25 16:16:58 +02:00
Austen Lauria
7814f4195c
Merge pull request #7845 from devreal/stack-fixes
...
Fix unexpected optimizations detected by STACK
2020-06-25 08:15:09 -04:00
Joseph Schuchart
e3b417c776
Add missing copyright header
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
e215eff43d
UCX osc: atomic fetch-and-op only on 32 and 64bit values
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
434c9055ee
UCX osc: fall back to get-compare-put for unsupported datatypes
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
7d5a6e3e8b
UCX osc: safely load/store 64bit integer from variable size pointer
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
5f786bcce4
UCX osc: make MPI_Fetch_and_op non-blocking if possible
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
d8696aa8c4
UCX osc: centralize decision on whether to use AMOs
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
427d4bd226
UCX osc: do not acquire accumulate lock if exclusive lock was taken
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
471d76777a
UCX osc: fence active operations before releasing accumulate lock and free memory if required
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
4d7a3856fa
UCX osc: Use accumulate for operations/datatypes that are not covered by UCX
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
899f58cef5
UCX osc: simplify output address computation
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
d888b4fd76
UCX osc: correctly handle MPI_NO_OP
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
7cfc0e71da
UCX osc: allow to asynchronously compare-and-swap
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
557ae80858
UCX osc: allow for overlap with (some) request-based atomic operations
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
1a3c6bbf35
UCX osc: re-use value returned by cswap to save additional get
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
8606a02b87
UCX osc: fix macro parameter name usage in OMPI_OSC_UCX_REQUEST_RETURN
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
d448efd49c
UCX osc: properly clean up requests in case of errors
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
73a183408f
UCX osc: add support for acc_single_intrinsic info key / mca param
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
d9d18acd49
Fix unintended optimizations detected by STACK
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-22 10:32:22 +02:00
Joseph Schuchart
e23dcca448
Add missing free calls to osc/ucx
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-19 14:30:07 +02:00
Joseph Schuchart
ede3c0840a
Add missing free calls to osc/sm component_select
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-19 12:33:34 +02:00
Jeff Squyres
18cfcc8b70
osc/rdma: update supported BTL list
...
"openib" no longer exists.
"tcp" had a typo.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-16 09:11:01 -07:00
Joseph Schuchart
85ed26f2f8
osc rdma: check for outstanding fragments before completing a request
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-16 17:45:00 +02:00
Michael Heinz
dbbdb8f2e2
Merge pull request #7621 from jsquyres/pr/remove-osc-pt2pt
...
Remove OSC pt2pt component
2020-05-08 12:43:57 -04:00
Austen Lauria
2e22a247bb
Merge pull request #7650 from devreal/fix-7617-oscpt2pt-leak
...
PT2PT osc: don't extra retain datatype
2020-04-24 08:55:28 -04:00
Joseph Schuchart
de67ada442
RDMA osc: remove extra retain on pending_op
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-04-21 22:49:48 +02:00
Joseph Schuchart
07d1011afe
OSC base: fix typos in documentation
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-04-21 21:53:36 +02:00
Joseph Schuchart
154cf571b6
OSC base: do not retain datatype by default
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-04-21 21:53:10 +02:00
Jeff Squyres
8999bae25e
Remove OSC pt2pt component
...
Per https://github.com/open-mpi/ompi/wiki/5.0.x-FeatureList , remove
the OSC pt2pt component.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-04-13 12:29:54 -07:00
Nathan Hjelm
160ff188b8
Merge pull request #7169 from hjelmn/fix_what_wg21_calls_our_problem_not_theirs_seriously__in_some_ways_they_are_correct_but_wtf
...
configure: use -iquote for non-system include paths
2020-03-30 09:22:54 -07:00
Noah Evans
ee3517427e
Add threads framework
...
Add a framework to support different types of threading models including
user space thread packages such as Qthreads and argobot:
https://github.com/pmodels/argobots
https://github.com/Qthreads/qthreads
The default threading model is pthreads. Alternate thread models are
specificed at configure time using the --with-threads=X option.
The framework is static. The theading model to use is selected at
Open MPI configure/build time.
mca/threads: implement Argobots threading layer
config: fix thread configury
- Add double quotations
- Change Argobot to Argobots
config: implement Argobots check
If the poll time is too long, MPI hangs.
This quick fix just sets it to 0, but it is not good for the
Pthreads version. Need to find a good way to abstract it.
Note that even 1 (= 1 millisecond) causes disastrous performance
degradation.
rework threads MCA framework configury
It now works more like the ompi/mca/rte configury,
modulo some edge items that are special for threading package
linking, etc.
qthreads module
some argobots cleanup
Signed-off-by: Noah Evans <noah.evans@gmail.com>
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-03-27 10:15:45 -06:00
Gilles Gouaillardet
69bc2e8372
misc: fix <> vs "" includes throught the ompi codebase
...
This commit fixes an issue with the include usage in some
ompi source files. These source files are using the <> form
of include when the "" form is correct (as these are internal,
**not** system headers).
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-03-09 21:13:49 -04:00
Austen Lauria
04a3a28a74
Some memchecker cleanup and others.
...
- Port memchecker call from a1d502c.
- Remove unused memcheck macro variables.
- Some code readability improvements.
- Remove some stray +1's in dynamic comm cleanup.
- Re-add OPAL_ENABLE_DEBUG macro to osc header.
- Cleanup some printf's, and includes.
- Refactor cleanup of dpm_disconnect_objs.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-05 16:44:18 -05:00
Gilles Gouaillardet
fc2516457b
osc/pt2pt: silence valgrind warnings
...
explicitly add and initialize padding to keep valgrind happy
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
bosilca
c4d36859ec
Merge pull request #7228 from devreal/progress-returns
...
Harmonize return values of progress callbacks
2020-02-28 20:15:37 -05:00
Ralph Castain
86de81baca
Silence a bunch of warnings
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-22 13:05:28 -08:00
Austen Lauria
dd5991f513
Merge pull request #7204 from devreal/shmwin_contig
...
Correctly set baseptr in contiguous shared memory window with local size zero
2020-02-20 13:22:40 -05:00