bosilca
b6a06ca37b
Merge pull request #7974 from abouteiller/bugfix/vader_des_tag
...
bug fix: des->tag = hdr->frag, should be hdr->tag
2020-08-11 11:13:14 -04:00
Josh Hursey
1d07933a78
Merge pull request #7992 from mkurnosov/fix-parsing-locality-str
...
opal/hwloc: fix a typo in parsing locality string
2020-08-11 08:58:59 -05:00
Mikhail Kurnosov
4708458d6b
Fix a typo in parsing locality string: L0 changed to L1
...
(`prte_hwloc_base_get_locality_string` never returns locality string with L0).
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2020-08-11 08:43:47 +07:00
Jeff Squyres
9a0f661a66
Merge pull request #7975 from wckzhang/btlcommonlist
...
btl/ofi: Use common provider include/exclude list
2020-08-10 14:41:53 -04:00
Nathan Hjelm
a44914cb6b
Merge pull request #7915 from bosilca/fix/intel_2330_warning_take2
...
Second take on fixing the Intel _Atomic atomic operation warning
2020-08-08 06:30:15 -06:00
Jeff Squyres
f5cb1a49b1
Merge pull request #7897 from cniethammer/cmd_line_fixes
...
Minor fix in cmd line parser help
2020-08-08 07:13:22 -04:00
Aurelien Bouteiller
8a2127bcd3
Merge pull request #7984 from abouteiller/bugfix/java-errh
...
Missing function to populate java errors_abort handler
2020-08-06 11:47:31 -04:00
Aurelien Bouteiller
e6c7731d9b
Missing function to populate java error handler abort
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-08-06 10:29:42 -04:00
Aurelien Bouteiller
efbc6ff6a5
Merge pull request #7798 from abouteiller/mpi-next/unbounderr-self
...
MPI-4 error handling: 'unbound' errors to MPI_COMM_SELF
2020-08-03 15:59:14 -04:00
Aurelien Bouteiller
ee149fcfcb
MPI3 (unchanged in 4) says that errors after MPI_REQUEST_FREE are FATAL
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-31 17:49:38 -04:00
Aurelien Bouteiller
bec7dfc1b1
Errors in non-api calls remain fatal
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-31 17:49:35 -04:00
Aurelien Bouteiller
e0df0f4bd9
Make errors_mpi3 compat a global mpi-3 compatibility flag
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-31 17:48:47 -04:00
Aurelien Bouteiller
7dfe6c1adc
Thread-shift errors reported by PMIx to the main MPI progress engine
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
make things happen before the terminal call
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-31 17:48:44 -04:00
William Zhang
9b8f463a76
btl/ofi: Use common provider include/exclude list
...
The btl/ofi does not currently utilize the common ofi include/exclude
list. Added verification code similar to the mtl/ofi that will check if
the info object is in the include or exclude list. If it isn't in the
include list or is in the exclude list, validate_info will return
OPAL_ERROR. The btl/ofi will no longer pass a provider name as a hint
when calling getinfo, instead filtering the provider during
validate_info.
This patch also moves the is_in_list MTL function into common code and
adds additional debugging output to the BTL to match the MTL standard.
Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-07-31 12:13:00 -07:00
Artem Polyakov
dfb0ae748f
Merge pull request #7681 from janjust/master-tls-refactor_v3
...
ompi/osc/ucx: remove global TLS tables
2020-07-31 10:39:53 -07:00
Aurelien Bouteiller
8e0cb1d49d
des->tag = hdr->frag, should be hdr->tag
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-30 14:02:22 -04:00
Tommy Janjusic
2c8da2c0a9
Further code reduction and simplifications.
...
Co-authored-by: Artem Polyakov <artpol84@gmail.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-30 20:00:22 +03:00
Tomislav Janjusic
cbfc9a3263
opal/mca/common/ucx: Use new TSD api
...
Co-authored-by: Artem Y. Polyakov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-30 00:21:26 +03:00
Tomislav Janjusic
72296e12f4
opal/common/ucx:
...
-mutex lock/unlock suggestions
-common destructor/cleanup
Co-authored-with: Artem Y. Polyakov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-30 00:21:26 +03:00
Tomislav Janjusic
27ba4b612f
ompi/osc/ucx: Remove workerpool's global thread storage tables.
...
Co-authored-by: Artem Y. Polyakov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-30 00:21:26 +03:00
Brian Barrett
41df122083
Merge pull request #7730 from wckzhang/newdefaults
...
coll/tuned: Change the default collective algorithm selection
2020-07-28 15:27:46 -07:00
William Zhang
ce40cfbaa5
coll/tuned: Change the default collective algorithm selection
...
The default algorithm selections were out of date and not performing
well. After gathering data from OMPI developers, new default algorithm
decisions were selected for:
allgather
allgatherv
allreduce
alltoall
alltoallv
barrier
bcast
gather
reduce
reduce_scatter_block
reduce_scatter
scatter
These results were gathered using the ompi-collectives-tuning package
and then averaged amongst the results gathered from multiple OMPI
developers on their clusters.
You can access the graphs and averaged data here:
https://drive.google.com/drive/folders/1MV5E9gN-5tootoWoh62aoXmN0jiWiqh3
Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-07-28 10:41:48 -07:00
Austen Lauria
d0152eb51e
Merge pull request #7940 from awlauria/revert_libevent_commit
...
Revert "Address a race condition in libevent select."
2020-07-28 11:34:59 -04:00
Jeff Squyres
c07d77fbf2
Merge pull request #7957 from bosilca/fix/avx_alignment
...
Use the unaligned SSE memory access primitive.
2020-07-27 15:50:40 -04:00
Artem Polyakov
e5ef80fe8c
Merge pull request #7936 from janjust/master-new-tsd-thread-api
...
Master: new thread-specific-data (tsd) api
2020-07-24 14:58:03 -07:00
Ralph Castain
863a058f8d
Merge pull request #7964 from rhc54/topic/sync
...
Sync to PRRTE master
2020-07-24 14:57:32 -07:00
Ralph Castain
8c0269cd4f
Sync to PRRTE master
...
Pickup the FT and libev cleanups
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-07-24 14:11:34 -07:00
Tomislav Janjusic
d809f6ba27
New TSD API interface fix for various components
...
Co-authored by: Artem Polykaov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-24 18:29:40 +03:00
Tomislav Janjusic
cba5a0e117
Rename tsd interface function calls
...
Co-authored by: Artem Polykaov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-24 18:29:07 +03:00
Tomislav Janjusic
cb1955bb53
Fix renamed interface functions for argo, q, and pthreads
...
Co-authored by: Artem Polykaov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-24 18:29:07 +03:00
Tomislav Janjusic
07dc86eb3a
opal/thread: New TSD API
...
Co-authored-by: Artem Polyakov <artemp@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-07-24 18:29:07 +03:00
Ralph Castain
06c585c316
Merge pull request #7962 from rhc54/topic/sync
...
Sync to PMIx and PRRTE master
2020-07-23 16:22:32 -07:00
Ralph Castain
c0bc89dc50
Sync to PMIx and PRRTE master
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-07-23 12:35:17 -07:00
Aurelien Bouteiller
06c563625a
Add a test for mpi_errors_mpi3 behavior and non-catastrophic errors
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-23 05:09:29 -04:00
Aurélien Bouteiller
b37202c74e
Add compliance mode with MPI-4 routing of errors to MPI_COMM_SELF by
...
default
And other streamlining of aborting behavior.
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Remove OMPI_COMM_ERRORS and use NOHANDLE macros instead.
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
route unbound errors to self error handler
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Do not raise the error handler from within components
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-23 05:09:29 -04:00
George Bosilca
c4e88a43a3
Check unaligned ops for correctness.
...
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-07-22 11:26:07 -04:00
Joshua Ladd
366e92ce54
Merge pull request #7860 from vspetrov/hcoll_reduce_scatter
...
Coll/Hcoll: reduce_scatter(block) interface
2020-07-22 09:45:34 -04:00
George Bosilca
b6d71aa893
Use the unaligned SSE memory access primitive.
...
Alter the test to validate misaligned data.
Fixes #7954 .
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-07-22 01:19:12 -04:00
Jeff Squyres
30ba603c2c
Merge pull request #7953 from cniethammer/configure-leak-fix
...
Fix memory leak in configure, which prevents leak sanitizer usage
2020-07-21 16:34:27 -04:00
Christoph Niethammer
6564c1b942
Fix memory leak in configure, which prevents leak sanitizer usage
...
If building Open MPI with sanitizers, e.g
$ configure CC=clang CFLAGS=-fsanitize=address ....
configure test programs are also build with the sanitizers and will
report errors resulting in configure to fail.
Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
2020-07-21 21:28:29 +02:00
Aurelien Bouteiller
816acbdfb1
Merge pull request #7840 from abouteiller/mpi-next/init-errh
...
MPI-4: Initial error handler
2020-07-21 11:55:14 -04:00
Joseph Schuchart
60aa97b301
Merge pull request #7948 from devreal/osc-rdma-check-endpoints
...
osc/rdma: fail query_btls if no endpoint for non-local peer is found
2020-07-20 15:14:25 +02:00
bosilca
1139d9ecae
Merge pull request #7931 from bosilca/fix/7928
...
Fix the BTL API conversion for the SMCUDA BTL
2020-07-18 17:35:39 -04:00
Joseph Schuchart
eebc451ec8
osc/rdma: fail query_btls if no endpoint for non-local peer is found
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-07-16 17:06:35 +02:00
Aurelien Bouteiller
7118755ae8
Add a tester for the initial error handler
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-16 03:10:32 -04:00
Aurelien Bouteiller
5f1f7fe313
route errors to self/initial error handler depending upon the state of
...
MPI initialization
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-16 03:10:32 -04:00
Aurélien Bouteiller
bed909c3ba
Read the info key mpi_initial_errhandler from spawn/spawn_multiple
...
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Use the same env to transmit the initial error handler to spawnees
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-16 03:10:32 -04:00
Aurélien Bouteiller
83d0f92152
Set the initial error handler onto predefined communicators
...
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
update to the predefined initial error handler selection
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-16 03:10:32 -04:00
Aurélien Bouteiller
3cd85a9ec5
Add the initial_errhandler info key to MPI_INFO_ENV and populate the
...
value from prun populated paremeters
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Allow errhandlers to invoke the initial error handler before MPI_INIT
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Indentation
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-16 03:10:32 -04:00
Aurélien Bouteiller
703b8c356f
Make error_class and error_string callable before/after
...
MPI_INIT/FINALIZE
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
make lazy initialization opal unlikely
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-07-16 03:10:32 -04:00