Artem Polyakov
907f4e196a
Merge pull request #6980 from devreal/ucx-acc-singel-intrinsics
...
UCX osc: add support for acc_single_intrinsic
2020-06-25 07:39:42 -07:00
Joseph Schuchart
73a183408f
UCX osc: add support for acc_single_intrinsic info key / mca param
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Joseph Schuchart
d9d18acd49
Fix unintended optimizations detected by STACK
...
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-22 10:32:22 +02:00
Tomislav Janjusic
d5f6b088ae
osc/ucx: Fix error path
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-08-12 21:54:01 +03:00
Nysal Jan K.A
14808922cf
osc/ucx: Add support for the no_locks info key
...
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
2019-07-18 17:29:01 +05:30
Artem Polyakov
6678ac0f55
osc/ucx: Fix possible win creation/destruction race condition
...
To avoid fully initializing the osc/ucx component for MPI application
that are not using One-Sided functionality, the initialization happens
at the first MPI window creation.
This commit ensures atomicity of global state modifications.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-06-20 09:05:03 -07:00
Artem Polyakov
0857742624
osc/ucx: Fix worker pool finalization
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-06-20 09:05:03 -07:00
Mikhail Brinskii
2ef5bd8b36
SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
...
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-26 14:47:58 +03:00
Artem Polyakov
19e2ae2efb
opal/common/ucx: Switch to opal/tsd
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
7984d7d997
opal/common/ucx: Remove unused debugging macro
...
Will be reintroduced later if needed and after adaptation to the OMPI
infrastructure.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
c6de09940f
ompi/osc/ucx: Switch osc/ucx code to use Worker Pool.
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Sergey Oblomov
2d230b3aac
OSC/UCX: set max level value to 60
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-27 14:20:28 +02:00
Sergey Oblomov
e91f214982
OSC/UCX: added UCX version evaluation
...
- added UCX version evaluation to set OSC UCX priority
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-14 10:03:13 +02:00
Sergey Oblomov
36934a8bb2
OSC: set UCX module used by default
...
- OSC/UCX module set priority to 200 to be used by default
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-12 15:08:22 +02:00
Sergey Oblomov
1099d5f023
COMMON/UCX: added error code to log output
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-21 11:37:25 +03:00
Sergey Oblomov
df765595e3
COMMON/UCX: suppressed coverity warnings
...
- suppressed coverity warnings - added log messages on failed calls
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-17 16:11:03 +03:00
Yossi Itigin
b8e1af6fcb
osc_ucx: add worker flush before osc module free
...
Make sure all pending communications are done on all ranks before
closing the window. This way it will be safe to close the endpoints when
closing the component.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-10 20:47:16 +03:00
Yossi Itigin
bcc48515e4
Revert "osc_ucx: fix hang/timeout in component finalize"
...
This reverts commit 438d13b4ca1e7333b789ca3fb536fda17b0feb38.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-10 20:47:13 +03:00
Yossi Itigin
a012ee91d8
Merge pull request #5886 from yosefe/topic/osc-ucx-fix-finalize-hang
...
osc_ucx: fix hang/timeout in component finalize
2018-10-10 16:29:29 +03:00
Yossi Itigin
dc6809495d
osc_ucx: fix hang/timeout in component finalize
...
Add barrier to make sure all endpoints are destroyed before destroying
the worker.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-10 14:38:06 +03:00
Sergey Oblomov
ae6f81983f
OSC/UCX: fixed zero-size window processing
...
- added processing of zero-size MPI window
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-10 13:08:01 +03:00
Brian Barrett
e9e4d2a4bc
Handle asprintf errors with opal_asprintf wrapper
...
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error. However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined. Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
Sergey Oblomov
b72dd83f05
MCA/COMMON/UCX: added synonims for common ucx variables
...
- added synonims for atomic/osc modules
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-26 18:25:21 +03:00
Sergey Oblomov
fa33e322e7
OSC/UCX: code deduplication
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-19 12:39:15 +03:00
Sergey Oblomov
6f0a7a2005
OSC/UCX: opal progress register/unregister optimization
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-19 12:07:26 +03:00
Sergey Oblomov
55b934bacf
OSC/UCX: enable progress when at least one window is allocated
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 17:52:30 +03:00
Sergey Oblomov
a081fba046
OSC/UCX: fixed hang on OSC init
...
- there worked progress was missed on startup which caused hang
on one of ranks
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 17:01:53 +03:00
Xin Zhao
74ef51af1b
OMPI/OSC/UCX: move memory hooks init in osc to win creation.
...
Move memory hooks init (for request based operation) in osc ucx to window
creation time, to avoid performance issue in MPI initialization.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-07-12 15:03:02 -07:00
Yossi Itigin
e77e31b50b
Merge pull request #5378 from hoopoepg/topic/unify-ucx-logging
...
MCA/COMMON/UCX: unified logging across all UCX modules
2018-07-08 12:45:26 +03:00
Sergey Oblomov
eb7010933d
OSC/UCX: suppressed compilation warnings
...
- suppressed sing/unsign-compare warnings
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-06 10:58:09 +03:00
Sergey Oblomov
bef47b792c
MCA/COMMON/UCX: unified logging across all UCX modules
...
- added common logging infrastructure for all
UCX modules
- all UCX modules are switched to new infra
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
502d04bf12
UCX/PML/SPML: fixed few coverity issues
...
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Yossi Itigin
ee873f4f79
Merge pull request #5322 from hoopoepg/topic/mca-ucx-common
...
MCA/UCX: added common module
2018-06-26 13:54:12 +03:00
Sergey Oblomov
63e7ba6843
MCA/COMMON/UCX: added parameter for UCX/opal progress
...
- added parameter to set UCX/opal progresses
- minor refactoring of request wait routines
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 11:00:12 +03:00
Yossi Itigin
c2fbf3a3e8
osc_ucx: register progress on-demand
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-06-19 12:47:08 +03:00
Joshua Ladd
32ddc6af7e
Merge pull request #5094 from xinzhao3/topic/osc-win-fix-master
...
OMPI/OSC/UCX: fix issue in impl of MPI_Win_create_dynamic/MPI_Win_attach/MPI_Win_detach
2018-05-02 17:42:34 -04:00
Xin Zhao
3f5ac97649
OMPI/OSC/UCX: set priority to 0.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-05-02 21:40:06 +03:00
Xin Zhao
53bdfd1dcb
OMPI/OSC/UCX: fix issue in impl of MPI_Win_create_dynamic/MPI_Win_attach/MPI_Win_detach
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-04-24 23:09:52 +03:00
Xin Zhao
2aa5292dbf
Add UCX component for ompi/mca/osc for MPI one-sided communication.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-07-19 19:45:40 +03:00