Sergey Oblomov
1099d5f023
COMMON/UCX: added error code to log output
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-21 11:37:25 +03:00
Sergey Oblomov
df765595e3
COMMON/UCX: suppressed coverity warnings
...
- suppressed coverity warnings - added log messages on failed calls
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-17 16:11:03 +03:00
Yossi Itigin
b8e1af6fcb
osc_ucx: add worker flush before osc module free
...
Make sure all pending communications are done on all ranks before
closing the window. This way it will be safe to close the endpoints when
closing the component.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-10 20:47:16 +03:00
Yossi Itigin
bcc48515e4
Revert "osc_ucx: fix hang/timeout in component finalize"
...
This reverts commit 438d13b4ca1e7333b789ca3fb536fda17b0feb38.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-10 20:47:13 +03:00
Yossi Itigin
a012ee91d8
Merge pull request #5886 from yosefe/topic/osc-ucx-fix-finalize-hang
...
osc_ucx: fix hang/timeout in component finalize
2018-10-10 16:29:29 +03:00
Yossi Itigin
dc6809495d
osc_ucx: fix hang/timeout in component finalize
...
Add barrier to make sure all endpoints are destroyed before destroying
the worker.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-10 14:38:06 +03:00
Sergey Oblomov
ae6f81983f
OSC/UCX: fixed zero-size window processing
...
- added processing of zero-size MPI window
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-10 13:08:01 +03:00
Brian Barrett
e9e4d2a4bc
Handle asprintf errors with opal_asprintf wrapper
...
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error. However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined. Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
Sergey Oblomov
b72dd83f05
MCA/COMMON/UCX: added synonims for common ucx variables
...
- added synonims for atomic/osc modules
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-26 18:25:21 +03:00
Sergey Oblomov
fa33e322e7
OSC/UCX: code deduplication
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-19 12:39:15 +03:00
Sergey Oblomov
6f0a7a2005
OSC/UCX: opal progress register/unregister optimization
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-19 12:07:26 +03:00
Sergey Oblomov
55b934bacf
OSC/UCX: enable progress when at least one window is allocated
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 17:52:30 +03:00
Sergey Oblomov
a081fba046
OSC/UCX: fixed hang on OSC init
...
- there worked progress was missed on startup which caused hang
on one of ranks
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 17:01:53 +03:00
Xin Zhao
74ef51af1b
OMPI/OSC/UCX: move memory hooks init in osc to win creation.
...
Move memory hooks init (for request based operation) in osc ucx to window
creation time, to avoid performance issue in MPI initialization.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-07-12 15:03:02 -07:00
Yossi Itigin
e77e31b50b
Merge pull request #5378 from hoopoepg/topic/unify-ucx-logging
...
MCA/COMMON/UCX: unified logging across all UCX modules
2018-07-08 12:45:26 +03:00
Sergey Oblomov
eb7010933d
OSC/UCX: suppressed compilation warnings
...
- suppressed sing/unsign-compare warnings
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-06 10:58:09 +03:00
Sergey Oblomov
bef47b792c
MCA/COMMON/UCX: unified logging across all UCX modules
...
- added common logging infrastructure for all
UCX modules
- all UCX modules are switched to new infra
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
502d04bf12
UCX/PML/SPML: fixed few coverity issues
...
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Yossi Itigin
ee873f4f79
Merge pull request #5322 from hoopoepg/topic/mca-ucx-common
...
MCA/UCX: added common module
2018-06-26 13:54:12 +03:00
Sergey Oblomov
63e7ba6843
MCA/COMMON/UCX: added parameter for UCX/opal progress
...
- added parameter to set UCX/opal progresses
- minor refactoring of request wait routines
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 11:00:12 +03:00
Yossi Itigin
c2fbf3a3e8
osc_ucx: register progress on-demand
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-06-19 12:47:08 +03:00
Joshua Ladd
32ddc6af7e
Merge pull request #5094 from xinzhao3/topic/osc-win-fix-master
...
OMPI/OSC/UCX: fix issue in impl of MPI_Win_create_dynamic/MPI_Win_attach/MPI_Win_detach
2018-05-02 17:42:34 -04:00
Xin Zhao
3f5ac97649
OMPI/OSC/UCX: set priority to 0.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-05-02 21:40:06 +03:00
Xin Zhao
53bdfd1dcb
OMPI/OSC/UCX: fix issue in impl of MPI_Win_create_dynamic/MPI_Win_attach/MPI_Win_detach
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-04-24 23:09:52 +03:00
Xin Zhao
2aa5292dbf
Add UCX component for ompi/mca/osc for MPI one-sided communication.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-07-19 19:45:40 +03:00