Sergey Oblomov
66e18563bf
SPML/UCX: fixed hang in SHMEM_FINALIZE
...
- used MPI _Barrier to synchronize processes
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 182023febb6f8f31ce34dc54c8aa409ad7e44fa2)
2019-08-22 11:41:52 +03:00
Howard Pritchard
a42977f1c2
Merge pull request #6707 from hoopoepg/topic/alloc-with-hint-realloc-inplace-v4.0
...
ALLOC_WITH_HINT: added inplace realloc - v4.0
2019-06-06 10:16:57 -07:00
Sergey Oblomov
69923e78c7
SPML/UCX: added synchronized flush on quiet
...
- added synchronized flush operation on quiet call.
- flush is implemented using get operation
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 0b108411f89727a68cd622f3b04c783efa359b8e)
2019-05-30 18:08:33 +03:00
Sergey Oblomov
f75d46faa9
ALLOC_WITH_HINT: added implace realloc
...
- in some cases realloc operation may be completed without
allocation of new buffer (and without additional data copy)
- added logic to reallocate buffer inplace if possible
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 277c2a9e5c7711098be826e6c154253747fdad9a)
2019-05-27 11:44:18 +03:00
Geoff Paulsen
c22326e59a
Merge pull request #6652 from yosefe/topic/alloc-with-hint-impl-master-v4.0.x
...
OSHMEM: Add support for shmemx_malloc_with_hint() - v4.0.x
2019-05-17 15:48:35 -05:00
Mikhail Brinskii
ff9ecc183f
SPML/UCX: Fix coverity error
...
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit d81dc533f6f7ebba0b00c1652190975f0aca9e06)
2019-05-15 14:20:05 +03:00
Yossi Itigin
fc41c16134
OSHMEM: Add support for shmemx_malloc_with_hint()
...
- added multiple segments processing
- added shmemx_malloc_with_hint call + set of hints
(picked from master 94b5e91)
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2019-05-12 11:42:59 +03:00
Mikhail Brinskii
6861a68de6
SPML/UCS: CR comments p2
...
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit d4843b165172a9cffbdc967ec92ec1d60d053903)
2019-05-02 21:27:15 +03:00
Mikhail Brinskii
1c56f49a44
SPML/UCX: CR comments p1
...
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit c4c99457db6577f85b5b1c4c6849b5a04c226a84)
2019-05-02 21:26:55 +03:00
Mikhail Brinskii
e4ee56d1f3
SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
...
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 2ef5bd8b3671f1e10caf00d06d66d120eac9c5be)
2019-05-02 21:25:59 +03:00
Xin Zhao
69a80fce9f
ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy
...
ompi/oshmem/spml/ucx: optimize spml ucx progress
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
(cherry picked from commit 9c3d00b144641d2929f830279dcc9d163c38e9e1)
2019-03-21 23:59:58 +02:00
Xin Zhao
580b584179
ompi/oshmem/spml/ucx:delete oob path of getting rkeys in spml ucx
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
(cherry picked from commit e0414006b0c0a8e9918a4cf8ac4bb819b977ec91)
2019-03-21 23:59:46 +02:00
Xin Zhao
596997c194
ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
(cherry picked from commit e1c1ab020227fc18d145379ab29ea86a3cdb66b1)
2019-03-21 23:58:23 +02:00
Xin Zhao
ce54b63b90
ompi/oshmem: add spml_context back to sshmem_type in memheap, to keep track of ucx_ctx_default's rkeys
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
(cherry picked from commit 48033ac1f43159c053241b65e74a39777e5e31e4)
2019-03-20 23:30:21 +02:00
Xin Zhao
91793484ed
OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
(cherry picked from commit 289595e45dc3ebfe5ae1a9dc6f347b1b2d569c4a)
2019-03-20 23:29:53 +02:00
Xin Zhao
f666d75322
ompi/oshmem/spml/ucx: fix eps destroy in shmem_ctx_destroy().
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
(cherry picked from commit 79ba7526677bd1641239bb77559a2999c8cd3a4a)
2019-03-20 23:29:38 +02:00
Sergey Oblomov
14c271f993
PML/SPML/UCX: added evaluation of mmap events
...
- there was a set of UCX related issues reported which caused
by mmap API hooks conflicts. We added diagnostic of such
problems to simplify bug-resolving pipeline
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit d8e3562bae700d84873c1d5ca9c45c846d7387ed)
2019-03-14 16:48:25 +02:00
Sergey Oblomov
3cace87749
MCA/COMMON/UCX: del_procs calls are unified to common module
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 920cc2e0d9994dfd49062822c89cb502274eb464)
2018-09-19 10:47:27 +03:00
Boris Karasev
8873d901e8
pmix: added check for pmix fence status
...
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
(cherry picked from commit 57683366ca300fe353e91c52dc9aa0f657120d4d)
Conflicts:
opal/mca/common/ucx/common_ucx.c
opal/mca/common/ucx/common_ucx.h
Modified:
ompi/mca/pml/ucx/pml_ucx.c
oshmem/mca/spml/ucx/spml_ucx.c
2018-08-17 21:33:50 +06:00
Sergey Oblomov
b64502977a
PML/SPML/UCX: init global objects using C99 style
...
- to avoid value mix used C99 style of object initializations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 2806504290ff2fdd10f254f878e92cae6e90c854)
2018-07-28 16:47:43 +03:00
Xin Zhao
c429900cd9
OMPI/OSHMEM: add new functionality of OpenSHMEM v1.4.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-07-16 12:55:25 -07:00
Sergey Oblomov
240670152e
MCA/COMMON/UCX: code beautify - alignment
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-06 19:40:58 +03:00
Sergey Oblomov
bef47b792c
MCA/COMMON/UCX: unified logging across all UCX modules
...
- added common logging infrastructure for all
UCX modules
- all UCX modules are switched to new infra
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
8080283b3d
MCA/COMMON/UCX: changed return type for wait_request
...
- for now wait_request returns OMPI status
- updated callers
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 23:29:38 +03:00
Sergey Oblomov
13331ba4d8
MCA/COMMON/UCX: code beautify + build fix
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 16:37:03 +03:00
Sergey Oblomov
8a793bb279
MCA/COMMON/UCX: fixed build issues
...
- fixed fuild issues when used older UCX
- added non-blocking call of ucp_put call
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:58:08 +03:00
Sergey Oblomov
c2bd6af9f2
MCA/COMMON/UCX: minor unification of del_proces calls
...
- some common functionality of del_procs calls is moved into
mca_common module
- blocking ucp_put call is replaced by non-blocking routine
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:10:53 +03:00
Sergey Oblomov
952fa8ade7
PML/UCX: method mca_spml_ucx_get_mkey_slow is renamed to get_mkey_slow
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-01 20:44:19 +03:00
Sergey Oblomov
c55db78e93
SPML/UCX: get mkey call refactoring
...
- method mca_spml_ucx_get_mkey_slow is moved into .c module,
added pointer to this method into mca_spml_ucx_t structure
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-01 18:58:14 +03:00
Sergey Oblomov
910e08f5ef
MCA/ATOMICS/UCX: workaround for abstraction violation
...
- some spml calls are marked as inline to exclude cross-module
dependency
- updated get-key call to get link to local module
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-29 16:32:04 +03:00
Sergey Oblomov
502d04bf12
UCX/PML/SPML: fixed few coverity issues
...
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Sergey Oblomov
63e7ba6843
MCA/COMMON/UCX: added parameter for UCX/opal progress
...
- added parameter to set UCX/opal progresses
- minor refactoring of request wait routines
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 11:00:12 +03:00
Sergey Oblomov
d57ae62dee
MCA/UCX: added common module
...
- implemented non-blocking routines for flush operations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-22 16:41:09 +03:00
Mikhail Brinskii
8e9d401938
OSHMEM/SMPL/UCX: Add real fence support
...
+ Add quiet method to SPML, so it can have different implementation with
fence.
+ Use ucp_worker_fence for spml_fence method of UCX SPML
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2018-05-25 22:43:06 +03:00
Yossi Itigin
385f38ab4e
ucx: improve error messages during connection establishment
...
Also, unite common code calling ucp_ep_create()
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-04-30 15:45:05 +03:00
Gilles Gouaillardet
88e26c63e0
spml/ucx: fix a double free() issue
...
in mca_spml_ucx_add_procs() error path
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-22 13:42:16 +09:00
Yossi Itigin
1193e1eb83
spml_ucx: fix rkey leak
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-12-26 20:47:26 +02:00
Yossi Itigin
3081576124
spml_ucx: fix typo in shmem_quiet() error message.
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-09-24 19:20:55 +03:00
Alex Mikheev
1b5df76f8b
oshmem: shmem_ptr() implementation
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-08-03 13:56:34 +03:00
Alex Mikheev
c63137e1c0
oshmem: sshmem ucx: minor code cleanup
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-02-22 17:48:00 +02:00
Alex Mikheev
132fbd9ae9
oshmem: sshmem: add UCX allocator
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-02-22 17:48:00 +02:00
Alex Mikheev
ea3ea4835b
oshmem: mem use hook: apply code review fixes
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
(cherry picked from commit a422154a141f0be5b92d2b6c26d7b2b4176dfe18)
2017-01-30 11:30:20 +02:00
Alex Mikheev
9da9e6260d
oshmem: spml ucx: on error print ucx error string
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-01-29 10:28:24 +02:00
Alex Mikheev
986ca000f8
oshmem: spml: add memory allocation hook
...
The hook is called from memheap when memory range
is going to be allocated by smalloc(), realloc() and others.
ucx spml uses this hook to call ucp_mem_advise in order to speedup
non blocking memory mapping.
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-01-26 16:41:39 +02:00
Alina Sklarevich
e9d2d029c6
PML/SPML/UCX: Adapt to the API changes in the UCX lib.
...
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2016-12-08 11:33:29 +02:00
Yossi Itigin
0241a2697d
spml_ucx: allow registering the heap in non-blocking mode.
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2016-11-25 15:09:22 +02:00
Ralph Castain
1e2019ce2a
Revert "Update to sync with OMPI master and cleanup to build"
...
This reverts commit cb55c88a8b7817d5891ff06a447ea190b0e77479.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b
Update to sync with OMPI master and cleanup to build
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
Alex Mikheev
ff5095e533
OSHMEM: adds support for mkey caching by spml
...
It improves cpu cache hit ratio.
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2016-11-06 11:56:43 +02:00
Pavel Shamis (Pasha)
92b0ebd7c3
For UCX it is legal to return UCS_INPROGRESS (1) code for non-blocking function
...
calls, which means that the operation was successfully started but not
immediately completed. This is a "good" return code that should not be handled
as an error.
Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>
2016-11-03 15:36:13 -05:00