Sergey Oblomov
a0a9306066
COMMON/UCX: init memhooks infra on external hooks only
...
- initialize memory hooks infrastructure only in case
if external memory hooks are requested
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 20:13:16 +03:00
Mikhail Brinskii
2ef5bd8b36
SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
...
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-26 14:47:58 +03:00
Xin Zhao
9c3d00b144
ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy
...
ompi/oshmem/spml/ucx: optimize spml ucx progress
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e1c1ab0202
ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:37 +02:00
Sergey Oblomov
c319cf9ade
COMMON/UCX: rewording of hooks suggestion
...
- also updated output macro
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-14 11:00:57 +02:00
Sergey Oblomov
d8e3562bae
PML/SPML/UCX: added evaluation of mmap events
...
- there was a set of UCX related issues reported which caused
by mmap API hooks conflicts. We added diagnostic of such
problems to simplify bug-resolving pipeline
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-12 21:14:27 +02:00
Artem Polyakov
91d6115d99
opal/common/ucx: Adjust the threasholds for periodical flushes
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
3aadc2b5e1
opal/common/ucx: Fix periodical flush in the worker pool
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
84dfe1277c
opal/common/ucx: Rename wpool recv_worker to dflt_worker
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
8a990c2b64
opal/common/ucx: Add comments clarifying data structures
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
19e2ae2efb
opal/common/ucx: Switch to opal/tsd
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
7984d7d997
opal/common/ucx: Remove unused debugging macro
...
Will be reintroduced later if needed and after adaptation to the OMPI
infrastructure.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
43f16d8796
opal/common/ucx: Remove common_ucx_int.h
...
Place the content of common_ucx_int.h back to the common_ucx.h and
include common_ucx_wpool.h explicitly.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
bb7d360621
opal/common/ucx: add refcnt in tlocal_ctx_tbl entry to keep track of usage
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
101036651b
opal/common/ucx: Fix the bug in wpool's periodical flush
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
bcb52ecade
opal/common/ucx: add winfo ptr into req
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
33517428a1
opal/common/ucx: add periodical flush and counter to opal directory.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
1fa7054041
opal/common/ucx: use trylock in opal_common_progress
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
2d3cffe1a3
opal/common/ucx: replace opal_mutex_t with opal_recursive_mutex_t
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
aa26a724ed
opal/common/ucx: introduce internal UCX request in wpool.
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
07cb4134be
opal/common/ucx: Set of bug fixes in wpool
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
344bb641a1
opal/common/ucx: Minor changes in wpool
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
9fb9cfbe8e
opal/common/ucx: Simplify Worker Pool TLS structure
...
Get rid of unneeded context and memory region identifiers
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
1e7bf7085d
opal/common/ucx: Improve/fix debug output macro's
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
fd98ee14eb
opal/common/ucx: Code cleanup
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
f38c9f3e5f
opal/common/ucx: Simplify Worker Pool memory handler
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
6b7acdf21f
opal/common/ucx: Somplify Worker Pool context management
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Xin Zhao
8b7fa927ba
opal/common/ucx: Add fetch primitives to wpool
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Xin Zhao
bfbf818fe1
opal/common/ucx: Complete initialization of the Worker Pool
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
e28fadb048
opal/common/ucx: Introduce Worker Pool (wpool) functionality
...
Worker Pool is an object containing/managing a set of UCX workers
and providing access to those workers through a smal interface
to allow Multi-Threaded applicatoins to access multiple HW contexts.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Yossi Itigin
83cca9d52a
ucx: add owner.txt for components
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-12-01 17:14:03 +02:00
Sergey Oblomov
1099d5f023
COMMON/UCX: added error code to log output
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-21 11:37:25 +03:00
Sergey Oblomov
df765595e3
COMMON/UCX: suppressed coverity warnings
...
- suppressed coverity warnings - added log messages on failed calls
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-17 16:11:03 +03:00
Gilles Gouaillardet
db65dbd9a8
ucx: use the c99 __func__ macro instead
...
__FUNCTION__ macro was never standardized and should not be used.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-09-25 11:19:18 +09:00
Yossi Itigin
68206a5635
Merge pull request #5569 from hoopoepg/topic/optimize-blocked-calls
...
PML/UCX: blocked calls optimizations
2018-08-29 14:19:09 +03:00
Yossi Itigin
4bb6845888
Merge pull request #5570 from hoopoepg/topic/opal-mem-hooks-syno
...
MCA/COMMON/UCX: added synonym to opal_mem_hook variable
2018-08-29 14:16:33 +03:00
Sergey Oblomov
2cd9e04166
PML/UCX: optimization of mprobe call - renamed vars
...
- renamed of internal variable names
- used unsigned datatypes
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:39 +03:00
Sergey Oblomov
38e908f83e
PML/UCX: optimization of mprobe call
...
- refactoring of opal/UCX progress calls
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:38 +03:00
Sergey Oblomov
b0f87f2235
PML/UCX: blocked calls optimizations
...
- added UCX progress priority
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:38 +03:00
Sergey Oblomov
b72dd83f05
MCA/COMMON/UCX: added synonims for common ucx variables
...
- added synonims for atomic/osc modules
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-26 18:25:21 +03:00
Jeff Squyres
fe0852bcb4
Miscellaneous compiler warning stomps.
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-24 07:39:14 -07:00
Sergey Oblomov
6a7f66d9c2
MCA/COMMON/UCX: renamed synonim to opal_mem_hook variable
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-22 14:12:33 +03:00
Sergey Oblomov
e00f7a68ba
MCA/COMMON/UCX: added synonim to opal_mem_hook variable
...
- added synonim to opal_mem_hook variable to allow
to print it in opal_info -a
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-21 15:05:12 +03:00
Boris Karasev
57683366ca
pmix: added check for pmix fence status
...
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-08-06 15:01:57 +06:00
Yossi Itigin
29812494f2
Merge pull request #5402 from hoopoepg/topic/common-del-procs
...
MCA/COMMON/UCX: del_procs calls are unified to common module
2018-07-19 11:19:45 +03:00
Sergey Oblomov
920cc2e0d9
MCA/COMMON/UCX: del_procs calls are unified to common module
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 07:37:25 +03:00
Sergey Oblomov
a4b8253fa2
MCA/COMMON/UCX: fixed initialization of malloc hooks
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-17 20:09:50 +03:00
Sergey Oblomov
1c7ae22dfb
MCA/COMMON/UCX: shift opal memhooks into common UCX
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-17 13:46:38 +03:00
Sergey Oblomov
bef47b792c
MCA/COMMON/UCX: unified logging across all UCX modules
...
- added common logging infrastructure for all
UCX modules
- all UCX modules are switched to new infra
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
8080283b3d
MCA/COMMON/UCX: changed return type for wait_request
...
- for now wait_request returns OMPI status
- updated callers
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 23:29:38 +03:00