1
1

243 Коммитов

Автор SHA1 Сообщение Дата
Xin Zhao
bb7d360621 opal/common/ucx: add refcnt in tlocal_ctx_tbl entry to keep track of usage
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
101036651b opal/common/ucx: Fix the bug in wpool's periodical flush
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
bcb52ecade opal/common/ucx: add winfo ptr into req
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
33517428a1 opal/common/ucx: add periodical flush and counter to opal directory.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
1fa7054041 opal/common/ucx: use trylock in opal_common_progress
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
2d3cffe1a3 opal/common/ucx: replace opal_mutex_t with opal_recursive_mutex_t
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
aa26a724ed opal/common/ucx: introduce internal UCX request in wpool.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
07cb4134be opal/common/ucx: Set of bug fixes in wpool
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-19 14:22:07 -08:00
Xin Zhao
344bb641a1 opal/common/ucx: Minor changes in wpool
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
9fb9cfbe8e opal/common/ucx: Simplify Worker Pool TLS structure
Get rid of unneeded context and memory region identifiers

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:07 -08:00
Artem Polyakov
1e7bf7085d opal/common/ucx: Improve/fix debug output macro's
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
fd98ee14eb opal/common/ucx: Code cleanup
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
f38c9f3e5f opal/common/ucx: Simplify Worker Pool memory handler
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
6b7acdf21f opal/common/ucx: Somplify Worker Pool context management
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Xin Zhao
8b7fa927ba opal/common/ucx: Add fetch primitives to wpool
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Xin Zhao
bfbf818fe1 opal/common/ucx: Complete initialization of the Worker Pool
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Artem Polyakov
e28fadb048 opal/common/ucx: Introduce Worker Pool (wpool) functionality
Worker Pool is an object containing/managing a set of UCX workers
and providing access to those workers through a smal interface
to allow Multi-Threaded applicatoins to access multiple HW contexts.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2019-02-19 14:22:06 -08:00
Jeff Squyres
dd20174532 Remove opal/mca/common/ofi.
It never lived up to its purpose (and has caused amorphous indirect
errors such as https://github.com/open-mpi/ompi/issues/2519), so
delete it.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-02-07 06:29:58 -08:00
Jeff Squyres
3f4af8e51c opal/common: remove stale common components
The verbs and verbs_usnic components are now no longer necessary / no
longer used anywhere in the code base.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-02-07 05:36:06 -08:00
Yossi Itigin
83cca9d52a ucx: add owner.txt for components
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-12-01 17:14:03 +02:00
Sergey Oblomov
1099d5f023 COMMON/UCX: added error code to log output
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-21 11:37:25 +03:00
Sergey Oblomov
df765595e3 COMMON/UCX: suppressed coverity warnings
- suppressed coverity warnings - added log messages on failed calls

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-17 16:11:03 +03:00
Brian Barrett
e9e4d2a4bc Handle asprintf errors with opal_asprintf wrapper
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error.  However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined.  Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
Gilles Gouaillardet
db65dbd9a8 ucx: use the c99 __func__ macro instead
__FUNCTION__ macro was never standardized and should not be used.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-09-25 11:19:18 +09:00
Nathan Hjelm
000f9eed4d opal: add types for atomic variables
This commit updates the entire codebase to use specific opal types for
all atomic variables. This is a change from the prior atomic support
which required the use of the volatile keyword. This is the first step
towards implementing support for C11 atomics as that interface
requires the use of types declared with the _Atomic keyword.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:48:55 -06:00
Jeff Squyres
05e5f61fe1 common/verbs-usnic: check that it will actually compile
If someone specifies --with-verbs-usnic, actually do a configury check
to ensure that it will compile (vs. assuming that it will compile if
someone asks for it).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-30 13:10:49 -07:00
Yossi Itigin
68206a5635
Merge pull request #5569 from hoopoepg/topic/optimize-blocked-calls
PML/UCX: blocked calls optimizations
2018-08-29 14:19:09 +03:00
Yossi Itigin
4bb6845888
Merge pull request #5570 from hoopoepg/topic/opal-mem-hooks-syno
MCA/COMMON/UCX: added synonym to opal_mem_hook variable
2018-08-29 14:16:33 +03:00
Sergey Oblomov
2cd9e04166 PML/UCX: optimization of mprobe call - renamed vars
- renamed of internal variable names
- used unsigned datatypes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:39 +03:00
Sergey Oblomov
38e908f83e PML/UCX: optimization of mprobe call
- refactoring of opal/UCX progress calls

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:38 +03:00
Sergey Oblomov
b0f87f2235 PML/UCX: blocked calls optimizations
- added UCX progress priority

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:38 +03:00
Sergey Oblomov
b72dd83f05 MCA/COMMON/UCX: added synonims for common ucx variables
- added synonims for atomic/osc modules

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-26 18:25:21 +03:00
Jeff Squyres
fe0852bcb4 Miscellaneous compiler warning stomps.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-24 07:39:14 -07:00
Sergey Oblomov
6a7f66d9c2 MCA/COMMON/UCX: renamed synonim to opal_mem_hook variable
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-22 14:12:33 +03:00
Sergey Oblomov
e00f7a68ba MCA/COMMON/UCX: added synonim to opal_mem_hook variable
- added synonim to opal_mem_hook variable to allow
  to print it in opal_info -a

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-21 15:05:12 +03:00
Boris Karasev
57683366ca pmix: added check for pmix fence status
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-08-06 15:01:57 +06:00
Yossi Itigin
29812494f2
Merge pull request #5402 from hoopoepg/topic/common-del-procs
MCA/COMMON/UCX: del_procs calls are unified to common module
2018-07-19 11:19:45 +03:00
Sergey Oblomov
920cc2e0d9 MCA/COMMON/UCX: del_procs calls are unified to common module
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 07:37:25 +03:00
Sergey Oblomov
a4b8253fa2 MCA/COMMON/UCX: fixed initialization of malloc hooks
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-17 20:09:50 +03:00
Sergey Oblomov
1c7ae22dfb MCA/COMMON/UCX: shift opal memhooks into common UCX
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-17 13:46:38 +03:00
Sergey Oblomov
bef47b792c MCA/COMMON/UCX: unified logging across all UCX modules
- added common logging infrastructure for all
  UCX modules
- all UCX modules are switched to new infra

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
8080283b3d MCA/COMMON/UCX: changed return type for wait_request
- for now wait_request returns OMPI status
- updated callers

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 23:29:38 +03:00
Sergey Oblomov
f574c14e3a ATOMICS/UCX: redefine atomic module API
- now it accepts integer values directily instead of
  pointers

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 14:41:45 +03:00
Sergey Oblomov
c2bd6af9f2 MCA/COMMON/UCX: minor unification of del_proces calls
- some common functionality of del_procs calls is moved into
  mca_common module
- blocking ucp_put call is replaced by non-blocking routine

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:10:53 +03:00
Sergey Oblomov
624d59604b MCA/COMMON/UCX: minor optimization of build scripts
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-28 12:58:07 +03:00
Sergey Oblomov
de8568c822 MCA/COMMON/UCX: enabled fallback into older UCX API
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-27 19:59:40 +03:00
Sergey Oblomov
1223b05811 MCA/COMMON/UCX: fixed build scripts
- updated evaluation of UCX lib - used call from UCX v1.3
- updated makefile compilation flags

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-27 11:10:25 +03:00
Sergey Oblomov
bf7fd480e9 MCA/COMMON/UCX: added non-blocking implementations of atomics
- added implementation of swap/cswap/fadd operations
- blocking add64 is replaced by non-blocking routine

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 12:25:31 +03:00
Sergey Oblomov
63e7ba6843 MCA/COMMON/UCX: added parameter for UCX/opal progress
- added parameter to set UCX/opal progresses
- minor refactoring of request wait routines

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 11:00:12 +03:00
Sergey Oblomov
d57ae62dee MCA/UCX: added common module
- implemented non-blocking routines for flush operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-22 16:41:09 +03:00