1
1
Граф коммитов

336 Коммитов

Автор SHA1 Сообщение Дата
Sergey Oblomov
5838760a3a OSHMEM/COLL/BCAST: removed unnecessary bcast call
- removed unnecessary bcast call on zero-length request

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit c93927e27a)
2018-11-27 14:26:56 +02:00
Sergey Oblomov
0a064d8c8d OSHMEM/COLL: optimization on zero-length ops
- removed barrier call on zero-length operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit ff2fd0679e)
2018-11-27 14:26:52 +02:00
Sergey Oblomov
dea9cf6b63 OSHMEM: added processing of zero-length collectives
- according spec 1.4, annex C shmem collectives should process
  calls where number of elements is zero independently from pointer
  value
- added zero-count processing - it just call barrier to
  sync ranks

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 9de128afaf)
2018-11-27 14:26:44 +02:00
Yossi Itigin
8a329a797c SCOLL/BASIC: Fix invalid pSync pointer passed to barrier func
mca_scoll_basic_alltoall() passed (pSync + 1) to barrier function, but
the value of _SHMEM_ALLTOALL_SYNC_SIZE is 1, which made the barrier
function use an invalid memory location. In particular, this location
was not initialized to _SHMEM_SYNC_VALUE, which broke the barrier
algorithm and it did not complete: One PE could read 0 from its peer and
assume the peer already started the barrier, and then write 1 to the
peer. Then, the peer entered the barrier and overwrote the 1 with 0, and
then it waited forever to see '1' in its pSync.

Found with shmem_verifier test suite.

(picked from master 6754bf1)

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-31 12:22:19 +02:00
Sergey Oblomov
3cace87749 MCA/COMMON/UCX: del_procs calls are unified to common module
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 920cc2e0d9)
2018-09-19 10:47:27 +03:00
Sergey Oblomov
028bcb8a73 MCA/COMMON/UCX: added synonim to opal_mem_hook variable
- added synonim to common ucx variables to allow
  to print it in opal_info -a

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit e00f7a68ba)
2018-08-29 15:17:00 +03:00
Boris Karasev
8873d901e8 pmix: added check for pmix fence status
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
(cherry picked from commit 57683366ca)

Conflicts:
	opal/mca/common/ucx/common_ucx.c
	opal/mca/common/ucx/common_ucx.h

Modified:
	ompi/mca/pml/ucx/pml_ucx.c
	oshmem/mca/spml/ucx/spml_ucx.c
2018-08-17 21:33:50 +06:00
Sergey Oblomov
b64502977a PML/SPML/UCX: init global objects using C99 style
- to avoid value mix used C99 style of object initializations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 2806504290)
2018-07-28 16:47:43 +03:00
Sergey Oblomov
58b7786b70 MCA/ATOMIC: atomic_init renamed to atomic_startup
- there is C11 naming conflict - atomic_init is C macro
  which cause building issue

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 3295b23800)
2018-07-24 17:23:42 +03:00
Xin Zhao
c429900cd9 OMPI/OSHMEM: add new functionality of OpenSHMEM v1.4.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-07-16 12:55:25 -07:00
Yossi Itigin
9d0b3a42aa
Merge pull request #5423 from hoopoepg/topic/bitwise-atomics-renaming
ATOMICS: renamed atomic calls to unsigned datatypes
2018-07-16 19:08:02 +03:00
Sergey Oblomov
bd84165277 ATOMICS: renamed atomic calls to unsigned datatypes
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-13 15:32:16 +03:00
Sergey Oblomov
d51426ff0a ATOMIC/MXM: fixed abstraction violation
- applied workaround for incorrect dynamic module dependency

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-13 14:30:12 +03:00
Sergey Oblomov
0212410187 OSHMEM/ATOMICS: fixed comments errors
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-10 10:13:40 +03:00
Sergey Oblomov
64212a9ff1 OSHMEM/ATOMICS: added C implementation of and/or/xor ops
- added implementation and/or/xor operations for post and
  fetch-op notations
- implemented basic and UCX transports, mxm added
  NON-IMPLEMENTED wrapper
- updated C interfaces only (fortran will be added later)
- existing API is not updated to spec v1.4

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-09 16:18:47 +03:00
Sergey Oblomov
240670152e MCA/COMMON/UCX: code beautify - alignment
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-06 19:40:58 +03:00
Sergey Oblomov
bef47b792c MCA/COMMON/UCX: unified logging across all UCX modules
- added common logging infrastructure for all
  UCX modules
- all UCX modules are switched to new infra

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
8080283b3d MCA/COMMON/UCX: changed return type for wait_request
- for now wait_request returns OMPI status
- updated callers

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 23:29:38 +03:00
Sergey Oblomov
586c16be70 ATOMIC/UCX: renamed ATOMIC_PTR_2_INT to OSHMEM_ATOMIC_PTR_2_INT
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 15:07:39 +03:00
Sergey Oblomov
54d97efa84 ATOMIC/UCX: fixed atomic swap/cswap issues
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 14:41:48 +03:00
Sergey Oblomov
5eb8c99cd7 ATOMIC/UCX: optimization for cswap
- used uint64_t output datatype to avoid branches in
  implementations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 14:41:46 +03:00
Sergey Oblomov
f574c14e3a ATOMICS/UCX: redefine atomic module API
- now it accepts integer values directily instead of
  pointers

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 14:41:45 +03:00
Sergey Oblomov
a0ea368464 ATOMIC/UCX: minor optimization and code cleaning
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 14:41:43 +03:00
Sergey Oblomov
eb0abfcf92 SHMEM/ATOMIC: refactoring of module API
- removed atomic-basic-specific operand from module API
- added own calls for add and swap operations
- minor optimization for UCX atomics

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 14:40:14 +03:00
Sergey Oblomov
13331ba4d8 MCA/COMMON/UCX: code beautify + build fix
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 16:37:03 +03:00
Sergey Oblomov
8a793bb279 MCA/COMMON/UCX: fixed build issues
- fixed fuild issues when used older UCX
- added non-blocking call of ucp_put call

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:58:08 +03:00
Sergey Oblomov
c2bd6af9f2 MCA/COMMON/UCX: minor unification of del_proces calls
- some common functionality of del_procs calls is moved into
  mca_common module
- blocking ucp_put call is replaced by non-blocking routine

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:10:53 +03:00
Sergey Oblomov
952fa8ade7 PML/UCX: method mca_spml_ucx_get_mkey_slow is renamed to get_mkey_slow
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-01 20:44:19 +03:00
Sergey Oblomov
c55db78e93 SPML/UCX: get mkey call refactoring
- method mca_spml_ucx_get_mkey_slow is moved into .c module,
  added pointer to this method into mca_spml_ucx_t structure

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-01 18:58:14 +03:00
Sergey Oblomov
910e08f5ef MCA/ATOMICS/UCX: workaround for abstraction violation
- some spml calls are marked as inline to exclude cross-module
  dependency
- updated get-key call to get link to local module

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-29 16:32:04 +03:00
Sergey Oblomov
502d04bf12 UCX/PML/SPML: fixed few coverity issues
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Sergey Oblomov
63e7ba6843 MCA/COMMON/UCX: added parameter for UCX/opal progress
- added parameter to set UCX/opal progresses
- minor refactoring of request wait routines

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 11:00:12 +03:00
Sergey Oblomov
d57ae62dee MCA/UCX: added common module
- implemented non-blocking routines for flush operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-22 16:41:09 +03:00
Sergey Oblomov
319bb376f9 MCA/UCX: branch optimization in cswap call
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-30 07:14:43 +03:00
Sergey Oblomov
daad71f036 MCA/UCX: switch/case optimization
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 22:33:33 +03:00
Sergey Oblomov
6be4066e23 MCA/UCX: cswap call if updated to non-blocking API
- minor fixes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 20:07:51 +03:00
Sergey Oblomov
b668e19cd1 Merge remote-tracking branch 'wg/master' into topic/amo-non-blocking-ucp
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 19:08:48 +03:00
Yossi Itigin
976cd5e307
Merge pull request #5186 from hoopoepg/topic/ucx-amo-error-msg
MCA/UCX: fixed error messages for incorrect msg size
2018-05-29 15:54:02 +03:00
Sergey Oblomov
0c3ed93ef0 MCA/UCX: added opal progress call to wait request routine
- added opal_progress call to wait function to avoid
  possible [dead]lock issues
- wait call is declared as inline
- minor fixes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 11:34:40 +03:00
Yossi Itigin
705c8a7b9b
Merge pull request #5198 from brminich/shmem_fence
OSHMEM/SMPL/UCX: Add real fence support
2018-05-27 11:25:42 +03:00
Mikhail Brinskii
8e9d401938 OSHMEM/SMPL/UCX: Add real fence support
+ Add quiet method to SPML, so it can have different implementation with
fence.
+ Use ucp_worker_fence for spml_fence method of UCX SPML

Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2018-05-25 22:43:06 +03:00
Brian Barrett
9fff40647d oshmem: disable if no spmls build
This patch disables the oshmem layer if there are no SPMLs that
will build.  With the limited set of SPMLs available to support
oshmem, many builds end up installing an oshmem library that we
know will not work.  There has been a bit of customer confusion
over oshmem, hopefully this will lead customers in the right
direction.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-05-25 08:48:50 -07:00
Sergey Oblomov
bbaffd3681 MCA/UCX: atomic add/swap are moved to new UCX atomic API
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-22 22:23:31 +03:00
Sergey Oblomov
4495da5cb9 MCA/UCX: fixed error messages for incorrect msg size
- supported 4 or 8 bytes only

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-22 19:53:23 +03:00
Yossi Itigin
385f38ab4e ucx: improve error messages during connection establishment
Also, unite common code calling ucp_ep_create()

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-04-30 15:45:05 +03:00
Joshua Ladd
15d5e2937a
Merge pull request #4996 from xinzhao3/topic/shmem-cswap
ompi/oshmem: fix cswap bug in mca/atomic/mxm.
2018-04-04 08:28:57 -04:00
Xin Zhao
a5b72cc2e4 ompi/oshmem: fix cswap bug in mca/atomic/mxm.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-03-30 03:17:01 -05:00
Xin Zhao
af32c305de ompi/oshmem: fix bug in shmem_alltoall in mca/scoll/basic.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-03-29 14:54:36 -05:00
Boris Karasev
3796307a57 timings: added new timing points
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-21 05:16:25 +02:00
Alex Mikheev
cca67a69ea oshmem: scoll: fixes strided alltoall
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-02-19 09:41:21 +02:00