1
1

621 Коммитов

Автор SHA1 Сообщение Дата
Sergey Oblomov
421a7fd47d SPML/UCX: fixed few compilation warnings
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-20 14:40:24 +03:00
Gilles Gouaillardet
5c14f8439a oshmem: remove useless reference to orte header
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-05-17 15:38:31 +09:00
Yossi Itigin
f7086746e9 OSHMEM/MMAP/SYSV: Return ERR_NOT_IMPLEMENTED if segment hint != 0
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2019-05-15 16:12:19 +03:00
Mikhail Brinskii
d81dc533f6 SPML/UCX: Fix coverity error
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-05-14 22:34:01 +03:00
Sergey Oblomov
4df8c1b3e3 OSHMEM/free: suppressed coverity issue
- removed dead code

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-13 11:59:02 +03:00
Yossi Itigin
94b5e91194 OSHMEM: Add support for shmemx_malloc_with_hint()
- added multiple segments processing
- added shmemx_malloc_with_hint call + set of hints

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-10 20:04:57 +03:00
Mikhail Brinskii
d4843b1651 SPML/UCS: CR comments p2
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-30 16:49:11 +03:00
Mikhail Brinskii
c4c99457db SPML/UCX: CR comments p1
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-30 16:26:45 +03:00
Mikhail Brinskii
2ef5bd8b36 SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.

Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-26 14:47:58 +03:00
Valentin Petrov
2fa93332ce Fixes the O(N^2) loop in the mca_scoll_mpi_comm_query
The new proc group is created from the "world_group" based on the
      ranks mapping which can be directly taken from proc_name->vpid.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2019-04-08 17:49:00 +03:00
Ben Menadue
063596b828 Add missing #include to oshmem/shmem/c/shmem_context.c.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2019-04-03 15:58:13 +11:00
Geoff Paulsen
44b3aa244b
Merge pull request #6510 from sam6258/int4_cswap_fix
shmem/fortran: Fix invalid datatype size in call to atomic cswap
2019-03-25 11:49:00 -05:00
Xin Zhao
9c3d00b144 ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy
ompi/oshmem/spml/ucx: optimize spml ucx progress

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e0414006b0 ompi/oshmem/spml/ucx:delete oob path of getting rkeys in spml ucx
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e1c1ab0202 ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:37 +02:00
Scott Miller
6b294e0641 shmem/fortran: Fix invalid datatype size in call to atomic cswap
Signed-off-by: Scott Miller <scott.miller1@ibm.com>
2019-03-20 21:57:08 -04:00
Xin Zhao
48033ac1f4 ompi/oshmem: add spml_context back to sshmem_type in memheap, to keep track of ucx_ctx_default's rkeys
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:48:21 +02:00
Xin Zhao
9a06000962 ompi/oshmem/spml/ucx: let shmem_finalize to clean up any ctx left
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:48:07 +02:00
Xin Zhao
289595e45d OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:47:50 +02:00
Xin Zhao
79ba752667 ompi/oshmem/spml/ucx: fix eps destroy in shmem_ctx_destroy().
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:47:38 +02:00
Xin Zhao
b00209e1f5 Revert "OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx."
This reverts commit f1b095c784de6d1908fa40dcf76e733110cbeaf2.

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:46:56 +02:00
Sergey Oblomov
d8e3562bae PML/SPML/UCX: added evaluation of mmap events
- there was a set of UCX related issues reported which caused
  by mmap API hooks conflicts. We added diagnostic of such
  problems to simplify bug-resolving pipeline

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-12 21:14:27 +02:00
Joshua Ladd
e57e18f6cc
Merge pull request #6290 from xinzhao3/topic/oshmem_mkeys
OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
2019-02-25 13:09:44 -05:00
Xin Zhao
f1b095c784 OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-25 16:19:08 +02:00
Gilles Gouaillardet
e0e924c4ed oshmem/wrappers: only install ORTE based wrappers if ORTE is built
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-02-20 13:55:55 +09:00
Gilles Gouaillardet
10cb9f6f9e oshmem: remove unnecessary dependencies to ORTE
either use OPAL or OMPI layers, since ORTE layer
is not present when PMIx RTE is used

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-02-20 13:55:55 +09:00
Jeff Squyres
3e82449dbe sshmem/verbs: So long / farewell / it's time to say goodnight
So long sshmem/verbs!  After many years of (mostly) faithful service,
it is time to remove the sshmem verbs component.  It has been fully
replaced by other components, such as the UCX PML and OFI MTL.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-02-07 05:34:19 -08:00
Sergey Oblomov
0759bb8561 COLL: removed FCA component
- removed FCA collectives from coll/scoll

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-01-09 16:51:40 +02:00
Yossi Itigin
939162ed33 oshmem/scoll: fix shmem_collect32/64 for zero-size length
Fixes scoll_basic failures with shmem_verifier, caused by recent changes
in handling of zero-size collectives.

- Check for zero-size length only for fixed size collect (shmem_fcollect),
  but not for variable-size collect (shmem_collect)
- Add 'nlong_type' parameter to internal broadcast function, to indicate
  whether the 'nlong' parameter is valid on non-root PEs, since it's
  used by shmem_collect algorithm. Before this change, some components
  assumed it's true (scoll_mpi) while others assumed it's false
  (scoll_basic).
- In scoll_basic, if nlong_type==false, do not exit if nlong==0, since
  this parameter may not be the same on all PEs.
- In scoll_mpi, fallback to scoll_basic if nlong_type==false, since MPI
  requires the 'count' argument of MPI_Bcast to be valid on all ranks.

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2019-01-01 20:43:32 +02:00
Sergey Oblomov
cfa9150934 OSHMEM: added missing API for get/put operations
- added calls for datatypes int/uint/8/16/32/size/ptrdiff
  for shmem_g/get/iget/get_nbi/_p/put/iput/put_nbi

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-12-17 11:34:03 +02:00
Gilles Gouaillardet
5ea939aa54 oshmem: fix macro usage in pshmem.h
pshmem.h now includes shmem.h (since open-mpi/ompi@f46130cd20) and some macros were removed at that time.

Use the OSHMEM_HAVE_C11 macro (defined in shmem.h) instead of the
previous OSHMEMP_HAVE_C11 macrso previously defined in pshmem.h

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-12-12 14:03:16 +09:00
Yossi Itigin
c28ba954b9
Merge pull request #6132 from bertwesarg/pshmem-includes-shmem
OSHMEM: Let `pshmem.h` include `shmem.h` to be stand-alone again
2018-12-10 18:01:40 +02:00
Yossi Itigin
09e13ad7b0 sshmem_ucx: add owner.txt
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-12-02 11:25:30 +02:00
Yossi Itigin
83cca9d52a ucx: add owner.txt for components
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-12-01 17:14:03 +02:00
Bert Wesarg
f46130cd20 OSHMEM: Let pshmem.h include shmem.h to be stand-alone again
See #6093

Signed-off-by: Bert Wesarg <bert.wesarg@tu-dresden.de>
2018-11-28 21:24:05 +01:00
Yossi Itigin
39daf7a436
Merge pull request #6104 from hoopoepg/topic/oshmem-zero-len-coll
OSHMEM: added processing of zero-length collectives
2018-11-27 13:28:41 +02:00
Jeff Squyres
dbe064af97
Merge pull request #5653 from bmwiedemann/userhost
Allow to override build user and host
2018-11-26 17:48:37 -05:00
Sergey Oblomov
c93927e27a OSHMEM/COLL/BCAST: removed unnecessary bcast call
- removed unnecessary bcast call on zero-length request

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-26 16:53:48 +02:00
Sergey Oblomov
ff2fd0679e OSHMEM/COLL: optimization on zero-length ops
- removed barrier call on zero-length operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-26 10:41:33 +02:00
Sergey Oblomov
9de128afaf OSHMEM: added processing of zero-length collectives
- according spec 1.4, annex C shmem collectives should process
  calls where number of elements is zero independently from pointer
  value
- added zero-count processing - it just call barrier to
  sync ranks

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-23 14:34:46 +02:00
Sergey Oblomov
4c071da565 OSHMEM/AMO: added int/uint/32/64 atomics calls
- added int/uint/32/64 atomics calls
- added SHMEM_SYNC_SIZE macro

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-22 10:18:16 +02:00
Yossi Itigin
241b424bd3
Merge pull request #6000 from hoopoepg/topic/added-missing-amo-datatypes
OSHMEM/AMO: added missing C11 macro datatypes
2018-11-01 15:29:56 +02:00
Sergey Oblomov
6e78102089 OSHMEM/AMO: code beautify
- added <cr> to split API groups to simplify human processing

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-11-01 11:33:34 +02:00
Yossi Itigin
bbe5da483a
Merge pull request #5985 from hoopoepg/topic/fixed-oshmem-profile-build
OSHMEM/PROFILE: fixed oshmem profile build
2018-10-31 14:23:46 +02:00
Sergey Oblomov
f63d6da6d7 OSHMEM/AMO: added missing C11 macro datatypes
- added signed datatypes for atomic_add calls
- added unsigned datatypes for atomic put/inc/get/fetch calls
- fixed incorrect SHMEM_CTX_DEFAULT macro, added
  external declaration of oshmem_ctx_default variable

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-30 23:15:26 +02:00
Sergey Oblomov
4a3e83780c OSHMEM/PROFILE: fixed profile build
- added missing file to profile makefile
- constants SHMEM_CTX_* are shifted into public header

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-29 23:50:26 +02:00
Yossi Itigin
6754bf1465 SCOLL/BASIC: Fix invalid pSync pointer passed to barrier func
mca_scoll_basic_alltoall() passed (pSync + 1) to barrier function, but
the value of _SHMEM_ALLTOALL_SYNC_SIZE is 1, which made the barrier
function use an invalid memory location. In particular, this location
was not initialized to _SHMEM_SYNC_VALUE, which broke the barrier
algorithm and it did not complete: One PE could read 0 from its peer and
assume the peer already started the barrier, and then write 1 to the
peer. Then, the peer entered the barrier and overwrote the 1 with 0, and
then it waited forever to see '1' in its pSync.

Found with shmem_verifier test suite.

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-29 12:21:34 +02:00
Bernhard M. Wiedemann
bc23993dea Allow to override build user and host
using the standard $USER and $HOSTNAME environment variables
to make reproducible builds possible.
See https://reproducible-builds.org/ for why this is good.

This helps improve issue #3759

Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
2018-10-20 09:27:00 -04:00
Brian Barrett
e9e4d2a4bc Handle asprintf errors with opal_asprintf wrapper
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error.  However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined.  Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-10-08 16:43:53 -07:00
Yossi Itigin
41011502c8 shmem/lock: progress communications while waiting for shmem_lock
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-09-16 18:46:36 +03:00