1
1
Граф коммитов

66 Коммитов

Автор SHA1 Сообщение Дата
Sergey Oblomov
6fe0a73861 PML/UCX: fixed ucp request free on persistent request completion
- in sine cases persistent request was deleted during completion
  callback, this cause double free of linked UCX request (assert
  in debug build or hang in release build)
- UCX request is freed prior completion calback

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-20 19:32:20 +03:00
Yossi Itigin
29812494f2
Merge pull request #5402 from hoopoepg/topic/common-del-procs
MCA/COMMON/UCX: del_procs calls are unified to common module
2018-07-19 11:19:45 +03:00
Sergey Oblomov
920cc2e0d9 MCA/COMMON/UCX: del_procs calls are unified to common module
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 07:37:25 +03:00
Sergey Oblomov
1c7ae22dfb MCA/COMMON/UCX: shift opal memhooks into common UCX
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-17 13:46:38 +03:00
Sergey Oblomov
240670152e MCA/COMMON/UCX: code beautify - alignment
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-06 19:40:58 +03:00
Sergey Oblomov
bef47b792c MCA/COMMON/UCX: unified logging across all UCX modules
- added common logging infrastructure for all
  UCX modules
- all UCX modules are switched to new infra

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
8080283b3d MCA/COMMON/UCX: changed return type for wait_request
- for now wait_request returns OMPI status
- updated callers

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 23:29:38 +03:00
Sergey Oblomov
c2bd6af9f2 MCA/COMMON/UCX: minor unification of del_proces calls
- some common functionality of del_procs calls is moved into
  mca_common module
- blocking ucp_put call is replaced by non-blocking routine

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:10:53 +03:00
Sergey Oblomov
074f30ba27 PML/UCX: suppressed compilation warning
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-27 12:05:07 +03:00
Sergey Oblomov
502d04bf12 UCX/PML/SPML: fixed few coverity issues
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Yossi Itigin
ee873f4f79
Merge pull request #5322 from hoopoepg/topic/mca-ucx-common
MCA/UCX: added common module
2018-06-26 13:54:12 +03:00
Sergey Oblomov
d57ae62dee MCA/UCX: added common module
- implemented non-blocking routines for flush operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-22 16:41:09 +03:00
Gilles Gouaillardet
edd02b7144 pml/ucx: silence a warning
declare 'fenced' volatile in order to silence CID 1437465

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-22 13:11:42 +09:00
Sergey Oblomov
5f03628560 PML/UCX: removed uneeded flush
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 12:40:46 +03:00
Sergey Oblomov
2745da7dcc PML/UCX: use non-blocking fence instead of async progress
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 09:46:03 +03:00
Sergey Oblomov
10f2d831ec PML/UCX: fixed hang on MPI_Finalize
- added async UCX progress thread to allow
  pending requests to complete

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-20 16:12:05 +03:00
Yossi Itigin
564f80d362 pml_ucx: add option to use opal memhooks instead of ucx internal hooks
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-06-17 15:30:44 +03:00
Sergey Oblomov
0a8261f3b0 PML/UCX: fixed hand on MPI_Finalize
fixes issue https://github.com/openucx/ucx/issues/2656

added flush for worker object to complete all pending operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-05 17:22:03 +03:00
Sergey Oblomov
5ec26914a6 PML/UCX: do not set offset on ordered data recv
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-21 19:40:07 +03:00
Sergey Oblomov
19607daa32 PML/UCX: create convertor clone instead of stack reset
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 16:39:13 +03:00
Sergey Oblomov
7c5de01c57 PML/UCX: reset converter stack on unordered messages
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 13:11:02 +03:00
Yossi Itigin
385f38ab4e ucx: improve error messages during connection establishment
Also, unite common code calling ucp_ep_create()

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-04-30 15:45:05 +03:00
Sergey Oblomov
7a5811d0a8 request/state: update state for canceled request
- fixed issue in set state for canceled request

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-01-29 18:26:20 +02:00
Yossi Itigin
f2851fd502
Merge pull request #4724 from alex-mikheev/topic/ucx_as_default
ompi/oshmem: ucx is selected over yalla/ikrit by default
2018-01-17 17:41:49 +02:00
Alex Mikheev
640e945b9c ompi: pml/ucx: blocking send using ucp_tag_send_nbr
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:54:18 +02:00
Alex Mikheev
ae326546f4
ompi/oshmem: ucx is selected over yalla/ikrit by default
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:08:04 +02:00
Alex Mikheev
e7bf0617cf
ompi: pml ucx: improve recv latency
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-12-26 16:24:16 +02:00
Yossi Itigin
14a93a5992 pml_ucx: fix tag/context_id layout and upper bounds.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-08-27 17:15:48 +03:00
Joshua Hursey
e1d079544b mca: Dynamic components link against project lib
* Resolves #3705
 * Components should link against the project level library to better
   support `dlopen` with `RTLD_LOCAL`.
 * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
   with the appropriate project level library:
```
MCA components in ompi/
       $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
       $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
       $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
       $(top_builddir)/oshmem/liboshmem.la"
```

Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Joshua Ladd
c27beea3a1 Merge pull request #3962 from karasevb/ucx_detect
configure: detect UCX support by default
2017-08-03 16:33:57 -04:00
Boris Karasev
d917d54ddc configure: detect UCX support by default
Adds detecting UCX from following paths: "/usr /usr/local /opt/ucx"

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-25 23:48:49 +03:00
KAWASHIMA Takahiro
0cbdbe32f7 ompi/request: Support non-PML persistent requests
This commit adds the `req_start` member to the `ompi_request_t` struct.
The `MPI_START` and `MPI_STARTALL` routines call this callback function
instead of `MCA_PML_CALL(start(...))`. So components that return
persistent request must set this member to their request objects.

`mca_pml_base_module_t::pml_start` is not deleted because
`MCA_PML_CALL(start(...))` is still used elsewhere across OMPI.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-02 13:08:17 +09:00
Alina Sklarevich
49913c692a PML UCX: unite the code for all the sending modes.
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-04-26 13:17:06 +03:00
Alina Sklarevich
d93b67257b PML UCX: handle a synchronous send.
MCA_PML_BASE_SEND_SYNCHRONOUS

Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-04-13 18:11:55 +03:00
Alina Sklarevich
eec310c99c PML/UCX/YALLA: Fix the message release call.
Set message to MPI_MESSAGE_NULL.

Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-04-13 14:41:13 +03:00
Xin Zhao
ee952fcccd Passing estimated_num_procs to UCX init in PML and SPML.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-03-27 20:36:52 +03:00
Xin Zhao
6a99c60fbd Add multithreading support in PML UCX framework.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-03-20 19:55:00 +02:00
Alex Mikheev
c081239f88
ompi: pml ucx: fix persistant request init CR changes
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-03-08 13:26:29 +02:00
Alex Mikheev
c113c37a7a
ompi: pml ucx: fix persistant request initialization
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-03-08 10:59:41 +02:00
Artem Polyakov
9448814c40 ompi/pml/ucx: Fix uninitialized UCX request field.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-03-05 03:06:30 +07:00
Alex Mikheev
152f77df59
ompi: pml ucx: fix datatype packing error in bsend
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-03-01 16:18:19 +02:00
Yossi
fb67c966a8 Merge pull request #2944 from alex-mikheev/topic/pml_ucx_bsend
ompi: pml ucx: add support for the buffered send
2017-02-22 12:21:03 +02:00
Alex Mikheev
b015c8bb48 ompi: pml ucx: add support for the buffered send
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-02-21 17:19:22 +02:00
Gilles Gouaillardet
4184c01be5 Merge pull request #2393 from bosilca/topic/no_predefined_ddt_refcount
Don't refcount the predefined datatypes.
2017-02-21 09:38:11 +09:00
George Bosilca
c2cd717f82 Don't refcount the predefined datatypes.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-01-11 16:48:59 -05:00
Xin Zhao
2d77912c19 Revert "PML/SPML/UCX: add UCX MT support to PML and SPML."
This reverts commit 0ecf3c951c.

Signed-off-by: Xin Zhao <xinz@mellanox.com>
2016-12-19 18:57:48 +02:00
Xin Zhao
0ecf3c951c PML/SPML/UCX: add UCX MT support to PML and SPML.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2016-12-15 23:59:15 +02:00
Alina Sklarevich
e9d2d029c6 PML/SPML/UCX: Adapt to the API changes in the UCX lib.
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2016-12-08 11:33:29 +02:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00