Sergey Oblomov
6fe0a73861
PML/UCX: fixed ucp request free on persistent request completion
...
- in sine cases persistent request was deleted during completion
callback, this cause double free of linked UCX request (assert
in debug build or hang in release build)
- UCX request is freed prior completion calback
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-20 19:32:20 +03:00
Yossi Itigin
29812494f2
Merge pull request #5402 from hoopoepg/topic/common-del-procs
...
MCA/COMMON/UCX: del_procs calls are unified to common module
2018-07-19 11:19:45 +03:00
Sergey Oblomov
920cc2e0d9
MCA/COMMON/UCX: del_procs calls are unified to common module
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 07:37:25 +03:00
Sergey Oblomov
1c7ae22dfb
MCA/COMMON/UCX: shift opal memhooks into common UCX
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-17 13:46:38 +03:00
Sergey Oblomov
240670152e
MCA/COMMON/UCX: code beautify - alignment
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-06 19:40:58 +03:00
Sergey Oblomov
bef47b792c
MCA/COMMON/UCX: unified logging across all UCX modules
...
- added common logging infrastructure for all
UCX modules
- all UCX modules are switched to new infra
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
8080283b3d
MCA/COMMON/UCX: changed return type for wait_request
...
- for now wait_request returns OMPI status
- updated callers
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 23:29:38 +03:00
Sergey Oblomov
c2bd6af9f2
MCA/COMMON/UCX: minor unification of del_proces calls
...
- some common functionality of del_procs calls is moved into
mca_common module
- blocking ucp_put call is replaced by non-blocking routine
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:10:53 +03:00
Sergey Oblomov
074f30ba27
PML/UCX: suppressed compilation warning
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-27 12:05:07 +03:00
Sergey Oblomov
502d04bf12
UCX/PML/SPML: fixed few coverity issues
...
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Yossi Itigin
ee873f4f79
Merge pull request #5322 from hoopoepg/topic/mca-ucx-common
...
MCA/UCX: added common module
2018-06-26 13:54:12 +03:00
Sergey Oblomov
d57ae62dee
MCA/UCX: added common module
...
- implemented non-blocking routines for flush operations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-22 16:41:09 +03:00
Gilles Gouaillardet
edd02b7144
pml/ucx: silence a warning
...
declare 'fenced' volatile in order to silence CID 1437465
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-22 13:11:42 +09:00
Sergey Oblomov
5f03628560
PML/UCX: removed uneeded flush
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 12:40:46 +03:00
Sergey Oblomov
2745da7dcc
PML/UCX: use non-blocking fence instead of async progress
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 09:46:03 +03:00
Sergey Oblomov
10f2d831ec
PML/UCX: fixed hang on MPI_Finalize
...
- added async UCX progress thread to allow
pending requests to complete
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-20 16:12:05 +03:00
Yossi Itigin
564f80d362
pml_ucx: add option to use opal memhooks instead of ucx internal hooks
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-06-17 15:30:44 +03:00
Sergey Oblomov
0a8261f3b0
PML/UCX: fixed hand on MPI_Finalize
...
fixes issue https://github.com/openucx/ucx/issues/2656
added flush for worker object to complete all pending operations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-05 17:22:03 +03:00
Sergey Oblomov
5ec26914a6
PML/UCX: do not set offset on ordered data recv
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-21 19:40:07 +03:00
Sergey Oblomov
19607daa32
PML/UCX: create convertor clone instead of stack reset
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 16:39:13 +03:00
Sergey Oblomov
7c5de01c57
PML/UCX: reset converter stack on unordered messages
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 13:11:02 +03:00
Yossi Itigin
385f38ab4e
ucx: improve error messages during connection establishment
...
Also, unite common code calling ucp_ep_create()
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-04-30 15:45:05 +03:00
Sergey Oblomov
7a5811d0a8
request/state: update state for canceled request
...
- fixed issue in set state for canceled request
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-01-29 18:26:20 +02:00
Yossi Itigin
f2851fd502
Merge pull request #4724 from alex-mikheev/topic/ucx_as_default
...
ompi/oshmem: ucx is selected over yalla/ikrit by default
2018-01-17 17:41:49 +02:00
Alex Mikheev
640e945b9c
ompi: pml/ucx: blocking send using ucp_tag_send_nbr
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:54:18 +02:00
Alex Mikheev
ae326546f4
ompi/oshmem: ucx is selected over yalla/ikrit by default
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:08:04 +02:00
Alex Mikheev
e7bf0617cf
ompi: pml ucx: improve recv latency
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-12-26 16:24:16 +02:00
Yossi Itigin
14a93a5992
pml_ucx: fix tag/context_id layout and upper bounds.
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-08-27 17:15:48 +03:00
Joshua Hursey
e1d079544b
mca: Dynamic components link against project lib
...
* Resolves #3705
* Components should link against the project level library to better
support `dlopen` with `RTLD_LOCAL`.
* Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
with the appropriate project level library:
```
MCA components in ompi/
$(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
$(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
$(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
$(top_builddir)/oshmem/liboshmem.la"
```
Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Joshua Ladd
c27beea3a1
Merge pull request #3962 from karasevb/ucx_detect
...
configure: detect UCX support by default
2017-08-03 16:33:57 -04:00
Boris Karasev
d917d54ddc
configure: detect UCX support by default
...
Adds detecting UCX from following paths: "/usr /usr/local /opt/ucx"
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-25 23:48:49 +03:00
KAWASHIMA Takahiro
0cbdbe32f7
ompi/request: Support non-PML persistent requests
...
This commit adds the `req_start` member to the `ompi_request_t` struct.
The `MPI_START` and `MPI_STARTALL` routines call this callback function
instead of `MCA_PML_CALL(start(...))`. So components that return
persistent request must set this member to their request objects.
`mca_pml_base_module_t::pml_start` is not deleted because
`MCA_PML_CALL(start(...))` is still used elsewhere across OMPI.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-02 13:08:17 +09:00
Alina Sklarevich
49913c692a
PML UCX: unite the code for all the sending modes.
...
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-04-26 13:17:06 +03:00
Alina Sklarevich
d93b67257b
PML UCX: handle a synchronous send.
...
MCA_PML_BASE_SEND_SYNCHRONOUS
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-04-13 18:11:55 +03:00
Alina Sklarevich
eec310c99c
PML/UCX/YALLA: Fix the message release call.
...
Set message to MPI_MESSAGE_NULL.
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-04-13 14:41:13 +03:00
Xin Zhao
ee952fcccd
Passing estimated_num_procs to UCX init in PML and SPML.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-03-27 20:36:52 +03:00
Xin Zhao
6a99c60fbd
Add multithreading support in PML UCX framework.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-03-20 19:55:00 +02:00
Alex Mikheev
c081239f88
ompi: pml ucx: fix persistant request init CR changes
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-03-08 13:26:29 +02:00
Alex Mikheev
c113c37a7a
ompi: pml ucx: fix persistant request initialization
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-03-08 10:59:41 +02:00
Artem Polyakov
9448814c40
ompi/pml/ucx: Fix uninitialized UCX request field.
...
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-03-05 03:06:30 +07:00
Alex Mikheev
152f77df59
ompi: pml ucx: fix datatype packing error in bsend
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-03-01 16:18:19 +02:00
Yossi
fb67c966a8
Merge pull request #2944 from alex-mikheev/topic/pml_ucx_bsend
...
ompi: pml ucx: add support for the buffered send
2017-02-22 12:21:03 +02:00
Alex Mikheev
b015c8bb48
ompi: pml ucx: add support for the buffered send
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-02-21 17:19:22 +02:00
Gilles Gouaillardet
4184c01be5
Merge pull request #2393 from bosilca/topic/no_predefined_ddt_refcount
...
Don't refcount the predefined datatypes.
2017-02-21 09:38:11 +09:00
George Bosilca
c2cd717f82
Don't refcount the predefined datatypes.
...
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-01-11 16:48:59 -05:00
Xin Zhao
2d77912c19
Revert "PML/SPML/UCX: add UCX MT support to PML and SPML."
...
This reverts commit 0ecf3c951c
.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2016-12-19 18:57:48 +02:00
Xin Zhao
0ecf3c951c
PML/SPML/UCX: add UCX MT support to PML and SPML.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2016-12-15 23:59:15 +02:00
Alina Sklarevich
e9d2d029c6
PML/SPML/UCX: Adapt to the API changes in the UCX lib.
...
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2016-12-08 11:33:29 +02:00
Ralph Castain
1e2019ce2a
Revert "Update to sync with OMPI master and cleanup to build"
...
This reverts commit cb55c88a8b
.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b
Update to sync with OMPI master and cleanup to build
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00