Mikhail Brinskii
751d88192d
PML/UCX: Use net worker address for remote peers
...
For remote node peers pack smaller worker address, which contains
network device addresses only. This would reduce amount of OOB traffic
during startup.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-02-14 18:06:36 +02:00
Yossi Itigin
83cca9d52a
ucx: add owner.txt for components
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-12-01 17:14:03 +02:00
Yossi Itigin
f36eeef4c5
pml_ucx: initialize req_mpi_object.comm for error handler
...
without this fix, an error handler invoked on pml_ucx request would
segfault while trying to dereference requests[i]->req_mpi_object.comm
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-11-25 19:37:54 +02:00
Sergey Oblomov
1099d5f023
COMMON/UCX: added error code to log output
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-10-21 11:37:25 +03:00
Yossi Itigin
b71e85b8d5
pml_ucx: fix return code from mca_pml_ucx_init() error flow
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-11 18:48:54 +03:00
Yossi Itigin
40ac9e4771
pml_ucx: fix return code from mca_pml_ucx_init()
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-10 14:41:05 +03:00
Yossi Itigin
4763822a64
pml_ucx: add ompi datatype attribute to release ucp_datatype
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-10-09 17:34:34 +03:00
Yossi Itigin
68206a5635
Merge pull request #5569 from hoopoepg/topic/optimize-blocked-calls
...
PML/UCX: blocked calls optimizations
2018-08-29 14:19:09 +03:00
Yossi Itigin
4bb6845888
Merge pull request #5570 from hoopoepg/topic/opal-mem-hooks-syno
...
MCA/COMMON/UCX: added synonym to opal_mem_hook variable
2018-08-29 14:16:33 +03:00
Sergey Oblomov
c201c0abb3
PML/UCX: blocked calls optimizations: removed reset progress count
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:39 +03:00
Sergey Oblomov
2cd9e04166
PML/UCX: optimization of mprobe call - renamed vars
...
- renamed of internal variable names
- used unsigned datatypes
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:39 +03:00
Sergey Oblomov
38e908f83e
PML/UCX: optimization of mprobe call
...
- refactoring of opal/UCX progress calls
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:38 +03:00
Sergey Oblomov
b0f87f2235
PML/UCX: blocked calls optimizations
...
- added UCX progress priority
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-27 09:50:38 +03:00
Jeff Squyres
fe0852bcb4
Miscellaneous compiler warning stomps.
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-08-24 07:39:14 -07:00
Sergey Oblomov
e00f7a68ba
MCA/COMMON/UCX: added synonim to opal_mem_hook variable
...
- added synonim to opal_mem_hook variable to allow
to print it in opal_info -a
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-08-21 15:05:12 +03:00
Sergey Oblomov
d204b8a678
PML/SPML/UCX/COMPONENT: applied C99 initialization
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-28 09:44:03 +03:00
Sergey Oblomov
2806504290
PML/SPML/UCX: init global objects using C99 style
...
- to avoid value mix used C99 style of object initializations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-25 14:52:45 +03:00
Sergey Oblomov
6fe0a73861
PML/UCX: fixed ucp request free on persistent request completion
...
- in sine cases persistent request was deleted during completion
callback, this cause double free of linked UCX request (assert
in debug build or hang in release build)
- UCX request is freed prior completion calback
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-20 19:32:20 +03:00
Yossi Itigin
29812494f2
Merge pull request #5402 from hoopoepg/topic/common-del-procs
...
MCA/COMMON/UCX: del_procs calls are unified to common module
2018-07-19 11:19:45 +03:00
Sergey Oblomov
920cc2e0d9
MCA/COMMON/UCX: del_procs calls are unified to common module
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-18 07:37:25 +03:00
Sergey Oblomov
1c7ae22dfb
MCA/COMMON/UCX: shift opal memhooks into common UCX
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-17 13:46:38 +03:00
Sergey Oblomov
240670152e
MCA/COMMON/UCX: code beautify - alignment
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-06 19:40:58 +03:00
Sergey Oblomov
bef47b792c
MCA/COMMON/UCX: unified logging across all UCX modules
...
- added common logging infrastructure for all
UCX modules
- all UCX modules are switched to new infra
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-05 16:25:39 +03:00
Sergey Oblomov
8080283b3d
MCA/COMMON/UCX: changed return type for wait_request
...
- for now wait_request returns OMPI status
- updated callers
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-04 23:29:38 +03:00
Sergey Oblomov
c2bd6af9f2
MCA/COMMON/UCX: minor unification of del_proces calls
...
- some common functionality of del_procs calls is moved into
mca_common module
- blocking ucp_put call is replaced by non-blocking routine
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-07-02 15:10:53 +03:00
Sergey Oblomov
074f30ba27
PML/UCX: suppressed compilation warning
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-27 12:05:07 +03:00
Sergey Oblomov
502d04bf12
UCX/PML/SPML: fixed few coverity issues
...
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Yossi Itigin
ee873f4f79
Merge pull request #5322 from hoopoepg/topic/mca-ucx-common
...
MCA/UCX: added common module
2018-06-26 13:54:12 +03:00
Sergey Oblomov
d57ae62dee
MCA/UCX: added common module
...
- implemented non-blocking routines for flush operations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-22 16:41:09 +03:00
Gilles Gouaillardet
edd02b7144
pml/ucx: silence a warning
...
declare 'fenced' volatile in order to silence CID 1437465
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-22 13:11:42 +09:00
Sergey Oblomov
5f03628560
PML/UCX: removed uneeded flush
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 12:40:46 +03:00
Sergey Oblomov
2745da7dcc
PML/UCX: use non-blocking fence instead of async progress
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 09:46:03 +03:00
Sergey Oblomov
10f2d831ec
PML/UCX: fixed hang on MPI_Finalize
...
- added async UCX progress thread to allow
pending requests to complete
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-20 16:12:05 +03:00
Yossi Itigin
564f80d362
pml_ucx: add option to use opal memhooks instead of ucx internal hooks
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-06-17 15:30:44 +03:00
Sergey Oblomov
0a8261f3b0
PML/UCX: fixed hand on MPI_Finalize
...
fixes issue https://github.com/openucx/ucx/issues/2656
added flush for worker object to complete all pending operations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-05 17:22:03 +03:00
Sergey Oblomov
5ec26914a6
PML/UCX: do not set offset on ordered data recv
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-21 19:40:07 +03:00
Sergey Oblomov
19607daa32
PML/UCX: create convertor clone instead of stack reset
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 16:39:13 +03:00
Sergey Oblomov
7c5de01c57
PML/UCX: reset converter stack on unordered messages
...
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 13:11:02 +03:00
Yossi Itigin
385f38ab4e
ucx: improve error messages during connection establishment
...
Also, unite common code calling ucp_ep_create()
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-04-30 15:45:05 +03:00
Sergey Oblomov
7a5811d0a8
request/state: update state for canceled request
...
- fixed issue in set state for canceled request
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-01-29 18:26:20 +02:00
Yossi Itigin
f2851fd502
Merge pull request #4724 from alex-mikheev/topic/ucx_as_default
...
ompi/oshmem: ucx is selected over yalla/ikrit by default
2018-01-17 17:41:49 +02:00
Alex Mikheev
640e945b9c
ompi: pml/ucx: blocking send using ucp_tag_send_nbr
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:54:18 +02:00
Alex Mikheev
ae326546f4
ompi/oshmem: ucx is selected over yalla/ikrit by default
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:08:04 +02:00
Alex Mikheev
e7bf0617cf
ompi: pml ucx: improve recv latency
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-12-26 16:24:16 +02:00
Yossi Itigin
14a93a5992
pml_ucx: fix tag/context_id layout and upper bounds.
...
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-08-27 17:15:48 +03:00
Joshua Hursey
e1d079544b
mca: Dynamic components link against project lib
...
* Resolves #3705
* Components should link against the project level library to better
support `dlopen` with `RTLD_LOCAL`.
* Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
with the appropriate project level library:
```
MCA components in ompi/
$(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
$(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
$(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
$(top_builddir)/oshmem/liboshmem.la"
```
Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Joshua Ladd
c27beea3a1
Merge pull request #3962 from karasevb/ucx_detect
...
configure: detect UCX support by default
2017-08-03 16:33:57 -04:00
Boris Karasev
d917d54ddc
configure: detect UCX support by default
...
Adds detecting UCX from following paths: "/usr /usr/local /opt/ucx"
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-25 23:48:49 +03:00
KAWASHIMA Takahiro
0cbdbe32f7
ompi/request: Support non-PML persistent requests
...
This commit adds the `req_start` member to the `ompi_request_t` struct.
The `MPI_START` and `MPI_STARTALL` routines call this callback function
instead of `MCA_PML_CALL(start(...))`. So components that return
persistent request must set this member to their request objects.
`mca_pml_base_module_t::pml_start` is not deleted because
`MCA_PML_CALL(start(...))` is still used elsewhere across OMPI.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-02 13:08:17 +09:00
Alina Sklarevich
49913c692a
PML UCX: unite the code for all the sending modes.
...
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-04-26 13:17:06 +03:00