1
1
Граф коммитов

28760 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
57bc657e7f btl/tcp: fix hash map usage
Fix two facepalms:

1. The "uint32" in the hash map functions refer to the *key* size, not
   the *value* size.  The values are always 64 bits.
2. Pass the straight value to the "set" functions -- not the pointer
   to the value.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-28 15:29:41 -07:00
Yossi Itigin
3a7271ef4e
Merge pull request #5344 from hoopoepg/topic/mca-common-ucx-fixed-build
MCA/COMMON/UCX: fixed build scripts
2018-06-28 15:14:04 +03:00
Sergey Oblomov
624d59604b MCA/COMMON/UCX: minor optimization of build scripts
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-28 12:58:07 +03:00
Thananon Patinyasakdikul
304cf97ab5
Merge pull request #5334 from thananon/ofi_progress_fix
btl/ofi: progress now happens after a threshold.
2018-06-27 12:51:33 -07:00
Jeff Squyres
c1ccbece2f
Merge pull request #5347 from jsquyres/pr/fix-f90-removed-interfaces
F90 removed interfaces: add missing "end interface"
2018-06-27 13:54:02 -04:00
Jeff Squyres
768b800533 F90 removed interfaces: add missing "end interface"
Thanks to @fsciortino for reporting.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-27 13:02:16 -04:00
Sergey Oblomov
de8568c822 MCA/COMMON/UCX: enabled fallback into older UCX API
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-27 19:59:40 +03:00
Yossi Itigin
aca61a6bfb
Merge pull request #5238 from hoopoepg/topic/fixed-coverity-issues-ucx-pml
UCX/PML: fixed few coverity issues
2018-06-27 11:14:06 +03:00
Sergey Oblomov
1223b05811 MCA/COMMON/UCX: fixed build scripts
- updated evaluation of UCX lib - used call from UCX v1.3
- updated makefile compilation flags

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-27 11:10:25 +03:00
Nathan Hjelm
4c230683e7 osc/sm: fix a typo
This commit fixes a typo where a bcast is used instead of the intended
collective (barrier).

References #5262

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-06-26 12:53:12 -06:00
Thananon Patinyasakdikul
be76896f7c btl/ofi: progress now happens after a threshold.
This commit changed the way btl/ofi call progress. Before, we force
progression with every rdma/atomic call. This gives performance boost in
some case and slow down on others. Now we only force progression after
some number of rdma calls which result in better performance overall.

Also added new MCA parameter 'mca_btl_ofi_progress_threshold' to set
the threshold number. The new default is 64.

Also:
Added FI_DELIVERY_COMPLETE to tx_rtx flags to ensure that the completion
is generated after the message has been received on the remote side.

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-06-26 10:39:45 -07:00
Nathan Hjelm
b0ac6276a6 btl/ugni: improve multi-threaded RDMA performance
This commit improves the injection rate and latency for RDMA
operations. This is done by the following improvements:

 - If C11's _Thread_local keyword is available then always use the
   same virtual device index for the same thread when using RDMA. If
   the keyword is not available then attempt to use any device that
   isn't already in use. The binding support is enabled by default but
   can be disabled via the btl_ugni_bind_devices MCA variable.

 - When posting FMA and RDMA operations always attempt to reap
   completions after posting the operation. This allows us to
   better balance the work of reaping completions across all
   application threads.

 - Limit the total number of outstanding BTE transactions. This
   fixes a performance bug when using many threads.

 - Split out RDMA and local SMSG completion queue sizes. The RDMA
   queue size is better tuned for performance with RMA-MT.

 - Split out put and get FMA limits. The old btl_ugni_fma_limit MCA
   variable is deprecated. The new variable names are:
   btl_ugni_fma_put_limit and btl_ugni_fma_get_limit.

 - Change how post descriptors are handled. They are no longer
   allocated seperately from the RDMA endpoints.

 - Some cleanup to move error code out of the critical path.

 - Disable the FMA sharing flag on the CDM when we detect that there
   should be enough FMA descriptors for the number of virtual devices
   we plan will create. If the user sets this flag we will not unset
   it. This change should improve the small-message RMA performance by
   ~ 10%.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-06-26 11:31:35 -06:00
Ralph Castain
0ddbc75ce5
Merge pull request #4930 from kizill/fix-ipv6
fixed ipv6 OOB connection problems (fix issue #1585)
2018-06-26 09:13:53 -07:00
Sergey Oblomov
502d04bf12 UCX/PML/SPML: fixed few coverity issues
- fixed incorrect pointer manipulation/free
- cleaned dead code
- minor optimization on process delete routine
- fixed error handling - free pointers
- added debug output for woker flush failure

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-26 18:52:39 +03:00
Nathan Hjelm
abb87f9137
Merge pull request #5338 from ggouaillardet/topic/uct
btl/uct: misc fixes
2018-06-26 08:56:40 -06:00
Yossi Itigin
ee873f4f79
Merge pull request #5322 from hoopoepg/topic/mca-ucx-common
MCA/UCX: added common module
2018-06-26 13:54:12 +03:00
Gilles Gouaillardet
b40b835a70 btl/uct: remove debug code
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-26 16:03:16 +09:00
Gilles Gouaillardet
552d0809aa btl/uct: add missing include file
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-26 14:53:02 +09:00
Gilles Gouaillardet
e609cf7bc3
Merge pull request #5337 from ggouaillardet/topic/generalized_requests
ompi/requests: implement generalized request extensions
2018-06-26 13:01:04 +09:00
KAWASHIMA Takahiro
a8da78eeaa
Merge pull request #4618 from ggouaillardet/topic/pcoll
Add the persistent collectives feature
2018-06-26 12:36:34 +09:00
Gilles Gouaillardet
5c394377d0 io/romio312: use Grequest extensions provided by Open MPI
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-26 10:52:18 +09:00
Gilles Gouaillardet
f72922b8b1 io/romio321: do not use removed MPI1 primitives
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-26 10:52:18 +09:00
Gilles Gouaillardet
383f23bf35 ompi/request: implement MPI Generalized request extensions
so latest ROM-IO can be used with Open MPI.

Note this first and naive implementation does not use the wait_fn callback.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-26 10:52:18 +09:00
Gilles Gouaillardet
1e5404873f io/romio321: update .gitignore
and remove two files that should have never been commited

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-26 10:52:17 +09:00
Nathan Hjelm
6c089518e7 btl/uct: make uct endpoints array a flexible array member
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-06-25 18:14:58 -06:00
Nathan Hjelm
c5c5b42307 btl: add a new btl for the UCT layer in OpenUCX
This commit adds a new btl for one-sided and two-sided. This btl
uses the uct layer in OpenUCX. This btl makes use of multiple uct
contexts and per-thread device pinning to provide good performance
when using threads and osc/rdma. This btl has been tested extensively
with osc/rdma and passes all MTT tests on aries and IB hardware.

For now this new component disables itself but can be enabled by
setting the btl_ucx_transports MCA variable with a comma-delimited
list of supported memory domains/transport layers. For example:
--mca btl_uct_memory_domains ib/mlx5_0. The specific transports used
can be selected using --mca btl_uct_transports. The default is to use
any available transport.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-06-25 18:14:58 -06:00
Joshua Ladd
256ad707f1
Merge pull request #5293 from yosefe/topic/osc-ucx-on-demand-progress
osc_ucx: register progress on-demand
2018-06-25 15:09:11 -04:00
Joshua Ladd
98afc838aa
Merge pull request #5294 from yosefe/topic/coll-hcoll-progress-fn
coll_hcoll: register progress callback directly without a proxy
2018-06-25 15:07:26 -04:00
Nathan Hjelm
e4989714c2 osc/rdma: fix data race on teardown
The osc/rdma module did not wait for all pending atomics to complete
before tearing down. This could lead to weird issues as the target
location may no longer be registered or allocated.

This commit also fixes an offset calculation issue in
ompi_osc_get_data_blocking ().

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-06-25 11:47:34 -06:00
Nathan Hjelm
c9e58cedc1 mpi.h: fix warning with gcc
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-06-25 11:45:36 -06:00
Ralph Castain
0efd07623a
Merge pull request #5327 from rhc54/topic/cov
Silence coverity warnings, remove/ignore build product
2018-06-25 08:51:27 -07:00
Ralph Castain
3b2390e5d5 Silence coverity warnings, remove/ignore build product
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-25 08:01:28 -07:00
Jeff Squyres
538528f659
Merge pull request #5326 from jsquyres/pr/tcp-btl-use-opal-hash-map-for-kindex
btl/tcp: use a hash map for kernel IP interface indexes
2018-06-25 10:50:50 -04:00
Sergey Oblomov
bf7fd480e9 MCA/COMMON/UCX: added non-blocking implementations of atomics
- added implementation of swap/cswap/fadd operations
- blocking add64 is replaced by non-blocking routine

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 12:25:31 +03:00
Sergey Oblomov
63e7ba6843 MCA/COMMON/UCX: added parameter for UCX/opal progress
- added parameter to set UCX/opal progresses
- minor refactoring of request wait routines

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-25 11:00:12 +03:00
Yossi Itigin
e3ee11608b coll_hcoll: register progress callback directly without a proxy
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-06-24 18:06:07 +03:00
Jeff Squyres
3767ce27c0 btl/tcp: trivial whitespace clean
No code/logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 08:04:12 -07:00
Jeff Squyres
9034717876 btl/tcp: use a hash map for kernel IP interface indexes
The giant size of the TCP proc struct is causing a problem in some
environments (because it is allocated on the stack), and it was too
big, anyway.

Instead, use a hash map.  That way, it starts small and can grow if it
needs to.  It also makes no assumptions about the values of the kernel
interface indexes.

Fixes #5292.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 08:03:30 -07:00
Ralph Castain
259d9bd4fe
Merge pull request #5325 from jsquyres/pr/compiler-warning-stomps
pmix3/pmix_server.c: minor compiler warning stomp
2018-06-23 07:39:27 -07:00
Jeff Squyres
e3d6c5ce3a pmix3/pmix_server.c: minor compiler warning stomp
Submitted upstream https://github.com/pmix/pmix/pull/776.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 06:35:09 -07:00
Edgar Gabriel
edfdcb6e82
Merge pull request #5324 from edgargabriel/pr/minor-fixes
Pr/minor fixes
2018-06-22 17:20:02 -05:00
Howard Pritchard
8babaad35c
Merge pull request #4520 from ggouaillardet/refresh/romio321
io/romio321: refresh ROMIO based on latest stable MPICH 3.2.1
2018-06-22 16:58:46 -05:00
Edgar Gabriel
cf5cdad40f fcoll: make vulcan the default component
make vulcan the default component except for Lustre file systems.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-22 14:12:02 -05:00
Edgar Gabriel
fd8c5fba4e common/ompio: fix the fview based grouping options
a bug sneaked into constructing the list of aggregators
processes when using the fileview based grouping options

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-22 14:01:31 -05:00
Sergey Oblomov
d57ae62dee MCA/UCX: added common module
- implemented non-blocking routines for flush operations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-22 16:41:09 +03:00
Gilles Gouaillardet
45b6e785aa
Merge pull request #5320 from ggouaillardet/topic/ucx_volatile
pml/ucx: silence a warning
2018-06-22 14:00:44 +09:00
Gilles Gouaillardet
edd02b7144 pml/ucx: silence a warning
declare 'fenced' volatile in order to silence CID 1437465

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-22 13:11:42 +09:00
Edgar Gabriel
d5dd008193
Merge pull request #5319 from edgargabriel/pr/ibm-testsuite-fixes2
Pr/ibm testsuite fixes2
2018-06-21 19:46:22 -05:00
Edgar Gabriel
743e0dff5a common/ompio: fix zero size fview issue
handle the situation where the user requests a non-zero amount
of data but has a zero-size fileview. My instrinct would have been
to return an error code, but according to the test that I used
it should be MPI_SUCCESS and zero bytes. It is definitely better
than segfaulting :-)

THis makes another test from the IBM testsuite pass.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 17:02:13 -05:00
Edgar Gabriel
7643ccfbcf sharedfp/sm and sharedfp/lockedfile: fix seek offset calculation
the seek offset calculation did not treat the offset as a multiple
of the etype provided. Fixing this makes some more ibm tests pass.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 14:26:36 -05:00