openmpi

Автор	SHA1	Сообщение	Дата
Sergey Oblomov	8080283b3d	MCA/COMMON/UCX: changed return type for wait_request - for now wait_request returns OMPI status - updated callers Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-04 23:29:38 +03:00
Sergey Oblomov	f574c14e3a	ATOMICS/UCX: redefine atomic module API - now it accepts integer values directily instead of pointers Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-04 14:41:45 +03:00
Yossi Itigin	4962651567	Merge pull request #5366 from hoopoepg/topic/mca-common-ucx-unify-2 MCA/COMMON/UCX: minor unification of del_proces calls	2018-07-04 14:38:37 +03:00
Nathan Hjelm	bd5cd62df9	btl/ugni: fix up some warnings Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-03 16:30:44 -06:00
Nathan Hjelm	d8916a4672	btl/ugni: fix race condition in completing frags The descriptor flags field in a fragment were being ready after the fragment may have been freed. This commit reads the flags before calling the user callback. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-03 10:48:54 -06:00
Nathan Hjelm	87d41da62b	btl/vader: add support for atomics and emulated rdma This commit adds support for atomic operations as well as rdma for systems without rdma support. This support is implemented using an internal send tag. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-02 13:57:11 -06:00
Nathan T. Weeks	08f9ae97ee	btl/ugni: update BTL_VERBOSE argument list Signed-off-by: Nathan T. Weeks <weeks@iastate.edu>	2018-07-02 09:23:30 -06:00
Sergey Oblomov	c2bd6af9f2	MCA/COMMON/UCX: minor unification of del_proces calls - some common functionality of del_procs calls is moved into mca_common module - blocking ucp_put call is replaced by non-blocking routine Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-02 15:10:53 +03:00
Jeff Squyres	7b0dd03e92	tcp/btl: fix a cast The current cast is functional, but isn't really the way it should be done. This commit makes the cast the way it should be done. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-29 07:25:46 -07:00
Jeff Squyres	57bc657e7f	btl/tcp: fix hash map usage Fix two facepalms: 1. The "uint32" in the hash map functions refer to the key size, not the value size. The values are always 64 bits. 2. Pass the straight value to the "set" functions -- not the pointer to the value. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-28 15:29:41 -07:00
Yossi Itigin	3a7271ef4e	Merge pull request #5344 from hoopoepg/topic/mca-common-ucx-fixed-build MCA/COMMON/UCX: fixed build scripts	2018-06-28 15:14:04 +03:00
Sergey Oblomov	624d59604b	MCA/COMMON/UCX: minor optimization of build scripts Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-28 12:58:07 +03:00
Thananon Patinyasakdikul	304cf97ab5	Merge pull request #5334 from thananon/ofi_progress_fix btl/ofi: progress now happens after a threshold.	2018-06-27 12:51:33 -07:00
Sergey Oblomov	de8568c822	MCA/COMMON/UCX: enabled fallback into older UCX API Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-27 19:59:40 +03:00
Sergey Oblomov	1223b05811	MCA/COMMON/UCX: fixed build scripts - updated evaluation of UCX lib - used call from UCX v1.3 - updated makefile compilation flags Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-27 11:10:25 +03:00
Thananon Patinyasakdikul	be76896f7c	btl/ofi: progress now happens after a threshold. This commit changed the way btl/ofi call progress. Before, we force progression with every rdma/atomic call. This gives performance boost in some case and slow down on others. Now we only force progression after some number of rdma calls which result in better performance overall. Also added new MCA parameter 'mca_btl_ofi_progress_threshold' to set the threshold number. The new default is 64. Also: Added FI_DELIVERY_COMPLETE to tx_rtx flags to ensure that the completion is generated after the message has been received on the remote side. Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>	2018-06-26 10:39:45 -07:00
Nathan Hjelm	b0ac6276a6	btl/ugni: improve multi-threaded RDMA performance This commit improves the injection rate and latency for RDMA operations. This is done by the following improvements: - If C11's _Thread_local keyword is available then always use the same virtual device index for the same thread when using RDMA. If the keyword is not available then attempt to use any device that isn't already in use. The binding support is enabled by default but can be disabled via the btl_ugni_bind_devices MCA variable. - When posting FMA and RDMA operations always attempt to reap completions after posting the operation. This allows us to better balance the work of reaping completions across all application threads. - Limit the total number of outstanding BTE transactions. This fixes a performance bug when using many threads. - Split out RDMA and local SMSG completion queue sizes. The RDMA queue size is better tuned for performance with RMA-MT. - Split out put and get FMA limits. The old btl_ugni_fma_limit MCA variable is deprecated. The new variable names are: btl_ugni_fma_put_limit and btl_ugni_fma_get_limit. - Change how post descriptors are handled. They are no longer allocated seperately from the RDMA endpoints. - Some cleanup to move error code out of the critical path. - Disable the FMA sharing flag on the CDM when we detect that there should be enough FMA descriptors for the number of virtual devices we plan will create. If the user sets this flag we will not unset it. This change should improve the small-message RMA performance by ~ 10%. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-06-26 11:31:35 -06:00
Ralph Castain	0ddbc75ce5	Merge pull request #4930 from kizill/fix-ipv6 fixed ipv6 OOB connection problems (fix issue #1585)	2018-06-26 09:13:53 -07:00
Nathan Hjelm	abb87f9137	Merge pull request #5338 from ggouaillardet/topic/uct btl/uct: misc fixes	2018-06-26 08:56:40 -06:00
Yossi Itigin	ee873f4f79	Merge pull request #5322 from hoopoepg/topic/mca-ucx-common MCA/UCX: added common module	2018-06-26 13:54:12 +03:00
Gilles Gouaillardet	b40b835a70	btl/uct: remove debug code Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-06-26 16:03:16 +09:00
Gilles Gouaillardet	552d0809aa	btl/uct: add missing include file Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-06-26 14:53:02 +09:00
Nathan Hjelm	6c089518e7	btl/uct: make uct endpoints array a flexible array member Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-06-25 18:14:58 -06:00
Nathan Hjelm	c5c5b42307	btl: add a new btl for the UCT layer in OpenUCX This commit adds a new btl for one-sided and two-sided. This btl uses the uct layer in OpenUCX. This btl makes use of multiple uct contexts and per-thread device pinning to provide good performance when using threads and osc/rdma. This btl has been tested extensively with osc/rdma and passes all MTT tests on aries and IB hardware. For now this new component disables itself but can be enabled by setting the btl_ucx_transports MCA variable with a comma-delimited list of supported memory domains/transport layers. For example: --mca btl_uct_memory_domains ib/mlx5_0. The specific transports used can be selected using --mca btl_uct_transports. The default is to use any available transport. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-06-25 18:14:58 -06:00
Sergey Oblomov	bf7fd480e9	MCA/COMMON/UCX: added non-blocking implementations of atomics - added implementation of swap/cswap/fadd operations - blocking add64 is replaced by non-blocking routine Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-25 12:25:31 +03:00
Sergey Oblomov	63e7ba6843	MCA/COMMON/UCX: added parameter for UCX/opal progress - added parameter to set UCX/opal progresses - minor refactoring of request wait routines Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-25 11:00:12 +03:00
Jeff Squyres	3767ce27c0	btl/tcp: trivial whitespace clean No code/logic changes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-23 08:04:12 -07:00
Jeff Squyres	9034717876	btl/tcp: use a hash map for kernel IP interface indexes The giant size of the TCP proc struct is causing a problem in some environments (because it is allocated on the stack), and it was too big, anyway. Instead, use a hash map. That way, it starts small and can grow if it needs to. It also makes no assumptions about the values of the kernel interface indexes. Fixes #5292. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-23 08:03:30 -07:00
Jeff Squyres	e3d6c5ce3a	pmix3/pmix_server.c: minor compiler warning stomp Submitted upstream https://github.com/pmix/pmix/pull/776. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-23 06:35:09 -07:00
Sergey Oblomov	d57ae62dee	MCA/UCX: added common module - implemented non-blocking routines for flush operations Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-22 16:41:09 +03:00
Jeff Squyres	4603852740	orterun: use consistent CLI option name for --bind-to Since the new binding option is tied to the --cpu-list orterun CLI option, make the --bind-to option reflect the same name (vs. the --cpu-set CLI option, which is entirely different). For example: mpirun --bind-to cpu-list:ordered ... Note that "--bind-to cpulist:ordered" is accepted as a synonym, because people will be lazy. Also add some minor updates to the orterun.1in man page for clarification. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-21 08:22:00 -07:00
Ralph Castain	f17d47087a	Define a new binding method and qualifier Allow users to request that procs be bound to a cpu in a given cpu-list based on their corresponding local rank Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-06-20 21:26:09 -07:00
Ralph Castain	5ac2ce6346	Cover all the PMIx data types Cover all data types for OPAL-to-PMIx conversion, generating error logs when we hit something we don't support Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-06-20 09:06:19 -07:00
Boris Karasev	39c9cb12bb	pmix/ext2x: fixed detection PMIx v2.0 by pmix component Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-06-20 13:23:51 +03:00
Ralph Castain	08707c9762	Sync to updated PMIx v3.0.0rc Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-06-19 21:25:43 -07:00
Noah Evans	a64abadf97	Fix mca_base_var_files separator In the opal list parsing behavior paths should be separated by ':' while files are separated by ','. In the opal and pmix code (the pmix fix is in a separate commit) there was a mistake in the parsing such that files were being separated by ':' when they should be separated by ','s. This commit attempts to address this mismatch. Signed-off-by: Noah Evans <noah.evans@gmail.com>	2018-06-19 12:59:07 -06:00
Ralph Castain	f0a0d606a0	Correct accounting for tools Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 1be080f7b92bad39745f42628a8cb6afefad2d2a)	2018-06-18 13:24:25 -07:00
Thananon Patinyasakdikul	13f58f3191	Merge pull request #5274 from thananon/ofi_sep btl/ofi: add scalable endpoint support.	2018-06-18 08:41:06 -07:00
Jeff Squyres	266d5b2110	Merge pull request #5277 from jsquyres/pr/cygwin-patch external libevent: fix for Cygwin	2018-06-18 11:08:30 -04:00
Ralph Castain	7981818b84	Update PMIx atomics Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-06-17 10:03:49 -07:00
Ralph Castain	fa18ba395d	Sync to latest PMIx v3.0rc Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-06-17 02:41:46 -07:00
Ralph Castain	ac7bb15505	Fix other typo in help message Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-06-16 16:30:52 -07:00
Ralph Castain	8cfce583c0	Correct typo to properly check for PMIx v4 Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-06-16 16:29:05 -07:00
Jeff Squyres	07c8ec6a3c	external libevent: fix for Cygwin Fix from Marco Atzeri for building on Cygwin. Signed-off-by: Marco Atzeri <marco.atzeri@gmail.com> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-16 09:08:58 -07:00
Ralph Castain	e2e6da4379	Merge pull request #5258 from rhc54/topic/pmix4 Sync to PMIx v3.0rc and add ext4x	2018-06-15 09:41:08 -07:00
Thananon Patinyasakdikul	dae3c9447c	btl/ofi: add scalable endpoint support. This commit add support for scalable endpoint to enhance multithreaded application performance. The BTL will detect the support from ofi provider and will fallback to normal usage of scalable endpoint is not supported. NEW MCA parameters: - mca_btl_ofi_disable_sep: force the btl to not use scalable endpoint. - mca_btl_ofi_num_contexts_per_module: number of communication context to create (should be the same as number of thread). Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>	2018-06-14 15:44:29 -07:00
Howard Pritchard	7dcab6e4a4	Merge pull request #5269 from hppritcha/topic/squash_gcc7.3.0_warnings topo/treematch - quash compiler warning	2018-06-13 21:13:04 -05:00
Howard Pritchard	64de269cc3	topo/treematch - quash compiler warning quash a compiler warning showing up with gcc 7.3 Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2018-06-13 16:34:17 -05:00
Thananon Patinyasakdikul	390d72addd	Merge pull request #4885 from davideberius/spc_pr Initial Software-based Performance Counters PR	2018-06-12 14:04:49 -07:00
Jeff Squyres	591b225527	Merge pull request #5245 from PeterGottesman/hwloc-fix Ensure required hwloc directories are in dist tarballs	2018-06-12 13:08:35 -04:00

1 2 3 4 5 ...

3471 Коммитов