openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	3eef3d1d8f	Update to PMIx 3.0.1 Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-08-20 14:00:41 -07:00
Geoff Paulsen	8483eb4bf7	Merge pull request #5471 from hjelmn/v4.0.x_uct_btl_fix btl/uct: fix compile warnings/errors	2018-08-15 16:30:40 -05:00
Geoff Paulsen	98bd571cc8	Merge pull request #5472 from ggouaillardet/topic/v4.0.x/prefer-externals v4.0.x: Prefer external hwloc and libevent	2018-08-15 16:27:33 -05:00
Nathan Hjelm	b4f80e4e36	btl/vader: move memory barrier to where it belongs The write memory barrier was intended to precede setting a fast-box header but instead follows it. This commit moves the memory barrier to the intended location. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> (cherry picked from commit dca3516765a4b5927b1877ca59d952baec42bc4a)	2018-08-14 09:19:48 -07:00
Jeff Squyres	72e5766a56	hwloc201/configure.m4: make it safe when used with hwloc:external The Autoconf AC_CONFIG_* macros can only be instantiated exacly once for any given file, and they must be in a code execution path at run time for the target file to be generated at the end of configure. For example, if you want to generate file ABC at the end of configure, you must invoke the AC_CONFIG_FILES(ABC) macro in a code path that will get executed when configure is run. That's pretty straightforward. What's not straightforward is two corner cases: 1. You cannot invoke the AC_CONFIG_FILES(ABC) macro for the same file more than once. If you do, autoreconf will fail (even before you can run configure). 2. If AC_CONFIG_FILES(ABC) is not in a code path that is executed by configure, the file ABC is not registered properly, and ABC will not be generated at the end of configure. This applies to hwloc because hwloc's HWLOC_SETUP_CORE macro calls both AC_CONFIG_FILES and AC_CONFIG_HEADER to setup its Makefiles (etc.) so that targets like "make distclean" and "make distcheck" will work properly. Hence, we have to invoke HWLOC_SETUP_CORE. However, the MCA_opal_hwloc_hwloc201_CONFIG macro has a few side effects. It would be nice to do able to do something like this: ``` if hwloc:extern is going to be used: Invoke minimal HWLOC_SETUP_CORE (with no side effects) else Invoke full HWLOC_SETUP_CORE (with side effects) fi ``` But we can't, because autoreconf will detect that AC_CONFIG_FILES has been invoked on the same files more than once (regardless of whether those code paths will be executed at run time or not). Kaboom. Similarly, we can't do this: ``` if hwloc:extern is not going to be used: Invoke full HWLOC_SETUP_CORE (with side effects) fi ``` Because then hwloc's AC_CONFIG_FILES won't be registered properly when hwloc:external is used (i.e., when the HWLOC_SETUP_CORE macro is not in a code path that is executed at run time), and targets like "make distclean" will fail because hwloc's Makefiles won't have been setup. Kaboom. But remember that the hwloc framework is a bit special: there will only ever be 2 comoponents: external and internal. External is guaranteed to be configured first because of its priority. So the internal component (i.e., this component) immediately knows if it is going to be used or not based on whether the external component configuration succeeded or failed. Specifically: regardless of whether the internal component (i.e., this component) is going to be used, we have to invoke HWLOC_SETUP_CORE. But we can manage the side effects: allow the side effects when this/internal component is going to be used, and avoid the side effects when this/internal component is not going to be used. This is a little less clean than I would have liked, but because of Autoconf's oddity about its AC_CONFIG_* macros, this is the only solution I could come up with. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 01e4570af759b113b965b63df7bfc72a78d69654)	2018-08-11 12:00:35 -07:00
Jeff Squyres	714f203985	libevent2022/configure.m4: always invoke sub-configure In order to make "make distclean" (and friends) work, we need to always invoke the embedded configure script -- even if we know that we're not going to use this component. But in cases where we know we're not going to use this component, we also need to avoid the side effects of the code path that is used when we do want to use this component. So split the two possibilities into two different macros: 1. MCA_opal_event_libevent2022_FAKE_CONFIG: which does almost nothing except invoke the underlying "configure" script. 2. MCA_opal_event_libevent2022_REAL_CONFIG: which does all the real work (including invoking the underlying "configure" script). Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 69aa46e1676c00bb41e54b95bbcf1df3b00dc9c1)	2018-08-11 12:00:35 -07:00
Jeff Squyres	5c5246f655	libevent2022/configure.m4: trivial cleanup Put argument to AM_CONDITIONAL inside []. No code or logic changes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 80df3f040be2b26162df9cc73a0cfd9e3d11732d)	2018-08-11 12:00:35 -07:00
Jeff Squyres	63d68ded48	libevent2022/configure.m4: minor comment cleanup Change # -> dnl. No code or logic changes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 17aa64e43825dd27148fc47d3f8f5694fc091d38)	2018-08-11 12:00:35 -07:00
Jeff Squyres	6cb3d61dd1	libevent2022: only configure if event:external fails We know that event:external will be configured first (because of its priority). Take advantage of that here in libevent2022 by having it refuse to configure / politely fail if event:external succeeded. Also print out some additional lines in configure output indicating what is going on (i.e., event:external succeeded, so this component will be skipped, or event:external failed, so this component will be used). Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit b063cb6b0f251052dc72d2496e5745ce83d6b869)	2018-08-09 06:47:08 -07:00
Jeff Squyres	eca16720de	hwloc201: only configure if hwloc:external fails We know that hwloc:external will be configured first (because of its priority). Take advantage of that here in hwloc201 by having it refuse to configure / politely fail if hwloc:external succeeded. Also print out some additional lines in configure output indicating what is going on (i.e., hwloc:external succeeded, so this component will be skipped, or hwloc:external failed, so this component will be used). Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 4e5f432786d1ab99304e40dadec40ac45e93e76f)	2018-08-09 06:47:08 -07:00
Ralph Castain	2c2f9b8169	Leave opal_event_external_support exposed as global var Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit open-mpi/ompi@5cab823979)	2018-07-24 09:53:00 +09:00
Jeff Squyres	1ea021933f	event/external: prefer external event component Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit open-mpi/ompi@a70ecf5267)	2018-07-24 09:52:58 +09:00
Jeff Squyres	6f5a453492	event: trivial comment change Switch from #-style to dnl-style. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit open-mpi/ompi@83e4a45a9f)	2018-07-24 09:52:56 +09:00
Gilles Gouaillardet	aa7a4d0f6f	hwloc: prefer external hwloc component Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit open-mpi/ompi@ce2c9fffd4)	2018-07-24 09:52:53 +09:00
Nathan Hjelm	b6bd3d33f1	btl/uct: fix compile warnings/errors Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> (cherry picked from commit 47ed8e8830749b6b59c84592c15b7576ea164f0c) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-23 14:05:17 -06:00
Ralph Castain	508c3f391f	Default to internal PMIx if newer than external Per https://github.com/open-mpi/ompi/issues/5031, if the user didn't specify a particular PMIx installation, then default back to the internal version if it is newer than the discovered external one. PMIx doesn't yet provide a full signature so we have to just get as close as possible for now. Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 1e6aaf7f226f5a4d940e544079e3977229746c11)	2018-07-19 11:59:17 -07:00
Howard Pritchard	4447738098	Merge pull request #5414 from hppritcha/topic/iwarp_only_by_default btl/openib: only look for iwarp/roce by default	2018-07-17 20:08:00 -06:00
Howard Pritchard	bc8134dae1	Merge pull request #5448 from thananon/ofi_context btl/ofi: Added FI_CONTEXT as requirement.	2018-07-17 19:51:13 -06:00
Howard Pritchard	6818272392	btl/openib: only look for iwarp/roce by default Due to decreasing support by vendors/other orgs for the OpenIB BTL, only look for iWarp/RoCE devices by default. Allow IB HCAs with ports configured for ethernet. Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2018-07-17 19:11:37 -06:00
Thananon Patinyasakdikul	033c364ee0	btl/ofi: Added FI_CONTEXT as requirement. OFI BTL uses context for completion but never ask for it in fi_getinfo(3). This commit makes sure that we always ask for FI_CONTEXT to eliminate any potential error. Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>	2018-07-17 12:18:43 -07:00
Sergey Oblomov	a4b8253fa2	MCA/COMMON/UCX: fixed initialization of malloc hooks Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-17 20:09:50 +03:00
Sergey Oblomov	1c7ae22dfb	MCA/COMMON/UCX: shift opal memhooks into common UCX Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-17 13:46:38 +03:00
Ralph Castain	4a596d35f7	Remove the PMIx ext4x component Update configury to redirect anything at or above v3 to the ext3x component Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-07-13 19:51:50 -07:00
Nathan Hjelm	9d3a79925b	btl/vader: fix bugs in rma emulation This commit fixes two bugs in the RMA/atomic emulation code: 1) Fix a fragment leak when using AMO emulation. 2) Always initialize the single-copy emulation code. This is required to use the AMO support. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-12 15:50:50 -06:00
Ralph Castain	fdca304268	Default to external PMIx installation Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-07-10 16:12:52 -07:00
Nathan Hjelm	8b090103e2	opal/fifo: fix 128-bit atomic fifo on Power9 This commit updates the atomic fifo code to fix a consistency issue observed on Power9 systems when builtin atomics are used. The cause was two things: 1) a missing write memory barrier in fifo push, and 2) a read ordering issue when reading the fifo head non-atomically. This commit fixes both issues and appears to correct then inconsistency. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-10 15:37:11 -06:00
KAWASHIMA Takahiro	3e179ba95f	hwloc/external: Suppress missing-include-dirs warning If OMPI is configured with `--with-hwloc=external` or `--with-hwloc=DIR` and gfortran is used, I see a lot of warnings when compiling files under the `ompi/mpi/fortran` directory. ``` f951: Warning: Nonexistent include directory 'BUILD_DIR/opal/mca/hwloc/external/hwloc/include' [-Wmissing-include-dirs] ``` There is no such `include` directory in the source tree and `configure`- created tree. I think these lines in the `configure.m4` file are wrongly copied from that for the embedded `hwlocXXX` component in the past. The `-Wmissing-include-dirs` option is enabled in gfortran by default but it is not enabled by default (or even with `-Wall`) in gcc and g++. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2018-07-09 10:55:33 +09:00
Yossi Itigin	e77e31b50b	Merge pull request #5378 from hoopoepg/topic/unify-ucx-logging MCA/COMMON/UCX: unified logging across all UCX modules	2018-07-08 12:45:26 +03:00
Ralph Castain	17c4cf0db8	Install PMIx v3.0.0 release Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-07-06 06:38:02 -07:00
Sergey Oblomov	bef47b792c	MCA/COMMON/UCX: unified logging across all UCX modules - added common logging infrastructure for all UCX modules - all UCX modules are switched to new infra Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-05 16:25:39 +03:00
Sergey Oblomov	8080283b3d	MCA/COMMON/UCX: changed return type for wait_request - for now wait_request returns OMPI status - updated callers Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-04 23:29:38 +03:00
Sergey Oblomov	f574c14e3a	ATOMICS/UCX: redefine atomic module API - now it accepts integer values directily instead of pointers Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-04 14:41:45 +03:00
Yossi Itigin	4962651567	Merge pull request #5366 from hoopoepg/topic/mca-common-ucx-unify-2 MCA/COMMON/UCX: minor unification of del_proces calls	2018-07-04 14:38:37 +03:00
Nathan Hjelm	bd5cd62df9	btl/ugni: fix up some warnings Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-03 16:30:44 -06:00
Nathan Hjelm	d8916a4672	btl/ugni: fix race condition in completing frags The descriptor flags field in a fragment were being ready after the fragment may have been freed. This commit reads the flags before calling the user callback. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-03 10:48:54 -06:00
Nathan Hjelm	87d41da62b	btl/vader: add support for atomics and emulated rdma This commit adds support for atomic operations as well as rdma for systems without rdma support. This support is implemented using an internal send tag. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-07-02 13:57:11 -06:00
Nathan T. Weeks	08f9ae97ee	btl/ugni: update BTL_VERBOSE argument list Signed-off-by: Nathan T. Weeks <weeks@iastate.edu>	2018-07-02 09:23:30 -06:00
Sergey Oblomov	c2bd6af9f2	MCA/COMMON/UCX: minor unification of del_proces calls - some common functionality of del_procs calls is moved into mca_common module - blocking ucp_put call is replaced by non-blocking routine Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-02 15:10:53 +03:00
Jeff Squyres	7b0dd03e92	tcp/btl: fix a cast The current cast is functional, but isn't really the way it should be done. This commit makes the cast the way it should be done. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-29 07:25:46 -07:00
Jeff Squyres	57bc657e7f	btl/tcp: fix hash map usage Fix two facepalms: 1. The "uint32" in the hash map functions refer to the key size, not the value size. The values are always 64 bits. 2. Pass the straight value to the "set" functions -- not the pointer to the value. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-06-28 15:29:41 -07:00
Yossi Itigin	3a7271ef4e	Merge pull request #5344 from hoopoepg/topic/mca-common-ucx-fixed-build MCA/COMMON/UCX: fixed build scripts	2018-06-28 15:14:04 +03:00
Sergey Oblomov	624d59604b	MCA/COMMON/UCX: minor optimization of build scripts Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-28 12:58:07 +03:00
Thananon Patinyasakdikul	304cf97ab5	Merge pull request #5334 from thananon/ofi_progress_fix btl/ofi: progress now happens after a threshold.	2018-06-27 12:51:33 -07:00
Sergey Oblomov	de8568c822	MCA/COMMON/UCX: enabled fallback into older UCX API Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-27 19:59:40 +03:00
Sergey Oblomov	1223b05811	MCA/COMMON/UCX: fixed build scripts - updated evaluation of UCX lib - used call from UCX v1.3 - updated makefile compilation flags Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-06-27 11:10:25 +03:00
Thananon Patinyasakdikul	be76896f7c	btl/ofi: progress now happens after a threshold. This commit changed the way btl/ofi call progress. Before, we force progression with every rdma/atomic call. This gives performance boost in some case and slow down on others. Now we only force progression after some number of rdma calls which result in better performance overall. Also added new MCA parameter 'mca_btl_ofi_progress_threshold' to set the threshold number. The new default is 64. Also: Added FI_DELIVERY_COMPLETE to tx_rtx flags to ensure that the completion is generated after the message has been received on the remote side. Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>	2018-06-26 10:39:45 -07:00
Nathan Hjelm	b0ac6276a6	btl/ugni: improve multi-threaded RDMA performance This commit improves the injection rate and latency for RDMA operations. This is done by the following improvements: - If C11's _Thread_local keyword is available then always use the same virtual device index for the same thread when using RDMA. If the keyword is not available then attempt to use any device that isn't already in use. The binding support is enabled by default but can be disabled via the btl_ugni_bind_devices MCA variable. - When posting FMA and RDMA operations always attempt to reap completions after posting the operation. This allows us to better balance the work of reaping completions across all application threads. - Limit the total number of outstanding BTE transactions. This fixes a performance bug when using many threads. - Split out RDMA and local SMSG completion queue sizes. The RDMA queue size is better tuned for performance with RMA-MT. - Split out put and get FMA limits. The old btl_ugni_fma_limit MCA variable is deprecated. The new variable names are: btl_ugni_fma_put_limit and btl_ugni_fma_get_limit. - Change how post descriptors are handled. They are no longer allocated seperately from the RDMA endpoints. - Some cleanup to move error code out of the critical path. - Disable the FMA sharing flag on the CDM when we detect that there should be enough FMA descriptors for the number of virtual devices we plan will create. If the user sets this flag we will not unset it. This change should improve the small-message RMA performance by ~ 10%. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-06-26 11:31:35 -06:00
Ralph Castain	0ddbc75ce5	Merge pull request #4930 from kizill/fix-ipv6 fixed ipv6 OOB connection problems (fix issue #1585)	2018-06-26 09:13:53 -07:00
Nathan Hjelm	abb87f9137	Merge pull request #5338 from ggouaillardet/topic/uct btl/uct: misc fixes	2018-06-26 08:56:40 -06:00
Yossi Itigin	ee873f4f79	Merge pull request #5322 from hoopoepg/topic/mca-ucx-common MCA/UCX: added common module	2018-06-26 13:54:12 +03:00

1 2 3 4 5 ...

5233 Коммитов