openmpi

Автор	SHA1	Сообщение	Дата
Joseph Schuchart	08da2f5ea5	Correctly set baseptr in contiguous shared memory window with local size zero Signed-off-by: Joseph Schuchart <schuchart@hlrs.de> (cherry picked from commit 06bbcf4fd63dd184cf22f8bcad007c4b8b991a3c)	2020-02-20 20:46:29 +01:00
Howard Pritchard	43ecbb1734	Merge pull request #7392 from awlauria/pgcc18_v4.0.x v4.0.x: Fix pgcc18 support.	2020-02-14 14:16:42 -06:00
Austen Lauria	ff6b068b93	Fix pgcc18 support. - pgcc18 defines __GNUC__ similar to Intel compilers. So we must check for pgi higher up, or else configury will mistake it for gcc. Signed-off-by: Austen Lauria <awlauria@us.ibm.com> (cherry picked from commit 14785deb3c6609cb3f6763d0e07a49e86588c4da)	2020-02-12 15:11:03 -05:00
Geoff Paulsen	a1259e6a14	Merge pull request #7356 from hppritcha/topic/pr7201_to_v40x Topic/pr7201 to v40x	2020-02-12 14:06:31 -06:00
Brice Goglin	f136804c45	hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 1.11 Build was broken by mistake in commit d40662edc41a5a4d09ae690b640cfdeeb24e15a1 Fixes #7362 Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> (cherry picked from commit 907ad854b46b42ae7cb1e9c87238691a5cc25e36)	2020-02-11 18:37:18 -06:00
Geoff Paulsen	3894e5760c	Merge pull request #7380 from gpaulsen/topic/v4.0.x/VERSION_v4.0.3rc4 Reving to VERSION v4.0.3rc4	2020-02-11 18:28:45 -06:00
Howard Pritchard	eddb0ef626	Merge pull request #7382 from gpaulsen/topic/v4.0.x/pmix_v3.1.5rc2 Adding PMIx v3.1.5rc2	2020-02-11 10:05:38 -06:00
Geoffrey Paulsen	81ad9bfdb6	Adding PMIx v3.1.5rc2 Adding PMIx v3.1.5rc2 from: https://github.com/openpmix/openpmix/releases/tag/v3.1.5rc2 Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>	2020-02-10 17:05:53 -06:00
Geoffrey Paulsen	aff4fa6c8f	Reving to VERSION v4.0.3rc4 Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>	2020-02-10 15:56:28 -06:00
Brice Goglin	6702a4febb	opal/hwloc: remove some unused variables when building with hwloc < 1.7 Refs #7362 Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> (cherry picked from commit 329d4451a6cdd544e532a29f594f6e5ee63e06da)	2020-02-10 15:54:48 -06:00
Geoff Paulsen	d79fe7fe10	Merge pull request #7376 from hoopoepg/topic/oshmem-inc-max-segments-v4.0 OSHMEM/SEGMENTS: increase max number of segments - v4.0	2020-02-10 15:52:24 -06:00
Sergey Oblomov	45a722ad6a	OSHMEM/SEGMENTS: increase number of max segments - increase number of max segments to allow application be launched on some Ubuntu configurations Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit f742f289ea32a0f3dfe5f769fb318213f1a74c37)	2020-02-10 07:44:50 +02:00
Howard Pritchard	42acf4fe6f	Merge pull request #7360 from jsquyres/pr/v4.0.x/fortran-you-win-again v4.0.x: Fortran fixes	2020-02-09 10:02:04 -06:00
Jeff Squyres	fbeebdb9a0	fortran: ensure not to use [AM_]CPPFLAGS Automake's Fortran compilation rules inexplicably use CPPFLAGS and AM_CPPFLAGS. Unfortunately, this can cause problems in some cases (e.g., picking up already-installed mpi.mod in a system-default include search path). So in relevant module-using Fortran compilation Makefile.am's, zero out CPPFLAGS and AM_CPPFLAGS. This has a side-effect of requiring that we compile the one .c file in the F08 library in a new, separate subdirectory (with its own Makefile.am that does _not_ have CPPFLAGS/AM_CPPFLAGS zeroed out). Signed-off-by: Jeff Squyres <jsquyres@cisco.com> Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit ab398f4b9a340b54a88b83021b66911fe46d5862)	2020-02-04 05:15:40 -08:00
Jeff Squyres	85ce373730	fortran: remove useless CPPFLAGS assignment These -D's are for C compilation, not Fortran compilation. Remove this useless statement. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit f4a47a5a8e4e3f2c902807d75e211f7f500f802b)	2020-02-04 04:26:11 -08:00
Howard Pritchard	bed0ce70a7	fix a problem with opal_asprintf not being defined. related to #7201 Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-02-03 14:22:59 -07:00
Brice Goglin	82567996f7	hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 2.0 Both opal_hwloc_base_get_relative_locality() and _get_locality_string() iterate over hwloc levels to build the proc locality information. Unfortunately, NUMA nodes are not in those normal levels anymore since 2.0. We have to explicitly look a the special NUMA level to get that locality info. I am factorizing the core of the iterations inside dedicated "_by_depth" functions and calling them again for the NUMA level at the end of the loops. Thanks to Hatem Elshazly for reporting the NUMA communicator split failure at https://www.mail-archive.com/users@lists.open-mpi.org/msg33589.html It looks like only the opal_hwloc_base_get_locality_string() part is needed to fix that split, but there's no reason not to fix get_relative_locality() as well. Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> (cherry picked from commit ea80a20e108cb69efc67ad04ad968da7b85772af)	2020-02-03 13:29:41 -07:00
Howard Pritchard	a26cd349b9	Merge pull request #7355 from jsquyres/pr/v4.0.x/fortran-sentinel-linker-black-magic v4.0.x: Make C and Fortran types for MPI sentinels agree in size	2020-02-03 13:25:29 -07:00
Fangrui Song	02f3795299	Make C and Fortran types for MPI sentinels agree in size Fix the C types for the following: * MPI_UNWEIGHTED * MPI_WEIGHTS_EMPTY * MPI_ARGV_NULL * MPI_ARGVS_NULL * MPI_ERRCODES_IGNORE There is lengthy discussion on https://github.com/open-mpi/ompi/pull/7210 describing the issue; the gist of it is that the C and Fortran types for several MPI global sentenial values should agree (specifically: their sizes must(*) agree). We erroneously had several of these array-like sentinel values be "array-like" values in C. E.g., MPI_ERRCODES_IGNORE was an (int ) in C while its corresponding Fortran type was "integer, dimension(1)". On a 64 bit platform, this resulted in C expecting the symbol size to be sizeof(int)==8 while Fortran expected the symbol size to be sizeof(INTEGER, DIMENSION(1))==4. That is incorrect -- the corresponding C type needed to be (int). Then both C and Fortran expect the size of the symbol to be the same. () NOTE: This code has been wrong for years. This mismatch of types typically worked because, due to Fortran's call-by-reference semantics, Open MPI was comparing the addresses* of these instances, not their types (or sizes) -- so even if C expected the size of the symbol to be X and Fortran expected the size of the symbol to be Y (where X!=Y), all we really checked at run time was that the addresses of the symbols were the same. But it caused linker warning messages, and even caused errors in some cases. Specifically: due to a GNU ld bug (https://sourceware.org/bugzilla/show_bug.cgi?id=25236), the 5 common symbols are incorrectly versioned VER_NDX_LOCAL because their definitions in Fortran sources have smaller st_size than those in libmpi.so. This makes the Fortran library not linkable with lld in distributions that ship openmpi built with -Wl,--version-script (https://bugs.llvm.org/show_bug.cgi?id=43748): % mpifort -fuse-ld=lld /dev/null ld.lld: error: corrupt input file: version definition index 0 for symbol mpi_fortran_argv_null_ is out of bounds >>> defined in /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_usempif08.so ... If we fix the C and Fortran symbols to actually be the same size, the problem goes away and the GNU ld bug does not come into play. This commit also fixes a minor issue that MPI_UNWEIGHTED and MPI_WEIGHTS_EMPTY were not declared as Fortran arrays (not fully fixed by commit 107c0073dd11fb90d18122c521686f692a32cdd8). Fixes open-mpi/ompi#7209 Signed-off-by: Fangrui Song <i@maskray.me> Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 5609268e90cb0ff7b2431d29041c10a700fd6996)	2020-02-02 13:57:50 -08:00
Geoff Paulsen	2f42a125be	Merge pull request #7352 from hppritcha/topic/minor_news_update_v4.0.x NEWS: tweak for v4.0.3 release	2020-01-31 13:46:48 -06:00
Howard Pritchard	89e3a2ba02	NEWS: tweak for v4.0.3 release [skip ci] Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-01-31 12:38:41 -07:00
Geoff Paulsen	731721119e	Merge pull request #7346 from gpaulsen/topic/v4.0.x/VERSION_4.0.3rc3_part2 Actually Updating VERSION to v4.0.3rc3	2020-01-28 16:53:28 -06:00
Geoffrey Paulsen	80950480a9	Actually Updating VERSION to v4.0.3rc3 Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>	2020-01-28 15:16:06 -06:00
Geoff Paulsen	c79e841921	Merge pull request #7342 from gpaulsen/topic/v4.0.x/VERSION_4.0.3rc3 Updating VERSION to v4.0.3rc3	2020-01-28 10:44:55 -06:00
Geoffrey Paulsen	44c1b6fb98	Updating VERSION to v4.0.3rc3 We tried doing an RC2 built without updating the greek, and found where that failed in build automation. Reving again for rc3, as we've already applied the rc2 tag. Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>	2020-01-28 10:42:04 -05:00
Geoff Paulsen	b9d54dadb6	Merge pull request #7341 from hppritcha/topic/news_for_rc4.0.3rc2 NEWS: updates for 4.0.3rc2	2020-01-27 14:52:58 -06:00
Howard Pritchard	7147a8c3bb	NEWS: updates for 4.0.3rc2 [skip ci] Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-01-27 13:50:35 -07:00
Howard Pritchard	0ea96ec4db	Merge pull request #7340 from jjhursey/v4-no-ssh-core plm/rsh: Fix segv on missing agent.	2020-01-27 13:08:40 -07:00
Joshua Hursey	05d003b109	plm/rsh: Fix segv on missing agent. * Additionally, fixes the `NULL` option to `OMPI_MCA_plm_rsh_agent` would would also lead to a segv. Now it operates as intended by disqualifying the `rsh` component and falling back onto the `isolated` component. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com> (cherry picked from commit 62d0058738e8a111cd099199bc5f1886f13aa8ec)	2020-01-27 10:34:28 -06:00
Howard Pritchard	5f40b47088	Merge pull request #7338 from hppritcha/topic/fix_6539_v4.0.x Topic/fix 6539 v4.0.x	2020-01-26 12:39:38 -07:00
Howard Pritchard	297505592a	Fix a problem with fortran configure test. Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-01-24 15:42:00 -06:00
Geoff Paulsen	2549ba2e47	Merge pull request #7329 from janjust/v4.0.x-oshmem-perf-multi-worker V4.0.x: oshmem/ucx: improves spml ucx performance for multi-threaded applications.	2020-01-24 13:41:13 -06:00
Howard Pritchard	d12e0fdf32	make mpifort obey disable-wrapper-runpath related to #6539 Signed-off-by: Howard Pritchard <howardp@lanl.gov> (cherry picked from commit 37b3e2f3fa7a4971dda64d8d2ff933dc4d4c807d)	2020-01-24 10:47:22 -06:00
Sergey Oblomov	91ab0e2191	SPML/UCX: fixed coverity issues - fixed sizeof(char***) by variable datatype - fixed resorce leak in proc_add Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit 8543860689029dc09b5edfa25afafa087fe8603b)	2020-01-24 17:29:53 +02:00
Tomislav Janjusic	0daf3df384	oshmem/ucx: improves spml ucx performance for multi-threaded applications. Improves multi-threaded performance by adding the option to create multiple ucx workers in threaded applications. Co-authored with: Artem Y. Polyakov <artemp@mellanox.com>, Manjunath Gorentla Venkata <manjunath@mellanox.com> Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com> (cherry picked from commit 3d6bf9fd8ec729d1c07470600e2c92c0f1580830)	2020-01-24 17:29:53 +02:00
Howard Pritchard	0b2b9d7660	Merge pull request #7325 from hppritcha/topic/pr_7304_to_v4.0.x btl/vader: modify how the max attachment address is determined	2020-01-24 08:00:36 -07:00
Howard Pritchard	686f2debda	Merge pull request #7327 from janjust/v4.0.x-oshmem-perf-progress v4.0.x: oshmem/ucx: Fix progress in iput/iget: periodically poke progress to prevent hardware stalls when using DCT transport.	2020-01-24 07:58:54 -07:00
Howard Pritchard	0f54228535	Merge pull request #7321 from hppritcha/topic/pr_2551_to_v4.0.x Topic/pr 7283 to v4.0.x	2020-01-23 16:36:36 -07:00
Tomislav Janjusic	9e755d3803	oshmem/ucx: Improves performance for non-blocking put/get operations. Improves the performance when excess non-blocking operations are posted by periodically calling progress on ucx workers. Co-authored with: Artem Y. Polyakov <artemp@mellanox.com>, Manjunath Gorentla Venkata <manjunath@mellanox.com> Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com> (cherry picked from commit 1b58e3d07388c8c63d485fe308589009279c1f4f)	2020-01-22 21:45:32 +02:00
Nathan Hjelm	66684bbda3	btl/vader: modify how the max attachment address is determined This PR removes the constant defining the max attachment address and replaces it with the largest address that shows up in /proc/self/maps. This should address issues found on AARCH64 where the max address may differ based on the configuration. Since the calculated max address may differ between processes the max address is sent as part of the modex and stored in the endpoint data. Signed-off-by: Nathan Hjelm <hjelmn@google.com> (cherry picked from commit 728d51f9f3f2df6577e5f9729b9d6a0fe9441d37)	2020-01-19 15:05:41 -08:00
Nathan Hjelm	a64a7c8a0a	btl/vader: fix issues with xpmem registration invalidation This commit fixes an issue discovered in the XPMEM registration cache. It was possible for a registration to be invalidated by multiple threads leading to a double-free situation or re-use of an invalidated registration. This commit fixes the issue by setting the INVALID flag on a registation when it will be deleted. The flag is set while iterating over the tree to take advantage of the fact that a registration can not be removed from the VMA tree by a thread while another thread is traversing the VMA tree. References #6524 References #7030 Closes #6534 Signed-off-by: Nathan Hjelm <hjelmn@google.com> (cherry picked from commit f86f805be1145ace46b570c5c518555b38e58cee)	2020-01-19 13:42:00 -08:00
Nathan Hjelm	76002ada84	opal: make interval tree resilient to similar intervals There are cases where the same interval may be in the tree multiple times. This generally isn't a problem when searching the tree but may cause issues when attempting to delete a particular registration from the tree. The issue is fixed by breaking a low value tie by checking the high value then the interval data. If the high, low, and data of a new insertion exactly matches an existing interval then an assertion is raised. Signed-off-by: Nathan Hjelm <hjelmn@google.com> (cherry picked from commit 1145abc0b790f82ea25e24a3becad91ff502769c)	2020-01-19 13:40:57 -08:00
Geoff Paulsen	629d0efa15	Merge pull request #7314 from hppritcha/topic/NEWS_v403 NEWS: update for 4.0.3	2020-01-17 12:41:05 -06:00
Geoff Paulsen	b21c475df6	Merge pull request #7313 from hppritcha/topic/version_for_4.0.3 VERSION - update for v4.0.3	2020-01-17 12:40:52 -06:00
Howard Pritchard	9d32fedd55	NEWS: update for 4.0.3 [skip ci] Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-01-17 11:29:08 -07:00
Howard Pritchard	baf1b06c9e	VERSION - update for v4.0.3 Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-01-17 10:21:40 -07:00
Howard Pritchard	1cdcce7f89	Merge pull request #7296 from michaellass/v4.0.x-fix-dims_create dims_create: fix calculation of factors for odd squares (v4.0.x)	2020-01-14 09:10:40 -07:00
Geoff Paulsen	3da939b124	Merge pull request #7248 from wckzhang/v4.0.x MTL/OFI: Check threshold number of peers allowed per rank	2020-01-13 14:19:51 -06:00
Geoff Paulsen	6985a5560f	Merge pull request #7291 from gpaulsen/topic/v4.0.x/from_pr7190_7192 Topic/v4.0.x/from pr7190 7192	2020-01-13 14:03:16 -06:00
Michael Lass	ff85c82151	dims_create: fix calculation of factors for odd squares Until now sqrt(n) was missed as a factor for odd square numbers n. This lead to suboptimal results of MPI_Dims_create for input numbers like 9, 25, 49, ... Fix the results by including sqrt(n) in the search for factors. Refs: #7186 Signed-off-by: Michael Lass <bevan@bi-co.net> (cherry picked from commit 67490118adb8372d2aefe1d2d923432e51e100cd)	2020-01-10 16:07:40 +01:00

1 2 3 4 5 ...

29741 Коммитов