openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	60a810c88b	Merge pull request #4016 from jsquyres/pr/libnl-you-win-again opal_check_libnl: it's not a fatal error if libnl check fails	2017-09-25 11:53:38 -04:00
Jeff Squyres	db10da97e3	Merge pull request #4257 from jsquyres/pr/moar-hwloc-cuda-cleanup hwloc2a/configure.m4: be more careful in with_cuda->enable_cuda	2017-09-25 11:35:33 -04:00
Jeff Squyres	0568f4a820	opal_check_libnl: remove abort on libnl check failure Per https://github.com/open-mpi/ompi/issues/3995, it should not be a fatal error if the libnl checks fail. Instead, just fail the check and let the upper layer decide what to do. In this case, OPAL_CHECK_PACKAGE will mark this library as no good, and then propagate that upward. E.g., if libfoo fails the libnl check, and the user had specified --with-libfoo, this will eventually cause configure to fail (because the libnl check will fail with libfoo, which will cause OPAL_CHECK_PACKAGE to fail with libfoo, which will ultimately cause some upper-level logic to realize "a human asked for libfoo but we could not provide it -- abort!"). However, if libfoo fails the libnl check and the user did not specify --with-libfoo, then this will cause the upper layer to silently skip libfoo (because the libnl check will fail libfoo, which will cause OPAL_CHECK_PACKAGE to fail libfoo, but then the upper-level logic will realize "oh, we can't use libfoo, but a human didn't ask for it -- so just skip libfoo support."). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-09-25 07:49:43 -07:00
Gilles Gouaillardet	e28cc8c986	configury: reset all flags if libnl v3 cannot be used Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-09-25 17:21:47 +09:00
Brian Barrett	c768f980e1	reachable: Fix string length Coverity warning Make sure hostnames are null terminated, even when they were too long to fit in the hostname buffer. Fixes: CID 1418232 Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-24 19:38:45 -07:00
Ralph Castain	fa762973ae	Merge pull request #4259 from rhc54/topic/cleanup Minor cleanups for when using external pmix	2017-09-24 11:12:37 -07:00
Ralph Castain	fcb7a2f29b	Minor cleanups for when using external pmix Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-09-24 09:53:04 -07:00
Jeff Squyres	2ec2a329dc	hwloc2a/configure.m4: be more careful in with_cuda->enable_cuda Be a little more deliberate about convering OMPI's --with-cuda CLI value to hwloc's --enable-cuda configure option. Also, unconditionally disable hwloc NVML support (because Open MPI is not currently using it). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-09-24 05:36:23 -07:00
Ralph Castain	448a8ee953	Merge pull request #4256 from rhc54/topic/scaling Get the scaling test to properly run a scan across the #nodes	2017-09-22 21:05:32 -07:00
Ralph Castain	1dd45e0f30	Get the scaling test to properly run a scan across the #nodes Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-09-22 21:04:36 -07:00
Ralph Castain	ae01dcee7b	Merge pull request #4218 from rhc54/topic/config Attempt vpath inside component	2017-09-22 18:02:45 -07:00
Ralph Castain	4f932819aa	Update platform file Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-09-22 16:05:57 -07:00
Ralph Castain	5fed7330e7	Update the configure logic to separate the emitting of a libpmix library from with-devel-headers. Instead, we create a new --enable-install-libpmix expressly for that purpose. Continue to link the new library back to libopen-pal to resolve the renamed symbols. Update opal configure logic to set disable_dlopen when disable_mca_dso is given. Fix typos in disable_dlopen when setting variables (incorrect inclusion of quotes) Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-09-22 16:02:57 -07:00
Ralph Castain	40a25e6077	Merge pull request #4254 from rhc54/topic/fixes Silence warnings and sync to PMIx master	2017-09-22 14:59:19 -07:00
Matias Cabral	2beeaa46fa	Merge pull request #4241 from yburette/topic/fix_provider_include mtl/ofi: Fix provider selection.	2017-09-22 11:23:43 -07:00
Ralph Castain	3493c43468	Sync to PMIx master Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-09-22 10:48:00 -07:00
Ralph Castain	b4ad81da85	Silence warnings about verbose output Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 2c9655bb631742fd7693e00289d1949f4b2fc155)	2017-09-22 09:05:03 -07:00
Ralph Castain	9edea02b46	Merge pull request #4246 from rhc54/topic/spawn Fully support OMPI spawn options.	2017-09-21 11:23:34 -07:00
Jeff Squyres	9708e9dd21	Merge pull request #4245 from jsquyres/pr/disable-hwloc-cuda hwloc: do not build hwloc CUDA support if --without-cuda used (and also always disable hwloc GL and OpenCL support)	2017-09-21 13:43:01 -04:00
Ralph Castain	fe9b584c05	Fully support OMPI spawn options. Fix a bug in the round-robin mappers where we weren't adding nodes to the job map node array, and so resources were not released Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 285d8cfef74ffc899e9c51e1d9c597b7fb2ceb89)	2017-09-21 10:29:27 -07:00
Brice Goglin	84a721d17a	hwloc: disable GL and OpenCL in the hwloc component Open MPI doesn't use GL or OpenCL OS devices, so just disable them in hwloc. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-09-21 08:25:46 -07:00
Jeff Squyres	f5d51dc2f5	hwloc: do not build hwloc CUDA support if --without-cuda used Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-09-21 08:24:54 -07:00
Gilles Gouaillardet	d704712bad	Merge pull request #4242 from ggouaillardet/topic/libnl3 configury: do not use libnl-3 when it is half broken	2017-09-21 16:16:00 +09:00
Gilles Gouaillardet	94747a1d28	configury: do not use libnl-3 when it is half broken Refs. open-mpi/ompi#4211 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-09-21 15:27:59 +09:00
yohann	1f8cabc890	mtl/ofi: Fix provider selection. This allows mtl_ofi_provider_include to work with layered providers as well. e.g. --mca mtl_ofi_provider_include "providerX;ofi_rxm" Signed-off-by: yohann <yohann.burette@intel.com>	2017-09-20 16:00:50 -07:00
Jeff Squyres	a182b4fbaa	Merge pull request #3883 from jsquyres/pr/readme-pathscale-update README: Pathscale updates	2017-09-20 10:07:48 -04:00
Gilles Gouaillardet	da2966ace1	Merge pull request #2191 from ggouaillardet/topic/remove_disable_mpi_io configury: remove the --disable-mpi-io option	2017-09-20 15:55:48 +09:00
Gilles Gouaillardet	b9315edb85	configury: remove the --disable-mpi-io option Fixes open-mpi/ompi#2185 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-09-20 14:39:09 +09:00
bosilca	ab68aced23	Merge pull request #3738 from bosilca/topic/tcp_event_count Fix the TCP performance impact when BTL not used	2017-09-19 23:08:58 -04:00
Brian Barrett	2c59fb7a58	Merge pull request #4221 from AntoineD/master Fix: Outdated README link #4220	2017-09-19 19:48:13 -07:00
Gabe Saba	c6235a9a0f	reachable: add tests Add test suite for netlink and weighted reachable components. We don't have a great way of running components through unit tests today, so make them stand-alone tests that are run with mpirun and such. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Brian Barrett	ae122c4b17	reachable: Change ownership to Amazon Amazon is going to use the reachable framework to fix some connection bugs in the TCP BTL, so claim support ownership of the weighted and netlink components. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Gabe Saba	9e53605a6f	reachable: Implement netlink component Wire up the libnl utilities Jeff and Ralph added previously to the netlink reachable component so that it actually does work. The algorithm is a bit simplistic, but should work for our use cases. If there's a route, assume the two interfaces can talk. If there's no gateway, assume the two interfaces are in the same subnet, and give preference to that connection. If there's a gateway, assume there's a route, but the interfaces are not in the same subnet. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Gabe Saba	4d81006222	reachable: Add IPv6 support to libnl code Add IPv6 support to the netlink component's utility wrappers around libnl-3. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Brian Barrett	4d5bfd0429	reachable: Simplify gateway check in netlink The netlink component's libnl wrapper code returned the next hop in the route table to allow the calling code to differentiate between same and different networks, which is a fine comparison for IPv4, but is pretty expensive for IPv6 (coming soon to a netlink component near you). Rather than provide extra information (the address of the next hop), just provide whether there is a gateway or not, which is all the netlink component actually needs. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Brian Barrett	a543e7f130	reachable: remove libnl-1 support from netlink The netlink reachable component has never been released in a usable form, but had code copied from usNIC to support both libnl-1 and libnl-3. If nothing else, this code was a little buggy in handling the case where libnl-3 but not libnl-route-3 were installed. Jeff and I decided to drop libnl-1 support from the netlink reachable component, given that it's getting pretty old and the weighted component provides the same information that the TCP BTL and OOB are using today, so libnl-1 customers won't see a step backwards from where they are today. Signed-off-by: Brian Barrett <bbarrett@mazon.com>	2017-09-19 19:42:54 -07:00
Gabe Saba	3f8d294191	reachable: Enable weighted component / fix interface Based on work from usNIC, the best way to use the reachability information the reachable components return is to build a connectivity graph between the two peers and run a bipartite graph solver. Rather than returning the "best" pairing, the reachability framework now returns the entire mapping, allowing a (soon to be added) graph solver to build the "optimal" connectivity pairing. Practically, this means changing the return type of the reachable() function and rewriting the weighted_reachable() function to return the full mapping. The netlink_reachable() function still always returns NULL. At the same time, fix bit-rot in the weighted component and enable builds of the component by removing the opal_ignore. Also, add IPv6 support to the weighted component to support both use cases in the TCP BTL. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Gabe Saba	8f2df42055	reachable: Initialize / Finalize reachable framework Initialize the reachable framework during opal_init() and tear it back down during opal_finalize(). The framework was never used, so the lack of initialization didn't matter, but this is a required step in using the framework. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Brian Barrett	6048c543fa	reachable: Rename code copied from usnic Ralph and Jeff created the reachable framework and added the netlink component based on code copied from the usnic btl. However, they never renamed all the symbols from the libnl compatibility code. This patch finishes the rename. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Brian Barrett	502f383f4d	util: Add link-local check to net interface Add a check for link-local IPv6 addresses to the net interface to support better computation of network pairings in the weighted reachable component. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2017-09-19 19:42:54 -07:00
Ralph Castain	a09c090709	Merge pull request #4237 from rhc54/topic/cnct Fix tool connection logic so we properly search for default session server, perform specified number of retries, etc.	2017-09-19 14:27:43 -07:00
Ralph Castain	e575c4d6f9	Fix tool connection logic so we properly search for default session server, perform specified number of retries, etc. Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 7c755e01004f8b86c71f1729662979ea45ab1adb)	2017-09-19 13:35:46 -07:00
Howard Pritchard	bfd5ed6e98	Merge pull request #1910 from hpcraink/pr/shmem_fix_f77 Fix shmem.fh: fails to compile with F77 fixed-form compiled programs...	2017-09-19 14:28:08 -06:00
Ralph Castain	16de607607	Merge pull request #4234 from rhc54/topic/upstream Ensure we update the total_slots_alloc field on each job. Correct the client example	2017-09-19 09:03:04 -07:00
Jeff Squyres	daa48906f5	Merge pull request #4233 from jsquyres/pr/remove-extraneous-done-output-from-rmaps-base rmaps/base: remove debugging "DONE" message	2017-09-19 11:34:52 -04:00
Ralph Castain	658c3d1d51	Ensure we update the total_slots_alloc field on each job. Correct the client example Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit bcedd12a8a24dd246f04ff13b4fd2f1bbac6ce5a)	2017-09-19 08:14:14 -07:00
Jeff Squyres	7cccee9d92	rmaps/base: remove debugging "DONE" message Thanks for Ben Menadue for reporting and supplying the patch. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-09-19 07:10:00 -07:00
Ralph Castain	3b3ce243bb	Merge pull request #4214 from karasevb/pmix1_hang_fix pmix: fixed immediate request for PMIx v1.2	2017-09-19 06:51:25 -07:00
Ralph Castain	48bbf707c3	Merge pull request #4232 from rhc54/topic/local Implement support for "local" range when publishing data	2017-09-18 20:18:06 -07:00
Ralph Castain	5708872112	Implement support for "local" range when publishing data Signed-off-by: Ralph Castain <rhc@open-mpi.org> (cherry picked from commit 2d54f7e0dd3a47260b0b2634aae3361316005933)	2017-09-18 19:34:08 -07:00

1 2 3 4 5 ...

27754 Коммитов