openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	12ecf972af	Split the pmix external component into one for the 1.1.4 release, and another for the upcoming 2.0 release. Clean up the configury so the components look for a series-specific function instead of running a program. NOTE: the changes for the 2.0 series are not yet in the PMIx master.	2016-06-01 14:15:24 -07:00
Jeff Squyres	d175fd692d	README.ompi: track patches added to hwloc Track post-v1.11.3-release patches applied to the hwloc copy embedded in Open MPI. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-06-01 07:17:05 -07:00
Jeff Squyres	3867bd3640	hwloc.m4: only check for valgrind in non-embedded mode This fixes https://github.com/open-mpi/ompi/issues/1732: i.e., the case where the outer project has its own check for <valgrind/valgrind.h>, but also supplements CPPFLAGS (to find Valgrind's header files) before doing that check. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> Ideally, we would tell OMPI to disable autoconf's caching of our valgrind check result so that its check gets the right result after adding CPPFLAGS. Not sure if we can do that. For now, just disable our Valgrind code in embedded mode. This will keep the x86 backend enabled under Valgrind but it will auto-disable itself when finding identical APIC ids anyway (because CPUID returns same outputs for all PUs). Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Fixes open-mpi/ompi#1732 (cherry picked from commit open-mpi/hwloc@8b44fb1c81)	2016-06-01 06:58:53 -07:00
Gilles Gouaillardet	57978a75d0	Merge pull request #1717 from ggouaillardet/topic/lex_cleanup configury: clean the flex generated .c files	2016-06-01 13:06:21 +09:00
Nathan Hjelm	5d4bcce042	Merge pull request #1700 from shamisp/topic/cma_config CMA: Fixing logic for CMA system call detection	2016-05-31 20:33:48 -06:00
Nathan Hjelm	340152a635	Merge pull request #1720 from shamisp/topic/vader/max_addr VADER: Adjusting VADER_MAX_ADDRESS for non x86 platforms.	2016-05-31 20:33:28 -06:00
Gilles Gouaillardet	5f565dfec3	configury: clean the flex generated .c files	2016-06-01 11:13:31 +09:00
Jeff Squyres	5cfee95ea4	hwloc1113: add missing file to Makefile.am Lack of this file causes a failure when you run autogen.pl on a distribution tarball. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 09:57:50 -07:00
George Bosilca	d2abff583e	Fix race condition during BTL TCP tear-down. bot🏷️bug bot:assign:@hjelmn	2016-05-30 10:47:14 -05:00
Jeff Squyres	e126d2cd18	Merge pull request #1584 from bgoglin/master Update hwloc to v1.11.3	2016-05-28 11:01:54 -04:00
Ralph Castain	55923eacd3	Stealing some pieces of Josh Hursey's PR #1583 and modifying a bit, allow the opal/pmix external component to handle both PMIx 1.1.4 and PMIx 2.0 versions. Automatically detect the version of the target external library and adjust the only two APIs that changed (PMIx_Init and PMIx_Finalize) Rename temp vars in .m4 to avoid conflict with Travis	2016-05-27 08:06:31 -07:00
Nathan Hjelm	d25b846c01	Merge pull request #1704 from hpcraink/pr/configure_framework Fix configure for FreePGI on OSX	2016-05-26 17:01:08 -06:00
Nathan Hjelm	8c9292d5d1	Merge pull request #1721 from hjelmn/xrc_fix btl/openib: fix XRC WQE calculation	2016-05-26 17:00:31 -06:00
Nathan Hjelm	56bdcd0888	btl/openib: fix XRC WQE calculation Before dynamic add_procs support was committed to master we called add_procs with every proc in the job. The XRC code in the openib btl was taking advantage of this and setting the number of work queue entries (WQE) based on all the procs on a remote node. Since that is no longer the case we can not simply increment the sd_wqe field on the queue pair. To fix the issue a new field has been added to the xrc queue pair structure to keep track of how many wqes there are total on the queue pair. If a new endpoint is added that increases the number of wqes and the xrc queue pair is already connected the code will attempt to modify the number of wqes on the queue pair. A failure is ignored because all that will happen is the number of active send work requests on an XRC queue pair will be more limited. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-26 15:58:31 -06:00
Aurelien Bouteiller	49bd28d0ac	Merge pull request #1714 from hjelmn/scif_exclusivity btl/scif: reduce default exclusivity	2016-05-26 17:53:11 -04:00
Pavel Shamis (Pasha)	60fd25f3fb	VADER: Adjusting VADER_MAX_ADDRESS for non x86 platforms. The original VADER_MAX_ADDRESS was tunned for x86_64 platforms only. For non x86_64 platforms we can use XPMEM_MAXADDR_SIZE. Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>	2016-05-26 16:38:04 -05:00
Nathan Hjelm	99627319f0	btl/ugni: reduce overhead of progress function This commit reduces the overhead of calling the ugni progress function. It does the following: - Check for new connections once every eight calls. - Do not call remote smsg progress unless we are connected to at least one remote peer. - Do not call rdma progress unless at least one rdma fragment is outstanding. - Check endpoint wait list size before obtaining a lock. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 14:27:34 -06:00
Nathan Hjelm	5caf12cd9b	btl/scif: reduce default exclusivity This commit reduces the default exclusivity so that btl/scif is not used for send/recv over other shared memory transports. Fixes open-mpi/ompi#1712 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 14:25:07 -06:00
Rainer Keller	3727cba9bb	Fix compilation for FreePGI on OSX Our checks and the ones of libevent are somewhat flawed. If adding multiple "-framework" to CXXFLAGS or CFLAGS, we strip the keyword from the command-line, not good. libevent however assumes plain gcc without testing properly that the compiler supports -Wno-deprecated-declarations.	2016-05-25 09:12:39 +02:00
Nathan Hjelm	461ca1203b	Merge pull request #1703 from hjelmn/grdma_cuda_fix rcache/grdma: fix typo in cuda code	2016-05-24 18:51:22 -06:00
bosilca	b90c83840f	Refactor the request completion (#1422 ) * Remodel the request. Added the wait sync primitive and integrate it into the PML and MTL infrastructure. The multi-threaded requests are now significantly less heavy and less noisy (only the threads associated with completed requests are signaled). * Fix the condition to release the request.	2016-05-24 18:20:51 -05:00
Nathan Hjelm	af52dad8f8	rcache/grdma: fix typo in cuda code Fixes open-mpi/ompi#1702 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-24 15:56:39 -06:00
Pavel Shamis (Pasha)	d984b4b3f9	CMA: Fixing logic for CMA system call detection The OPAL_CMA_NEED_SYSCALL_DEFS is always defined/set to 0 or 1. Therefore instead of checking if the macro is defined, we have to look at the value itself. Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>	2016-05-24 14:53:25 -05:00
Ralph Castain	80f4e3b872	Fix the --tune problem by searching the argv for MCA params in advance of opal_init_util. Only search the first app_context as we historically have done - we can debate whether or not to search all app_contexts	2016-05-23 21:09:44 -07:00
Nathan Hjelm	37e9e2c660	mca/base: fix typo in flag enumeration This commit fixes a typo in flag enumeration that can cause the parser to miss valid flags or crash. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-23 12:21:34 -06:00
Gilles Gouaillardet	d5a2ac6f2f	btl/openib: fix #if vs #ifdef	2016-05-23 14:27:33 +09:00
Gilles Gouaillardet	5a8cbe5a8f	btl/openib: remove obsolete reference to MEMORY_LINUX_MALLOC_ALIGN_ENABLED macro	2016-05-23 14:12:21 +09:00
Gilles Gouaillardet	8466a3daf3	pmix: update .gitignore git ignore opal/mca/pmix/pmix114/pmix/include/pmix/autogen/config.h.in git rm opal/mca/pmix/pmix114/pmix/include/pmix/autogen/config.h.in git ignore opal/mca/pmix/pmix*/...	2016-05-23 11:58:07 +09:00
Nathan Hjelm	31bfeede82	bml/r2: always add btl progress function This commit changes the behavior of bml/r2 from conditionally registering btl progress functions to always registering progress functions. Any progress function beloning to a btl that is not yet in use is registered as low-priority. As soon as a proc is added that will make use of the btl is is re-registered normally. This works around an issue with some btls. In order to progress a first message from an unknown peer both ugni and openib need to have their progress functions called. If either btl is not in use after the first call to add_procs the callback was never happening. This commit ensures the btl progress function is called at some point but the number of progress callbacks is reduced from normal to ensure lower overhead when a btl is not used. The current ratio is 1 low priority progress callback for every 8 calls to opal_progress(). Fixes open-mpi/ompi#1676 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-21 15:54:04 -04:00
Ralph Castain	4e0749f03d	Remove verbose error messages	2016-05-20 10:04:26 -07:00
Ralph Castain	42ecffb6d0	Move the registration of MCA params out of the init of the var system - put them in with the rest of the OPAL MCA param registrations Take another shot at untangling the spaghetti orterun: fix for command line parsing orte-submit calls opal_init_util () before parsing out MCA command line options (-mca, -am, etc). This prevents mpirun from setting opal MCA variables for some frameworks as well as the MCA base. This is because when a framework is opened all of its variables are set to read-only. Eventually we want to lift this restriction on some MCA variables but since -mca is affected we must parse out the MCA command line options before opal_init_util(). This commit fixes the bug by adding a new option to opal_cmd_line_parse (ignore unknown option) so orte-submit can pre-parse the command line for MCA options. Signed-off-by: Nathan Hjelm <hjelmn@me.com> Minor cleanups to avoid releasing/recreating the cmd line	2016-05-20 09:59:50 -07:00
Brice Goglin	ca621330a6	Update hwloc to v1.11.3 Remove contrib/windows/ Merge hwlocXYZ/hwloc/README-ompi.txt back into hwlocXYZ/README-ompi.txt instead of having both. Add README.txt in new automake-required directory contrib/systemd/ Keep the following patches applied since they are not in 1.11.3 linux: actually enable libudev based on the result of AC_CHECK_LIB (cherry picked from open-mpi/hwloc@9549fd59af) configure: check the actual may_alias syntax that we use (cherry picked from open-mpi/hwloc@0ab7af5e90)	2016-05-20 07:20:16 +02:00
Gilles Gouaillardet	5ec1eedbae	Merge pull request #1682 from ggouaillardet/topic/fix-ethtool-again opal/util/ethtool: use system ethtool_cmd_speed when available	2016-05-20 10:30:43 +09:00
Gilles Gouaillardet	cbbdce05b1	pmix/pmix114: silence a warning	2016-05-20 09:35:26 +09:00
Gilles Gouaillardet	ed3fd1775f	rcache/grdma: silence a warning	2016-05-20 09:30:29 +09:00
Gilles Gouaillardet	a01a5487a8	opal/util/ethtool: use system ethtool_cmd_speed when available Refs: open-mpi/ompi#1679	2016-05-20 09:05:09 +09:00
rhc54	99d3c283f5	Merge pull request #1681 from rhc54/topic/pmixupdate Update PMIx 114 to current release candidate	2016-05-19 13:50:16 -07:00
Ralph Castain	6f743f81b6	Update PMIx 114 to current release candidate	2016-05-19 12:55:05 -07:00
Jeff Squyres	87233aae49	ethtool: better handle portability Be sure to handle the case where we don't have ethtool support at all. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-19 10:57:14 -07:00
Gilles Gouaillardet	fd93d236b1	opal/util/ethtool: fix compilation on older Linux when struct ethtool_cmd has no speed_hi field Refs: open-mpi/ompi#1628	2016-05-19 11:58:04 +09:00
Jeff Squyres	66f53ec29a	Merge pull request #1628 from kmroz/wip-btl-tcp-ethtool-speed btl/tcp: autodetect bandwidth and latency if unset by the user	2016-05-18 12:12:55 -04:00
Nathan Hjelm	9371a6a52d	Merge pull request #1673 from hjelmn/fix_rcache_deadlock rcache: fix deadlock in multi-threaded environments	2016-05-18 08:32:21 -07:00
Karol Mroz	ca6ddf3270	btl/tcp: autodetect bandwidth and latency if unset Fixes open-mpi/ompi#120 Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-05-18 16:25:52 +02:00
Karol Mroz	b9c6c43c6b	btl/tcp: add default defines for bandwidth and latency Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-05-18 16:25:52 +02:00
Karol Mroz	31e33a64f9	opal/util: add function to obtain interface speed If kernel ethtool_cmd_speed() is not available, use copies if possible. Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-05-18 16:25:51 +02:00
Nathan Hjelm	ab8ed177f5	rcache: fix deadlock in multi-threaded environments This commit fixes several bugs in the registration cache code: - Fix a programming error in the grdma invalidation function that can cause an infinite loop if more than 100 registrations are associated with a munmapped region. This happens because the mca_rcache_base_vma_find_all function returns the same 100 registrations on each call. This has been fixed by adding an iterate function to the vma tree interface. - Always obtain the vma lock when needed. This is required because there may be other threads in the system even if opal_using_threads() is false. Additionally, since it is safe to do so (the vma lock is recursive) the vma interface has been made thread safe. - Avoid calling free() while holding a lock. This avoids race conditions with locks held outside the Open MPI code. Fixes open-mpi/ompi#1654. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-17 09:02:40 -06:00
Nathan Hjelm	f6938868bd	Merge pull request #1659 from hjelmn/sync_64 sync_builtin: check for 64-bit atomic support	2016-05-17 05:40:04 -07:00
rhc54	8b534e9897	Merge pull request #1668 from rhc54/topic/slurm When direct launching applications, we must allow the MPI layer to pr…	2016-05-16 12:23:19 -07:00
Howard Pritchard	1a676e5b35	pmix/cray: fix some breakage Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2016-05-16 12:45:05 -05:00
Gilles Gouaillardet	4e21933a74	memory/patcher: declare __curbrk as extern in order not to generate an (unitialized) common symbol	2016-05-16 09:30:11 +09:00

1 2 3 4 5 ...

4157 Коммитов