openmpi

Автор	SHA1	Сообщение	Дата
rhc54	b85a5e62ab	Merge pull request #1739 from rhc54/topic/pmix Split the pmix external component into one for the 1.1.4 release, and…	2016-06-01 16:24:44 -07:00
Nathan Hjelm	d844442683	Merge pull request #1738 from hjelmn/ob1_req_fix pml/ob1: fix race on pml completion of send requests	2016-06-01 15:21:52 -06:00
Ralph Castain	12ecf972af	Split the pmix external component into one for the 1.1.4 release, and another for the upcoming 2.0 release. Clean up the configury so the components look for a series-specific function instead of running a program. NOTE: the changes for the 2.0 series are not yet in the PMIx master.	2016-06-01 14:15:24 -07:00
Jeff Squyres	873cebb4c0	Merge pull request #1727 from jsquyres/pr/mpirun-timeout-and-friends mpirun.1in: add descriptions of new options	2016-06-01 17:11:44 -04:00
Nathan Hjelm	ceb2912838	Merge pull request #1736 from hjelmn/ugni_fixes ugni BTL fixes	2016-06-01 14:59:55 -06:00
Nathan Hjelm	086ffc1838	pml/ob1: fix race on pml completion of send requests The request code was setting the request as pml_complete before calling MCA_PML_OB1_SEND_REQUEST_MPI_COMPLETE. This was causing MCA_PML_OB1_SEND_REQUEST_RETURN to be called twice in some cases. The code now mirrors the recvreq code and only sets the request as pml complete if the request has not already been freed. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-01 13:36:06 -06:00
Jeff Squyres	2c3d522147	Merge pull request #1737 from jsquyres/pr/fix-hwloc-valgrind-check fix hwloc valgrind check	2016-06-01 11:14:02 -04:00
Jeff Squyres	d175fd692d	README.ompi: track patches added to hwloc Track post-v1.11.3-release patches applied to the hwloc copy embedded in Open MPI. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-06-01 07:17:05 -07:00
Jeff Squyres	3867bd3640	hwloc.m4: only check for valgrind in non-embedded mode This fixes https://github.com/open-mpi/ompi/issues/1732: i.e., the case where the outer project has its own check for <valgrind/valgrind.h>, but also supplements CPPFLAGS (to find Valgrind's header files) before doing that check. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> Ideally, we would tell OMPI to disable autoconf's caching of our valgrind check result so that its check gets the right result after adding CPPFLAGS. Not sure if we can do that. For now, just disable our Valgrind code in embedded mode. This will keep the x86 backend enabled under Valgrind but it will auto-disable itself when finding identical APIC ids anyway (because CPUID returns same outputs for all PUs). Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr> Fixes open-mpi/ompi#1732 (cherry picked from commit open-mpi/hwloc@8b44fb1c81)	2016-06-01 06:58:53 -07:00
Gilles Gouaillardet	57978a75d0	Merge pull request #1717 from ggouaillardet/topic/lex_cleanup configury: clean the flex generated .c files	2016-06-01 13:06:21 +09:00
Nathan Hjelm	5d4bcce042	Merge pull request #1700 from shamisp/topic/cma_config CMA: Fixing logic for CMA system call detection	2016-05-31 20:33:48 -06:00
Nathan Hjelm	340152a635	Merge pull request #1720 from shamisp/topic/vader/max_addr VADER: Adjusting VADER_MAX_ADDRESS for non x86 platforms.	2016-05-31 20:33:28 -06:00
Gilles Gouaillardet	5f565dfec3	configury: clean the flex generated .c files	2016-06-01 11:13:31 +09:00
Jeff Squyres	cf27ec36b3	mpirun.zsh: add options to zsh shell completion Add the following to zsh shell completion: * --get-stack-traces * --report-state-upon-timeout * --timeout Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 16:33:46 -07:00
Jeff Squyres	e9ce11c6a7	help-orterun.txt: minor word smything Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 16:33:46 -07:00
Jeff Squyres	347497cc7e	mpirun.1in: add descriptions of new options Add descriptions for the new --report-state-on-timeout and --get-stack-traces options. Also add --timeout, and cross-reference MPIEXEC_TIMEOUT with it. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 16:33:46 -07:00
Jeff Squyres	36f653164f	.mailmap: Updates Remove all @open-mpi-git-mirror entries; those are no longer necessary since the official migration to Git/Github. Add aliases for @users.noreply.github.com addresses. Add fixes for what look like accidental name mispellings / common-name-isms. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 19:18:24 -04:00
Jeff Squyres	1d83d594c8	AUTHORS: reformat and include all git log email addresses Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 19:18:24 -04:00
Nathan Hjelm	bf10d79914	btl/ugni: remove erroneous unlock The endpoint lock was being released twice in mca_btl_ugni_get_ep. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-31 16:52:53 -06:00
Nathan Hjelm	cc96097873	btl/ugni: fix bug when attempting unaligned get on aries This commit fixes a programming error when using an aries nic. The documentation of ugni shows that only the local alignment restriction for get was lifted on aries. There is still a remote address alignment restriction. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-31 16:52:09 -06:00
Jeff Squyres	17202e5177	Merge pull request #1733 from jsquyres/pr/hwloc1113-fix hwloc1113: add missing file to Makefile.am	2016-05-31 13:59:08 -04:00
Jeff Squyres	5cfee95ea4	hwloc1113: add missing file to Makefile.am Lack of this file causes a failure when you run autogen.pl on a distribution tarball. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 09:57:50 -07:00
rhc54	93ff4ce36d	Merge pull request #1731 from rhc54/topic/timeout Provide ETIMEDOUT as the mpirun exit code if the timeout limit was hit	2016-05-31 08:41:21 -07:00
Ralph Castain	0cd0ccb7fd	Provide ETIMEDOUT as the mpirun exit code if the timeout limit was hit	2016-05-31 07:45:31 -07:00
Gilles Gouaillardet	1bbc5fadee	ompi/win: silence an other warning	2016-05-31 13:18:39 +09:00
Gilles Gouaillardet	c41321b9e5	ompi/win: silence warning	2016-05-31 13:03:20 +09:00
rhc54	0965cb3d41	Merge pull request #1730 from rhc54/topic/pmixext Patch from Gilles - modify detection of PMIx version for external libraries	2016-05-30 18:50:12 -07:00
Ralph Castain	7b115a9e0b	Patch from Gilles - modify detection of PMIx version for external libraries	2016-05-30 14:30:10 -07:00
Nathan Hjelm	60519c2b4e	cma: add support for MIPS and ARM Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-05-30 12:13:20 -06:00
George Bosilca	d2abff583e	Fix race condition during BTL TCP tear-down. bot🏷️bug bot:assign:@hjelmn	2016-05-30 10:47:14 -05:00
rhc54	876257469e	Merge pull request #1728 from rhc54/topic/sim Enable simulation of large-scale clusters	2016-05-29 21:29:16 -07:00
Ralph Castain	3913595e10	Enable simulation of large-scale clusters by allowing multiple daemons/node. Specifying the ras_base_multiplier parameter to be greater than 1 will cause ORTE to replicate each allocated node by that factor. A daemon will be spawned for each replica, thus letting ORTE function as if it were on a much larger cluster. Note that this cannot be used for MPI performance testing. It is really only useful for ORTE scaling tests. It also only works with the rsh/ssh launcher.	2016-05-29 18:56:18 -07:00
rhc54	a93c01d4f4	Merge pull request #1724 from rhc54/topic/timeout Add a timeout cmd line option and an option to report state info upon timeout to assist with debugging Jenkins tests	2016-05-28 08:36:41 -07:00
Ralph Castain	ebe159acef	Add a timeout cmd line option and an option to report state info upon timeout to assist with debugging Jenkins tests If requested, obtain stacktraces for each application process and report it to stderr upon timeout stack traces: minor improvements - Also include the hostname and PID of the each process for which we're sending the stack traces (vs. just including the ORTE process name) - Send a specific error message if we couldn't find "gstack" in the $PATH (e.g., on OS X) - Send a sepcific error message if gstack fails to run - Print a message that obtaining the stack traces may take a few seconds so that users don't wonder what's happening Signed-off-by: Jeff Squyres <jsquyres@cisco.com> help-orterun.txt: minor tweaks Trivial update: show "--timeout" (instead of "-timeout") in the help message, just to encourage the use of double-dash options. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> trivial: stacktrace -> stack trace Trivial word smything. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-28 08:36:25 -07:00
Jeff Squyres	59f4a765b3	Merge pull request #1656 from hpcraink/pr/make_manpage In case, we do not build Fortran, Fortran 2008 or CXX, the regexp in …	2016-05-28 11:02:12 -04:00
Jeff Squyres	e126d2cd18	Merge pull request #1584 from bgoglin/master Update hwloc to v1.11.3	2016-05-28 11:01:54 -04:00
Nathan Hjelm	d8fd3a411a	Merge pull request #1725 from hjelmn/request_fixes ompi/request: fix bugs in MPI_Wait_some and MPI_Wait_any	2016-05-27 13:47:49 -06:00
Nathan Hjelm	0591139f49	ompi/request: fix bugs in MPI_Wait_some and MPI_Wait_any This commit fixes two bugs in MPI_Wait_any: - If all requests are inactive then the sync wait would hang forever because no requests are attached to the sync. - The request pointer was pointing to the request before the completed request which caused the wrong request to be freed or marked inactive. MPI_Wait_some had a similar issue if all the requests were pending. These issues were identified by MTT. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-27 12:36:10 -06:00
Nathan Hjelm	3974987ba3	Merge pull request #1723 from hjelmn/warning_fixes win: fix warnings	2016-05-27 12:26:04 -06:00
Nathan Hjelm	0adfb328e1	win: fix warnings Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-05-27 10:14:02 -06:00
rhc54	e5ee7adbe0	Merge pull request #1722 from rhc54/topic/pmixext Enable PMIx external support for both 1.1.4 and 2.0 versions	2016-05-27 08:59:09 -07:00
Ralph Castain	55923eacd3	Stealing some pieces of Josh Hursey's PR #1583 and modifying a bit, allow the opal/pmix external component to handle both PMIx 1.1.4 and PMIx 2.0 versions. Automatically detect the version of the target external library and adjust the only two APIs that changed (PMIx_Init and PMIx_Finalize) Rename temp vars in .m4 to avoid conflict with Travis	2016-05-27 08:06:31 -07:00
Nathan Hjelm	28dfa36a3f	btl/ugni: fix bug when attempting unaligned get on aries This commit fixes a programming error when using an aries nic. The documentation of ugni shows that only the local alignment restriction for get was lifted on aries. There is still a remote address alignment restriction. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-27 08:22:13 -06:00
Nathan Hjelm	c19426ac1b	btl/ugni: add support for additional atomic operations This commit adds support for Cray Aries atomic operations. This includes 32-bit and floating point support. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-27 08:22:13 -06:00
Nathan Hjelm	23fe19a956	btl: add support for more atomics This commit add support for more atomic operations and type. The operations added are logical and, logical or, logical xor, swap, min, and max. New types are 32-bit int by using the MCA_BTL_ATOMIC_FLAG_32BIT flag, 64-bit float by using the MCA_BTL_ATOMIC_FLAG_FLOAT flag, and 32-bit float by using both flags. Floating point numbers are supported by packing the number in as an int64_t or int32_t. We will update the btl interface in the future to make this less confusing. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-27 08:22:13 -06:00
Nathan Hjelm	d25b846c01	Merge pull request #1704 from hpcraink/pr/configure_framework Fix configure for FreePGI on OSX	2016-05-26 17:01:08 -06:00
Nathan Hjelm	8c9292d5d1	Merge pull request #1721 from hjelmn/xrc_fix btl/openib: fix XRC WQE calculation	2016-05-26 17:00:31 -06:00
Nathan Hjelm	56bdcd0888	btl/openib: fix XRC WQE calculation Before dynamic add_procs support was committed to master we called add_procs with every proc in the job. The XRC code in the openib btl was taking advantage of this and setting the number of work queue entries (WQE) based on all the procs on a remote node. Since that is no longer the case we can not simply increment the sd_wqe field on the queue pair. To fix the issue a new field has been added to the xrc queue pair structure to keep track of how many wqes there are total on the queue pair. If a new endpoint is added that increases the number of wqes and the xrc queue pair is already connected the code will attempt to modify the number of wqes on the queue pair. A failure is ignored because all that will happen is the number of active send work requests on an XRC queue pair will be more limited. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-26 15:58:31 -06:00
Aurelien Bouteiller	49bd28d0ac	Merge pull request #1714 from hjelmn/scif_exclusivity btl/scif: reduce default exclusivity	2016-05-26 17:53:11 -04:00
Pavel Shamis (Pasha)	60fd25f3fb	VADER: Adjusting VADER_MAX_ADDRESS for non x86 platforms. The original VADER_MAX_ADDRESS was tunned for x86_64 platforms only. For non x86_64 platforms we can use XPMEM_MAXADDR_SIZE. Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>	2016-05-26 16:38:04 -05:00

... 3 4 5 6 7 ...

25376 Коммитов