openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	17202e5177	Merge pull request #1733 from jsquyres/pr/hwloc1113-fix hwloc1113: add missing file to Makefile.am	2016-05-31 13:59:08 -04:00
Jeff Squyres	5cfee95ea4	hwloc1113: add missing file to Makefile.am Lack of this file causes a failure when you run autogen.pl on a distribution tarball. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-31 09:57:50 -07:00
rhc54	93ff4ce36d	Merge pull request #1731 from rhc54/topic/timeout Provide ETIMEDOUT as the mpirun exit code if the timeout limit was hit	2016-05-31 08:41:21 -07:00
Ralph Castain	0cd0ccb7fd	Provide ETIMEDOUT as the mpirun exit code if the timeout limit was hit	2016-05-31 07:45:31 -07:00
Gilles Gouaillardet	1bbc5fadee	ompi/win: silence an other warning	2016-05-31 13:18:39 +09:00
Gilles Gouaillardet	c41321b9e5	ompi/win: silence warning	2016-05-31 13:03:20 +09:00
rhc54	0965cb3d41	Merge pull request #1730 from rhc54/topic/pmixext Patch from Gilles - modify detection of PMIx version for external libraries	2016-05-30 18:50:12 -07:00
Ralph Castain	7b115a9e0b	Patch from Gilles - modify detection of PMIx version for external libraries	2016-05-30 14:30:10 -07:00
George Bosilca	d2abff583e	Fix race condition during BTL TCP tear-down. bot🏷️bug bot:assign:@hjelmn	2016-05-30 10:47:14 -05:00
rhc54	876257469e	Merge pull request #1728 from rhc54/topic/sim Enable simulation of large-scale clusters	2016-05-29 21:29:16 -07:00
Ralph Castain	3913595e10	Enable simulation of large-scale clusters by allowing multiple daemons/node. Specifying the ras_base_multiplier parameter to be greater than 1 will cause ORTE to replicate each allocated node by that factor. A daemon will be spawned for each replica, thus letting ORTE function as if it were on a much larger cluster. Note that this cannot be used for MPI performance testing. It is really only useful for ORTE scaling tests. It also only works with the rsh/ssh launcher.	2016-05-29 18:56:18 -07:00
rhc54	a93c01d4f4	Merge pull request #1724 from rhc54/topic/timeout Add a timeout cmd line option and an option to report state info upon timeout to assist with debugging Jenkins tests	2016-05-28 08:36:41 -07:00
Ralph Castain	ebe159acef	Add a timeout cmd line option and an option to report state info upon timeout to assist with debugging Jenkins tests If requested, obtain stacktraces for each application process and report it to stderr upon timeout stack traces: minor improvements - Also include the hostname and PID of the each process for which we're sending the stack traces (vs. just including the ORTE process name) - Send a specific error message if we couldn't find "gstack" in the $PATH (e.g., on OS X) - Send a sepcific error message if gstack fails to run - Print a message that obtaining the stack traces may take a few seconds so that users don't wonder what's happening Signed-off-by: Jeff Squyres <jsquyres@cisco.com> help-orterun.txt: minor tweaks Trivial update: show "--timeout" (instead of "-timeout") in the help message, just to encourage the use of double-dash options. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> trivial: stacktrace -> stack trace Trivial word smything. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-28 08:36:25 -07:00
Jeff Squyres	59f4a765b3	Merge pull request #1656 from hpcraink/pr/make_manpage In case, we do not build Fortran, Fortran 2008 or CXX, the regexp in …	2016-05-28 11:02:12 -04:00
Jeff Squyres	e126d2cd18	Merge pull request #1584 from bgoglin/master Update hwloc to v1.11.3	2016-05-28 11:01:54 -04:00
Nathan Hjelm	d8fd3a411a	Merge pull request #1725 from hjelmn/request_fixes ompi/request: fix bugs in MPI_Wait_some and MPI_Wait_any	2016-05-27 13:47:49 -06:00
Nathan Hjelm	0591139f49	ompi/request: fix bugs in MPI_Wait_some and MPI_Wait_any This commit fixes two bugs in MPI_Wait_any: - If all requests are inactive then the sync wait would hang forever because no requests are attached to the sync. - The request pointer was pointing to the request before the completed request which caused the wrong request to be freed or marked inactive. MPI_Wait_some had a similar issue if all the requests were pending. These issues were identified by MTT. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-27 12:36:10 -06:00
Nathan Hjelm	3974987ba3	Merge pull request #1723 from hjelmn/warning_fixes win: fix warnings	2016-05-27 12:26:04 -06:00
Nathan Hjelm	0adfb328e1	win: fix warnings Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-05-27 10:14:02 -06:00
rhc54	e5ee7adbe0	Merge pull request #1722 from rhc54/topic/pmixext Enable PMIx external support for both 1.1.4 and 2.0 versions	2016-05-27 08:59:09 -07:00
Ralph Castain	55923eacd3	Stealing some pieces of Josh Hursey's PR #1583 and modifying a bit, allow the opal/pmix external component to handle both PMIx 1.1.4 and PMIx 2.0 versions. Automatically detect the version of the target external library and adjust the only two APIs that changed (PMIx_Init and PMIx_Finalize) Rename temp vars in .m4 to avoid conflict with Travis	2016-05-27 08:06:31 -07:00
Nathan Hjelm	d25b846c01	Merge pull request #1704 from hpcraink/pr/configure_framework Fix configure for FreePGI on OSX	2016-05-26 17:01:08 -06:00
Nathan Hjelm	8c9292d5d1	Merge pull request #1721 from hjelmn/xrc_fix btl/openib: fix XRC WQE calculation	2016-05-26 17:00:31 -06:00
Nathan Hjelm	56bdcd0888	btl/openib: fix XRC WQE calculation Before dynamic add_procs support was committed to master we called add_procs with every proc in the job. The XRC code in the openib btl was taking advantage of this and setting the number of work queue entries (WQE) based on all the procs on a remote node. Since that is no longer the case we can not simply increment the sd_wqe field on the queue pair. To fix the issue a new field has been added to the xrc queue pair structure to keep track of how many wqes there are total on the queue pair. If a new endpoint is added that increases the number of wqes and the xrc queue pair is already connected the code will attempt to modify the number of wqes on the queue pair. A failure is ignored because all that will happen is the number of active send work requests on an XRC queue pair will be more limited. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-26 15:58:31 -06:00
Aurelien Bouteiller	49bd28d0ac	Merge pull request #1714 from hjelmn/scif_exclusivity btl/scif: reduce default exclusivity	2016-05-26 17:53:11 -04:00
Nathan Hjelm	f19c647f21	Merge pull request #1718 from hjelmn/config_fix config: fix typo in mxm configury	2016-05-26 13:19:23 -06:00
Joshua Ladd	1a5fd6bf83	Merge pull request #1719 from ICLDisco/ucx_request_fix Removal of ompi_request_lock from pml/ucx.	2016-05-26 15:09:57 -04:00
Thananon Patinyasakdikul	60d0fbf683	Removal of ompi_request_lock from pml/ucx.	2016-05-26 12:36:58 -04:00
Nathan Hjelm	8c2086995d	config: fix typo in mxm configury A 1 was missing when setting $1_LDFLAGS leading to erroneous items in the wrapper cflags. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-26 10:28:07 -06:00
Nathan Hjelm	87ea9be863	Merge pull request #1715 from hjelmn/ugni_overhead btl/ugni: reduce overhead of progress function	2016-05-26 10:17:00 -06:00
Gilles Gouaillardet	46710ba151	travis: fix a typo and create bogus directories to avoid compiler warnings	2016-05-26 15:28:10 +09:00
George Bosilca	90f294096e	Remove more references to the request mutex. Regarding BFO it should be mentionned that this component is currently unmaintained, and that despite my efforts I could not make it compile (it would not compile before this patch either).	2016-05-25 23:27:06 -04:00
Nathan Hjelm	5d322170a0	Merge pull request #1716 from hjelmn/request_fixes Request fixes	2016-05-25 18:14:03 -06:00
Nathan Hjelm	9d439664f0	pml/yalla: update for request changes This commit brings the pml/yalla component up to date with the request rework changes. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 15:42:53 -06:00
Nathan Hjelm	8445c885ce	pml/cm: update for request changes This fixes a hang caused by the request refactor work. The cm pml was not updated and was hanging is most cases. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 15:35:32 -06:00
Nathan Hjelm	dbfab94ede	atomic/mxm: rename symbol that is a duplicate of one in atomic/ucx This fixes an error when building with --enable-static. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 15:34:40 -06:00
Nathan Hjelm	99627319f0	btl/ugni: reduce overhead of progress function This commit reduces the overhead of calling the ugni progress function. It does the following: - Check for new connections once every eight calls. - Do not call remote smsg progress unless we are connected to at least one remote peer. - Do not call rdma progress unless at least one rdma fragment is outstanding. - Check endpoint wait list size before obtaining a lock. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 14:27:34 -06:00
Nathan Hjelm	5caf12cd9b	btl/scif: reduce default exclusivity This commit reduces the default exclusivity so that btl/scif is not used for send/recv over other shared memory transports. Fixes open-mpi/ompi#1712 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 14:25:07 -06:00
Nathan Hjelm	8e1d59aea8	Merge pull request #1708 from hjelmn/c__fix request: fix compilation error	2016-05-25 10:48:02 -06:00
Nathan Hjelm	ef11ba9394	request: fix compilation error The request.h header is unfortunately included files in the C++ bindings. C++ does not allow assigning from void * to another pointer without a cast. This commit adds the cast. We can clean this up when the C++ bindings are deleted. Fixes open-mpi/ompi#1707 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 09:52:23 -06:00
Joshua Ladd	ce783a9ebf	Merge pull request #1706 from vspetrov/coll_hcoll_req_type_bugfix coll/hcoll: bugfix: initialize req_type field	2016-05-25 10:56:33 -04:00
Valentin Petrov	5ff6372886	coll/hcoll: bugfix: initialize req_type field If left uninitialized then segfault is possible in MPI_Waitall in the case the field by chance equals OMPI_REQUEST_GEN.	2016-05-25 15:38:01 +03:00
Rainer Keller	3727cba9bb	Fix compilation for FreePGI on OSX Our checks and the ones of libevent are somewhat flawed. If adding multiple "-framework" to CXXFLAGS or CFLAGS, we strip the keyword from the command-line, not good. libevent however assumes plain gcc without testing properly that the compiler supports -Wno-deprecated-declarations.	2016-05-25 09:12:39 +02:00
George Bosilca	2b868c4952	Fix MPI datatype args. Compensate for the datatype ID that we add to the array.	2016-05-24 23:36:54 -04:00
Jeff Squyres	dd9a819a1c	odls_default: do not opal_output() while creating a process! It is verbotten to use opal_output() after the fork() but before the exec()! It results in all manner of undefined behavior. For example, on some OS X systems, if you run a trivial "hello world" MPI program with a high level of ODLS verbosity: ```sh $ mpirun -np 3 --mca odls_base_verbose 100 ./hello_c ``` You will see a bunch of output from the mpirun ODLS base, but then it may hang in odls_default_module.c:do_child() -- after the fork() but before the exec() -- while trying to opal_output() some debugging statements. The solution is to remove these extraneous opal_output() statements. Indeed, the ODLS base is already outputting the same information that these opal_output() statements are trying to emit, anyway. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-24 21:28:57 -04:00
Nathan Hjelm	461ca1203b	Merge pull request #1703 from hjelmn/grdma_cuda_fix rcache/grdma: fix typo in cuda code	2016-05-24 18:51:22 -06:00
bosilca	b90c83840f	Refactor the request completion (#1422 ) * Remodel the request. Added the wait sync primitive and integrate it into the PML and MTL infrastructure. The multi-threaded requests are now significantly less heavy and less noisy (only the threads associated with completed requests are signaled). * Fix the condition to release the request.	2016-05-24 18:20:51 -05:00
Nathan Hjelm	af52dad8f8	rcache/grdma: fix typo in cuda code Fixes open-mpi/ompi#1702 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-24 15:56:39 -06:00
Nathan Hjelm	1d3110471c	Merge pull request #1697 from hjelmn/acc_order win: add support for accumulate_ordering info key	2016-05-24 14:34:05 -06:00
Nathan Hjelm	5126da5377	win: add support for accumulate_ordering info key This commit adds support for the MPI-3.1 accumulate_ordering info key. The default value is rar,war,raw,waw and is supported using an MCA variable flag enumerator. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-24 11:13:30 -06:00

1 2 3 4 5 ...

25147 Коммитов