openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	c3adcb05eb	Miscellaneous compiler warnings fixes Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-23 11:45:30 -07:00
Artem Polyakov	714c8c7381	Merge pull request #4957 from open-mpi/host_filtering plm/base: fixed the hosts filtering	2018-03-23 10:27:04 -07:00
Jeff Squyres	f66ac43fbc	opal/util: fix CID 1430381 Fix minor resource leak. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-23 08:48:11 -07:00
Nathan Hjelm	5f7ff5307e	fcoll/two_phase: do not use removed function (MPI_Address) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-23 08:43:24 -06:00
Boris Karasev	6afc7099a0	plm/base: fixed the hosts filtering Reseting the `ORTE_NODE_FLAG_MAPPED` flag after hosts filtering, this flag is used subsequently and can be affect to the node mapping logic Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-03-23 09:41:16 +03:00
Jeff Squyres	1e56023ea4	Merge pull request #4951 from jsquyres/pr/contribution-consolidation CONTRIBUTING: Consolidate the 2 files	2018-03-22 10:53:08 -05:00
Jeff Squyres	8ada4e48a5	CONTRIBUTING: Consolidate the 2 files We accidentally had 2 CONTRIBUTING.md files. Consolidate the content of both of them. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-22 08:56:35 -05:00
Jeff Squyres	023a4a82d3	Merge pull request #4942 from jsquyres/pr/tcp-btl-help-message-updates TCP help message updates	2018-03-22 08:53:04 -05:00
Jeff Squyres	0f8077ace6	oob/tcp: add show_help message about version mismatch Be more explicit about version mismatch between ORTE processes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-21 20:18:28 -07:00
Jeff Squyres	a15d8233c9	Merge pull request #3434 from dsharma283/pr-3431 ompi/opal: add support for HDR link speeds	2018-03-21 21:57:20 -05:00
Jeff Squyres	40afd525f8	btl/tcp: make error messages more specific Convert some verbose messages to opal_show_help() messages. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-21 19:34:03 -07:00
Jeff Squyres	e0d86b1c72	opal/util/fd: add opal_fd_get_peer_name(() Returns a string name (either a resolved name or IPv4/IPv6 name in a string if unresolvable. The caller is responsible for freeing the string. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-21 19:34:03 -07:00
Devesh Sharma	90e9b22196	ompi/opal: add support for HDR link speeds This patch enables to use adapters with HDR speeds. issue id 3431 Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>	2018-03-21 19:15:41 -07:00
Edgar Gabriel	c23dff24bc	Merge pull request #4940 from edgargabriel/topic/ompi-cleanup-march-2018 Topic/ompio cleanup march 2018	2018-03-21 13:47:41 -05:00
Edgar Gabriel	36747cca67	io/ompio: disable the fcoll timing by default somehow the flag indicating to gather performance data on collective io operations has changed to 1 accidentally. Should be 0 ( false) by default. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:34:35 -05:00
Edgar Gabriel	aae8c6c6ad	remove addproc sharedfp component never got to move this sharedfp component into anything usable. Can easily be restored if necessary. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:27:01 -05:00
Edgar Gabriel	e703ac2da8	remove plfs components plfs components are at this point not utilized by anybody as far as I know. Easy to bring back if we want to. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:27:01 -05:00
Howard Pritchard	ade280eb7c	Merge pull request #3292 from markalle/pr/ibv_reg_mr__fork IB fork	2018-03-21 09:39:08 -06:00
Boris Karasev	85ce76fa36	Merge pull request #4926 from karasevb/pmix_dstore_enable pmix: dstore returned for direct modex	2018-03-21 14:24:51 +07:00
Boris Karasev	3796307a57	timings: added new timing points Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-03-21 05:16:25 +02:00
Jeff Squyres	5ad796259c	Merge pull request #4931 from jsquyres/pr/contributing-guideliens CONTRIBUTING.md: add Github contribution guidelines	2018-03-20 16:49:25 -05:00
Jeff Squyres	ed3aa63a96	CONTRIBUTING.md: add Github contribution guidelines We have a small number of requirements for contributions (e.g., "Signed-off-by"), so let's make sure that people have an easy way of knowing these things. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-20 07:10:38 -07:00
Jeff Squyres	dd620049cd	Merge pull request #4925 from jsquyres/pr/mpool-memkind-typo-fix mpool/memkind: fix typo in partition page sizes	2018-03-19 22:07:00 -05:00
Boris Karasev	dca3dd2ea4	pmix: dstore returned for direct modex Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-03-20 04:56:48 +02:00
Jeff Squyres	cc4bb433bc	mpool/memkind: fix typo in partition page sizes Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-19 16:33:08 -05:00
Ralph Castain	9eb426e288	Merge pull request #4924 from karasevb/pmix_fix_dmdx Sync to PMIx master PR pmix/pmix#697	2018-03-19 06:24:06 -07:00
Boris Karasev	36a0c6a794	pmix: fixed the direct modex request This commit fixes the case when local client asks for the key from the process on the remote node. The local server don't have commit count for remote ranks, it is maintained by another PMIx server, so commit count should be ignored for remote requests. Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-03-19 11:51:03 +02:00
Brian Barrett	a2d1419185	Merge pull request #4921 from bwbarrett/master-NEWS dist: Sync 2.1.3 NEWS items into master	2018-03-16 21:01:17 -07:00
Brian Barrett	ab19602752	dist: Sync 2.1.3 NEWS items into master Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-03-16 12:28:41 -07:00
bosilca	bf3dd8af19	Merge pull request #4884 from bosilca/topic/fix_wtime Improve the range and accuracy of MPI_Wtime.	2018-03-16 14:09:33 +09:00
Aurelien Bouteiller	e08e580e27	Merge pull request #4916 from abouteiller/topic/scaling.pl-m Scaling.pl: Fix Srun options and wait for DVM launch	2018-03-15 22:06:01 -04:00
Nathan Hjelm	7f4872d483	osc/rdma: performance improvments and bug fixes This commit is a large update to the osc/rdma component. Included in this commit: - Add support for using hardware atomics for fetch-and-op and single count accumulate when using the accumulate lock. This will improve the performance of these operations even when not setting the single intrinsic info key. - Rework how large accumulates are done. They now block on the get operation to fix some bugs discovered by an IBM one-sided test. I may roll back some of the changes if the underlying bug in the original design is discovered. There appear to be no real difference (on the hardware this was tested with) in performance so its probably a non-issue. References #2530. - Add support for an additional lock-all algorithm: on-demand. The on-demand algorithm will attempt to acquire the peer lock when starting an RMA operation. The lock algorithm default has not changed. The algorithm can be selected by setting the osc_rdma_locking_mode MCA variable. The valid values are two_level and on_demand. - Make use of the btl_flush function if available. This can improve performance with some btls. - When using btl_flush do not keep track of the number of put operations. This reduces the number of atomic operations in the critical path. - Make the window buffers more friendly to multi-threaded applications. This was done by dropping support for multiple buffers per MPI window. I intend to re-add that support once the underlying performance bug under the old buffering scheme is fixed. - Fix a bug in request completion in the accumulate, get, and put paths. This also helps with #2530. - General code cleanup and fixes. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-15 14:53:53 -06:00
Aurélien Bouteiller	9e23d24bb4	Scaling.pl: Fix Srun options and wait for DVM launch Flush out the DVM ready notice on stdout Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2018-03-15 00:00:49 -04:00
Jeff Squyres	5f58e7b961	Merge pull request #4910 from jsquyres/pr/reset-opal-cuda-verbose-value opal_datatype_module.c: reset opal_cuda_verbose	2018-03-13 14:01:34 -04:00
Jeff Squyres	2713a24009	opal_datatype_module.c: reset opal_cuda_verbose 999de137ce6 accidentally reset opal_cuda_verbose's default value. This commit puts it back. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-13 10:10:15 -07:00
Jeff Squyres	695b92ec7b	Merge pull request #4906 from blegat/doctypo Fix typo in MPI_Cart_shift doc	2018-03-13 11:29:44 -04:00
Benoît Legat	00600c7cbb	Fix typo in MPI_Cart_shift doc Signed-off-by: Benoît Legat <benoit.legat@gmail.com>	2018-03-13 15:25:42 +01:00
Josh Hursey	ae1d3183f9	Merge pull request #4891 from jjhursey/fix/mpir-symbol-vis Fix MPIR_proctable structure visibility	2018-03-13 08:09:15 -05:00
Edgar Gabriel	50d07e9622	Merge pull request #4900 from edgargabriel/topic/two_phase_data_sieving_fix fcoll/two_phase: data sieving has to occur at offset 0 as well	2018-03-10 12:18:15 -06:00
Edgar Gabriel	da640f98df	fcoll/two_phase: data sieving has to occur at offset 0 as well data sieving has to occur for any offset provided that is larger or equal zero for this implementation to work correctly. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-10 11:23:09 -06:00
Joshua Hursey	ccb4f43c9b	Fix MPIR_proctable structure visibility * The `MPIR_PROCDESC` structure needs to be visible even in optimized builds so that debuggers can attach to `mpirun` and properly read the `MPIR_proctable`. * In the v2.0.x and v2.x series this structure resided in the `orterun` directory and included the `CFLAGS` fix included here. This code moved in the v3.x series and the `CFLAGS` did not move causing this issue. - Instead of applying the debug `CFLAGS` globally to libopen-rte, only apply them to the `orted_submit.c` compile which contains the MPIR symbols. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2018-03-09 21:15:28 -05:00
George Bosilca	0f0c27a184	Allow MPI_PROC_NULL as neighbor. Allowing MPI_PROC_NULL as a neighbor in any topology allows us to add gaps on the send and recv buffers. This does make the traditional neighbor collective have a similar behavior as the V version, but in same time it allows the users to skip the step where they prepare the counts and the displacement array. For more info please take a look at issue #4675. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-03-09 12:20:26 +09:00
Edgar Gabriel	0f345c068a	Merge pull request #4888 from edgargabriel/topic/romio_size0_contiguous_flag io/romio314: mark datatypes of size 0 as contiguous	2018-03-08 13:28:13 -06:00
Jeff Squyres	70c59f78b9	Merge pull request #4883 from bosilca/topic/get_element_fix Topic/get element fix	2018-03-08 10:31:47 -05:00
Edgar Gabriel	c83b47c266	io/romio314: mark datatypes of size 0 as contiguous this commit fixes an issue observed with romio314 and the hdf5 1.10.x testsuite. The ADIOI_Datatype_iscontig() routine in romio314/src/io_romio314_module.c will now return for a datatype of size 0 that it is contiguous, even if the extent of the datatype is non-zero. This avoids a segmentation fault observed in the ADIOI_Flatten routine, and fixes this particular with the hdf5 1.10.x testsuite in OpenMPI with romio314. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-08 09:10:09 -06:00
George Bosilca	9bced03213	Improve the range and accuracy of MPI_Wtime. As discussed on https://github.com/mpi-forum/mpi-issues/issues/77#issuecomment-369663119 the conversion to double in the MPI_Wtime decrease the range and accuracy of the resulting timer. By setting the timer to 0 at the first usage we basically maintain the accuracy for 194 days even for gettimeofday. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-03-08 14:26:02 +09:00
George Bosilca	999de137ce	Fix the datatype debug. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-03-08 03:40:08 +09:00
George Bosilca	7848035195	Update the loop stats. The loop should be updated on each internal iteration. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-03-08 03:18:39 +09:00
Jordan Cherry	2f0e8153a5	Merge pull request #4247 from jocherry/btlTcpLinksBugFix tcp btl: Fix multiple-link connection establishment.	2018-03-07 08:40:37 -08:00
Alex Mikheev	04ec013da9	Merge pull request #4847 from alex-mikheev/topic/oshmem_group_cache_refactor oshmem: refactor group cache	2018-03-04 14:36:32 +02:00

... 4 5 6 7 8 ...

28547 Коммитов