openmpi

Автор	SHA1	Сообщение	Дата
Austen Lauria	edcd6d8aeb	Merge pull request #7146 from bgoglin/master fix typos hlwoc->hwloc	2019-11-13 15:38:00 -05:00
Ralph Castain	b0a487a3c7	Update IOF redirection options Provide both "--output-directory" and "--output-filename" options but do not allow both to be given at the same time. Output-directory allows specification of a directory, with output redirected into files of form "<directory>/<jobid>/rank.<vpid>/stdout[err]". This option also supports two directives: nojobid (removes the jobid directory layer) and nocopy (do not copy the output to the terminal). Output-filename is the "old" behavior that names the output files as "<filename>.rank" with both stdout and stderr redirected into it. This option only supports one directive: nocopy (do not copy the output to the terminal). Fix both the --help and man documentation. Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-11-13 12:13:06 -08:00
Nathan Hjelm	09dd383f8b	Merge pull request #7108 from devreal/btl-ugni-deadlock uGNI: Fix potential deadlock when processing outstanding transfers	2019-11-11 10:56:56 -08:00
George Bosilca	3de636dc6f	Swap the 2 fields to maintain the size of the struct. Thanks @devreal for catching this. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-11-07 11:25:03 -05:00
George Bosilca	59fb02618e	Prevent overflow when dealing with datatype count. This patch fixes #7147 by preventing overflow when multiplying the count and the blocklen. The count reflects MPI count and is therefore bound to the size of an int (it is an uint32_t) while the blocklen can be merged together to represent the largest contiguous memory layout and it is therefore promoted to a size_t. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-11-07 11:25:03 -05:00
Yossi Itigin	40e2fbb093	Merge pull request #7114 from brminich/topic/mlx_scat_tuning COLL/TUNED: Add linear scatter using isend for mlnx platform	2019-11-07 13:38:21 +02:00
Mikhail Brinskii	f2cbd4806e	COLL/TUNED: Add linear scatter using isend for mlnx platform Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>	2019-11-07 11:04:39 +02:00
Howard Pritchard	6fc5a4e033	Merge pull request #7142 from hppritcha/topic/swat_issue_7128 btl/uct: restrict to using UCT 1.7 or older API for now	2019-11-06 16:07:54 -07:00
Howard Pritchard	9d345d9aa0	btl/uct: add UCT API version check to configury related to #7128 The UCX crew is no longer guaranteeing that the UCT API is going to be frozen, so this is kind of a whack-a-mole problem trying to keep the BTL UCT working with various changing UCT APIs. Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2019-11-06 14:27:58 -07:00
Brice Goglin	5c6bd7ea4e	fix typos hlwoc->hwloc Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>	2019-11-06 10:42:36 +01:00
Austen Lauria	1c33d5aecd	Merge pull request #7141 from devreal/shmem_memheap_alloc_band Shmem: use bitwise and instead of logical and to check for allocator capabilities	2019-11-05 18:21:23 -05:00
Nathan Hjelm	30171cbd27	Merge pull request #7144 from hjelmn/btl_uct_fix_compilation_issue_for_ucx_1_7_because_the_api_break_got_into_this_release btl/uct: fix compilation for UCX 1.7.0	2019-11-05 14:54:30 -08:00
Nathan Hjelm	a3026c016a	btl/uct: fix compilation for UCX 1.7.0 Ref #7128 Signed-off-by: Nathan Hjelm <hjelmn@google.com>	2019-11-05 12:53:26 -08:00
Joseph Schuchart	9f2c6a42c3	Shmem: use bitwise and instead of logical and to check for allocator capabilities Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-11-05 15:44:59 +01:00
bosilca	6e3ff5e9b4	Merge pull request #7131 from bosilca/fix/tcp_errors Correctly report TCP connect errors.	2019-10-31 22:12:27 -04:00
George Bosilca	476562752f	Correctly report TCP connect errors. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-10-31 18:33:15 -04:00
Mark Allen	6855ebb84b	Adding -mca comm_method to print table of communication methods This is closely related to Platform-MPI's old -prot feature. The long-format of the tables it prints could look like this: > Host 0 [myhost001] ranks 0 - 1 > Host 1 [myhost002] ranks 2 - 3 > Host 2 [myhost003] ranks 4 > Host 3 [myhost004] ranks 5 > Host 4 [myhost005] ranks 6 > Host 5 [myhost006] ranks 7 > Host 6 [myhost007] ranks 8 > Host 7 [myhost008] ranks 9 > Host 8 [myhost009] ranks 10 > > host \| 0 1 2 3 4 5 6 7 8 > ======\|============================================== > 0 : sm tcp tcp tcp tcp tcp tcp tcp tcp > 1 : tcp sm tcp tcp tcp tcp tcp tcp tcp > 2 : tcp tcp self tcp tcp tcp tcp tcp tcp > 3 : tcp tcp tcp self tcp tcp tcp tcp tcp > 4 : tcp tcp tcp tcp self tcp tcp tcp tcp > 5 : tcp tcp tcp tcp tcp self tcp tcp tcp > 6 : tcp tcp tcp tcp tcp tcp self tcp tcp > 7 : tcp tcp tcp tcp tcp tcp tcp self tcp > 8 : tcp tcp tcp tcp tcp tcp tcp tcp self > > Connection summary: > on-host: all connections are sm or self > off-host: all connections are tcp In this example hosts 0 and 1 had multiple ranks so "sm" was more meaningful than "self" to identify how the ranks on the host are talking to each other. While host 2..8 were one rank per host so "self" was more meaningful as their btl. Above a certain number of hosts (12 by default) the above table gets too big so we shrink to a more abbreviated looking table that has the same data: > host \| 0 1 2 3 4 8 > ======\|==================== > 0 : A C C C C C C C C > 1 : C A C C C C C C C > 2 : C C B C C C C C C > 3 : C C C B C C C C C > 4 : C C C C B C C C C > 5 : C C C C C B C C C > 6 : C C C C C C B C C > 7 : C C C C C C C B C > 8 : C C C C C C C C B > key: A == sm > key: B == self > key: C == tcp Then above 36 hosts we stop printing the 2d table entirely and just print the summary: > Connection summary: > on-host: all connections are sm or self > off-host: all connections are tcp The options to control it are -mca comm_method 1 : print the above table at the end of MPI_Init -mca comm_method 2 : print the above table at the beginning of MPI_Finalize -mca comm_method_max <n> : number of hosts <n> for which to print a full size 2d -mca comm_method_brief 1 : only print summary output, no 2d table -mca comm_method_fakefile <filename> : for debugging only * printing at init vs finalize: The most important difference between these two is that when printing the table during MPI_Init(), we send extra messages to make sure all hosts are connected to each other. So the table ends up working against the idea of on-demand connections (although it's only forcing the n^2 connections in the number of hosts, not the total ranks). If printing at MPI_Finalize() we don't create any connections that aren't already connected, so the table is more likely to have "n/a" entries if some hosts never connected to each other. * how many hosts <n> for which to print a full size 2d table The option -mca comm_method_max <n> can be used to specify a number of hosts <n> (default 12) that controls at what host-count the unabbreviated / abbreviated 2d tables get printed: 1 - n : full size 2d table n+1 - 3n : shortened 2d table 3n+1 - inf : summary only, no 2d table * brief The option -mca comm_method_brief 1 can be used to skip the printing of the 2d table and only show the short summary * fakefile This is a debugging option that allows easeir testing of all the printout routines by letting all the detected communication methods between the hosts be overridden by fake data from a file. The source of the information used in the table is the .mca_component_name In the case of BTLs, the module always had a .btl_component linking back to the component. The vars mca_pml_base_selected_component and ompi_mtl_base_selected_component offer similar functionality for pml/mtl. So with the ability to identify the component, we can then access the component name with code like this mca_pml_base_selected_component.pmlm_version.mca_component_name See the three lookup_{pml,mtl,btl}_name() functions in hook_comm_method_fns.c, and their use in comm_method() to parse the strings and produce an integer to represent the connection type being used. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2019-10-31 16:23:57 -04:00
Joseph Schuchart	5471d59af4	Fix OPAL_ALIGN_MIN to work on 32-bit systems Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-10-30 10:33:10 +01:00
Gilles Gouaillardet	631a43581f	Merge pull request #7117 from ggouaillardet/topic/f08_bind_c_constants_revamp_misc_fixes fortran/use-mpi-f08: misc fixes	2019-10-30 10:42:33 +09:00
Edgar Gabriel	007b773cd7	Merge pull request #7122 from edgargabriel/pr/simple-aggr-mode-fix common/ompio: fix calculation in simple-grouping option	2019-10-29 13:38:24 -05:00
Josh Hursey	312c55edaa	Merge pull request #7092 from sam6258/smiller_rsh_chdir plm/rsh: Add chdir option to change directory before orted exec	2019-10-29 13:34:25 -05:00
Edgar Gabriel	ad5d0df4e9	common/ompio: fix calculation in simple-grouping option This is based on a bug reported on the mailing list using a netcdf testcase. The problem occurs if processes are using a custom file view, but on some of them it appears as if the default file view is being used. Because of that, the simple-grouping option lead to different number of aggregators used on different processes, and ultimately to a deadlock. This patch fixes the problem by not using the file_view size anymore for the calculation in the simple-grouping option, but the contiguous chunk size (which is identical on all processes). Fixes issue #7109 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-10-29 12:30:41 -05:00
Gilles Gouaillardet	fda4d040da	fortran/use-mpi-f08: misc fixes - fix typos from open-mpi/ompi@b10a60a5a9 - remove remaining references to OMPI_PROTECTED from open-mpi/ompi@df6d763a53 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-29 15:00:51 +09:00
Jeff Squyres	8343a289f2	Merge pull request #7112 from wbailey2/pr/fix-HACKING Changed the final URL to https://github.com/westes/flex	2019-10-28 09:25:21 -04:00
Jeff Squyres	e59e6f714c	Merge pull request #7105 from ggouaillardet/topic/f08_bind_c_constants_revamp fortran/use-mpi-f08: revamp mpi_f08 constants	2019-10-28 09:14:23 -04:00
Joseph Schuchart	02dd877d8a	RDMA osc: perform CAS in shared memory if possible Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-10-28 11:24:16 +01:00
William Bailey	caf1d9292c	Changed the final URL to https://github.com/westes/flex Signed-off-by: William Bailey <wbailey2@nd.edu>	2019-10-27 22:33:50 -04:00
Gilles Gouaillardet	51e23f8cb6	fortran/use-mpi-f08: remove bind(C) constants. Remove unused bind(C) constants in ompi/mpi/fortran/use-mpi-f08/constants.{c,h} (and break ABI compatibility). Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-28 10:28:17 +09:00
Gilles Gouaillardet	df6d763a53	configury: remove references to unused OMPI_PROTECTED Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-28 10:28:17 +09:00
Gilles Gouaillardet	b10a60a5a9	fortran/use-mpi-f08: revamp constant declarations In order to work around an issue with flang based compilers, avoid declaring bind(C) constants and use plain Fortran parameter instead. For example, type(MPI_Comm), bind(C, name="ompi_f08_mpi_comm_world") OMPI_PROTECTED :: MPI_COMM_WORLD is changed to type(MPI_Comm), parameter :: MPI_COMM_WORLD = MPI_Comm(OMPI_MPI_COMM_WORLD) Note that in order to preserve ABI compatibility, ompi/mpi/fortran/use-mpi-f08/constants.{c,h} have been kept even if its symbols are no more referenced by Open MPI. Refs. open-mpi/ompi#7091 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-28 10:01:17 +09:00
Joseph Schuchart	c09ca039b4	uGNI: Fix potential deadlock when processing outstanding transfers Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-10-26 12:21:17 +02:00
Austen Lauria	aa8be9c12d	Merge pull request #6284 from devreal/ompi-rdma-memalign Ensure proper alignment of memory provided by MPI	2019-10-25 12:27:58 -04:00
Austen Lauria	ecd990a67c	Merge pull request #6933 from devreal/osc-ucx-excl-lock UCX osc: properly release exclusive lock to avoid lockup	2019-10-25 09:16:51 -04:00
Austen Lauria	96f55b0b32	Merge pull request #7096 from jsquyres/pr/fix-alps-configure-output opal_check_alps: fix configure output	2019-10-25 09:02:25 -04:00
Jeff Squyres	d8f17aea69	Merge pull request #7097 from mcoil1/pr/README-fix2 README: Use "--" notation for CLI options	2019-10-18 16:32:29 -04:00
Maxwell Coil	7e07346524	README: Use "--" notation for CLI options Signed-off-by: Maxwell Coil <mcoil@nd.edu>	2019-10-18 15:44:23 -04:00
Jeff Squyres	26705efad0	opal_check_alps: fix configure output There was a path where OPAL_CHECK_ALPS would exit its testing but still leave `opal_check_cray_alps_happy` blank. Fix that by setting it to "no". Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-18 11:30:00 -07:00
Edgar Gabriel	dce203ffc6	Merge pull request #7057 from edgargabriel/topic/romio321-status-set-elements-fix MPIR_Status_set_bytes: fix for large counts	2019-10-18 08:16:36 -05:00
Nathan Hjelm	b1ef5a40fa	Merge pull request #7016 from hjelmn/fix_btl_uct_from_yet_another_unannounced_api_break_in_the_openucx_uct_layer btl/uct: add support for OpenUCX v1.8 API changes	2019-10-17 06:27:18 -07:00
Scott Miller	c1b8599528	plm/rsh: Add chdir option to change directory before orted exec Signed-off-by: Scott Miller <scott.miller1@ibm.com>	2019-10-15 17:19:30 -04:00
Jeff Squyres	b6c4d5c118	Merge pull request #7060 from jsquyres/pr/usnic-mca-updates BTL usnic MCA updates	2019-10-15 10:48:10 -04:00
Jeff Squyres	e1e6d8b85e	Merge pull request #7076 from ftab/pr/my-superlative-fix README: Remove info for plugins that aren't used anymore	2019-10-10 14:52:36 -04:00
Jeff Squyres	65fd12feff	Merge pull request #7081 from msbrowning/pr/fixed-README Removed text block from line 883 of README.	2019-10-10 14:52:23 -04:00
Mark Browning	77b3ff9d38	Remove the stale cr MPI extension Also removed text block from line 883 of README. Signed-off-by: Mark Browning <marksbrowning3@gmail.com>	2019-10-10 13:24:30 -04:00
Jeff Squyres	f7ee4463b3	Merge pull request #7079 from CalebProvost/hacktoberfest Edit README	2019-10-10 13:18:54 -04:00
Jeff Squyres	896ce76b64	Merge pull request #7082 from kizill/master Fix ipv6 improper address copy bug	2019-10-10 12:01:44 -04:00
Jeff Squyres	8f3583d3bd	Merge pull request #7073 from Joe-Downs/pr/fix-README README: edit "dist_graph topologies" to "communicator topologies"	2019-10-10 11:55:43 -04:00
Jeff Squyres	d736253079	Merge pull request #7074 from classicsman/pr/fix-README Deleted paragraph	2019-10-10 11:50:38 -04:00
Jeff Squyres	836a0766ae	Merge pull request #7072 from bfitzgit23/pr/fix-README README-fixed-bfitzgit23	2019-10-10 11:39:06 -04:00
CalebProvost	634054fb37	README: minor grammar fixes Signed-off-by: CalebProvost <DHX664@gmail.com>	2019-10-10 11:23:55 -04:00

... 2 3 4 5 6 ...

30330 Коммитов