openmpi

Автор	SHA1	Сообщение	Дата
Mark Allen	6855ebb84b	Adding -mca comm_method to print table of communication methods This is closely related to Platform-MPI's old -prot feature. The long-format of the tables it prints could look like this: > Host 0 [myhost001] ranks 0 - 1 > Host 1 [myhost002] ranks 2 - 3 > Host 2 [myhost003] ranks 4 > Host 3 [myhost004] ranks 5 > Host 4 [myhost005] ranks 6 > Host 5 [myhost006] ranks 7 > Host 6 [myhost007] ranks 8 > Host 7 [myhost008] ranks 9 > Host 8 [myhost009] ranks 10 > > host \| 0 1 2 3 4 5 6 7 8 > ======\|============================================== > 0 : sm tcp tcp tcp tcp tcp tcp tcp tcp > 1 : tcp sm tcp tcp tcp tcp tcp tcp tcp > 2 : tcp tcp self tcp tcp tcp tcp tcp tcp > 3 : tcp tcp tcp self tcp tcp tcp tcp tcp > 4 : tcp tcp tcp tcp self tcp tcp tcp tcp > 5 : tcp tcp tcp tcp tcp self tcp tcp tcp > 6 : tcp tcp tcp tcp tcp tcp self tcp tcp > 7 : tcp tcp tcp tcp tcp tcp tcp self tcp > 8 : tcp tcp tcp tcp tcp tcp tcp tcp self > > Connection summary: > on-host: all connections are sm or self > off-host: all connections are tcp In this example hosts 0 and 1 had multiple ranks so "sm" was more meaningful than "self" to identify how the ranks on the host are talking to each other. While host 2..8 were one rank per host so "self" was more meaningful as their btl. Above a certain number of hosts (12 by default) the above table gets too big so we shrink to a more abbreviated looking table that has the same data: > host \| 0 1 2 3 4 8 > ======\|==================== > 0 : A C C C C C C C C > 1 : C A C C C C C C C > 2 : C C B C C C C C C > 3 : C C C B C C C C C > 4 : C C C C B C C C C > 5 : C C C C C B C C C > 6 : C C C C C C B C C > 7 : C C C C C C C B C > 8 : C C C C C C C C B > key: A == sm > key: B == self > key: C == tcp Then above 36 hosts we stop printing the 2d table entirely and just print the summary: > Connection summary: > on-host: all connections are sm or self > off-host: all connections are tcp The options to control it are -mca comm_method 1 : print the above table at the end of MPI_Init -mca comm_method 2 : print the above table at the beginning of MPI_Finalize -mca comm_method_max <n> : number of hosts <n> for which to print a full size 2d -mca comm_method_brief 1 : only print summary output, no 2d table -mca comm_method_fakefile <filename> : for debugging only * printing at init vs finalize: The most important difference between these two is that when printing the table during MPI_Init(), we send extra messages to make sure all hosts are connected to each other. So the table ends up working against the idea of on-demand connections (although it's only forcing the n^2 connections in the number of hosts, not the total ranks). If printing at MPI_Finalize() we don't create any connections that aren't already connected, so the table is more likely to have "n/a" entries if some hosts never connected to each other. * how many hosts <n> for which to print a full size 2d table The option -mca comm_method_max <n> can be used to specify a number of hosts <n> (default 12) that controls at what host-count the unabbreviated / abbreviated 2d tables get printed: 1 - n : full size 2d table n+1 - 3n : shortened 2d table 3n+1 - inf : summary only, no 2d table * brief The option -mca comm_method_brief 1 can be used to skip the printing of the 2d table and only show the short summary * fakefile This is a debugging option that allows easeir testing of all the printout routines by letting all the detected communication methods between the hosts be overridden by fake data from a file. The source of the information used in the table is the .mca_component_name In the case of BTLs, the module always had a .btl_component linking back to the component. The vars mca_pml_base_selected_component and ompi_mtl_base_selected_component offer similar functionality for pml/mtl. So with the ability to identify the component, we can then access the component name with code like this mca_pml_base_selected_component.pmlm_version.mca_component_name See the three lookup_{pml,mtl,btl}_name() functions in hook_comm_method_fns.c, and their use in comm_method() to parse the strings and produce an integer to represent the connection type being used. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2019-10-31 16:23:57 -04:00
Edgar Gabriel	ad5d0df4e9	common/ompio: fix calculation in simple-grouping option This is based on a bug reported on the mailing list using a netcdf testcase. The problem occurs if processes are using a custom file view, but on some of them it appears as if the default file view is being used. Because of that, the simple-grouping option lead to different number of aggregators used on different processes, and ultimately to a deadlock. This patch fixes the problem by not using the file_view size anymore for the calculation in the simple-grouping option, but the contiguous chunk size (which is identical on all processes). Fixes issue #7109 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-10-29 12:30:41 -05:00
Austen Lauria	aa8be9c12d	Merge pull request #6284 from devreal/ompi-rdma-memalign Ensure proper alignment of memory provided by MPI	2019-10-25 12:27:58 -04:00
Austen Lauria	ecd990a67c	Merge pull request #6933 from devreal/osc-ucx-excl-lock UCX osc: properly release exclusive lock to avoid lockup	2019-10-25 09:16:51 -04:00
Edgar Gabriel	dce203ffc6	Merge pull request #7057 from edgargabriel/topic/romio321-status-set-elements-fix MPIR_Status_set_bytes: fix for large counts	2019-10-18 08:16:36 -05:00
Geoff Paulsen	4e1e6f8972	Merge pull request #6993 from awlauria/fix_warnings_master Fix miscellaneous compiler warnings.	2019-10-09 09:17:02 -05:00
Gilles Gouaillardet	33361aa124	pml/ucx: correctly handle zero size datatypes zero-size derived datatypes are now flagged as OPAL_DATATYPE_FLAG_CONTIGUOUS so update mca_pml_ucx_init_datatype() to correctly handle them. Since 'size' is a 'size_t', the assertion can simply be removed. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-10-09 16:54:00 +09:00
Edgar Gabriel	8a3abbf803	MPIR_Status_set_bytes: fix for large count sizes Change the ncounts argument to MPI_Count and use MPI_Status_set_elements_x for enabling read/write operations beyond the 2GB limit. Thanks to Richard Warren from the HDF5 group for reporting the issue and providing the suggested fix for romio. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-10-08 10:47:02 -05:00
Edgar Gabriel	a130f569df	comomn_ompio_file_read/write: fix 2GB limiting issue individual read/write operations exceeding 2GB fail in ompio due to improper conversions from size_t to int in two different locations. This commit fixes an issue reported by Richard Warren from the HDF5 group. Fixes Issue #397 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-10-05 09:50:02 -05:00
Austen Lauria	0d4004cc3c	Fix miscellaneous compiler warnings. Signed-off-by: Austen Lauria <awlauria@us.ibm.com>	2019-10-01 16:27:25 -04:00
Howard Pritchard	d6d73b7724	mtl/ofi: replace OMPI_UNLIKELY with OPAL version one off patch for v4.0.x. for some reason commit on master didn't have this problem. Signed-off-by: Howard Pritchard <howardp@lanl.gov> (cherry picked from commit 5f3dbdb5c8a94a4f426ecca1a3a91c83035f956c) Note that this commit is actually a cherry-pick from the v4.0.x branch. This is the opposite direction than what we nornmally do: we usually commit to master first and then cherry-pick to the release branches (vs. the other way around). As is probably evident from the original commit message above, through a comedy of errors, this commit was actually applied to the v4.0.x branch first and then cherry-picked back to master (i.e., the problem did exist in the original master commit 3aca4af548a3d781b6b52f89f4d6c7e66d379609, but it was not recongized at the time). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-10-01 09:52:27 -07:00
Joseph Schuchart	c385c927fb	Ensure proper alignment of memory provided by MPI Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-10-01 11:54:29 +02:00
Jeff Squyres	ee3564a2dc	Merge pull request #7004 from mwheinz/REFS6976-master REF6976 Silent failure of OMPI over OFI with large messages sizes	2019-09-23 17:31:21 -04:00
Michael Heinz	3aca4af548	REF6976 Silent failure of OMPI over OFI with large messages sizes INTERNAL: STL-59403 The OFI (libfabric) MTL does not respect the maximum message size parameter that OFI provides in the fi_info data. This patch adds this missing max_msg_size field to the mca_ofi_module_t structure and adds a length check to the low-level send routines. Change-Id: I05aa71d332f2df897133b30c28bf37d98f061996 Signed-off-by: Michael Heinz <michael.william.heinz@intel.com> Reviewed-by: Adam Goldman <adam.goldman@intel.com> Reviewed-by: Brendan Cunningham <brendan.cunningham@intel.com>	2019-09-23 15:23:48 -04:00
Geoff Paulsen	5ff6cb6e6a	Merge pull request #6756 from markalle/romio_info romio info: letting romio keep its internal setup	2019-09-05 15:43:07 -05:00
Raafat Feki	7877743784	Merge pull request #6857 from raafatfeki/pr/ompio_coll_write_clean Pr/ompio_fcoll_write_clean	2019-09-04 11:06:56 -05:00
Mark Allen	14e3d7b8b0	romio info: letting romio keep its internal setup I'm restoring the info function pointers to the IO module but allowing the function pointers to be NULL (eg in ompio). And letting romio321 set its function pointers for those routines. This means the info system uses the new OMPI-level info system for most things, but skips it and uses the pre-existing romio info system just for the romio module. It's possible to convert romio, but I went a ways down that path and found it kind of convoluted. Having pointers from the lower level ADIO_File back to the higher level ompi_file_t wasn't too bad, but I got stuck trying to figure out where/how to register the infosubscribe_subscribe callbacks vs the way initial k/v values are scattered around the romio code currently. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2019-09-03 14:08:19 -04:00
Nathan Hjelm	c4d0752036	Merge pull request #6803 from guserav/fix-osc-sm-post-32-bit-atomics Fix osc sm posts when only 32 bit atomics support	2019-08-27 18:23:48 -07:00
Joseph Schuchart	a5cc380416	UCX osc: properly release exclusive lock to avoid lockup Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-08-27 23:01:35 +02:00
Valentin Petrov	a0d99ad190	Coll/hcoll: fixes hcoll non-blocking colls support open-mpi/ompi@0fe756d416 Introduced a bug in coll/hcoll component. The ompi_requests allocated by libhcoll would be treated as coll_base_nbc_request during ompi_coll_base_retain_<> call. Afterwards this would lead to a segv in the request cleanup. Fix: since libhcoll interface does not distinguish between the blocling/non-blocking requests use coll_base_nbc_request all the time and initialize it properly in coll/hcoll/get_coll_handle(). It is still within 2 cache lines. Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2019-08-27 17:22:58 +03:00
raafatfeki	2c6a5eed29	fcoll/dynamic_gen2: Adjustment of displacement index in collective write Within the shuffle iteration, the aggregators have to set a displacement array needed to receive data from other processes. The array had 1 extra element. We adjust the displacement index to match the number of elements. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2019-08-26 10:03:23 -05:00
raafatfeki	f45e9cfdbe	fcoll/vulcan: Adjustment of displacement index in collective write Within the shuffle iteration, the aggregators have to set a displacement array needed to receive data from other processes. The array had 1 extra element. We adjust the displacement index to match the number of elements. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2019-08-26 10:03:23 -05:00
George Bosilca	2930bd9d21	Whitespace cleanup No code or logic changes. Signed-off-by: George Bosilca <bosilca@icl.utk.edu> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-08-14 11:06:47 -04:00
Artem Polyakov	d58c59eb71	Merge pull request #6893 from janjust/osc_error_path_fix osc/ucx: Fix error path	2019-08-12 21:23:57 -07:00
Jeff Squyres	ae1f7e0c3b	Merge pull request #6879 from mwheinz/REF6877-master PSM MTL is obsolete and should be removed	2019-08-12 15:08:25 -04:00
Tomislav Janjusic	d5f6b088ae	osc/ucx: Fix error path Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>	2019-08-12 21:54:01 +03:00
Gilles Gouaillardet	63d3ccde9d	coll/base: only retain datatypes/op if the request has not yet completed a non blocking collective might return ompi_request_null, so we should not retain anything in that case. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-08-09 09:57:56 +09:00
Gilles Gouaillardet	0862c409f1	coll/base: cleanup ompi_coll_base_nbc_request_t elements Since ompi_coll_base_nbc_request_t is to be used in an opal_free_list_t, it must be returned into a "clean" state. So cleanup some data in the callback completion subroutines. This fixes a regression introduced in open-mpi/ompi@0fe756d416 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-08-08 10:48:06 +09:00
Gilles Gouaillardet	f8eef0fde9	coll/libnbc: fixes ompi ompi_coll_libnbc_request_t parent base ompi_coll_libnbc_request_t on top of ompi_coll_base_nbc_request_t to correctly support the retention of datatypes/operators This fixes a regression introduced in open-mpi/ompi@0fe756d416 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-08-08 10:47:48 +09:00
Michael Heinz	0348d14ff3	PSM MTL is obsolete and should be removed The PSM MTL for Intel's TrueScale Infiniband HCAs is not being actively maintained and should be removed from the master branch. Fixes issue: #6877 Signed-off-by: Michael Heinz <michael.william.heinz@intel.com:	2019-08-07 11:43:03 -04:00
Yossi Itigin	ec9def1406	Merge pull request #6864 from hoopoepg/topic/ucx-ppn-hint UCX: added PPN hint for UCX context	2019-08-07 13:45:38 +03:00
Edgar Gabriel	34b06dc8bd	io_ompio_file_open: fix offset calculation with SEEK_END and SEEK_CUR. fixes an issue reported by Wei-keng Liao Fixes Issue #6858 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-08-05 15:56:25 -05:00
Sergey Oblomov	43186e494b	UCX: added PPN hint for UCX context - added PPN hint for UCX context init Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2019-08-05 18:07:06 +03:00
Nysal Jan K A	3c45542c51	Merge pull request #6840 from nysal/ucx_accumulate_fix osc/ucx: Fix data corruption with non-contiguous accumulates	2019-07-25 22:11:52 +05:30
Yossi Itigin	98d0ecfe14	Merge pull request #6814 from brminich/tuned_all2all_select COLL/TUNED: Update alltoall selection rule for mellanox platform	2019-07-25 17:51:55 +03:00
Mikhail Brinskii	65618f8db8	COLL/TUNED: Minor var names/comments fixes Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>	2019-07-24 10:23:38 +00:00
Nysal Jan K.A	3529d44702	osc/ucx: Fix data corruption with non-contiguous accumulates Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>	2019-07-24 13:07:59 +05:30
Nysal Jan K.A	14808922cf	osc/ucx: Add support for the no_locks info key Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>	2019-07-18 17:29:01 +05:30
Mikhail Brinskii	404c480068	COLL/TUNED: Update alltoall selection rule for mlx Use linear with sync alltoall algorithm for certain message/comm size ranges. Does not affect default fixed decision, unless HPCX (with its custom parameters) is used or corresponding mca is set. Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>	2019-07-13 23:27:40 +03:00
Gilles Gouaillardet	0fe756d416	mpi: retain operation and datatype in non blocking collectives MPI standard states a user MPI_Op and/or user MPI_Datatype can be free'd after a call to a non blocking collective and before the non-blocking collective completes. Retain user (only) MPI_Op and MPI_Datatype when the non blocking call is invoked, and set a request callback so they are free'd when the MPI_Request completes. Thanks Thomas Ponweiser for reporting this Fixes open-mpi/ompi#2151 Fixes open-mpi/ompi#1304 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-07-12 09:15:45 +09:00
guserav	3c9f4e6823	Fix osc sm posts when only 32 bit atomics support Signed-off-by: guserav <erik.zeiske@hpe.com>	2019-07-09 15:13:25 -07:00
Nysal Jan K.A	fe4ef147f8	pml/ucx: Fix the max tag and context id values Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>	2019-07-03 14:33:01 +05:30
Geoff Paulsen	f1b2a09675	Merge pull request #6649 from devreal/rdma-fetchop-local OSC rdma: make sure accumulating in shared memory is safe	2019-06-28 14:46:21 -05:00
Artem Polyakov	6678ac0f55	osc/ucx: Fix possible win creation/destruction race condition To avoid fully initializing the osc/ucx component for MPI application that are not using One-Sided functionality, the initialization happens at the first MPI window creation. This commit ensures atomicity of global state modifications. Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2019-06-20 09:05:03 -07:00
Artem Polyakov	0857742624	osc/ucx: Fix worker pool finalization Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2019-06-20 09:05:03 -07:00
Nathan Hjelm	560886f095	Merge pull request #6746 from devreal/osc_winalloc_err OSC rdma win allocate: propagate errors to avoid deadlocks	2019-06-18 17:57:53 -07:00
Harald Klimach	e222a04ae5	Suggestion to fix division by zero in file view. In common_ompi_aggregators calc_cost routine: do not cast the real division to an int intermediately. This patch removes the obsolete int variable c and assigns the result of the P_a/P_x division directly to n_as. With the intermediate int c variable, n_as gets 0 if P_a < P_x, resulting in a division by 0 when computing n_s. Signed-off-by: Harald Klimach <harald.klimach@uni-siegen.de>	2019-06-13 18:47:32 +02:00
Jeff Squyres	7c3aeb3061	Merge pull request #6686 from alex-anenkov/coll-iallreduce-recursivedoubling coll/libnbc: add recursive doubling algorithm for MPI_Iallreduce	2019-06-10 10:09:51 -04:00
Yossi Itigin	a46e5da3ca	Merge pull request #6744 from brminich/topic/all2all_linear_sync_fix COLL/BASE: Fix linear sync all2all	2019-06-09 21:23:38 +03:00
Joseph Schuchart	8f27cc26d9	OSC rdma win allocate: synchronize error codes across shared memory group Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-06-07 11:03:21 +02:00

1 2 3 4 5 ...

6986 Коммитов