openmpi

Автор	SHA1	Сообщение	Дата
Mikhail Kurnosov	4cbcff7fcd	coll/base: add recursive doubling algorithm for MPI_Reduce_scatter_block Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-04-23 11:02:31 +07:00
raafatfeki	91e028f7fd	fcoll/dynamic_gen2: Reduce number of realloc calls keep track of the sizeof the blocklen_per_process and displs_per_process on the aggregator datastructure to minimze the number of realloc function calls required in the shuffle_init operation. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-04-20 10:13:57 -05:00
Nathan Hjelm	84765001aa	io/romio: do not use removed functions This commit attempts to update the romio io component to not use functions removed in MPI-3.0 (2012). This is a first cut and will probably need to be reviewed for correctness. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-04-16 12:06:52 -06:00
Nathan Hjelm	4d876ec6fe	io/romio314: fix minmax datatypes romio assumes that all predefined datatypes are contiguous. Because of the (terribly named) composed datatypes MPI_SHORT_INT, MPI_DOUBLE_INT, MPI_LONG_INT, etc this is an incorrect assumption. The simplest way to fix this is to override the MPI_Type_get_envelope and MPI_Type_get_contents calls with calls that will work on these datatypes. Note that not all calls to these MPI functions are replaced, only the ones used when flattening a non-contiguous datatype. References #5009 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-04-16 10:46:38 -06:00
George Bosilca	6ff11267fb	Remove warnings identified by clang. Plus minor spacing and indentation issues. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-04-14 17:14:12 -04:00
George Bosilca	cd683e3eec	Allow OPAL DDT to receive size_t count argument. Fixes issue #5069, which relates a BigMPI bug with the use of MPI_Type_vectpor to construct very large datatypes (>2GB). Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-04-14 15:32:19 -04:00
Todd Kordenbrock	d646a00cd9	Merge pull request #5054 from tkordenbrock/topic/master/mtl-portals4.finalize.fix master: mtl-portals4: don't call progress() in finalize() if Portals4 was not initialized	2018-04-12 12:12:05 -05:00
Todd Kordenbrock	90659671bc	mtl-portals4: don't call progress() in finalize() if Portals4 was not initialized This commit fixes a segfault in mtl-portals4 finalize(). The segfault occurs if finalize() is called without any calls to add_procs(). This commit resolves the segfault by skipping the progress() loop in finalize() if the Portals was not initialized. Signed-off-by: Todd Kordenbrock (thkgcode@gmail.com)	2018-04-10 14:22:32 -05:00
Mikhail Kurnosov	82a3a5bdb5	Fix dynamic decision for Scan and bug in Allreduce Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-04-06 11:03:17 +07:00
KAWASHIMA Takahiro	5e12e0f2e5	Merge pull request #5001 from ggouaillardet/topic/javah configury: use javac vs javah whenever possible.	2018-04-06 11:30:20 +09:00
Jeff Squyres	fc8ebbb0e0	MPI_Comm_spawn_multiple.3in: update Fortran string array notes Per 0ab6b201fed, note in the MPI_Comm_spawn_multiple.3in man page that the array_of_commands does not need to be terminated -- it just need to have exactly "count" entries. In the Fortran binding, at least, this is different than in prior released versions of Open MPI (it's not a backwards incompatibility, since prior versions of Open MPI required array_of_commands to be blank-string-terminated in Fortran -- this change makes Open MPI be less restrictive, and therefore still backwards compatible). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-04-05 06:52:46 -07:00
Jeff Squyres	0ab6b201fe	mpi/fortran: fix parsing arrays of Fortran strings MPI defines the "argv" param to Fortran MPI_COMM_SPAWN as being terminated by a blank string. While not precisely defined (except through a non-binding example, Example 10.2, MPI-3.1 p382:6-29), one can infer that the "array_of_argv" param to Fortran MPI_COMM_SPAWN_MULTIPLE is also a set of argv, each of which are terminated by a blank line. The "array_of_commands" argument to Fortran MPI_COMM_SPAWN_MULTIPLE is a little less well-defined. It is assumed to be of length "count" (another parameter to MPI_COMM_SPAWN_MULTIPLE) -- and not be terminated by a blank string. This is also given credence by the same example 10.2 in MPI-3.1. The previous code assumed that "array_of_commands" should also be terminated by a blank line -- but per the above, this is incorrect. Instead, we should just parse our "count" number of strings from "array_of_commands" and not look for a blank line termination. This commit separates these two cases: * ompi_fortran_argv_blank_f2c(): parse a Fortran array of strings out and stop when reaching a blank string. * ompi_fortran_argv_count_f2c(): parse a Fortran array of strings out and stop when "count" number of strings have been parsed. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-04-04 18:56:44 -07:00
Gilles Gouaillardet	5370586d98	configury: use javac vs javah whenever possible javah is no more available from Java 10, so try javac -h first (available since Java 8) and fallback on javah Refs. open-mpi/ompi#5000 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-05 10:37:35 +09:00
Gilles Gouaillardet	132ea1a6b0	java: cleanup the list of automatically generated header files Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-05 09:27:34 +09:00
Edgar Gabriel	ef28d941d9	Merge pull request #5002 from raafatfeki/pr/coverty-dynamic_gen2-fixes fcoll/dynamic_gen2: fix coverty warnings	2018-04-04 09:08:42 -05:00
Gilles Gouaillardet	e85fa469f3	coll/tuned: add recursive doubling algo for [ex]scan Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-04 14:56:23 +09:00
Gilles Gouaillardet	393376bbd9	coll/basic: move [ex]scan from coll/basic to coll/base Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-04 13:41:01 +09:00
Gilles Gouaillardet	65fa0b59c3	coll/tuned: add Rabenseifner algo for [all]reduce Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-04 13:25:41 +09:00
Mikhail Kurnosov	177c6ce51f	Move algorithms from coll/spacc to coll/base and remove coll/spacc Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-04-04 10:21:06 +07:00
raafatfeki	5d99af29cd	fcoll/dynamic_gen2: Formatting fixes Adjust Coding Style to match the 4 space tab rule. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-04-02 17:25:00 -05:00
raafatfeki	92822613ea	fcoll/dynamic_gen2: fix coverty warnings fix warnings for coverty CID 1433655 and CID 1433654 Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-04-02 16:18:07 -05:00
Mikhail Kurnosov	1d2d43bdf0	Fix compile error with dtype Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-04-01 08:27:34 +07:00
Edgar Gabriel	c4879ec29f	io/ompio: don't reset amode if MODE_SEQUENTIAL is set the ompio module resets the amode from WRONLY to RDWR in order to accoomodate data sieving in the two-phase fcoll componet. This leads however to an error if MPI_MODE_SEQUENTIAL has been requested by the user, since MODE_SEQUENTIAL is incompatible with MODE_RDWR. SInce the change to the amode was done after opening the file for individual file pointers but before opening the file for shared filepointers, this lead to an error message in the sharedfp component. Note, that data sieving is never necessary if MODE_SEQUENTIAL is set, so this should not be a problem for any scenario. Fixes #4991 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-30 07:56:47 -05:00
Mikhail Kurnosov	50ec214d42	Add recursive doubling algorithm for MPI_Scan and MPI_Exscan to coll/base Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-03-30 10:12:51 +07:00
raafatfeki	100677721d	fcoll/dynamic_gen2: use hindexed constructor on the sender side instead of using a temporary buffer and copy data into the temp buffer before sending, use a derived datatype to describe the data that needs to be sent during a cycle in the collective I/O operation. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-03-28 14:37:30 -05:00
Mikhail Kurnosov	bd12e2b1c6	Add recursive doubling algorithm for Scan and Exscan Implements recursive doubling algorithm for MPI_Scan and MPI_Exscan. The algorithm preserves order of operations so it can be used both by commutative and non-commutative operations. Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-03-28 16:27:11 +07:00
Nathan Hjelm	e79debc320	osc/rdma: fix overflow in offset calculation This commit fixes a bug is osc/rdma that can occur if the total size of the shared memory segment gets larger than 4 GiB. The bug was caused by a typo. The type of my_base_offset should have been size_t not int. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-27 09:33:44 -06:00
Nathan Hjelm	f7faacca4e	osc/rdma: fix 32-bit builds Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-27 09:16:04 -06:00
Jeff Squyres	06af6f1c4c	Merge pull request #4962 from jsquyres/pr/cid-fixes A bunch of CID fixes	2018-03-26 22:30:31 -04:00
Ralph Castain	f92acd735b	Merge pull request #4965 from rhc54/topic/rank Fix breakage in ranking system and silence OSC/RDMA warnings	2018-03-26 19:10:36 -05:00
Jeff Squyres	5360035995	topo/treematch: fix CID 1416327 Ensure to free things in the right order so that we don't access memory after it is freed. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:26:17 -07:00
Jeff Squyres	08ceb66a19	osc/pt2pt: fix (effectively false positive) CID 1402113 This will almost certainly never happen, but be defensive and guarantee that we never return an uninitialized variable. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:26:17 -07:00
Jeff Squyres	9de750a280	io/ompio: fix CID 1269889 Free some memory upon error conditions. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	dca66b9775	comm_join: fix CID 1323170 Enusre that the port name is always NULL-terminated. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	6319292170	fcoll/static: fix CID 1413066 local_iov_array is unconditionally allocated, so unconditionally de-allocate it, too. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	2968ffa296	fcoll/static: remove useless/dead code Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	8e925b4f17	fbtl/posix: fix CID 1419954 Ensure to initialized ret_code. This problem will likely never occur in practice, but we might as well be defensive about it. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	124208198c	osc/rdma: fix CID 1424327 Fix minor memory leak. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Nathan Hjelm	1c75aa82fc	use-mpi-f08: fix rma function signatures The various RMA functions need to have the asynchronous property on all buffers. This property was missing and some buffers were incorrectly marked as intent(in). This commit fixes the function signatures. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-26 15:11:07 -06:00
Ralph Castain	3a93b535ec	Silence the flood of OSC/RDMA warnings Fixes #4950 Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-03-25 16:12:41 -07:00
Artem Polyakov	77ff99e9ee	Merge pull request #4933 from karasevb/timings_update timings: added new timing points	2018-03-25 00:10:49 -07:00
Jeff Squyres	871e5c76bc	Merge pull request #4960 from jsquyres/pr/warnings-fixes Coverity fix + compiler warning fixes	2018-03-23 14:47:56 -05:00
Jeff Squyres	c3adcb05eb	Miscellaneous compiler warnings fixes Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-23 11:45:30 -07:00
Nathan Hjelm	5f7ff5307e	fcoll/two_phase: do not use removed function (MPI_Address) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-23 08:43:24 -06:00
Edgar Gabriel	36747cca67	io/ompio: disable the fcoll timing by default somehow the flag indicating to gather performance data on collective io operations has changed to 1 accidentally. Should be 0 ( false) by default. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:34:35 -05:00
Edgar Gabriel	aae8c6c6ad	remove addproc sharedfp component never got to move this sharedfp component into anything usable. Can easily be restored if necessary. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:27:01 -05:00
Edgar Gabriel	e703ac2da8	remove plfs components plfs components are at this point not utilized by anybody as far as I know. Easy to bring back if we want to. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:27:01 -05:00
Boris Karasev	3796307a57	timings: added new timing points Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-03-21 05:16:25 +02:00
bosilca	bf3dd8af19	Merge pull request #4884 from bosilca/topic/fix_wtime Improve the range and accuracy of MPI_Wtime.	2018-03-16 14:09:33 +09:00
Nathan Hjelm	7f4872d483	osc/rdma: performance improvments and bug fixes This commit is a large update to the osc/rdma component. Included in this commit: - Add support for using hardware atomics for fetch-and-op and single count accumulate when using the accumulate lock. This will improve the performance of these operations even when not setting the single intrinsic info key. - Rework how large accumulates are done. They now block on the get operation to fix some bugs discovered by an IBM one-sided test. I may roll back some of the changes if the underlying bug in the original design is discovered. There appear to be no real difference (on the hardware this was tested with) in performance so its probably a non-issue. References #2530. - Add support for an additional lock-all algorithm: on-demand. The on-demand algorithm will attempt to acquire the peer lock when starting an RMA operation. The lock algorithm default has not changed. The algorithm can be selected by setting the osc_rdma_locking_mode MCA variable. The valid values are two_level and on_demand. - Make use of the btl_flush function if available. This can improve performance with some btls. - When using btl_flush do not keep track of the number of put operations. This reduces the number of atomic operations in the critical path. - Make the window buffers more friendly to multi-threaded applications. This was done by dropping support for multiple buffers per MPI window. I intend to re-add that support once the underlying performance bug under the old buffering scheme is fixed. - Fix a bug in request completion in the accumulate, get, and put paths. This also helps with #2530. - General code cleanup and fixes. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-15 14:53:53 -06:00

1 2 3 4 5 ...

9977 Коммитов