openmpi

Автор	SHA1	Сообщение	Дата
raafatfeki	91e028f7fd	fcoll/dynamic_gen2: Reduce number of realloc calls keep track of the sizeof the blocklen_per_process and displs_per_process on the aggregator datastructure to minimze the number of realloc function calls required in the shuffle_init operation. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-04-20 10:13:57 -05:00
Nathan Hjelm	84765001aa	io/romio: do not use removed functions This commit attempts to update the romio io component to not use functions removed in MPI-3.0 (2012). This is a first cut and will probably need to be reviewed for correctness. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-04-16 12:06:52 -06:00
Nathan Hjelm	4d876ec6fe	io/romio314: fix minmax datatypes romio assumes that all predefined datatypes are contiguous. Because of the (terribly named) composed datatypes MPI_SHORT_INT, MPI_DOUBLE_INT, MPI_LONG_INT, etc this is an incorrect assumption. The simplest way to fix this is to override the MPI_Type_get_envelope and MPI_Type_get_contents calls with calls that will work on these datatypes. Note that not all calls to these MPI functions are replaced, only the ones used when flattening a non-contiguous datatype. References #5009 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-04-16 10:46:38 -06:00
George Bosilca	6ff11267fb	Remove warnings identified by clang. Plus minor spacing and indentation issues. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-04-14 17:14:12 -04:00
Todd Kordenbrock	d646a00cd9	Merge pull request #5054 from tkordenbrock/topic/master/mtl-portals4.finalize.fix master: mtl-portals4: don't call progress() in finalize() if Portals4 was not initialized	2018-04-12 12:12:05 -05:00
Todd Kordenbrock	90659671bc	mtl-portals4: don't call progress() in finalize() if Portals4 was not initialized This commit fixes a segfault in mtl-portals4 finalize(). The segfault occurs if finalize() is called without any calls to add_procs(). This commit resolves the segfault by skipping the progress() loop in finalize() if the Portals was not initialized. Signed-off-by: Todd Kordenbrock (thkgcode@gmail.com)	2018-04-10 14:22:32 -05:00
Mikhail Kurnosov	82a3a5bdb5	Fix dynamic decision for Scan and bug in Allreduce Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-04-06 11:03:17 +07:00
Edgar Gabriel	ef28d941d9	Merge pull request #5002 from raafatfeki/pr/coverty-dynamic_gen2-fixes fcoll/dynamic_gen2: fix coverty warnings	2018-04-04 09:08:42 -05:00
Gilles Gouaillardet	e85fa469f3	coll/tuned: add recursive doubling algo for [ex]scan Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-04 14:56:23 +09:00
Gilles Gouaillardet	393376bbd9	coll/basic: move [ex]scan from coll/basic to coll/base Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-04 13:41:01 +09:00
Gilles Gouaillardet	65fa0b59c3	coll/tuned: add Rabenseifner algo for [all]reduce Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-04-04 13:25:41 +09:00
Mikhail Kurnosov	177c6ce51f	Move algorithms from coll/spacc to coll/base and remove coll/spacc Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-04-04 10:21:06 +07:00
raafatfeki	5d99af29cd	fcoll/dynamic_gen2: Formatting fixes Adjust Coding Style to match the 4 space tab rule. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-04-02 17:25:00 -05:00
raafatfeki	92822613ea	fcoll/dynamic_gen2: fix coverty warnings fix warnings for coverty CID 1433655 and CID 1433654 Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-04-02 16:18:07 -05:00
Mikhail Kurnosov	1d2d43bdf0	Fix compile error with dtype Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-04-01 08:27:34 +07:00
Edgar Gabriel	c4879ec29f	io/ompio: don't reset amode if MODE_SEQUENTIAL is set the ompio module resets the amode from WRONLY to RDWR in order to accoomodate data sieving in the two-phase fcoll componet. This leads however to an error if MPI_MODE_SEQUENTIAL has been requested by the user, since MODE_SEQUENTIAL is incompatible with MODE_RDWR. SInce the change to the amode was done after opening the file for individual file pointers but before opening the file for shared filepointers, this lead to an error message in the sharedfp component. Note, that data sieving is never necessary if MODE_SEQUENTIAL is set, so this should not be a problem for any scenario. Fixes #4991 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-30 07:56:47 -05:00
Mikhail Kurnosov	50ec214d42	Add recursive doubling algorithm for MPI_Scan and MPI_Exscan to coll/base Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-03-30 10:12:51 +07:00
raafatfeki	100677721d	fcoll/dynamic_gen2: use hindexed constructor on the sender side instead of using a temporary buffer and copy data into the temp buffer before sending, use a derived datatype to describe the data that needs to be sent during a cycle in the collective I/O operation. Signed-off-by: raafatfeki <fekiraafat@gmail.com>	2018-03-28 14:37:30 -05:00
Mikhail Kurnosov	bd12e2b1c6	Add recursive doubling algorithm for Scan and Exscan Implements recursive doubling algorithm for MPI_Scan and MPI_Exscan. The algorithm preserves order of operations so it can be used both by commutative and non-commutative operations. Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-03-28 16:27:11 +07:00
Nathan Hjelm	e79debc320	osc/rdma: fix overflow in offset calculation This commit fixes a bug is osc/rdma that can occur if the total size of the shared memory segment gets larger than 4 GiB. The bug was caused by a typo. The type of my_base_offset should have been size_t not int. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-27 09:33:44 -06:00
Nathan Hjelm	f7faacca4e	osc/rdma: fix 32-bit builds Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-27 09:16:04 -06:00
Jeff Squyres	06af6f1c4c	Merge pull request #4962 from jsquyres/pr/cid-fixes A bunch of CID fixes	2018-03-26 22:30:31 -04:00
Jeff Squyres	5360035995	topo/treematch: fix CID 1416327 Ensure to free things in the right order so that we don't access memory after it is freed. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:26:17 -07:00
Jeff Squyres	08ceb66a19	osc/pt2pt: fix (effectively false positive) CID 1402113 This will almost certainly never happen, but be defensive and guarantee that we never return an uninitialized variable. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:26:17 -07:00
Jeff Squyres	9de750a280	io/ompio: fix CID 1269889 Free some memory upon error conditions. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	6319292170	fcoll/static: fix CID 1413066 local_iov_array is unconditionally allocated, so unconditionally de-allocate it, too. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	2968ffa296	fcoll/static: remove useless/dead code Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	8e925b4f17	fbtl/posix: fix CID 1419954 Ensure to initialized ret_code. This problem will likely never occur in practice, but we might as well be defensive about it. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Jeff Squyres	124208198c	osc/rdma: fix CID 1424327 Fix minor memory leak. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-26 14:21:21 -07:00
Ralph Castain	3a93b535ec	Silence the flood of OSC/RDMA warnings Fixes #4950 Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2018-03-25 16:12:41 -07:00
Jeff Squyres	871e5c76bc	Merge pull request #4960 from jsquyres/pr/warnings-fixes Coverity fix + compiler warning fixes	2018-03-23 14:47:56 -05:00
Jeff Squyres	c3adcb05eb	Miscellaneous compiler warnings fixes Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-03-23 11:45:30 -07:00
Nathan Hjelm	5f7ff5307e	fcoll/two_phase: do not use removed function (MPI_Address) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-23 08:43:24 -06:00
Edgar Gabriel	36747cca67	io/ompio: disable the fcoll timing by default somehow the flag indicating to gather performance data on collective io operations has changed to 1 accidentally. Should be 0 ( false) by default. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:34:35 -05:00
Edgar Gabriel	aae8c6c6ad	remove addproc sharedfp component never got to move this sharedfp component into anything usable. Can easily be restored if necessary. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:27:01 -05:00
Edgar Gabriel	e703ac2da8	remove plfs components plfs components are at this point not utilized by anybody as far as I know. Easy to bring back if we want to. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-21 11:27:01 -05:00
Nathan Hjelm	7f4872d483	osc/rdma: performance improvments and bug fixes This commit is a large update to the osc/rdma component. Included in this commit: - Add support for using hardware atomics for fetch-and-op and single count accumulate when using the accumulate lock. This will improve the performance of these operations even when not setting the single intrinsic info key. - Rework how large accumulates are done. They now block on the get operation to fix some bugs discovered by an IBM one-sided test. I may roll back some of the changes if the underlying bug in the original design is discovered. There appear to be no real difference (on the hardware this was tested with) in performance so its probably a non-issue. References #2530. - Add support for an additional lock-all algorithm: on-demand. The on-demand algorithm will attempt to acquire the peer lock when starting an RMA operation. The lock algorithm default has not changed. The algorithm can be selected by setting the osc_rdma_locking_mode MCA variable. The valid values are two_level and on_demand. - Make use of the btl_flush function if available. This can improve performance with some btls. - When using btl_flush do not keep track of the number of put operations. This reduces the number of atomic operations in the critical path. - Make the window buffers more friendly to multi-threaded applications. This was done by dropping support for multiple buffers per MPI window. I intend to re-add that support once the underlying performance bug under the old buffering scheme is fixed. - Fix a bug in request completion in the accumulate, get, and put paths. This also helps with #2530. - General code cleanup and fixes. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-03-15 14:53:53 -06:00
Edgar Gabriel	da640f98df	fcoll/two_phase: data sieving has to occur at offset 0 as well data sieving has to occur for any offset provided that is larger or equal zero for this implementation to work correctly. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-10 11:23:09 -06:00
Edgar Gabriel	c83b47c266	io/romio314: mark datatypes of size 0 as contiguous this commit fixes an issue observed with romio314 and the hdf5 1.10.x testsuite. The ADIOI_Datatype_iscontig() routine in romio314/src/io_romio314_module.c will now return for a datatype of size 0 that it is contiguous, even if the extent of the datatype is non-zero. This avoids a segmentation fault observed in the ADIOI_Flatten routine, and fixes this particular with the hdf5 1.10.x testsuite in OpenMPI with romio314. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-03-08 09:10:09 -06:00
bosilca	9944d63de1	Merge pull request #4852 from thananon/pr/ob1_oos_fix pml/ob1: fixed out of sequence bug.	2018-02-28 13:02:03 -05:00
Thananon Patinyasakdikul	09cba8b30b	pml/ob1: fixed out of sequence bug. This commit fixes #4795 - Fixed typo that sometimes causes deadlock in change of protocol. - Redesigned out of sequence ordering and address the overflow case of sequence number from uint16_t. Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>	2018-02-27 13:49:40 -05:00
Valentin Petrov	bf4e694a96	coll/hcoll: Fix return codes Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2018-02-22 17:48:29 +02:00
Matias Cabral	0a822f8f99	Merge pull request #4821 from nrspruit/OFI_mtl_multi_event_progress MTL OFI: Added support for reading multiple CQ events in ofi progress	2018-02-20 14:59:47 -08:00
Jeff Squyres	9ef0f3d83a	ompi/monitoring: add .sh versionig to common monitoring lib Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-02-20 07:07:23 -08:00
Spruit, Neil R	e7bff501cd	MTL OFI: Added support for reading multiple CQ events in ofi progress -Updated ompi_mtl_ofi_progress to use an array to read CQ events up to a threshold that can be set by the Open MPI User. -Users can adjust the number of events that can be handled in the ompi_mtl_ofi_progress by setting "--mca mtl_ofi_progress_event_cnt #". -The default value for the the number of CQ events that can be read in a single call to ofi progress is 100 which is an average based off workload usecase anaylsis showing 70-128 as the range of multiple events returned during ofi progress. Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>	2018-02-15 09:41:14 -05:00
Nathan Hjelm	0e83568466	coll/libnbc: do not take lock in progress if there are no requests This commit fixes a flaw in the progress function for libnbc. The function was unconditionally taking a lock even if there are no requests to process. This lock was showing up in vtune traces of multi-threaded benchmarks. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-02-13 09:51:01 -07:00
Edgar Gabriel	a3a734b6d2	io/ompio: correctly reset the request after performing the final OBJ_RELEASE on the request, reset the user level variable to MPI_REQUEST_NULL. Otherwise the c_2_f translation step in the fortran interface fails. Fixes issue #4807 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-02-13 09:18:25 -06:00
Jeff Squyres	e7f91f8068	Merge pull request #4527 from clementFoyer/osc-no-includes Remove inter-dependencies between OSC modules.	2018-02-09 15:49:56 -05:00
Nathan Hjelm	da9f833f4a	pml/ob1: ignore the eager limit of RDMA-only btls This commit fixes a flaw in the eager limit check in pml/ob1. The check was incorrectly checking if RDMA-only BTLs (BTLs without the send flag) has a valid eager limit. This commit fixes the check by adding an additional check for the send flag on the BTL module. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-02-07 12:42:44 -07:00
Clement Foyer	f5b4fc05f8	Remove inter-dependencies between OSC modules. The osc monitoring component needed to include other OSC components header in order to be able tu access communicator through the component specific ompi_osc__module_t structures. This commit remove the dependency, and resolve the issue #4523. Extend the common monitoring API. Now it's possible to translate from local rank to world rank from both the communicator and the group. * Remove useless hashtable as we directly use the w_group contained in window structure. Add automatic generation at config time. The templates are expanded at configure time. It creates a new header file that generates all the variables/functions needed. Adding this during the autogen automagicaly generates for each of the available modules the proper functions. Only keep a generated argv-style array. Following Jeff's advice, the configure.m4 file generate a simple array of module variables to be iterated over to find the proper module. Signed-off-by: Clement Foyer <clement.foyer@inria.fr>	2018-02-07 11:52:00 +00:00

... 7 8 9 10 11 ...

6935 Коммитов