openmpi

Автор	SHA1	Сообщение	Дата
Nathan Hjelm	eed7b45db5	osc/rdma: fix issue identified by Berk Hess osc/rdma uses counters to determine if all messages have been received before exiting synchronization calls. The problem is that the active target counter is always increasing (never zeroed). If over 2^31-1 messages are sent this causes the counter to overflow (in itself this isn't an error). This causes test/wait to return before the communication is complete. There is an additional error in the use of the fragment flush function. If PSCW synchronization is in use this function CAN NOT be called unless a post message has arrived. Relevant mailing list thread: http://www.open-mpi.org/community/lists/devel/2014/10/16016.php This commit fixes both issues. Tested against MTT and issue reproducer. Closes #224.	2014-10-07 11:45:22 -06:00
Jeff Squyres	8468424f45	distscript: remove configure.params and autogen.subdirs kruft Remove configure.params support: configure.params hasn't been used in years. Also remove autogen.subdirs support; those should really be handled by their respective Makefile.am's.	2014-10-02 11:32:54 -07:00
Howard Pritchard	bb65835816	Fix iallgather problem with intercommunicators A problem was found with the libnbc MPI_Iallgather routine when using intercommunicators. Special thanks to Takahiro Kawashima(Fujitsu) for the patch and a test case. Verified master fails without the patch and the test passes with the patch applied. fixes #219	2014-10-02 11:45:17 -06:00
Jeff Squyres	72704441a2	URLs: update URLs for GitHub	2014-10-01 14:44:09 -07:00
Howard Pritchard	0f74467264	switch to ompi_mpi_thread_provided for ts check Use ompi_mpi_thread_provided rather than opal_using_threads macro to check whether MPI_THREAD_MULTIPLE is being used. This commit was SVN r32815.	2014-09-29 22:20:35 +00:00
Howard Pritchard	7069f2361a	disqualify coll ml for MPI_THREAD_MULTIPLE This commit was SVN r32814.	2014-09-29 21:02:15 +00:00
Ralph Castain	eb95d6f892	ompi_info_get_bool returns "success" if the value isn't found, setting "flag" to false, but doesn't set the value of the param itself. So if you don't specify "blocking_fence" in MPI_Info, then the "blocking_fence" flag wasn't being set. Initialize the blocking_fence flag to false as the code logic indicates that it should only be set if someone provides that flag. Thanks to Lisandro Dalcin for reporting it cmr=v1.8.4:reviewer=hjelmn This commit was SVN r32812.	2014-09-29 17:21:28 +00:00
George Bosilca	49e79a9ade	Fix the case of a single process. This commit was SVN r32807.	2014-09-28 22:06:39 +00:00
Nathan Hjelm	9c788ff940	coll/basic: fix segmentation fault in neighborhood collectives if the degree of the topology is higher than the communicator size It is possible to have a topology degree higher than the size of the communicator. For example, a periodic cartesian communicator on MPI_COMM_SELF. This will leave the neighborhood collectives with a request buffer that is too small. This commit adds a call that will dynamically increase the size of the request buffer if it is too small. A better fix would be to create the topology before calling the coll_select routine on a communicator. This will take some discussion and the solution will not likely be ready anytime soon. Thanks to Lisandro Dalcin for reporting this. Original thread: http://www.open-mpi.org/community/lists/devel/2014/08/15713.php cmr=v1.8.3:reviewer=jsquyres This commit was SVN r32796.	2014-09-25 17:43:29 +00:00
Edgar Gabriel	05c34946f7	implementation of non-blocking read/write operations through aio functions for the posix module. Som interface changes for the fbtl were necessary for that. This commit was SVN r32777.	2014-09-23 21:27:57 +00:00
Rolf vandeVaart	5c73101a72	Fix typo. This commit was SVN r32755.	2014-09-18 13:58:54 +00:00
Vasily Filipov	c7c63fe73e	COLL/TUNED: alltoall - return previous default values of algorithm choosing decision thresholds (were changed by r32735) reviewed by miked cmr=v1.8.3:reviewer=ompi-rm1.8 This commit was SVN r32753. The following SVN revision numbers were found above: r32735 --> open-mpi/ompi@5fecf65daf	2014-09-18 08:07:51 +00:00
Rolf vandeVaart	8db1f89dd1	Small change to allow CUDA-aware to work with non-reduction nonblocking collectives. Only used when CUDA-aware feature compiled in. This commit was SVN r32750.	2014-09-17 16:55:01 +00:00
Vasily Filipov	ff10b25e7d	warnings (caused by commit r32735) fix. reviewed by miked cmr=v1.8.3:reviewer=ompi-rm1.8 This commit was SVN r32740. The following SVN revision numbers were found above: r32735 --> open-mpi/ompi@5fecf65daf	2014-09-16 06:33:49 +00:00
Vasily Filipov	5fecf65daf	OMPI/COLL/Tuned: add command line params for thresholds to decide if small/intermediate MSGs alltoall algorithm will be used. cmr=v1.8.3:reviewer=miked This commit was SVN r32735.	2014-09-15 12:34:21 +00:00
Mangala Jyothi Bhaskar	dc05b709a7	it is ok to not have a sharedfp component selected, as long as no sharedfp functionality is being used. Return an error however if no sharedfp component is selected and the applications calls a file_read/write_shared function. This commit was SVN r32718.	2014-09-12 21:15:58 +00:00
Edgar Gabriel	597177cd8b	silence a warning regarding the return value of the fbtl's. This commit was SVN r32717.	2014-09-12 18:01:30 +00:00
Mangala Jyothi Bhaskar	cd78a3a026	Fixed offset data type in communication This commit was SVN r32710.	2014-09-11 14:51:30 +00:00
Mangala Jyothi Bhaskar	4ff21d6178	Fixed offset data type in communication This commit was SVN r32709.	2014-09-11 14:51:07 +00:00
Mangala Jyothi Bhaskar	6e5f2c8ae8	Fixed offset data type in communication This commit was SVN r32708.	2014-09-11 14:50:30 +00:00
Edgar Gabriel	4ccc0f5ea2	the length of the iov array should be limited to IOV_MAX, which is defined in limits.h This commit was SVN r32706.	2014-09-10 21:59:45 +00:00
Edgar Gabriel	cc46b65a5e	the fbtl interfaces should really be an ssize_t not a size_t, since the return value could be negative, which is allowed for ssize_t, but not for size_t. This commit was SVN r32700.	2014-09-10 15:01:54 +00:00
Edgar Gabriel	599cb7b351	update the pvfs2 fbtl to return the number of bytes generated. This commit was SVN r32699.	2014-09-10 13:32:06 +00:00
Edgar Gabriel	3a5f4f72da	make the zero byte read/write scenarios work without the contiguous flag. This commit was SVN r32690.	2014-09-09 16:26:14 +00:00
Edgar Gabriel	6a607caed8	fix some zero byte allocation scenarios. This commit was SVN r32689.	2014-09-09 16:25:44 +00:00
Edgar Gabriel	ed02927767	- do not set the contiguous memory option in the collective operations. It should not be stored on the file handle anyway, since it is not a property of the file. - protect a realloc for zero byte scenarios. This commit was SVN r32678.	2014-09-07 18:09:43 +00:00
Edgar Gabriel	0d425e2f74	resetting the counter for the iov array has to happen outside of the if statement. This commit was SVN r32677.	2014-09-07 16:30:56 +00:00
Edgar Gabriel	0f59ce6591	use the fbtl return value as originally intended, namely to retrieve the number of bytes written and read. Status contains now the actual number of bytes written for individual operations. For collective operations, this is unfortunately not possible. This commit was SVN r32674.	2014-09-07 15:14:57 +00:00
Ralph Castain	41c6058153	Bring over changes to MXM from pmix branch: MTL MXM: establish endpoint connection on the first communication when direct_modex used This commit was SVN r32668.	2014-09-03 18:22:11 +00:00
Gilles Gouaillardet	edfbeba7bf	coll/ml: better error handling when CHECK_AND_RECYCLE detects an error, a message is displayed if the error occurs on an intrinsic communicator, then abort the program (instead of trying to free the communicator) cmr=v1.8.3:reviewer=hjelmn This commit was SVN r32659.	2014-09-01 10:00:49 +00:00
Ralph Castain	b554cd7d86	Turn off the coll/ml component if --without-hwloc was given cmr=v1.8.3:reviewer=jsquyres This commit was SVN r32621.	2014-08-27 20:25:39 +00:00
Edgar Gabriel	46de730059	fix a typo This commit was SVN r32603.	2014-08-25 20:53:19 +00:00
Edgar Gabriel	52eac0146d	cleanup of the fbtl interfaces: remove the *sorted optimization flag, since it was not used anyway in the last two years. Simplifies the code significantly. This commit was SVN r32602.	2014-08-25 18:04:24 +00:00
Todd Kordenbrock	6a3225d800	Fix invalid symbols left by the PMIx merge. This commit was SVN r32597.	2014-08-25 16:30:26 +00:00
Ralph Castain	ac0c584eb7	Add missing file This commit was SVN r32588.	2014-08-23 04:31:35 +00:00
Ralph Castain	b1a7375192	Fix the "unreachable" message so it outputs the correct hostname for the remote proc. Cleanup some of the pmix stuff when running corner cases of errors This commit was SVN r32584.	2014-08-22 19:20:45 +00:00
Vishwanath Venkatesan	b176787d0f	Remove unwanted spaces + Test commit This commit was SVN r32576.	2014-08-22 05:11:17 +00:00
Edgar Gabriel	9987135da0	add initial support for non-blocking read and write operations. This commit was SVN r32571.	2014-08-22 01:34:19 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Mangala Jyothi Bhaskar	a5973c3f8c	revamp of the aggregator selection logic, part 1. This commit was SVN r32557.	2014-08-20 19:28:04 +00:00
Rolf vandeVaart	8709071819	Fix missing help file. This commit was SVN r32550.	2014-08-18 21:52:31 +00:00
Edgar Gabriel	fabad95b8e	- extend the explicit offset patch to collective explicit offset operations as well - minor restructuring to support the shared file pointer operations correctly for explicit offsets This commit was SVN r32538.	2014-08-15 14:03:29 +00:00
Edgar Gabriel	d773dc8aa5	make arbitrary sequences of explicit and implicit offset operations work properly. This commit was SVN r32537.	2014-08-15 01:49:43 +00:00
Edgar Gabriel	da1b6c2e87	some code reorganization in preparation for non-blocking read and write operations. This commit was SVN r32534.	2014-08-14 20:17:58 +00:00
Alina Sklarevich	a914c68356	MTL MXM: fix check-help-string.pl errors and warnings. This commit was SVN r32533.	2014-08-14 07:46:56 +00:00
Edgar Gabriel	e401b68ca5	fix the zero byte fileview problem reported by Mohamad on the mailinglist This commit was SVN r32529.	2014-08-13 23:44:43 +00:00
Vasily Filipov	5ca2fffa44	MTL/MXM: call for ompi_proc_world instead of ompi_comm_size during del_procs. This commit was SVN r32504.	2014-08-11 11:52:23 +00:00
Gilles Gouaillardet	cf9e144f05	silence warnings gcc 3.4.3 on solaris 10 issues some warnings cmr=v1.8.2:reviewer=hjelmn This commit was SVN r32500.	2014-08-11 07:36:46 +00:00
Gilles Gouaillardet	22cb8a1834	check-help-strings cleanup This commit was SVN r32497.	2014-08-11 03:27:45 +00:00
Gilles Gouaillardet	f24699623f	check-help-strings cleanup This commit was SVN r32495.	2014-08-11 03:25:22 +00:00

... 2 3 4 5 6 ...

5227 Коммитов