openmpi

Автор	SHA1	Сообщение	Дата
rhc54	8b534e9897	Merge pull request #1668 from rhc54/topic/slurm When direct launching applications, we must allow the MPI layer to pr…	2016-05-16 12:23:19 -07:00
Jeff Squyres	5275e5e2a1	bml_r2: use __func__ to identify function names There were some old/stale function names in some debugging/verbose opal_output calls. Use __func__ instead, so that they won't become stale in the future. Thanks to Durga Choudhury for pointing out the issue. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-16 11:06:47 -04:00
Ralph Castain	01ba861f2a	When direct launching applications, we must allow the MPI layer to progress during RTE-level barriers. Neither SLURM nor Cray provide non-blocking fence functions, so push those calls into a separate event thread (use the OPAL async thread for this purpose so we don't create another one) and let the MPI thread sping in wait_for_completion. This also restores the "lazy" completion during MPI_Finalize to minimize cpu utilization. Update external as well Revise the change: we still need the MPI_Barrier in MPI_Finalize when we use a blocking fence, but do use the "lazy" wait for completion. Replace the direct logic in MPI_Init with a cleaner macro	2016-05-14 16:37:00 -07:00
Aurélien Bouteiller	7f65c2b18e	forgot to update copyright in commits 627a89b 4899c89	2016-05-13 11:34:59 -04:00
George Bosilca	37e03e3e5b	Don't update req_bytes_received if no bytes were received.	2016-05-12 23:39:32 -04:00
Matias A Cabral	528abff6ae	Merge remote-tracking branch 'upstream/master'	2016-05-10 15:42:08 -07:00
Matias A Cabral	d28ee62a96	Update in PSM and PSM2 MTLs to detect entries created by drivers for Intel TrueScale and Intel OmniPath, and detect a link in ACTIVE state. This fix addresses the scenario reported in the below OMPI users email, including formerly named Qlogic IB, now Intel True scale. Given the nature of the PSM/PSM2 mtls this fix applies to OmniPath: https://www.open-mpi.org/community/lists/users/2016/04/29018.php	2016-05-09 12:08:44 -07:00
Gilles Gouaillardet	0a19337371	coll/base: return MPI_ERR_UNSUPPORTED_OPERATION when coll_base_*_two_procs algo is used on a communicator that has no two tasks Thanks Dave Love for the report	2016-05-09 14:18:40 +09:00
Gilles Gouaillardet	b159587325	io/romio: fix filesystem type check on OpenBSD 5.7 check the existence of the f_type field in struct statfs Thanks Paul Hargrove for the report	2016-05-09 13:54:46 +09:00
Ralph Castain	6b24e2779b	Remove stale component - I'm not going to get to it	2016-05-07 04:13:34 -07:00
Edgar Gabriel	def1b95fd7	Merge pull request #1646 from edgargabriel/getview-preallocate-fixes io/ompio: file_getview and file_preallocate fixes	2016-05-06 11:46:00 -05:00
Edgar Gabriel	e65e189671	io/ompio: fix file size after file_preallocate Thanks for @dalcini for reporting Fixes open-mpi/ompi#1633	2016-05-06 08:20:59 -05:00
Edgar Gabriel	d358965134	io/ompio: fix envelope of datatype returned by getview Thanks for @dalcini for reporting Fixes open-mpi/ompi#1632	2016-05-06 08:19:48 -05:00
Edgar Gabriel	7c92acaa78	Merge pull request #1637 from edgargabriel/pr/netbsd-compilation-problems fs/lustre and fs/pvfs2: fix netbsd compilation problems	2016-05-06 08:05:36 -05:00
Gilles Gouaillardet	6c9d65c0ca	coll/libnbc: fix MPI_Ireduce_scatter_block for one task communicator Thanks Lisandro Dalcin for the report Fixes open-mpi/ompi#248	2016-05-06 09:43:29 +09:00
Ralph Castain	08022d7af1	Some minor cleanups of warnings from gcc 6.0.0. Update s1/s2 pmix to get max_procs as required.	2016-05-05 15:28:13 -07:00
Jeff Squyres	f167be1c91	ompio: always return valid info from FILE_GET_INFO MPI-3.1 says that even if no info keys are set on the file, we need to return a new, empty info. Thanks to Lisandro Dalcin for identifying the issue. Fixes open-mpi/ompi#1630 Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-05 12:03:29 -07:00
Aurélien Bouteiller	4899c89731	Fix a race condition when multiple threads try to create a bml endpoint simultaneously.	2016-05-05 10:49:30 -04:00
Aurélien Bouteiller	627a89bf71	Fix a race condition when multiple threads do the "first send" to an endpoint simultaneously.	2016-05-05 09:04:10 -04:00
Joshua Ladd	4771c9ece6	Merge pull request #1617 from jladd-mlnx/topic/disable-hcoll-barrier-in-finalize-ompi-trunk HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla	2016-05-04 10:12:34 -04:00
Edgar Gabriel	78fa8bb2c4	remove some unused variables that can cause compilation problems on netbsd	2016-05-03 10:25:15 -05:00
Todd Kordenbrock	3498bed650	Merge pull request #1555 from shawone/check_reduce_ret coll-portals4: check return value from reduce kary tree functions	2016-05-03 10:17:23 -05:00
Jeff Squyres	33dd8ca81e	osc_rdma_peer: properly include ompi_config.h Thanks to Paul Hargrove for reporting. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-03 07:39:55 -07:00
Devendar Bureddy	cafd55f18c	HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla tear down HCOLL barrier may not complete if HCOLL progress is not called periodically. which is the case in HCOLL teardown progress in the finalize. (cherry picked from commit 793244d75dd94d1d5e0243bcccf6d04318750f3f)	2016-05-03 00:49:57 +03:00
Nathan Hjelm	d3d779f6d9	osc/rdma: clear all_sync object when obtaining a lock This commit fixes a bad synchronization detection bug that occurs when mixing MPI_Win_fence() and MPI_Win_lock(). If no communication has occurred in the fence epoch it is safe to just clear the all_sync object (it was set up by fence). Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-02 15:28:47 -06:00
Jeff Squyres	265e5b9795	Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1 ompi/opal/orte/oshmem/test: max hostname length cleanup	2016-05-02 09:44:18 -04:00
Ralph Castain	6ac7929bd0	Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need. Cleanups per @jjhursey review	2016-05-01 11:30:25 -07:00
Nathan Hjelm	7bda3eb2dc	osc/rdma: fix global index array calculation This commit fixes a bug that occurs when ranks are either not mapped evenly or by something other than core. Fixes open-mpi/ompi#1599 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-28 19:11:11 -06:00
Nathan Hjelm	f0f3383006	Merge pull request #1590 from hjelmn/thread_multiple osc/pt2pt: do not drop/reacquire the ompi_request_lock	2016-04-26 16:48:37 -06:00
Nathan Hjelm	34ff6293bd	osc/pt2pt: do not drop/reacquire the ompi_request_lock This lock is now recursive so it is safe to call into the pml without dropping the lock. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-26 14:19:38 -06:00
George Bosilca	bf190671e9	Make the request lock recursive. If during the request completion callback we post another request that completes right away (such a small send or a match for an unexpected short message) we will try to complete the second request while holding the lock for the completion of the first. For performance reasons (mainly to avoid unlocking and locking the request mutex several times) we have made the request lock recursive.	2016-04-26 16:16:07 -04:00
Nathan Hjelm	c16e639b2f	Merge pull request #1563 from hjelmn/ompi_coverity ompi coverity fixes	2016-04-26 09:17:48 -06:00
Karol Mroz	3322347da9	ompi: fixup hostname max length usage Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-04-25 07:08:23 +02:00
Nathan Hjelm	ae0ffbb67f	Merge pull request #1397 from hjelmn/enable_thread_multiple ompi: always enable MPI_THREAD_MULTIPLE support	2016-04-23 08:40:22 -06:00
Nathan Hjelm	1ff3d3b16b	pml/ob1: fix coverity issue Fix CID 1357978 (1 of 1): Logically dead code (DEADCODE): Remove duplicate check for NULL == endpoint. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-19 14:48:13 -06:00
Nathan Hjelm	70533e6d50	fcoll/static: fix coverity issues Fix CID 72362: Explicit null dereferenced (FORWARD_NULL) From what I can tell the code @ fcoll_static_file_read_all.c:649 should be setting bytes_per_process[i] to 0 not bytes_per_process. Fix CID 72361: Explicit null dereferenced (FORWARD_NULL) Modified check to check for blocklen_per_process non-NULL before trying to free blocklen_per_process[l]. This is sufficient because free (NULL) is safe. Also cleaned up the initialization of this an a couple other arrays. They were allocated with malloc() then initialized to 0. Changed to used calloc(). Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-19 14:48:13 -06:00
Nathan Hjelm	8871bdb2f8	fcoll/two_phase: fix coverity issues Fix CID 72296: Resource leak (RESOURCE_LEAK): Changed code to goto exit instead of returning to ensure memory is freed. Fix CID 712589: Out-of-bounds read (OVERRUN): In this loop i and j are identical and always less than iov_count. The CID was triggered because i was incremented if i was < iov_count. This meant that if the loop did go on the next iteration would access an invalid index. Fix CID 741363: Uninitialized scalar variable (UNINIT): Allocate tmp_len with calloc to insure every index is initialized. Fix CID 741364: Uninitialized pointer read (UNINIT): Allocate recv_types with calloc to ensure all indices are always initialized. Also added a check to not loop and destroy if recv_types is NULL. Also added a NULL check on the allocation of decoded iov. This is not the cause of CID 126784 but should be fixed. Fix CID 712588: Out-of-bounds read (OVERRUN): Similar to CID 712589. Should silence the issue. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-19 14:47:41 -06:00
Valentin Petrov	21f1c572c0	Adds mapping to hcoll complex dte	2016-04-19 14:14:28 +03:00
Nicolas Chevalier	c86d4035d2	coll-portals4: check return value from reduce kary tree functions	2016-04-18 12:02:30 +00:00
Nathan Hjelm	3245428e82	Merge pull request #1535 from kawashima-fj/pr/osc-pt2pt-header-fix osc/pt2pt: Fix a struct name typo	2016-04-14 15:55:25 -06:00
Nathan Hjelm	330302c4b4	Merge pull request #1534 from kawashima-fj/pr/parallel-rma-fix osc/pt2pt: Fix tag conflicts on parallel RMA communications	2016-04-14 15:13:32 -06:00
Jeff Squyres	fdf33674b3	Merge pull request #1532 from kmroz/wip-hindexed-cleanup-1 romio,java: cleanup deprecated hindexed call	2016-04-14 17:07:31 -04:00
KAWASHIMA Takahiro	35ea9e5c3c	Add FUJITSU copyright	2016-04-12 13:47:53 +09:00
KAWASHIMA Takahiro	39bcbe439a	osc/pt2pt: Fix a struct name typo Fortunately the sizes of `ompi_osc_pt2pt_header_put_t` and `ompi_osc_pt2pt_header_get_t` are same. So this doesn't affect the behavior.	2016-04-11 20:55:22 +09:00
KAWASHIMA Takahiro	28a0577364	osc/pt2pt: Insert breaks in long lines	2016-04-11 19:06:01 +09:00
KAWASHIMA Takahiro	5ac95df9dc	osc/pt2pt: use two distinct "namespaces" for tags - revised Before this commit, a same PML tag may be used for distinct communications for long messages. For example, consider a condition where rank A calls ```MPI_PUT``` targeting rank B and rank B calls ```MPI_GET``` targeting rank A simultaneously. A PML tag for the ```MPI_PUT``` is acquired on rank A and is used for the long-message communication from rank A to rank B. A PML tag for the ```MPI_GET``` is acquired on rank B and is used for the long-message communication from rank A to rank B. These two tags may become a same value because they are managed independently on each rank. This will cause a data corruption. This commit separates the tag used in a single RMA communication call, one for communication from an origin to a target, and one for communication from a target to an origin. A "base" tag is acquired using ```get_tag``` function and PML tag is caluculated from the base tag by ```tag_to_target``` and ```tag_to_origin``` function.	2016-04-11 19:05:20 +09:00
KAWASHIMA Takahiro	3576ecafa7	Revert "osc/pt2pt: use two distinct "namespaces" for tags" This reverts commit 06ecdb6aa7ee688f51de2b3ca05e9f0605a90099 to reimplement the fix completely.	2016-04-11 19:04:11 +09:00
Karol Mroz	5c54184986	romio: replace deprecated hindexed call Signed-off-by: Karol Mroz <mroz.karol@gmail.com>	2016-04-10 19:56:22 +02:00
Nathan Hjelm	c6b19818be	bml: always enable the bml This commit ensures the bml is always enabled whether or not it will be used. This ensures that any available btls communicate their modex so that they can be used for one-sided communication. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-08 21:14:17 -06:00
George Bosilca	896f857fc4	Thanks @hjelmn for catching up the typo.	2016-04-07 13:56:26 -04:00

... 3 4 5 6 7 ...

6123 Коммитов