openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	16d9f71d01	Correctly compute the space needed for the args. Add checks to bail out if our precomputed value is less than needed (we are already at fault). bot:milestone:v1.10.3 bot:milestone:v2.0 bot🏷️bug bot:assign: @ggouaillardet	2016-05-21 16:01:16 -04:00
George Bosilca	0641005dab	Only check the parameters on valid dimensions.	2016-05-21 15:54:04 -04:00
George Bosilca	6aac0d9c22	Remove useless output stream.	2016-05-21 15:54:04 -04:00
Nathan Hjelm	31bfeede82	bml/r2: always add btl progress function This commit changes the behavior of bml/r2 from conditionally registering btl progress functions to always registering progress functions. Any progress function beloning to a btl that is not yet in use is registered as low-priority. As soon as a proc is added that will make use of the btl is is re-registered normally. This works around an issue with some btls. In order to progress a first message from an unknown peer both ugni and openib need to have their progress functions called. If either btl is not in use after the first call to add_procs the callback was never happening. This commit ensures the btl progress function is called at some point but the number of progress callbacks is reduced from normal to ensure lower overhead when a btl is not used. The current ratio is 1 low priority progress callback for every 8 calls to opal_progress(). Fixes open-mpi/ompi#1676 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-21 15:54:04 -04:00
Ralph Castain	a35bb8453a	Unlock the mutex prior to destructing it. Thanks to Nicolas Joly for the report	2016-05-19 10:36:58 -07:00
rhc54	8b534e9897	Merge pull request #1668 from rhc54/topic/slurm When direct launching applications, we must allow the MPI layer to pr…	2016-05-16 12:23:19 -07:00
Jeff Squyres	5275e5e2a1	bml_r2: use __func__ to identify function names There were some old/stale function names in some debugging/verbose opal_output calls. Use __func__ instead, so that they won't become stale in the future. Thanks to Durga Choudhury for pointing out the issue. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-16 11:06:47 -04:00
Ralph Castain	01ba861f2a	When direct launching applications, we must allow the MPI layer to progress during RTE-level barriers. Neither SLURM nor Cray provide non-blocking fence functions, so push those calls into a separate event thread (use the OPAL async thread for this purpose so we don't create another one) and let the MPI thread sping in wait_for_completion. This also restores the "lazy" completion during MPI_Finalize to minimize cpu utilization. Update external as well Revise the change: we still need the MPI_Barrier in MPI_Finalize when we use a blocking fence, but do use the "lazy" wait for completion. Replace the direct logic in MPI_Init with a cleaner macro	2016-05-14 16:37:00 -07:00
Aurélien Bouteiller	7f65c2b18e	forgot to update copyright in commits 627a89b 4899c89	2016-05-13 11:34:59 -04:00
George Bosilca	37e03e3e5b	Don't update req_bytes_received if no bytes were received.	2016-05-12 23:39:32 -04:00
rhc54	4d026e223c	Merge pull request #1661 from matcabral/master PSM and PSM2 MTLs to detect drivers and link	2016-05-11 17:43:17 -07:00
George Bosilca	f8facb177d	atomically update the refcount on the datatype args.	2016-05-11 12:40:18 -04:00
Matias A Cabral	528abff6ae	Merge remote-tracking branch 'upstream/master'	2016-05-10 15:42:08 -07:00
Matias A Cabral	d28ee62a96	Update in PSM and PSM2 MTLs to detect entries created by drivers for Intel TrueScale and Intel OmniPath, and detect a link in ACTIVE state. This fix addresses the scenario reported in the below OMPI users email, including formerly named Qlogic IB, now Intel True scale. Given the nature of the PSM/PSM2 mtls this fix applies to OmniPath: https://www.open-mpi.org/community/lists/users/2016/04/29018.php	2016-05-09 12:08:44 -07:00
Gilles Gouaillardet	0a19337371	coll/base: return MPI_ERR_UNSUPPORTED_OPERATION when coll_base_*_two_procs algo is used on a communicator that has no two tasks Thanks Dave Love for the report	2016-05-09 14:18:40 +09:00
Ralph Castain	6b24e2779b	Remove stale component - I'm not going to get to it	2016-05-07 04:13:34 -07:00
Edgar Gabriel	def1b95fd7	Merge pull request #1646 from edgargabriel/getview-preallocate-fixes io/ompio: file_getview and file_preallocate fixes	2016-05-06 11:46:00 -05:00
Edgar Gabriel	e65e189671	io/ompio: fix file size after file_preallocate Thanks for @dalcini for reporting Fixes open-mpi/ompi#1633	2016-05-06 08:20:59 -05:00
Edgar Gabriel	d358965134	io/ompio: fix envelope of datatype returned by getview Thanks for @dalcini for reporting Fixes open-mpi/ompi#1632	2016-05-06 08:19:48 -05:00
Edgar Gabriel	7c92acaa78	Merge pull request #1637 from edgargabriel/pr/netbsd-compilation-problems fs/lustre and fs/pvfs2: fix netbsd compilation problems	2016-05-06 08:05:36 -05:00
Jeff Squyres	810db734c4	Merge pull request #1640 from jsquyres/pr/mpir-cleanup debuggers: remove some useless code	2016-05-05 21:23:30 -04:00
Gilles Gouaillardet	6c9d65c0ca	coll/libnbc: fix MPI_Ireduce_scatter_block for one task communicator Thanks Lisandro Dalcin for the report Fixes open-mpi/ompi#248	2016-05-06 09:43:29 +09:00
Ralph Castain	08022d7af1	Some minor cleanups of warnings from gcc 6.0.0. Update s1/s2 pmix to get max_procs as required.	2016-05-05 15:28:13 -07:00
Jeff Squyres	83c2d04aa3	debuggers: remove some useless code MPIR-1.0 specifies that the following symbols are only relevant in the starter process: - MPIR_Breakpoint - MPIR_being_debugged - MPIR_debug_state - MPIR_debug_abort_string I.e., the code filling in values in these various symbols was useless / never used. MPIR-1.1 will define that MPIR_being_debugged is relevant in MPI processes. That symbol is currently defined in libopen-rte (which is currently causing a duplicate symbol error for static builds -- this commit fixes that error), and is therefore still available for MPI processes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-05 14:22:55 -07:00
Jeff Squyres	f167be1c91	ompio: always return valid info from FILE_GET_INFO MPI-3.1 says that even if no info keys are set on the file, we need to return a new, empty info. Thanks to Lisandro Dalcin for identifying the issue. Fixes open-mpi/ompi#1630 Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-05 12:03:29 -07:00
Aurélien Bouteiller	4899c89731	Fix a race condition when multiple threads try to create a bml endpoint simultaneously.	2016-05-05 10:49:30 -04:00
Aurélien Bouteiller	627a89bf71	Fix a race condition when multiple threads do the "first send" to an endpoint simultaneously.	2016-05-05 09:04:10 -04:00
Joshua Ladd	4771c9ece6	Merge pull request #1617 from jladd-mlnx/topic/disable-hcoll-barrier-in-finalize-ompi-trunk HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla	2016-05-04 10:12:34 -04:00
Aurélien Bouteiller	8344d00418	use-mpi extensions do not have a .la lib, so the fortran module should not depend on them.	2016-05-03 11:54:35 -04:00
Edgar Gabriel	78fa8bb2c4	remove some unused variables that can cause compilation problems on netbsd	2016-05-03 10:25:15 -05:00
Todd Kordenbrock	3498bed650	Merge pull request #1555 from shawone/check_reduce_ret coll-portals4: check return value from reduce kary tree functions	2016-05-03 10:17:23 -05:00
Jeff Squyres	33dd8ca81e	osc_rdma_peer: properly include ompi_config.h Thanks to Paul Hargrove for reporting. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-03 07:39:55 -07:00
Devendar Bureddy	cafd55f18c	HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla tear down HCOLL barrier may not complete if HCOLL progress is not called periodically. which is the case in HCOLL teardown progress in the finalize. (cherry picked from commit 793244d75dd94d1d5e0243bcccf6d04318750f3f)	2016-05-03 00:49:57 +03:00
Nathan Hjelm	d3d779f6d9	osc/rdma: clear all_sync object when obtaining a lock This commit fixes a bad synchronization detection bug that occurs when mixing MPI_Win_fence() and MPI_Win_lock(). If no communication has occurred in the fence epoch it is safe to just clear the all_sync object (it was set up by fence). Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-02 15:28:47 -06:00
Jeff Squyres	265e5b9795	Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1 ompi/opal/orte/oshmem/test: max hostname length cleanup	2016-05-02 09:44:18 -04:00
Ralph Castain	6ac7929bd0	Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need. Cleanups per @jjhursey review	2016-05-01 11:30:25 -07:00
George Bosilca	6e6ed62a3c	Allow NULL arrays for emoty datatypes. When building an empty datatype (aka. size = 0) because the count of included datatypes is 0, be less strict on what the arguments are (allow NULL pointers).	2016-05-01 12:37:02 -04:00
Nathan Hjelm	ec66a6a1f8	Merge pull request #1605 from hjelmn/rdma_fixes osc/rdma: fix global index array calculation	2016-04-28 20:41:36 -06:00
Nathan Hjelm	7bda3eb2dc	osc/rdma: fix global index array calculation This commit fixes a bug that occurs when ranks are either not mapped evenly or by something other than core. Fixes open-mpi/ompi#1599 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-28 19:11:11 -06:00
Nathan Hjelm	1783d94f91	ompi/group: fix sparse group proc reference counting This commit fixes a bug when sparse groups are in use. Since sparse group do not actually increment the reference counts of any procs (they just retain the parent group) it is wrong to decrement the reference counts of all procs in the group using ompi_group_decrement_proc_count(). This commit makes the call to ompi_group_decrement_proc_count() conditional on the group being dense. Fixes open-mpi/ompi#1593 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-27 15:55:13 -06:00
Gilles Gouaillardet	01c90d4e71	fortran/mpif-h: fix _create_keyval_f correctly handle out parameter _keyval when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT	2016-04-27 13:34:32 +09:00
Gilles Gouaillardet	178dde6a20	fortran/mpif-h: fix MPI_Win_shared_query correctly handle out parameter disp_unit when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT	2016-04-27 11:22:09 +09:00
Gilles Gouaillardet	7f59d2a8c7	fortran/mpif-h: fix MPI_Win_free_keyval initialize inout parameter when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT	2016-04-27 10:46:14 +09:00
Nathan Hjelm	f0f3383006	Merge pull request #1590 from hjelmn/thread_multiple osc/pt2pt: do not drop/reacquire the ompi_request_lock	2016-04-26 16:48:37 -06:00
Nathan Hjelm	34ff6293bd	osc/pt2pt: do not drop/reacquire the ompi_request_lock This lock is now recursive so it is safe to call into the pml without dropping the lock. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-26 14:19:38 -06:00
George Bosilca	bf190671e9	Make the request lock recursive. If during the request completion callback we post another request that completes right away (such a small send or a match for an unexpected short message) we will try to complete the second request while holding the lock for the completion of the first. For performance reasons (mainly to avoid unlocking and locking the request mutex several times) we have made the request lock recursive.	2016-04-26 16:16:07 -04:00
Nathan Hjelm	1e4daa2a0e	mpi_init: move opal_set_using_threads() earlier in MPI_Init() There is a potential race condition in MPI_Init() where an orte even thread could be in a function that uses OPAL_THREAD_LOCK / OPAL_THREAD_UNLOCK when ompi_mpi_init calls opal_set_using_threads(). Closes open-mpi/ompi#1586 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-26 13:02:42 -06:00
Nathan Hjelm	c16e639b2f	Merge pull request #1563 from hjelmn/ompi_coverity ompi coverity fixes	2016-04-26 09:17:48 -06:00
Jeff Squyres	8ab88f2051	ompi_mpi_finalize: add/update comments This is a follow-on to open-mpi/ompi@7373111: add some comments explaining why the code is the way it is. Also update a previous comment. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-04-25 13:42:30 -07:00
Ralph Castain	7373111662	Somehow, the logic for finalize got lost, so restore it here. If pmix.fence_nb is available, then call it and cycle opal_progress until complete. If pmix.fence_nb is not available, then do an MPI_Barrier and call pmix.fence. Needs to go over to 2.x	2016-04-25 08:04:35 -07:00

1 2 3 4 5 ...

8957 Коммитов