openmpi

Автор	SHA1	Сообщение	Дата
Nathan Hjelm	d7264aa613	osc/pt2pt: various threading fixes This commit fixes several bugs identified by a new multi-threaded RMA benchmarking suite. The following bugs have been identified and fixed: - The code that signaled the actual start of an access epoch changed the eager_send_active flag on a synchronization object without holding the object's lock. This could cause another thread waiting on eager sends to block indefinitely because the entirety of ompi_osc_pt2pt_sync_expected could exectute between the check of eager_send_active and the conditon wait of ompi_osc_pt2pt_sync_wait. - The bookkeeping of fragments could get screwed up when performing long put/accumulate operations from different threads. This was caused by the fragment flush code at the end of both put and accumulate. This code was put in place to avoid sending a large number of unexpected messages to a peer. To fix the bookkeeping issue we now 1) wait for eager sends to be active before stating any large isend's, and 2) keep track of the number of large isends associated with a fragment. If the number of large isends reaches 32 the active fragment is flushed. - Use atomics to update the large receive/send tag counters. This prevents duplicate tags from being used. The tag space has also been updated to use the entire 16-bits of the tag space. These changes should also fix open-mpi/ompi#1299. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-02-02 12:33:33 -07:00
Gilles Gouaillardet	728a97c558	use-mpi-f08: remove duplicates from Makefile.am	2016-02-02 13:33:07 +09:00
Jeff Squyres	910eca751f	Merge pull request #1327 from ggouaillardet/poc/mpi_xxx_dup_yyy_no_bind f08: do not BIND(C) to subroutines with LOGICAL parameters	2016-02-01 17:56:27 -05:00
Edgar Gabriel	3f7fff5780	Merge pull request #1331 from edgargabriel/solaris-statfs-fix Solaris statfs fix	2016-01-28 20:16:33 -06:00
Gilles Gouaillardet	69ba2a9b6b	ddt: fix support of MPI_COMBINER_RESIZED in __ompi_datatype_create_from_args Thanks James Ramsey for the report	2016-01-28 11:32:29 +09:00
Nathan Hjelm	a19c265ab5	osc/rdma: fix typo in ompi_osc_rdma_complete_atomic The typo caused SEGVs on systems with only fetching atomic support. Fixes open-mpi/ompi#1329 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-26 15:44:07 -07:00
Edgar Gabriel	b4a725c26a	need to check for the parent dir as well, since the file might not exist yet.	2016-01-26 13:49:21 -06:00
Edgar Gabriel	722aab92e6	- extend opal_path_nfs to retrieve the file system type - use opal_path_nfs in the fs_base function to avoid code duplication.	2016-01-26 13:36:21 -06:00
Gilles Gouaillardet	704f14f91e	f08: do not BIND(C) to subroutines with LOGICAL parameters Thanks Paul Romano for reporting this issue.	2016-01-26 13:56:24 +09:00
Joshua Ladd	69e3c6f289	Merge pull request #1321 from jladd-mlnx/topic/add-allgatherv-reduce Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.	2016-01-25 20:46:52 -05:00
Nathan Hjelm	500e90422d	Merge pull request #1320 from hjelmn/osc_rdma_fix osc/rdma: fix hang when performing large unaligned gets	2016-01-25 09:36:13 -07:00
Nathan Hjelm	45da311473	osc/rdma: fix hang when performing large unaligned gets This commit adds code to handle large unaligned gets. There are two possible code paths for these transactions: 1) The remote region and local region have the same alignment. In this case the get will be broken down into at most three get transactions: 1 transaction to get the unaligned start of the region (buffered), 1 transaction to get the aligned portion of the region, and 1 transaction to get the end of the region. 2) The remote and local regions do not have the same alignment. This should be an uncommon case and is not optimized. In this case a buffer is allocated and registered locally to hold the aligned data from the remote region. There may be cases where this fails (low memory, can't register memory). Those conditions are unlikely and will be handled later. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-22 21:06:46 -07:00
Valentin Petrov	5e2a2c0755	BufFix for coll/hcoll: coll_request must be set to ACTIVE when alloced If the state of the request is not set to OMPI_REQUEST_ACTIVE then MPI_Test would immediately signal such request completed while hcoll may still be working on it. Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>	2016-01-23 03:23:59 +02:00
Joshua Ladd	e398bf6f3a	Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce. Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>	2016-01-23 03:09:29 +02:00
Nathan Hjelm	49d2f44b97	osc/rdma: use correct endpoint for local state If atomics are not globally visible (cpu and nic atomics do not mix) then a btl endpoint must be used to access local ranks. To avoid issues that are caused by having the same region registered with multiple handles osc/rdma was updated to always use the handle for rank 0. There was a bug in the update that caused osc/rdma to continue using the local endpoint for accessing the state even though the pointer/handle are not valid for that endpoint. This commit fixes the bug. Fixes open-mpi/ompi#1241. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-22 10:41:27 -07:00
Nathan Hjelm	243d973cfe	Merge pull request #1316 from hjelmn/datatype_pack_threads ompi/datatype: make datatype pack thread safe	2016-01-21 20:14:10 -07:00
Nathan Hjelm	b921831f2b	ompi/datatype: make datatype pack thread safe This commit makes ompi_datatype_get_pack_description thread safe. The call is used by osc/pt2pt to send the packed description to remote peers. Before this commit if MPI_THREAD_MULTIPLE is enabled and the user uses MPI_Put, MPI_Get, etc we could hit a race where multiple threads attempt to store the packed description on the datatype. Since the code in question is not performance-critical the threading fix uses opal_atomic_* calls instead of bothering with OPAL_THREAD_*. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-21 17:53:28 -07:00
Nathan Hjelm	6180386bea	osc/rdma: disable put aggregation when using threads Optimizing put aggregation in the presence of threads will require a redesign of the code. For now just ensure that put aggregation is turned off when MPI_THREAD_MULTIPLE is enabled. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-21 15:50:35 -07:00
Edgar Gabriel	b253d4e887	fix CID 1349739, CID 1349738, CID 1349736 and (probably) CID 1349740 (not entirely sure about the last one, since I don't understand why block[i] is a problem but max_len[i] allocated and treated exactly the same way 1 line later is not).	2016-01-21 08:32:23 -06:00
Edgar Gabriel	9b8d769e41	will rivist the addproc component later in spring, right now it is constantly in the way of doing my tests.	2016-01-20 15:05:51 -06:00
Edgar Gabriel	1671604dbc	Merge pull request #1307 from edgargabriel/fcoll-dynamic_gen2 Fcoll dynamic gen2	2016-01-20 10:19:56 -06:00
Gilles Gouaillardet	2adbe273d6	mpi: have MPI_Wtick() return the period (and not the frequency) if OPAL_TIMER_CYCLE_NATIVE	2016-01-20 14:14:47 +09:00
Gilles Gouaillardet	c0f8f2ce32	ompi/dpm: correctly handle sentinels in construct peers This fix is similar to open-mpi/ompi@4c1ea4a171 and open-mpi/ompi@213b2abde4	2016-01-18 09:57:38 +09:00
Edgar Gabriel	a9ca37059a	improve the communicaton abstraction. This commit also allows all aggregators to work simultaniously, instead of the slightly staggered way of the previous version.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	56e11bfc97	initialize the stripe_size variable as well.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	26c57ef374	separate the size of the buffer used for the shuffle step and the size of the buffer used for a pwritev operation.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	39d5c8c281	further bug fixes silencing a compiler warning and fixing a memory overrun	2016-01-17 09:48:49 -06:00
Edgar Gabriel	2bcae84e11	further debugging	2016-01-17 09:48:49 -06:00
Edgar Gabriel	2bdd6ba17a	correctly free some buffers, and ensure that lustre_stripe_size and stripe_count are always read from the file system.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	4bbb22bd0b	add a new field to the ompio data structure (stripe_count) and set it correctly on pvfs2 and lustre.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	d282e94b67	add the new dynamic_gen2 component, designed to coexist for now with the original dynamic component	2016-01-17 09:48:49 -06:00
Jeff Squyres	60ffe713b8	common syms: whitelist bison-generated common symbols Bison generates some common symbols that we can't do anything about, so whitelist them.	2016-01-16 03:53:14 -08:00
Jeff Squyres	96f94f8228	fortran: whitelist deliberate common symbols The Fortran library has a number of common symbols that are deliberate, so whitelist them.	2016-01-16 03:53:14 -08:00
Joshua Ladd	18c5a21562	Fix typo in error handling flow.	2016-01-14 22:28:54 +02:00
Joshua Ladd	afa62d8ca1	Addressing reviewers' comments for https://github.com/open-mpi/ompi-release/pull/891	2016-01-14 19:22:27 +02:00
Tomislav Janjusic	3858bc8e62	Adding support for dynamic endpoint creation Signed-off-by: Tomislav Janjusic <tomislavj@mngx-apl-01.mtl.labs.mlnx> Signed-off-by: Tomislavj Janjusic <tomislavj@mellanox.com> Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>	2016-01-12 22:17:03 +02:00
Nathan Hjelm	dd4d49cbbb	Merge pull request #1278 from ggouaillardet/poc/osc_pt2pt osc/pt2pt: use two distinct "namespaces" for tags	2016-01-12 09:49:31 -07:00
Nathan Hjelm	d26cc3fece	ompi/group: do no decrement parent group proc pointers in destruct Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-11 12:56:11 -07:00
Edgar Gabriel	0a1b735eed	use the actual preadv and pwritev functions if available. That's what the fbtl interfaces have been designed for.	2016-01-07 08:29:17 -06:00
Gilles Gouaillardet	4c1ea4a171	dpm: correctly handle procs_cutoff in ompi_dpm_connect_accept() this commit includes missing bits from open-mpi/ompi@213b2abde4	2016-01-07 09:11:03 +09:00
Gilles Gouaillardet	213b2abde4	dpm: correctly handle procs_cutoff in ompi_dpm_connect_accept()	2016-01-06 16:21:13 +09:00
Edgar Gabriel	1b0b849994	remove the MCA parameter setting the number of hosts in PLFS, since the plfs_setxattr function used is causing linking problems with PLFS 2.5 remove unused variables.	2016-01-05 11:13:23 -06:00
Edgar Gabriel	7861a8c357	revise the logic in the fbtl plfs avoiding the memcpy operation	2016-01-05 10:04:46 -06:00
Edgar Gabriel	da309ac962	- use a unique pid for each process as requested by the API - sync the file before closing it - use plfs_access() instead of access() before closing the file	2016-01-05 10:04:12 -06:00
Gilles Gouaillardet	06ecdb6aa7	osc/pt2pt: use two distinct "namespaces" for tags	2016-01-05 16:57:37 +09:00
Gilles Gouaillardet	14fdf75944	fs/pvfs2: fix typo Thanks Dave Love for reporting this issue. Fixes #1272	2016-01-03 23:28:35 +09:00
Artem Polyakov	2abb2972ac	Fix Mellanox copyrights with respect to the following PRs: * https://github.com/open-mpi/ompi/pull/1184 * https://github.com/open-mpi/ompi/pull/1188 * https://github.com/open-mpi/ompi/pull/1197 * https://github.com/open-mpi/ompi/pull/1202 * https://github.com/open-mpi/ompi/pull/1210 * https://github.com/open-mpi/ompi/pull/1216 * https://github.com/open-mpi/ompi/pull/1236 * https://github.com/open-mpi/ompi/pull/1237 * https://github.com/open-mpi/ompi/pull/1248 * https://github.com/open-mpi/ompi/pull/1260 * https://github.com/open-mpi/ompi/pull/1264	2015-12-30 00:12:19 +06:00
Ralph Castain	810f2446b7	Add pmix120 component, update the error handling functions in the PMIx API. Update the configure logic for the new pmix120 component ckpt Get the pmix120 component to work - still not really registering or handling notifications, but infrastructure now operates Cleanup some of the symbol scopes, and provide a more comprehensive rename.h file. Will pretty it up later - let's see how this works Cleanup the rename files to use the pretty macros	2015-12-28 23:15:44 +09:00
Gilles Gouaillardet	fec973efda	configury: test portability replace test ... -o ... with test ... \|\| test ... and test ... -a ... with test ... && test ...	2015-12-28 13:58:45 +09:00
Gilles Gouaillardet	47ab2fcb89	man: fix MPI_Neighbor_alltoall{v,w} prototypes Thanks Willem Vermin for bringing this to our attention	2015-12-28 09:39:33 +09:00

1 2 3 4 5 ...

8783 Коммитов