openmpi

Автор	SHA1	Сообщение	Дата
Igor Ivanov	8b05f308f9	opal/memory: Move Memory Allocation Hooks usage from openib These changes fix issue https://github.com/open-mpi/ompi/issues/1336 - improve abstractions: opal/memory/linux component should be single place that opeartes with Memory Allocation Hooks. - avoid collisions in case dynamic component open/close: it is safe because it is linked statically. - does not change original behaivour.	2016-02-11 14:46:35 +02:00
Gilles Gouaillardet	96310f439b	sentinel: fix 32 bits arch since a sentinel is only made from the current job, only store the first 31 bits of the vpid into the sentinel.	2016-02-10 15:44:07 +09:00
Gilles Gouaillardet	b55b9e6aee	sentinel: fix sentinel to proc_name conversion converting an opal_process_name_t means the loss of one bit, it was decided to restrict the local job id to 15 bits, so the useful information of an opal_process_name_t can fit in 63 bits.	2016-02-10 15:44:07 +09:00
Gilles Gouaillardet	030a5f2054	sentinel: use type uintptr_t for sentinel MSB is now automatically cleared when right shifting Thanks George for pointing this	2016-02-10 11:28:56 +09:00
Jeff Squyres	d537ee9f26	Merge pull request #1340 from jsquyres/pr/decrease-mpi_add_procs_cutoff RFC: ompi_mpi_params.c: set mpi_add_procs_cutoff default to 0	2016-02-09 13:36:43 -05:00
Jeff Squyres	902b477aac	ompi_mpi_params.c: set mpi_add_procs_cutoff default to 0 Decrease the default value of the "mpi_add_procs_cutoff" MCA param from 1024 to 0.	2016-02-09 09:41:36 -08:00
George Bosilca	7c574a3530	Typo.	2016-02-07 07:22:22 +02:00
Nathan Hjelm	5b9c82a964	osc/pt2pt: bug fixes This commit fixes several bugs identified by @ggouaillardet and MTT: - Fix SEGV in long send completion caused by missing update to the request callback data. - Add an MPI_Barrier to the fence short-cut. This fixes potential semantic issues where messages may be received before fence is reached. - Ensure fragments are flushed when using request-based RMA. This allows MPI_Test/MPI_Wait/etc to work as expected. - Restore the tag space back to 16-bits. It was intended that the space be expanded to 32-bits but the required change to the fragment headers was not committed. The tag space may be expanded in a later commit. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-02-04 16:59:39 -07:00
Gilles Gouaillardet	6eac6a8b00	osc/sm: create datafile into the per proc directory in order to make it unique per communicator Thanks Peter Wind for the report	2016-02-03 10:12:37 +09:00
Nathan Hjelm	519fffb65e	osc/pt2pt: eager sends are always active if MPI_MODE_NOCHECK is used This commit fixes open-mpi/ompi#1299. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-02-02 12:44:17 -07:00
Nathan Hjelm	d7264aa613	osc/pt2pt: various threading fixes This commit fixes several bugs identified by a new multi-threaded RMA benchmarking suite. The following bugs have been identified and fixed: - The code that signaled the actual start of an access epoch changed the eager_send_active flag on a synchronization object without holding the object's lock. This could cause another thread waiting on eager sends to block indefinitely because the entirety of ompi_osc_pt2pt_sync_expected could exectute between the check of eager_send_active and the conditon wait of ompi_osc_pt2pt_sync_wait. - The bookkeeping of fragments could get screwed up when performing long put/accumulate operations from different threads. This was caused by the fragment flush code at the end of both put and accumulate. This code was put in place to avoid sending a large number of unexpected messages to a peer. To fix the bookkeeping issue we now 1) wait for eager sends to be active before stating any large isend's, and 2) keep track of the number of large isends associated with a fragment. If the number of large isends reaches 32 the active fragment is flushed. - Use atomics to update the large receive/send tag counters. This prevents duplicate tags from being used. The tag space has also been updated to use the entire 16-bits of the tag space. These changes should also fix open-mpi/ompi#1299. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-02-02 12:33:33 -07:00
Gilles Gouaillardet	cda094afc7	mpi_f08: correctly implements MPI_{COMM,TYPE,WIN}_{DUP,NULL_{COPY,DELETE}}_FN Fixes open-mpi/ompi#1323	2016-02-02 13:38:01 +09:00
Gilles Gouaillardet	728a97c558	use-mpi-f08: remove duplicates from Makefile.am	2016-02-02 13:33:07 +09:00
Jeff Squyres	910eca751f	Merge pull request #1327 from ggouaillardet/poc/mpi_xxx_dup_yyy_no_bind f08: do not BIND(C) to subroutines with LOGICAL parameters	2016-02-01 17:56:27 -05:00
Edgar Gabriel	3f7fff5780	Merge pull request #1331 from edgargabriel/solaris-statfs-fix Solaris statfs fix	2016-01-28 20:16:33 -06:00
Gilles Gouaillardet	69ba2a9b6b	ddt: fix support of MPI_COMBINER_RESIZED in __ompi_datatype_create_from_args Thanks James Ramsey for the report	2016-01-28 11:32:29 +09:00
Nathan Hjelm	a19c265ab5	osc/rdma: fix typo in ompi_osc_rdma_complete_atomic The typo caused SEGVs on systems with only fetching atomic support. Fixes open-mpi/ompi#1329 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-26 15:44:07 -07:00
Edgar Gabriel	b4a725c26a	need to check for the parent dir as well, since the file might not exist yet.	2016-01-26 13:49:21 -06:00
Edgar Gabriel	722aab92e6	- extend opal_path_nfs to retrieve the file system type - use opal_path_nfs in the fs_base function to avoid code duplication.	2016-01-26 13:36:21 -06:00
Gilles Gouaillardet	704f14f91e	f08: do not BIND(C) to subroutines with LOGICAL parameters Thanks Paul Romano for reporting this issue.	2016-01-26 13:56:24 +09:00
Joshua Ladd	69e3c6f289	Merge pull request #1321 from jladd-mlnx/topic/add-allgatherv-reduce Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce.	2016-01-25 20:46:52 -05:00
Nathan Hjelm	500e90422d	Merge pull request #1320 from hjelmn/osc_rdma_fix osc/rdma: fix hang when performing large unaligned gets	2016-01-25 09:36:13 -07:00
Nathan Hjelm	45da311473	osc/rdma: fix hang when performing large unaligned gets This commit adds code to handle large unaligned gets. There are two possible code paths for these transactions: 1) The remote region and local region have the same alignment. In this case the get will be broken down into at most three get transactions: 1 transaction to get the unaligned start of the region (buffered), 1 transaction to get the aligned portion of the region, and 1 transaction to get the end of the region. 2) The remote and local regions do not have the same alignment. This should be an uncommon case and is not optimized. In this case a buffer is allocated and registered locally to hold the aligned data from the remote region. There may be cases where this fails (low memory, can't register memory). Those conditions are unlikely and will be handled later. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-22 21:06:46 -07:00
Valentin Petrov	5e2a2c0755	BufFix for coll/hcoll: coll_request must be set to ACTIVE when alloced If the state of the request is not set to OMPI_REQUEST_ACTIVE then MPI_Test would immediately signal such request completed while hcoll may still be working on it. Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>	2016-01-23 03:23:59 +02:00
Joshua Ladd	e398bf6f3a	Adding entry points for Allgatherv, iAllgatherv, Reduce, and iReduce. Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>	2016-01-23 03:09:29 +02:00
Nathan Hjelm	49d2f44b97	osc/rdma: use correct endpoint for local state If atomics are not globally visible (cpu and nic atomics do not mix) then a btl endpoint must be used to access local ranks. To avoid issues that are caused by having the same region registered with multiple handles osc/rdma was updated to always use the handle for rank 0. There was a bug in the update that caused osc/rdma to continue using the local endpoint for accessing the state even though the pointer/handle are not valid for that endpoint. This commit fixes the bug. Fixes open-mpi/ompi#1241. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-22 10:41:27 -07:00
Nathan Hjelm	243d973cfe	Merge pull request #1316 from hjelmn/datatype_pack_threads ompi/datatype: make datatype pack thread safe	2016-01-21 20:14:10 -07:00
Nathan Hjelm	b921831f2b	ompi/datatype: make datatype pack thread safe This commit makes ompi_datatype_get_pack_description thread safe. The call is used by osc/pt2pt to send the packed description to remote peers. Before this commit if MPI_THREAD_MULTIPLE is enabled and the user uses MPI_Put, MPI_Get, etc we could hit a race where multiple threads attempt to store the packed description on the datatype. Since the code in question is not performance-critical the threading fix uses opal_atomic_* calls instead of bothering with OPAL_THREAD_*. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-21 17:53:28 -07:00
Nathan Hjelm	6180386bea	osc/rdma: disable put aggregation when using threads Optimizing put aggregation in the presence of threads will require a redesign of the code. For now just ensure that put aggregation is turned off when MPI_THREAD_MULTIPLE is enabled. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-21 15:50:35 -07:00
Edgar Gabriel	b253d4e887	fix CID 1349739, CID 1349738, CID 1349736 and (probably) CID 1349740 (not entirely sure about the last one, since I don't understand why block[i] is a problem but max_len[i] allocated and treated exactly the same way 1 line later is not).	2016-01-21 08:32:23 -06:00
Edgar Gabriel	9b8d769e41	will rivist the addproc component later in spring, right now it is constantly in the way of doing my tests.	2016-01-20 15:05:51 -06:00
Edgar Gabriel	1671604dbc	Merge pull request #1307 from edgargabriel/fcoll-dynamic_gen2 Fcoll dynamic gen2	2016-01-20 10:19:56 -06:00
Francois WELLENREITER	411b7301c3	OSC portals4 : do not generate an EVENT_SEND to avoid to filter it	2016-01-20 11:47:46 +01:00
Gilles Gouaillardet	2adbe273d6	mpi: have MPI_Wtick() return the period (and not the frequency) if OPAL_TIMER_CYCLE_NATIVE	2016-01-20 14:14:47 +09:00
Gilles Gouaillardet	c0f8f2ce32	ompi/dpm: correctly handle sentinels in construct peers This fix is similar to open-mpi/ompi@4c1ea4a171 and open-mpi/ompi@213b2abde4	2016-01-18 09:57:38 +09:00
Edgar Gabriel	a9ca37059a	improve the communicaton abstraction. This commit also allows all aggregators to work simultaniously, instead of the slightly staggered way of the previous version.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	56e11bfc97	initialize the stripe_size variable as well.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	26c57ef374	separate the size of the buffer used for the shuffle step and the size of the buffer used for a pwritev operation.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	39d5c8c281	further bug fixes silencing a compiler warning and fixing a memory overrun	2016-01-17 09:48:49 -06:00
Edgar Gabriel	2bcae84e11	further debugging	2016-01-17 09:48:49 -06:00
Edgar Gabriel	2bdd6ba17a	correctly free some buffers, and ensure that lustre_stripe_size and stripe_count are always read from the file system.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	4bbb22bd0b	add a new field to the ompio data structure (stripe_count) and set it correctly on pvfs2 and lustre.	2016-01-17 09:48:49 -06:00
Edgar Gabriel	d282e94b67	add the new dynamic_gen2 component, designed to coexist for now with the original dynamic component	2016-01-17 09:48:49 -06:00
Jeff Squyres	60ffe713b8	common syms: whitelist bison-generated common symbols Bison generates some common symbols that we can't do anything about, so whitelist them.	2016-01-16 03:53:14 -08:00
Jeff Squyres	96f94f8228	fortran: whitelist deliberate common symbols The Fortran library has a number of common symbols that are deliberate, so whitelist them.	2016-01-16 03:53:14 -08:00
Joshua Ladd	18c5a21562	Fix typo in error handling flow.	2016-01-14 22:28:54 +02:00
Joshua Ladd	afa62d8ca1	Addressing reviewers' comments for https://github.com/open-mpi/ompi-release/pull/891	2016-01-14 19:22:27 +02:00
Tomislav Janjusic	3858bc8e62	Adding support for dynamic endpoint creation Signed-off-by: Tomislav Janjusic <tomislavj@mngx-apl-01.mtl.labs.mlnx> Signed-off-by: Tomislavj Janjusic <tomislavj@mellanox.com> Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>	2016-01-12 22:17:03 +02:00
Nathan Hjelm	dd4d49cbbb	Merge pull request #1278 from ggouaillardet/poc/osc_pt2pt osc/pt2pt: use two distinct "namespaces" for tags	2016-01-12 09:49:31 -07:00
Nathan Hjelm	d26cc3fece	ompi/group: do no decrement parent group proc pointers in destruct Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-01-11 12:56:11 -07:00
Edgar Gabriel	0a1b735eed	use the actual preadv and pwritev functions if available. That's what the fbtl interfaces have been designed for.	2016-01-07 08:29:17 -06:00
Gilles Gouaillardet	4c1ea4a171	dpm: correctly handle procs_cutoff in ompi_dpm_connect_accept() this commit includes missing bits from open-mpi/ompi@213b2abde4	2016-01-07 09:11:03 +09:00
Gilles Gouaillardet	213b2abde4	dpm: correctly handle procs_cutoff in ompi_dpm_connect_accept()	2016-01-06 16:21:13 +09:00
Edgar Gabriel	1b0b849994	remove the MCA parameter setting the number of hosts in PLFS, since the plfs_setxattr function used is causing linking problems with PLFS 2.5 remove unused variables.	2016-01-05 11:13:23 -06:00
Edgar Gabriel	7861a8c357	revise the logic in the fbtl plfs avoiding the memcpy operation	2016-01-05 10:04:46 -06:00
Edgar Gabriel	da309ac962	- use a unique pid for each process as requested by the API - sync the file before closing it - use plfs_access() instead of access() before closing the file	2016-01-05 10:04:12 -06:00
KAWASHIMA Takahiro	ad26899110	osc/sm: Fix a bus error on MPI_WIN_{POST,START}. A bus error occurs in sm OSC under the following conditions. - sparc64 or any other architectures which need strict alignment. - `MPI_WIN_POST` or `MPI_WIN_START` is called for a window created by sm OSC. - The communicator size is odd and greater than 3. The lines 283-285 in current `ompi/mca/osc/sm/osc_sm_component.c` has the following code. ```c module->global_state = (ompi_osc_sm_global_state_t ) (module->segment_base); module->node_states = (ompi_osc_sm_node_state_t ) (module->global_state + 1); module->posts[0] = (uint64_t *) (module->node_states + comm_size); ``` The size of `ompi_osc_sm_node_state_t` is multiples of 4 but not multiples of 8. So if `comm_size` is odd, `module->posts[0]` does not aligned to 8. This causes a bus error when accessing `module->posts[i][j]`. This patch fixes the alignment of `module->posts[0]` by setting `module->posts[0]` first.	2016-01-05 19:04:53 +09:00
Gilles Gouaillardet	06ecdb6aa7	osc/pt2pt: use two distinct "namespaces" for tags	2016-01-05 16:57:37 +09:00
Gilles Gouaillardet	14fdf75944	fs/pvfs2: fix typo Thanks Dave Love for reporting this issue. Fixes #1272	2016-01-03 23:28:35 +09:00
Artem Polyakov	2abb2972ac	Fix Mellanox copyrights with respect to the following PRs: * https://github.com/open-mpi/ompi/pull/1184 * https://github.com/open-mpi/ompi/pull/1188 * https://github.com/open-mpi/ompi/pull/1197 * https://github.com/open-mpi/ompi/pull/1202 * https://github.com/open-mpi/ompi/pull/1210 * https://github.com/open-mpi/ompi/pull/1216 * https://github.com/open-mpi/ompi/pull/1236 * https://github.com/open-mpi/ompi/pull/1237 * https://github.com/open-mpi/ompi/pull/1248 * https://github.com/open-mpi/ompi/pull/1260 * https://github.com/open-mpi/ompi/pull/1264	2015-12-30 00:12:19 +06:00
Ralph Castain	810f2446b7	Add pmix120 component, update the error handling functions in the PMIx API. Update the configure logic for the new pmix120 component ckpt Get the pmix120 component to work - still not really registering or handling notifications, but infrastructure now operates Cleanup some of the symbol scopes, and provide a more comprehensive rename.h file. Will pretty it up later - let's see how this works Cleanup the rename files to use the pretty macros	2015-12-28 23:15:44 +09:00
Gilles Gouaillardet	fec973efda	configury: test portability replace test ... -o ... with test ... \|\| test ... and test ... -a ... with test ... && test ...	2015-12-28 13:58:45 +09:00
Gilles Gouaillardet	47ab2fcb89	man: fix MPI_Neighbor_alltoall{v,w} prototypes Thanks Willem Vermin for bringing this to our attention	2015-12-28 09:39:33 +09:00
Gilles Gouaillardet	ccc96ad204	fbtl/base: add missing #include "opal/util/output.h" Thanks Marco Atzeri for contributing the original patch	2015-12-24 14:41:26 +09:00
Gilles Gouaillardet	cebde2a753	coll/tuned: add missing #include "opal/util/output.h" Thanks Marco Atzeri for contributing the original patch	2015-12-24 14:41:17 +09:00
Gilles Gouaillardet	ad9693c604	pml/yalla: add missing #include <alloca.h>	2015-12-24 14:33:58 +09:00
Gilles Gouaillardet	b38c17dbcb	pml/cm: add missing #include <alloca.h> Thanks Paul Hargrove for reporting this issue	2015-12-24 14:33:58 +09:00
Gilles Gouaillardet	071ae39a44	osc/rdma: add missing #include <alloca.h>	2015-12-24 14:33:58 +09:00
Gilles Gouaillardet	77f199d1d7	coll/fca: add missing #include <alloca.h>	2015-12-24 14:33:58 +09:00
Todd Kordenbrock	8a3660138e	mtl-portals4: initialize endpoint nid/pid when using logical mapping When mtl-portals4 is configured for logical mapping, coll-portals4 must disqualify because it does not yet support logical mapping. coll-portals4 looks for the endpoint pid to be zero which tells it that mtl-portals4 is configured for logical mapping. This commit initializes the endpoint nid/pid to zero for logical mapping.	2015-12-22 11:20:18 -06:00
Gilles Gouaillardet	e918d75fae	java: try do dlopen libmpi with the full path Since OS X 10.11 (aka El Capitan) DYLD_LIBRARY_PATH is no more propagated to children, so try to dlopen libmpi with the full path using the directory of libmpi_java Fixes open-mpi/ompi#1220 Thanks Alexander Daryin for reporting this	2015-12-22 11:09:46 +09:00
rhc54	aa17bdf6e8	Merge pull request #1239 from rhc54/topic/cleanup Cleanup the warnings from the ompi layer when compiling optimized under Mac OSX	2015-12-21 07:23:31 -08:00
Edgar Gabriel	46c20a1246	correctly set all variables storing information on the file pointer position to zero when setting the file view	2015-12-21 09:41:39 +09:00
George Bosilca	12dad8b37f	Fix the missing resize of the returned type for the subarray and darray types. Thanks Keith Bennett and Dan Garmann for reporting this issue Fixes open-mpi/ompi#1191	2015-12-21 09:41:30 +09:00
George Bosilca	6e6fd14a19	Fix indentation.	2015-12-20 03:15:19 -05:00
George Bosilca	c895eb7068	Remove extraneous declaration.	2015-12-19 01:34:48 -05:00
Ralph Castain	ac6289dca6	Cleanup the warnings from the ompi layer when compiling optimized under Mac OSX Cleanup per George's comments	2015-12-17 17:39:15 -08:00
Nathan Hjelm	d0b4aa1f9a	Merge pull request #1237 from artpol84/add_proc_deadlck_fix Fix add_proc deadlock.	2015-12-17 12:09:40 -07:00
Artem Polyakov	6a791c3026	Fix add_proc deadlock.	2015-12-17 21:18:33 +06:00
igor.ivanov@itseez.com	041a6a9f53	ompi/pml: Fix warnings in yalla component	2015-12-16 16:22:30 +02:00
igor.ivanov@itseez.com	38c253c74c	ompi/mtl: Fix warnings in mxm component	2015-12-16 16:22:29 +02:00
igor.ivanov@itseez.com	0a9956927a	ompi/coll: Fix warnings in fca components warning: assignment from incompatible pointer type	2015-12-16 16:22:16 +02:00
igor.ivanov@itseez.com	8f45d83d46	ompi/coll: Fix warnings in hcoll component warning: assignment from incompatible pointer type	2015-12-16 14:52:29 +02:00
Ralph Castain	3a56f0d34b	Create the pmix external component. Fix a few places where opal/util/argv.h were required when building with an external pmix (go figure). NOTE: Building with external pmix requires that you also build with external libevent and hwloc libraries. Detect this at configure and error out with large message if this requirement is violated. Closes #1204 (replaces it) Fixes #1064	2015-12-15 15:26:13 -08:00
Nathan Hjelm	4992c22f4a	Merge pull request #1224 from hjelmn/osc_fixes osc/rdma: fix bugs when running more than one process per node	2015-12-15 14:01:01 -08:00
Nathan Hjelm	0de9445fc7	osc/rdma: fix bugs when running more than one process per node A previous commit updated the one-sided code to register the state region only once. This created an issue when using the scratch lock with fetching atomics. In this case on any rank that isn't local rank 0 the module->state_handle is NULL. This commit fixes the issue by removing the scratch lock and using a fragment pointer instead. Fixes open-mpi/ompi#1290 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-12-15 11:25:25 -07:00
Jeff Squyres	a2a5d650f9	Merge pull request #1180 from ggouaillardet/mpi_xxx_dup_fn fortran: add missing MPI_xxx_DUP_FN bindings	2015-12-15 13:15:27 -05:00
Gilles Gouaillardet	f0df2a7b2b	ompi: silence CID 1343322	2015-12-15 13:33:43 +09:00
Nathan Hjelm	139799f3c4	Merge pull request #1202 from artpol84/alltoall_fix Fix MPI_Alltoall to support inter-communicators.	2015-12-14 14:33:23 -08:00
Nathan Hjelm	b7ba301310	Merge pull request #1165 from hjelmn/add_procs_group ompi/group: release ompi_proc_t's at group destruction	2015-12-14 13:53:42 -08:00
Nathan Hjelm	9d659465b7	Merge pull request #1210 from artpol84/icbarrier_fix Fix NBC iBarrier for inter-communicators.	2015-12-14 13:52:38 -08:00
Nathan Hjelm	4b3dac5933	Merge pull request #1216 from artpol84/icgatherv_fix Fix NBC iGatherv for inter-communicators.	2015-12-14 13:51:58 -08:00
Matias Cabral	7cfd7d50b9	Merge pull request #1219 from matcabral/PSM2_tag_hashing Support for PSM2 hashing lookup in message queue.	2015-12-14 12:01:55 -08:00
matcabral	9a1f9be146	A new internal feature in PSM2 will use hash tables to accelerate message queue lookups if the lookups have the proper tag&mask layout. OpenMPI should follow PSM2's preferred tag&mask spec, so that PSM2 can provide a performance benefit.	2015-12-14 10:13:39 -08:00
Artem Polyakov	2d0919dbdc	Fix NBC iGatherv for inter-communicators. We need to use remote size to form a schedule.	2015-12-14 12:19:10 +06:00
Ralph Castain	5e5adebf8e	Port the changes from #782 to the master. Not everything applies here as the code in the 1.10 series is a little different. In addition, we asked for a few changes (e.g., using MPI_ERR_ARG instead of "13") that are incorporated here. Thanks to @jsharpe for the PR	2015-12-12 12:40:34 -08:00
Artem Polyakov	fc17deca43	Fix NBC iBarrier for inter-communicators. Remove send of the extra message. This bug hase triggered on MPICH/coll/nbicbarrier test. In this test a series of communicators are created. This extre-message was reseived after original communicator was destroyed and queued into non_existing_communicator_pending. When new completely unrelated communicator with the same id as original was created this message was pushed into the frags_cant_match queue and caused seq numbers skew and hang.	2015-12-12 13:27:31 +06:00
Gilles Gouaillardet	3a3b13ea12	coll/base: fix an integer overflow in ompi_coll_base_reduce_generic Refs open-mpi/ompi#1198	2015-12-11 13:55:59 +09:00
Artem Polyakov	25077fc5d9	Fix MPI_Alltoall to support inter-communicators. Remove excessive parameter check to avoid premature exit from the collective. MPI standard says: The type signature associated with sendcount, sendtype, at a process must be equal to the type signature associated with recvcount, recvtype at any other process. This implies that the amount of data sent must be equal to the amount of data received, pairwise between every pair of processes. In case of inter-communicator we have 2 group of processes and "left" group may call MPI_Alltoall(NULL, 0, MPI_INT, buf, 10, MPI_INT, comm, ...); and the right one: MPI_Alltoall(buf,10,MPI_INT, NULL, 0, MPI_INT, comm, ...); And it would be legal though one of the group will receive 0 bytes from others. This was triggered by MPICH/coll test called icalltoall.	2015-12-11 08:50:34 +06:00
Alina Sklarevich	3ffd8dcd20	PML UCX: fix typo (following `7becc54d`).	2015-12-10 13:51:10 +02:00
Artem Polyakov	ee71e35a90	Fix ompi_comm_create when source communicator is inter-communicator. This bug was triggered by probe-intercom and icm tests from MPICH suite.	2015-12-09 15:44:26 +02:00
Gilles Gouaillardet	3a62341b30	Merge pull request #1189 from ggouaillardet/topic/empty_ddt_fix ddt: duplicate MPI_DATATYPE_NULL when ompi_datatype_create_indexed of…	2015-12-09 15:29:03 +09:00
Nathan Hjelm	f317ba5262	Merge pull request #1163 from hjelmn/ompi_proc_threads ompi/proc: make proc system always thread safe	2015-12-08 10:36:55 -07:00
Nathan Hjelm	b47a64f27d	Merge pull request #1188 from artpol84/intercomm_split_fix Yet one more fix to intercommunicator splitting logic.	2015-12-08 07:09:46 -07:00
Nathan Hjelm	dae3746d2f	Merge pull request #1190 from kawashima-fj/pr/sm-win-test-fix osc/sm: Fix a bug that `MPI_WIN_TEST` does not update `flag` to 0	2015-12-08 06:39:16 -07:00
KAWASHIMA Takahiro	9c7b6a4352	osc/sm: Fix a bug that `MPI_WIN_TEST` does not update `flag` to 0. `MPI_WIN_TEST` must update the `flag` parameter to 0 when not all origin processes called `MPI_WIN_COMPLETE`. But sm OSC doesn't. If the caller initialize the `flag` argument to a non-0 value, the caller will receive the non-0 `flag` value.	2015-12-08 19:23:21 +09:00
Gilles Gouaillardet	59a361b781	ompio: correctly handle zero f_cc_size in mca_io_ompio_simple_grouping	2015-12-08 17:00:11 +09:00
Gilles Gouaillardet	d43ad3fada	ddt: duplicate MPI_DATATYPE_NULL when ompi_datatype_create_indexed of ompi_datatype_create_indexed_block is invoked with a zero count	2015-12-08 16:25:36 +09:00
Artem Polyakov	7690f4027a	Yet one more fix to intercommunicator splitting logic. Previous commit `f2794740` reverts Nathans changes. However it turns out that I was unable to trace his logic until I started investigation of icsplit hang. Bug was triggered when splitting Intercom was giving a group where on side of the communicator was empty (icsplit, intercom create #2). in this case remote_size == 0 and there is no way to distinguish between inter- and intra-communicator. Conclusion: We do need to distinguish between intra- and inter-communicators. So we should use ompi_mpi_group_null.group.	2015-12-08 08:43:08 +02:00
Nathan Hjelm	63d8feb31c	Merge pull request #1187 from hjelmn/bsend_fix pml/ob1: add missing ompi_request_wait_completion for buffered sends	2015-12-07 23:09:04 -07:00
Nathan Hjelm	f68c315188	pml/ob1: add missing ompi_request_wait_completion for buffered sends This commit adds a call to ompi_request_wait_completion for buffered sends. Without this line it is possible to get into a state where the data is never sent. Fixes open-mpi/ompi#1185 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-12-07 22:28:07 -07:00
Nathan Hjelm	eb830b9501	ompi_proc_pack: correctly handle proc sentinels Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-12-07 17:27:38 -07:00
Artem Polyakov	f2794740b3	Fix intercommunicator split (was triggered by MPICH/icsend test)	2015-12-07 15:41:29 +02:00
Gilles Gouaillardet	bfe8e03d9d	fcoll/two_phase: use ompi_mpi_abort instead of PMPI_Abort Thanks Jeff for the review	2015-12-07 11:34:36 +09:00
Gilles Gouaillardet	ef03bc726c	ompi: fix comment in ompi/mpi/c/Makefile.am Thanks Jeff for the review	2015-12-07 11:34:01 +09:00
Gilles Gouaillardet	37c978f5e9	coll/libnbc: correctly handle changed types. this fixes open-mpi/ompi@d816d1c194 thanks Jeff for the review	2015-12-07 10:13:43 +09:00
Gilles Gouaillardet	26b2ed1069	fortran: add missing MPI_xxx_DUP_FN bindings in use-mpi-tkr - MPI_COMM_DUP_FN - MPI_TYPE_DUP_FN - MPI_WIN_DUP_FN	2015-12-07 09:10:48 +09:00
George Bosilca	3a9664ac9d	Fix Coverity CIDs 1341584-1341589.	2015-12-06 14:06:36 -05:00
Jeff Squyres	ad35a363fa	Merge pull request #1179 from jsquyres/pr/mpi-testsome-man-page-update Pr/mpi testsome man page update	2015-12-04 05:55:33 -05:00
bosilca	8fee96c086	Merge pull request #1091 from bosilca/topic/datatype_span Cleanup the temporary memory allocation in collectives	2015-12-03 19:25:04 -05:00
Jeff Squyres	0adcb5b5cd	MPI_Testsome.3in: wrap some long lines Wrap some long lines; no other text or semantics changes.	2015-12-03 17:06:43 -05:00
Jeff Squyres	11c571b568	MPI_Testsome.3in: add explicit verbiage about return values Instead of solely relying on the out value definitions in MPI_Waitsome.3, explicitly copy this text here. Note that the original text in this man page was copied verbatim from the MPI spec; we've now added a bit more text (copied from MPI_Waitsome.3in) that explains the out values so that users don't have to cross-reference to another man page. Thanks to Eric Schnetter for the suggestion. Fixes open-mpi/ompi#1153	2015-12-03 17:06:22 -05:00
Gilles Gouaillardet	a5440ade5f	topo/treematch: do not invoke hwloc_topology_{init,load} * this is not necessary * this overwrites existing topology, that could be different if hwloc_base_topo_file is used	2015-12-03 11:24:32 +09:00
George Bosilca	688108cf7f	Patch submitted by @ggouaillardet on ticket #1091 .	2015-12-02 20:42:18 -05:00
George Bosilca	4d00c59b2e	Cleanup the memory handling for temporary buffers in some of the collective modules. Added a new function opan_datatype_span, to compute the memory span of count number of datatype, excluding the gaps in the beginning and at the end. If a memory allocation is made using the returned value, the gap (also returned) should be removed from the allocated pointer.	2015-12-02 20:42:18 -05:00
Gilles Gouaillardet	351bd03249	ompi_proc_sentinel_to_name: clear the top left bit	2015-12-02 17:18:56 +09:00
Jeff Squyres	15325c8094	op/x86: change the owner to Ralph Cisco no longer cares about this component, but Intel might. Transferring ownership to Ralph.	2015-12-01 15:08:07 -08:00
igor-ivanov	d8c85738ab	Merge pull request #1151 from igor-ivanov/pr/opal-abort-vars Add new mca variables opal_abort_delay and opal_abort_print_stack	2015-12-01 16:27:11 +04:00
Nathan Hjelm	406b9ff1e6	ompi/group: add helper function for creating plist groups This commit adds a helper function for creating groups from proc lists. The function is used by ompi_comm_fill_rest to create the local and remote groups. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-11-30 23:52:57 -07:00
Nathan Hjelm	5334d22a37	ompi/group: release ompi_proc_t's at group destruction This commit changes the way ompi_proc_t's are retained/released by ompi_group_t's. Before this change ompi_proc_t's were retained once for the group and then once for each retain of a group. This method adds unnecessary overhead (need to traverse the group list each time the group is retained) and causes problems when using an async add_procs. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-11-30 23:03:47 -07:00
Ryan Grant	324534b191	Merge pull request #1161 from tkordenbrock/topic/add.triggered.scatter coll-portals4: add scatter and iscatter implementations that use Portals4 triggered operations	2015-11-30 16:53:47 -07:00
Nathan Hjelm	22af95b266	ompi/proc: make proc system always thread safe This commit changes the OPAL_THREAD_LOCK/OPAL_THREAD_UNLOCK calls in ompi/proc to opal_mutex_lock/opal_mutex_unlock. This will allow multi-threaded BTLs the ability to creat ompi_proc_t's without having to set opal_using_threads. There should be no performance hits as none of the lock points are in the critical path. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-11-30 16:37:09 -07:00
Todd Kordenbrock	4721b70dd5	coll-portals4: add scatter and iscatter implementations that use Portals4 triggered operations This commit adds implementations of scatter and iscatter using Portals4 triggered operations. Currently, the only algorithm is linear.	2015-11-30 15:07:18 -06:00
Todd Kordenbrock	f6f525e0d8	coll-portals4: remove unneeded code from gather This commit removes two pieces of unneeded code from gather. First it removes destroy_tree() calls from linear_top(), because the linear algorithm does not create a tree, so there is no need to destroy it. Second it removes unpack_bytes from the gather request because it was calculated but never used.	2015-11-30 10:38:51 -06:00
Gilles Gouaillardet	80f02518ff	topo/base: correctly free the topo object in mca_topo_base_dist_graph_create_adjacent	2015-11-30 15:33:59 +09:00
Gilles Gouaillardet	8227bc6320	ompi_proc_find_and_add: use ompi_proc_allocate in order to update both ompi_proc_list and ompi_proc_hash	2015-11-30 14:00:59 +09:00
igor.ivanov@itseez.com	c15bf147bf	opal: Add opal_abort_print_stack mca variable with aliases for ompi/oshmem This commit allows to control output during abnormal oshmem/ompi application termination. Fixed issue in backtrace output. HAVE_BACKTRACE was never set so user was limited in control of this variable. Two related mca variables are moved to opal layer. Corresponding aliases are added for ompi and oshmem.	2015-11-25 18:18:33 +02:00
Ryan Grant	81d482dca6	Merge pull request #1137 from francois-wellenreiter/trig_mtl_rdv MTL portals4 : improve the rendez-vous protocol using PtlTriggeredGet…	2015-11-24 17:31:31 -07:00
Ryan Grant	219581e87e	Merge pull request #1090 from tkordenbrock/topic/check.for.invalid.handles.in.finalize mtl-portals4: test for valid handle before releasing resources	2015-11-20 07:54:44 -06:00
Mike Dubman	c544620a7c	Merge pull request #1138 from igor-ivanov/pr/yalla-valgrind yalla: fix valgrind error due to uninitialized status field.	2015-11-20 07:19:11 -05:00
Gilles Gouaillardet	002c7b8b3a	fcoll/two_phase: use PMPI_* insted of MPI_*	2015-11-20 13:46:19 +09:00
Gilles Gouaillardet	561e7f6647	vprotocol/pessimist: use internal ompi_* insted of MPI_*	2015-11-20 13:46:19 +09:00
Gilles Gouaillardet	025fd8a9fc	osc: use PMPI_* insted of MPI_*	2015-11-20 13:46:19 +09:00
Gilles Gouaillardet	d816d1c194	coll/libnbc: use PMPI_* and internal ompi_* insted of MPI_*	2015-11-20 13:46:19 +09:00
yosefe	3bb1270715	yalla: fix valgrind error due to uninitialized status field.	2015-11-19 10:59:31 +02:00
Francois WELLENREITER	9126ea5e82	MTL portals4 : improve the rendez-vous protocol using PtlTriggeredGet operation	2015-11-19 09:52:53 +01:00
Edgar Gabriel	9e5ade4e8b	argh, a debugging sleep statement got into the source code.	2015-11-16 13:26:57 -06:00
Edgar Gabriel	dbfbcdecd5	make adjustments for the default settings of grouping parameters and the default contiguous group size option. minor bug fix in the simple grouping strategy.	2015-11-16 08:17:27 -06:00
Edgar Gabriel	27628774c7	add a new option for a simple aggregator selection which has zero communication costs.	2015-11-16 08:17:26 -06:00
Edgar Gabriel	66c1ea5fcb	change the default value of the grouping option. Also add new grouping option which skips the refinement step in the aggregator selection.	2015-11-16 08:17:23 -06:00

1 2 3 4 5 ...

8896 Коммитов