openmpi

Автор	SHA1	Сообщение	Дата
Nathan Hjelm	59bae1a330	osc/rdma: fix typo in compare-and-swap This commit fixes a typo in compare-and-swap when retrieving the memory region associated with a displacement. It was erroneously 8 bytes instead of the datatype size. This can cause an incorrect RMA range error when the compare-and-swap is less than 4 bytes from the end of the region. Fixed open-mpi/ompi#2080 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-09-14 16:49:42 -06:00
Nathan Hjelm	7c8e7691a7	Merge pull request #2045 from hjelmn/osc_rdma_atomics osc/rdma: add support for network AMOs	2016-09-08 11:21:49 -06:00
Gilles Gouaillardet	d1e1ec51b6	ompio: correctly fix a memory plug as newly reported by Coverity with CID 1372660	2016-09-08 18:50:18 +09:00
Gilles Gouaillardet	213a981041	io/ompio: plug memory leaks as reported by Coverity with CIDs 1369022 and 1369023	2016-09-07 10:08:44 +09:00
Ralph Castain	7f3fac48ab	Fix typo on the COLL_SYNC macro	2016-09-06 12:43:07 -07:00
Todd Kordenbrock	a17dff281d	Merge pull request #1900 from PDeveze/mtl-portals4-short_msg-split_msg Mtl portals4 short msg split msg	2016-09-06 11:14:19 -05:00
Nathan Hjelm	1ce5847e8b	osc/rdma: add support for network AMOs This commit adds support for using network AMOs for MPI_Accumulate, MPI_Fetch_and_op, and MPI_Compare_and_swap. This support is only enabled if the ompi_single_intrinsic info key is specified or the acc_single_interinsic MCA variable is set. This configuration indicates to this implementation that no long accumulates will be performed since these do not currently mix with the AMO implementation. This commit also cleans up the code somwhat. This includes removing unnecessary struct keywords where the type is also typedef'd. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-09-01 15:47:33 -06:00
Nathan Hjelm	cb1cb5ffed	osc/pt2pt: do not use frag send to send lock request This commit cleans up some code in the passive target path. The code used the buffered frag control send path but it is more appropriate to use the unbuffered one. This avoids checking structures that are should not be in use in this path. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-09-01 09:57:27 -06:00
Gilles Gouaillardet	75b7ef97a0	coll/libnbc: fix nbc_ireduce when sendbuf == recvbuf if sendbuf is equal to recvbuf, that should not be interpreted as equivalent to MPI_IN_PLACE on the non root rank(s) Thanks Valentin Petrov for the report	2016-09-01 10:19:05 +09:00
Gilles Gouaillardet	2969235324	libnbc: fix NBC_Copy for predefined datatypes predefined datatypes such as MPI_LONG_DOUBLE_INT are not really contiguous, so use span as returned by opal_datatype_span() instead of type extent, otherwise data might be written above allocated memory. Thanks Valentin Petrov for the report	2016-09-01 10:18:57 +09:00
Edgar Gabriel	be183cb3dd	io/ompio: fix the reference count of basic datatypes used as etypes or ftypes.	2016-08-31 14:08:26 -05:00
Nathan Hjelm	99b26644c1	Merge pull request #2011 from hjelmn/osc_pt2pt_fix osc/pt2pt: fix possible race in peer locking	2016-08-29 09:17:36 -06:00
Edgar Gabriel	b5c757e82c	Merge pull request #2014 from edgargabriel/topic/mt-io Topic/mt io	2016-08-26 08:54:45 -05:00
Edgar Gabriel	1ba03d38ec	io/ompio: protect remaining functions in multi-threaded scenarios protect the remaining functions where necessary by a mutex lock to avoid problems in multi-threaded executions. Some functions do not require that in my opinion, and I provided an explanation in those cases.	2016-08-25 13:45:51 -05:00
Nathan Hjelm	e53de7ecbe	osc/rdma: fix bug in dynamic memory window tracking code This commit fixes an ordering bug in the code that keeps track of all attached memory windows. The code is intended to keep the memory regions sorted but was often inserting at the wrong index. Thanks to Christoph Niethammer for reporting the issue. The reproducer will be added to nightly MTT testing. Fixes open-mpi/ompi#2012 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-08-25 12:08:46 -06:00
Nathan Hjelm	7af138f83b	osc/pt2pt: fix possible race in peer locking It is possible for another thread to process a lock ack before the peer is set as locked. In this case either setting the locked or the eager active flag might clobber the other thread. To address this the flags have been made volatile and are set atomically. Since there is no a opal_atomic_or or opal_atomic_and function just use cmpset for now. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-08-25 09:28:25 -06:00
Nathan Hjelm	c082068953	Merge pull request #2006 from hjelmn/osc_pt2pt_fix osc/pt2pt: fix several bugs	2016-08-25 09:19:29 -06:00
Edgar Gabriel	1cee83cc1b	use the common/ interfaces in file_preallocate instead of the io_ompio_ interfaces. Necessar for avoiding potential deadlock situations in multi-threaded scenarios.	2016-08-25 08:55:12 -05:00
Nathan Hjelm	70f8a6e792	osc/pt2pt: fix several bugs This commit fixes some bugs uncovered during thread testing of 2.0.1rc1. With these fixes the component is running cleanly with threads. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-08-24 14:35:45 -06:00
Ralph Castain	bcf5ac3971	Set the default value of both barrier counters to zero, thus ensuring the coll/sync component is off by default	2016-08-24 07:51:32 -07:00
Ralph Castain	22844b0dc6	Balance priorities to ensure something is below sync	2016-08-23 17:33:45 -07:00
Ralph Castain	540f23c4dd	Adjust priority of coll/sync downwards	2016-08-23 17:12:48 -07:00
Edgar Gabriel	41ed4a28d2	add the protective lock around read and write operations in ompio	2016-08-23 11:07:58 -05:00
Howard Pritchard	696121cc4a	Merge pull request #1988 from hppritcha/topic/another_ofi_fix mtl/ofi: fix a botched assignment of av_type	2016-08-22 17:59:59 -06:00
Ralph Castain	6549c878a9	Silence the warnings	2016-08-22 15:35:27 -07:00
Ralph Castain	871bedb103	Add missing "const" qualifiers	2016-08-22 12:54:24 -07:00
Edgar Gabriel	a76f4d7c69	Merge pull request #1990 from edgargabriel/topic/mt-io steps towards making file I/O operations thread safe	2016-08-22 08:19:33 -05:00
Joshua Ladd	deae1ab375	Merge pull request #1985 from vspetrov/master coll/hcoll: Fixes predifined types mapping	2016-08-22 09:18:59 -04:00
Edgar Gabriel	bc042259bc	make initialization of the io framework thread safe. Also, remove the lock/unlock in the file_open ompi-interface routines of romio314. The global lock in the romio component does probably not work, it is easy to construct a testcase where two threads perform collective I/O operations on different file handles. With a global lock it is easy to deadlock. THe lock has to be at least on the file handle basis. move the mutex to file/file.c to avoid duplicate symbol problem in file_open.c pfile_open.c	2016-08-21 16:09:00 -05:00
George Bosilca	b96ec77e40	This variable belongs to the tuned modules and not to base.	2016-08-20 15:37:55 -04:00
George Bosilca	e8425eb1f5	Rename an OMPI internal variable (ticket #1955 ).	2016-08-20 15:37:55 -04:00
rhc54	102d3afe2c	Merge pull request #1992 from rhc54/topic/sync Restore the coll/sync module and provide a test to verify its operation	2016-08-20 13:33:28 -05:00
George Bosilca	fd57f5bccd	Remove some of the clang warnings.	2016-08-20 14:21:42 -04:00
Ralph Castain	9888615e75	Restore the coll/sync module and provide a test to verify its operation	2016-08-20 10:14:52 -07:00
Howard Pritchard	61d62b6821	mtl/ofi: fix a botched assignment of av_type Well now the av_type is being assigned correctly Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2016-08-19 17:01:02 -05:00
Valentin Petrov	9790373fc6	coll/hcoll: Fixes predifined types mapping	2016-08-19 11:19:12 +03:00
Nathan Hjelm	e5c7512692	Merge pull request #1983 from hjelmn/request_cb ompi/request: change semantics of ompi request callbacks	2016-08-18 08:31:56 -06:00
Nathan Hjelm	6aa658ae33	ompi/request: change semantics of ompi request callbacks This commit changes the sematics of ompi request callbacks. If a request's callback has freed or re-posted (using start) a request the callback must return 1 instead of OMPI_SUCCESS. This indicates to ompi_request_complete that the request should not be modified further. This fixes a race condition in osc/pt2pt that could lead to the req_state being inconsistent if a request is freed between the callback and setting the request as complete. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-08-17 20:14:01 -06:00
Edgar Gabriel	e14c23ba79	Merge pull request #1980 from edgargabriel/topic/coverty-cleanup io/ompio: Topic/coverty cleanup	2016-08-17 17:27:51 -05:00
Edgar Gabriel	2c8437ce62	fs/pvfs2: fix a common symbol	2016-08-17 13:10:32 -05:00
Edgar Gabriel	eba5293586	fix coverty warning CID 1369021	2016-08-17 13:02:45 -05:00
Nathan Hjelm	40b70889e5	osc/pt2pt: make receive count an unsigned int This receive_count MCA variable should never be negative. Change it to an unsigned int. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-08-17 08:14:24 -06:00
Gilles Gouaillardet	8faa1edafa	osc/pt2pt: silence misc warnings	2016-08-17 14:24:14 +09:00
LANL OMPI Bot	96c7762050	Merge pull request #1942 from hppritcha/topic/minor_ofi_fix mtl/ofi: use mca param to set av type	2016-08-16 14:14:12 -06:00
Nathan Hjelm	9444df1eb7	osc/pt2pt: make lock_all locking on-demand The original lock_all algorithm in osc/pt2pt sent a lock message to each peer in the communicator even if the peer is never the target of an operation. Since this scales very poorly the implementation has been replaced by one that locks the remote peer on first communication after a call to MPI_Win_lock_all. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-08-11 15:33:07 -06:00
Nathan Hjelm	7589a25377	osc/pt2pt: do not repost receive from request callback This commit fixes an issue that can occur if a target gets overwhelmed with requests. This can cause osc/pt2pt to go into deep recursion with a stack like req_complete_cb -> ompi_osc_pt2pt_callback -> start -> req_complete_cb -> ... . At small scale this is fine as the recursion depth stays small but at larger scale we can quickly exhaust the stack processing frag requests. To fix the issue the request callback now simply puts the request on a list and returns. The osc/pt2pt progress function then handles the processing and reposting of the request. As part of this change osc/pt2pt can now post multiple fragment receive requests per window. This should help prevent a target from being overwhelmed. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-08-11 15:33:07 -06:00
George Bosilca	8d0baf140f	If the RTE fails to deliver the daemon information, gracefully fallback to a non-reordered communicator. Optimize the loops building the process hierarchy.	2016-08-11 13:04:27 -04:00
Howard Pritchard	e46eee3fcb	mtl/ofi: use mca param to set av type Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2016-08-10 16:10:17 -06:00
Gilles Gouaillardet	dfbf2b7be4	opal/threads: add OPAL_THREAD_SUB_SIZE_T macro -1 is not a valid size_t, so instead of OPAL_THREAD_ADD_SIZE_T(..., -1), simply OPAL_THREAD_SUB_SIZE_T(..., 1) and keep picky compilers happy	2016-08-10 13:37:36 +09:00
Nathan Hjelm	799104f688	Merge pull request #1947 from hjelmn/perf pml/ob1: be more selective when using rdma capable btls	2016-08-09 22:15:09 -06:00

1 2 3 4 5 ...

6091 Коммитов