openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	1e2019ce2a	Revert "Update to sync with OMPI master and cleanup to build" This reverts commit cb55c88a8b7817d5891ff06a447ea190b0e77479.	2016-11-22 15:03:20 -08:00
Ralph Castain	cb55c88a8b	Update to sync with OMPI master and cleanup to build Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-22 14:24:54 -08:00
Nathan Hjelm	e968ddfe64	start bug fixes (#1729 ) * mpi/start: fix bugs in cm and ob1 start functions There were several problems with the implementation of start in Open MPI: - There are no checks whatsoever on the state of the request(s) provided to MPI_Start/MPI_Start_all. It is erroneous to provide an active request to either of these calls. Since we are already looping over the provided requests there is little overhead in verifying that the request can be started. - Both ob1 and cm were always throwing away the request on the initial call to start and start_all with a particular request. Subsequent calls would see that the request was pml_complete and reuse it. This introduced a leak as the initial request was never freed. Since the only pml request that can be mpi complete but not pml complete is a buffered send the code to reallocate the request has been moved. To detect that a request is indeed mpi complete but not pml complete isend_init in both cm and ob1 now marks the new request as pml complete. - If a new request was needed the callbacks on the original request were not copied over to the new request. This can cause osc/pt2pt to hang as the incoming message callback is never called. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> * osc/pt2pt: add request for gc after starting a new request Starting a new receive may cause a recursive call into the pt2pt frag receive function. If this happens and the prior request is on the garbage collection list it could cause problems. This commit moves the gc insert until after the new request has been posted. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-06-02 20:22:40 -04:00
Nathan Hjelm	c16e639b2f	Merge pull request #1563 from hjelmn/ompi_coverity ompi coverity fixes	2016-04-26 09:17:48 -06:00
Nathan Hjelm	ae0ffbb67f	Merge pull request #1397 from hjelmn/enable_thread_multiple ompi: always enable MPI_THREAD_MULTIPLE support	2016-04-23 08:40:22 -06:00
Nathan Hjelm	1ff3d3b16b	pml/ob1: fix coverity issue Fix CID 1357978 (1 of 1): Logically dead code (DEADCODE): Remove duplicate check for NULL == endpoint. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-04-19 14:48:13 -06:00
Nathan Hjelm	9d5eeecb8a	pml/ob1: detect unreachable errors This commit adds code to detect when procs are unreachable when using the dynamic add_procs functionality. Fixes #1501 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-03-28 10:52:40 -06:00
Nathan Hjelm	230d04327e	ompi: always enable MPI_THREAD_MULTIPLE support This commit removes the --with-mpi-thread-multiple option and forces MPI_THREAD_MULTIPLE support. This cleans up an abstration violation in opal where OMPI_ENABLE_THREAD_MULTIPLE determines whether the opal_using_threads is meaningful. To reduce the performance hit on MPI_THREAD_SINGLE programs an OPAL_UNLIKELY is used for the check on opal_using_threads in OPAL_THREAD_* macros. This commit does not clean up the arguments to the various functions that take whether muti-threading support is enabled. That should be done at a later time. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-02-23 10:02:14 -07:00
Nathan Hjelm	f68c315188	pml/ob1: add missing ompi_request_wait_completion for buffered sends This commit adds a call to ompi_request_wait_completion for buffered sends. Without this line it is possible to get into a state where the data is never sent. Fixes open-mpi/ompi#1185 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-12-07 22:28:07 -07:00
George Bosilca	01d8e23ccc	Fix the random errors related to the recursive sends and receives identified by Fujitsu.	2015-09-26 00:44:51 +02:00
Nathan Hjelm	b4a0d40915	pml/ob1: Add support for dynamically calling add_procs This commit contains the following changes: - pml/ob1: use the bml accessor function when requesting a bml endpoint. this will ensure that bml endpoints are only created when needed. for example, a bml endpoint is not requested and not allocated when receiving an eager message from a peer. - pml/ob1: change the pml_procs array in the ob1 communicator to a proc pointer array. at the cost of a single level of extra redirection this will allow us to allocate pml procs on demand. - pml/ob1: add an accessor function to access the pml proc structure for a given peer. this function will allocate the proc if it doesn't already exist. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-10 08:55:54 -06:00
Gilles Gouaillardet	6e6a3e965c	pml: do not cast way the const modifier when this is not necessary update the pml framework and mpi c bindings	2015-09-09 09:18:57 +09:00
Nathan Hjelm	9a8a87611e	pml/ob1: fix bugs in static request objects This commit fixes several bugs in the static request objects used by ob1 for blocking send/receive operations. - Fix memory leak when using MPI_THREAD_MULTIPLE. Requests were allocated off the free list but were destructed and NOT returned. - Fix double-destruct of static objects. There is no reason to CONSTRUCT/DESTUCT the static object for each send/receive operation. This adds overhead and no benefit. To keep the code clean helper functions have been added to finalize ob1 send/receive requests. - Remove now unnecessary include of alloca.h. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-06-23 11:00:45 -06:00
Nathan Hjelm	284dd6babe	pml/ob1: do not use OPAL_ENABLE_MULTI_THREADS to determine thread multiple support OPAL_ENABLE_MULTI_THREADS is always on. The correct value to check is OMPI_ENABLE_THREAD_MULTIPLE. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-06-22 19:17:23 -06:00
George Bosilca	67b70bb47a	Add multi-threaded support.	2015-06-12 14:22:17 -07:00
George Bosilca	b2cf74cabc	A first cut at a possible solution for the missing requests from the message queues (a debugging feature). With this approach all blocking (single threaded) requests are allocated from the main freelist, so they will be accounted for during the message queues investigation).	2015-06-12 14:22:17 -07:00
Nathan Hjelm	c4a0e02261	pml/ob1: update for BTL 3.0 interface Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-13 11:46:37 -07:00
George Bosilca	df0512550e	The extent of the datatype is irrelevant for deciding to do an immediate send as long as we have to pack.	2015-01-19 02:23:12 -05:00
Gilles Gouaillardet	d14daf40d0	ob1: correctly handle types in which size > extent do not send inline if extentcount OR* size*count are greater than 256	2015-01-19 14:07:23 +09:00
Nathan Hjelm	1b564f62bd	Revert "Merge pull request #275 from hjelmn/btlmod" This reverts commit ccaecf0fd6c862877e6a1e2643f95fa956c87769, reversing changes made to 6a19bf85dde5306f559f09952cf3919d97f52502.	2014-11-19 23:22:43 -07:00
Nathan Hjelm	271818f887	pml/ob1: bug fixes and adjustments for changes in btl_sendi behavior	2014-11-19 11:33:03 -07:00
Nathan Hjelm	b75bb8aea7	Update pml for btl changes	2014-11-19 11:33:02 -07:00
Jeff Squyres	7a5b2e9b13	ob1: change an OPAL_UNLIKELY to OPAL_LIKELY Per `924d39e415 (commitcomment-8378266)`, this OPAN_UNLIKELY should really be OPAL_LIKELY.	2014-10-31 03:22:55 -07:00
George Bosilca	924d39e415	Always OBJ_DESTRUCT the send request.	2014-10-30 01:28:50 -04:00
George Bosilca	cee2a4e5c8	Missing alloca.h. Thanks Paul for catching this. This commit was SVN r32388.	2014-08-01 03:28:23 +00:00
Ralph Castain	552c9ca5a0	George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.	2014-07-26 00:47:28 +00:00
Jeff Squyres	b0a6e42f45	pml ob1: use the pre-computed size from the free lists Based on a suggestion from George on #31806, use the pre-computed sizes rather than duplicating the computation math (which may change someday in the future). cmr=v1.8.2:ticket=trac:4647 This commit was SVN r31841. The following Trac tickets were found above: Ticket 4647 --> https://svn.open-mpi.org/trac/ompi/ticket/4647	2014-05-20 20:32:25 +00:00
Nathan Hjelm	a1d5ce0893	pml/ob1: as per past RFC bring the inline send optimization to MPI_Isend. I filed an RFC for this optimization some time back. It is a relatively simple optimization. If the data associated with an MPI_Isend can be put on the wire without allocating an MPI_Request then do so. In this case we can legally return omp_request_empty which will correctly indicate that the request is complete and that is was not cancelled (these are the only requirements on send requests). cmr=v1.8.3:reviewer=bosilca This commit was SVN r31828.	2014-05-19 19:34:59 +00:00
Jeff Squyres	025e4a852b	pml_ob1: ensure to have enough space for send/recvreq on stack r30343 introduced the optimization of putting the OB1 sendreq and recvreq on the stack for blocking sends and receives. However, the requests did not contain enough storage for the data that is normally immediately ''after'' the request (e.g., BTL data). This commit changes these requests to be pointers and to use alloca() to get enough total space for the OB1 request and all the associated data. The change is smaller than it looks; most of it is just changing from "foo.bar" to "foo->bar" notation (etc.). Submitted by Jeff, reviewed by Nathan. But we want George to look at this (and get a little soak time on the trunk) before moving to v1.8. cmr=v1.8.2:reviewer=bosilca This commit was SVN r31806. The following SVN revision numbers were found above: r30343 --> open-mpi/ompi@2b57f4227e	2014-05-17 01:05:59 +00:00
Nathan Hjelm	626b521e9c	pml/ob1: fix heterogeneous support when using the send_inline optimization We will track #4568 from the 1.8 CMR. Closes trac:4568 cmr=v1.8.2:reviewer=jsquyres This commit was SVN r31535. The following Trac tickets were found above: Ticket 4568 --> https://svn.open-mpi.org/trac/ompi/ticket/4568	2014-04-28 17:36:26 +00:00
Nathan Hjelm	a06e491c2c	ob1: large buffered sends were broken by the ob1 optimizations. fix them The problem was caused by the static request optimization. The buffered send case is much like the isend case in that the request structure may be needed after MPI_Bsend completes. Fix this case by calling isend and freeing the resulting request. cmr=v1.7.5:ticket=trac:4149 This commit was SVN r30601. The following Trac tickets were found above: Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149	2014-02-07 00:12:36 +00:00
Nathan Hjelm	3902cf66f1	ob1: OBJ_CONSTRUCT the convertor in the send_inline optimization. This change does not appear to increase the small message latency of ping-pong benchmarks and fixes an issue found by our ibm datatype tests. Fixes trac:4232 cmr=v1.7.5:ticket=trac:4149 This commit was SVN r30598. The following Trac tickets were found above: Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149 Ticket 4232 --> https://svn.open-mpi.org/trac/ompi/ticket/4232	2014-02-06 21:27:42 +00:00
Nathan Hjelm	66b69da394	Fix a bug in the ob1 optimizations that can cause a segfault. btl sendi functions currently can not handle the descriptor being NULL. The send inline optimization was assuming (incorrectly) that NULL was ok. cmr=v1.7.5:ticket=trac:4149 This commit was SVN r30364. The following Trac tickets were found above: Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149	2014-01-22 16:31:58 +00:00
Nathan Hjelm	2b57f4227e	ob1: optimize blocking send and receive paths Per RFC. There are two optimizations in this commit: - Allocate requests for blocking sends and receives on the stack. This bypasses the request free list and saves two atomics on the critical path. This change improves the small message ping-pong by 50-200ns on both AMD and Intel CPUs. - For small messages try to use the btl sendi function before intializing a send request. If the sendi fails or the btl does not have a sendi function silently fallback on the standard send path. cmr=v1.7.5:reviewer=brbarret This commit was SVN r30343.	2014-01-21 15:16:21 +00:00
Jeff Squyres	3a7af4ab40	Fix another clang warning: sendreq is undefined if proc==NULL. cmr=v1.7.4:reviewer=hjelmn:subject=fix ob1 undefined sendreq value This commit was SVN r29774.	2013-12-02 19:44:42 +00:00
George Bosilca	c9e5ab9ed1	Our macros for the OMPI-level free list had one extra argument, a possible return value to signal that the operation of retrieving the element from the free list failed. However in this case the returned pointer was set to NULL as well, so the error code was redundant. Moreover, this was a continuous source of warnings when the picky mode is on. The attached parch remove the rc argument from the OMPI_FREE_LIST_GET and OMPI_FREE_LIST_WAIT macros, and change to check if the item is NULL instead of using the return code. This commit was SVN r28722.	2013-07-04 08:34:37 +00:00
George Bosilca	e361bcb64c	Send optimizations. 1. The send path get shorter. The BTL is allowed to return > 0 to specify that the descriptor was pushed to the networks, and that the memory attached to it is available again for the upper layer. The MCA_BTL_DES_SEND_ALWAYS_CALLBACK flag can be used by the PML to force the BTL to always trigger the callback. Unmodified BTL will continue to work as expected, as they will return OMPI_SUCCESS which force the PML to have exactly the same behavior as before. Some BTLs have been modified: self, sm, tcp, mx. 2. Add send immediate interface to BTL. The idea is to have a mechanism of allowing the BTL to take advantage of send optimizations such as the ability to deliver data "inline". Some network APIs such as Portals allow data to be sent using a "thin" event without packing data into a memory descriptor. This interface change allows the BTL to use such capabilities and allows for other optimizations in the future. All existing BTLs except for Portals and sm have this interface set to NULL. This commit was SVN r18551.	2008-05-30 03:58:39 +00:00
Shiqing Fan	8393fb5d47	Use the new memchecker_call function for memory checking of non-blocking communication. This commit was SVN r18399.	2008-05-07 12:28:51 +00:00
Shiqing Fan	f35a06119c	Use memchecker_convertor_call function instead the old one. Move the function to the place that we can use convertor. This commit was SVN r18370.	2008-05-05 13:57:27 +00:00
Gleb Natapov	e0a3a7e53e	Move duplicated code all over the code to a single function ompi_request_wait_completion(). This commit was SVN r16494.	2007-10-18 12:33:21 +00:00
Rainer Keller	b0df55d53b	- For MPI_Probe/MPI_Iprobe, we should not have a PERUSE_COMM_REQ_ACTIVATE event. Therefore move the PERUSE_TRACE_COMM_EVENT for this event from MCA_PML_BASE_SEND_REQUEST_INIT / MCA_PML_BASE_RECV_REQUEST_INIT to the proper places into pml_ob1_isend.c / pml_ob1_irecv.c right after the MCA_PML_OB1_SEND_REQUEST_INIT / MCA_PML_OB1_RECV_REQUEST_INIT. This commit was SVN r15945.	2007-08-23 05:52:33 +00:00
George Bosilca	e19777e910	A more consistent version. As we now share the send and receive queue, we have to construct/destruct only once. Therefore, the construction will happens before digging for a PML, while the destruction just before finalizing the component. Add some OPAL_LIKELY/OPAL_UNLIKELY. This commit was SVN r15347.	2007-07-10 23:45:23 +00:00
Brian Barrett	84d1512fba	Add the potential for doing some basic error checking on mutexes during single threaded builds. In its default configuration, all this does is ensure that there's at least a good chance of threads building based on non-threaded development (since the variable names will be checked). There is also code to make sure that a "mutex" is never "double locked" when using the conditional macro mutex operations. This is off by default because there are a number of places in both ORTE and OMPI where this alarm spews mega bytes of errors on a simple test. So we have some work to do on our path towards thread support. Also removed the macro versions of the non-conditional thread locks, as the only places they were used, the author of the code intended to use the conditional thread locks. So now you have upper-case macros for conditional thread locks and lowercase functions for non-conditional locks. Simple, right? :). This commit was SVN r15011.	2007-06-12 16:25:26 +00:00
George Bosilca	688a16ea78	A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was long ago) supposed to be used as a cache for accessing the PML procs. But in all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc. This pointer can be accessed using the c_remote_group easily. Therefore, there is no meaning of keeping the PML procs around. Slim fast commit ... This commit was SVN r11730.	2006-09-20 22:14:46 +00:00
George Bosilca	58cd591d3b	PERUSE support for OB1. There we go, now the trunk has a partial peruse implementation. We support all the events in the PERUSE specifications, but right now only one event of each type can be attached to a communicator. This will be worked out in the future. The events were places in such a way, that we will be able to measure the overhead for our threading implementation (the cost of the synchronization objects). This commit was SVN r9500.	2006-03-31 17:09:09 +00:00
George Bosilca	612570134f	The request management framework has been redesigned. The main idea is to let the PML (or io, more generally the low level request manager) to have it's own release function (what was before the req_fini). This function will only be called from the low level while the req_free will be called from the upper level (MPI layer) in order to mark the request as not used by the user anymore. From the request point of view the requests will be marked as inactive everytime we read their status (true for persistent as well). As MPI_REQUEST_NULL is already marked as inactive, the test and wait functions are simpler. The drawback is that now we have to change in the ompi_request_{test\|wait} the req_status of the request once we get it's status. This commit was SVN r9290.	2006-03-15 22:53:41 +00:00
Tim Woodall	8bf6ed7a36	- corrected locking in gm btl - gm api is not thread safe - initial support for gm progress thread - corrected threading issue in pml - added polling progress for a configurable number of cycles to wait for threaded case This commit was SVN r9188.	2006-03-02 00:39:07 +00:00
George Bosilca	83f83e5730	Specialize the MCA_PML_OB1_FREE macro. When we call this macro we already know what kind of request we are playing with (send or receive). Therefore, it's useless to have another switch inside this macro and make the code bigger. Now, we have 2 versions MCA_PML_OB1_SEND_REQUEST_FREE and MCA_PML_OB1_RECV_REQUEST_FREE. This commit was SVN r8945.	2006-02-08 22:42:00 +00:00
Jeff Squyres	42ec26e640	Update the copyright notices for IU and UTK. This commit was SVN r7999.	2005-11-05 19:57:48 +00:00
Galen Shipman	c3c83aa3e1	BML (BTL Managment Layer). Allows BTL's to be used outside of the PML. See bml.h and PML-OB1 for usage. This commit was SVN r6815.	2005-08-12 02:41:14 +00:00

1 2

53 Коммитов