openmpi

Автор	SHA1	Сообщение	Дата
Aravind Gopalakrishnan	f8a2b7f6bf	Use opal_show_help to warn about PSM2_CUDA envvar setting If Open MPI is configured with CUDA, then user also should be using a CUDA build of PSM2 and therefore be setting PSM2_CUDA environment variable to 1 while using CUDA buffers for transfers. If we detect this setting to be missing, force set it. If user wants to use this build for regular (Host buffer) transfers, we allow the option of setting PSM2_CUDA=0, but print a warning message to user that it is not a recommended usage scenario. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-09-29 17:04:10 -07:00
Aravind Gopalakrishnan	2e83cf15ce	Add support for GPU buffers for PSM2 MTL PSM2 enables support for GPU buffers and CUDA managed memory and it can directly recognize GPU buffers, handle copies between HFIs and GPUs. Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases. In this patch, we allow the PSM2 MTL to specify when it does not require CUDA convertor support. This allows us to skip CUDA convertor init phases and lets PSM2 handle the memory transfers. This translates to improvements in latency. The patch enables blocking collectives and workloads with GPU contiguous, GPU non-contiguous memory. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-09-01 16:59:03 -07:00
George Bosilca	c2cd717f82	Don't refcount the predefined datatypes. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-01-11 16:48:59 -05:00
Nathan Hjelm	8445c885ce	pml/cm: update for request changes This fixes a hang caused by the request refactor work. The cm pml was not updated and was hanging is most cases. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 15:35:32 -06:00
Gilles Gouaillardet	6e6a3e965c	pml: do not cast way the const modifier when this is not necessary update the pml framework and mpi c bindings	2015-09-09 09:18:57 +09:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Jithin Jose	7ccde09a09	Do opal_convertor_copy_and_prepare_for_send for buffered send mode as MCA_PML_CM_HVY_SEND_REQUEST_BSEND_ALLOC calls opal_convertor_pack directly. Signed-off-by: Jithin Jose <jithin.jose@intel.com>	2015-06-15 17:12:50 -07:00
Jithin Jose	5ba5a9ade2	Offset buffer by datatype true_lb to handle resized datatypes. - Follow up patch for 56869bff38e264ee91ea68ae2fabfafe9456548e Signed-off-by: Jithin Jose <jithin.jose@intel.com>	2015-05-27 13:51:05 -07:00
Jithin Jose	07043894bd	Avoid extra lookup for ompi_proc in homogenous build Signed-off-by: Jithin Jose <jithin.jose@intel.com>	2015-05-26 21:42:42 -07:00
Jithin Jose	56869bff38	Avoid datatype pack/unpack for contiguous data on homogenous systems. Signed-off-by: Jithin Jose <jithin.jose@intel.com>	2015-05-26 21:42:41 -07:00
Nathan Hjelm	5f1254d710	Update code base to use the new opal_free_list_t Use of the old ompi_free_list_t and ompi_free_list_item_t is deprecated. These classes will be removed in a future commit. This commit updates the entire code base to use opal_free_list_t and opal_free_list_item_t. Notes: OMPI_FREE_LIST__MT -> opal_free_list_ (uses opal_using_threads ()) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-02-24 10:05:45 -07:00
Ralph Castain	552c9ca5a0	George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.	2014-07-26 00:47:28 +00:00
Aurelien Bouteiller	e1066143a4	rename ompi_free_list operations to _mt, as per discussions at last face to face meeting This commit was SVN r28734.	2013-07-08 22:07:52 +00:00
George Bosilca	c9e5ab9ed1	Our macros for the OMPI-level free list had one extra argument, a possible return value to signal that the operation of retrieving the element from the free list failed. However in this case the returned pointer was set to NULL as well, so the error code was redundant. Moreover, this was a continuous source of warnings when the picky mode is on. The attached parch remove the rc argument from the OMPI_FREE_LIST_GET and OMPI_FREE_LIST_WAIT macros, and change to check if the item is NULL instead of using the return code. This commit was SVN r28722.	2013-07-04 08:34:37 +00:00
George Bosilca	733d25a8a3	First step toward fixing the MPI_Get_count issues from the ticket #2241 . Next step is the configure and Fortran mojo that Jeff will put in. Until then I guess the Fortran interface is broken (at least all functions using the hidden count firld in the MPI_Status). This commit was SVN r23467.	2010-07-21 20:07:00 +00:00
Rainer Keller	6c5532072a	- Split the datatype engine into two parts: an MPI specific part in OMPI and a language agnostic part in OPAL. The convertor is completely moved into OPAL. This offers several benefits as described in RFC http://www.open-mpi.org/community/lists/devel/2009/07/6387.php namely: - Fewer basic types (int* and float* types, boolean and wchar - Fixing naming scheme to ompi-nomenclature. - Usability outside of the ompi-layer. - Due to the fixed nature of simple opal types, their information is completely known at compile time and therefore constified - With fewer datatypes (22), the actual sizes of bit-field types may be reduced from 64 to 32 bits, allowing reorganizing the opal_datatype structure, eliminating holes and keeping data required in convertor (upon send/recv) in one cacheline... This has implications to the convertor-datastructure and other parts of the code. - Several performance tests have been run, the netpipe latency does not change with this patch on Linux/x86-64 on the smoky cluster. - Extensive tests have been done to verify correctness (no new regressions) using: 1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and ompi-ddt: a. running both trunk and ompi-ddt resulted in no differences (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run correctly). b. with --enable-memchecker and running under valgrind (one buglet when run with static found in test-suite, commited) 2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt: all passed (except for the dynamic/ tests failed!! as trunk/MTT) 3. compilation and usage of HDF5 tests on Jaguar using PGI and PathScale compilers. 4. compilation and usage on Scicortex. - Please note, that for the heterogeneous case, (-m32 compiled binaries/ompi), neither ompi-trunk, nor ompi-ddt branch would successfully launch. This commit was SVN r21641.	2009-07-13 04:56:31 +00:00
George Bosilca	e361bcb64c	Send optimizations. 1. The send path get shorter. The BTL is allowed to return > 0 to specify that the descriptor was pushed to the networks, and that the memory attached to it is available again for the upper layer. The MCA_BTL_DES_SEND_ALWAYS_CALLBACK flag can be used by the PML to force the BTL to always trigger the callback. Unmodified BTL will continue to work as expected, as they will return OMPI_SUCCESS which force the PML to have exactly the same behavior as before. Some BTLs have been modified: self, sm, tcp, mx. 2. Add send immediate interface to BTL. The idea is to have a mechanism of allowing the BTL to take advantage of send optimizations such as the ability to deliver data "inline". Some network APIs such as Portals allow data to be sent using a "thin" event without packing data into a memory descriptor. This interface change allows the BTL to use such capabilities and allows for other optimizations in the future. All existing BTLs except for Portals and sm have this interface set to NULL. This commit was SVN r18551.	2008-05-30 03:58:39 +00:00
George Bosilca	1bd31aa3ac	Cleanup the OMPI_DECLSPEC/OMPI_MODULE_DECLSPEC in the PMLs. This commit was SVN r17093.	2008-01-09 20:32:39 +00:00
Rich Graham	67f4b69848	propogate fix for out of buffered send memory space to dr and ob1 - thanks George. This commit was SVN r16593.	2007-10-27 00:17:53 +00:00
Rich Graham	9c0483088a	if unable to get buffered space, try and progress communications to free up resources. This commit was SVN r16591.	2007-10-26 23:16:31 +00:00
Gleb Natapov	52c6160252	MCA_PML_BASE_REQUEST_MPI_COMPLETE() macro does nothing except call to ompi_request_complete(). Remove the macro and call the function directly. This commit was SVN r16498.	2007-10-18 14:20:24 +00:00
George Bosilca	9ed3ede73e	Correct the thin and heavy requests management for the CM PML. This commit was SVN r15361.	2007-07-11 15:10:01 +00:00
George Bosilca	433f8a7694	This patch bring full support for message queues in Open MPI. Now the send and receive queues are shared among all PMLs, they are declared in the base PML, and the selected PML is in charge of initializing and releasing them. The CM PML is slightly different compared with OB1 or DR. Internally it use 2 different types of requests: light and heavy. However, now with this patch both types of requests are stored in the same queue, and cast appropriately on the allocation macro. This means we might use less memory than we allocate, but in exchange we got full support for most of the parallel debuggers. Another thing with this patch, is that now for all PML (CM included) the basic PML requests start with the same fields, and they are declared in the same order in the request structure. Moreover, the fields have been moved in such a way that only one volatile/atomic will exist per line of cache (hopefully). This commit was SVN r15346.	2007-07-10 22:16:38 +00:00
Brian Barrett	a0b40ce45a	Fix race condition in setting MPI_ERROR -- with buffered sends, the request can complete before the operation, meaning that a bogus MPI_ERROR is read This commit was SVN r13401.	2007-01-31 21:40:14 +00:00
Brian Barrett	01e8fc5f91	Redo of r12871, without the preconnect code change: Move the req_mtl structure back to the end of each of the structures in the CM PML. The req_mtl structure is cast into a mtl__request_structure for each MTL, which is larger than the req_mtl itself. The cast will cause the _request to overwrite parts of the heavy requests if the req_mtl isn't the LAST thing on each structure (hence the comment). This was moved as an optimization at some point, which caused buffer sends to fail... Refs trac:669 This commit was SVN r12873. The following SVN revision numbers were found above: r12871 --> open-mpi/ompi@597598b712 The following Trac tickets were found above: Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669	2006-12-15 17:54:14 +00:00
Brian Barrett	bdf0b231b2	Undo r12871, as it contained some code in ompi/runtime that shouldn't have been committed Refs trac:669 This commit was SVN r12872. The following SVN revision numbers were found above: r12871 --> open-mpi/ompi@597598b712 The following Trac tickets were found above: Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669	2006-12-15 17:52:13 +00:00
Brian Barrett	597598b712	Move the req_mtl structure back to the end of each of the structures in the CM PML. The req_mtl structure is cast into a mtl__request_structure for each MTL, which is larger than the req_mtl itself. The cast will cause the _request to overwrite parts of the heavy requests if the req_mtl isn't the LAST thing on each structure (hence the comment). This was moved as an optimization at some point, which caused buffer sends to fail... Refs trac:669 This commit was SVN r12871. The following Trac tickets were found above: Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669	2006-12-15 17:46:53 +00:00
George Bosilca	eb45a5e402	Move things around a little bit. Mainly fields from the send and receive request in the base request. Rearrange the fields to keep the data together. Remove some useless tests. This commit was SVN r12482.	2006-11-08 04:58:23 +00:00
Jeff Squyres	e02114dcf3	Fixes trac:529. * Create a new request type: NOOP (described below) * For all MPI__INIT functions, OBJ_NEW an ompi_request_t and set its type to NOOP Ensure that the NOOP requests are OBJ_RELEASE'd when they are done * MPI_START looks at the request type; if NOOP, just return success. If not, call the PML start() function * MPI_STARTALL always pass the entire array of requests back to the PML (see next point) * Make the PMLs only process PML requests (i.e., ignore/skip anything that isn't of type PML -- such as the NOOP requests) * Add a little more param error checking in STARTALL This commit was SVN r12338. The following Trac tickets were found above: Ticket 529 --> https://svn.open-mpi.org/trac/ompi/ticket/529	2006-10-27 12:32:36 +00:00
George Bosilca	126a68dc9a	Big datatype commit. Remove all unused features of the datatype engine. As the memory allocation logic is completely done outside the data-type engine (in the PML) there is no need for any special case inside the data-type engine. There is less arguments for the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is not required anymore as there is no memory allocated in the engine itself). This change affect all components using datatypes. I test most of them, but it might happens that I miss some ... If it's the case please let me know (don't shoot the pianist!!). This commit was SVN r12331.	2006-10-26 23:11:26 +00:00
George Bosilca	688a16ea78	A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was long ago) supposed to be used as a cache for accessing the PML procs. But in all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc. This pointer can be accessed using the c_remote_group easily. Therefore, there is no meaning of keeping the PML procs around. Slim fast commit ... This commit was SVN r11730.	2006-09-20 22:14:46 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
Brian Barrett	f0afe38293	* Need to retain / release datatype and communicator so that the MPI layer handles can be freed before communication completes. This commit was SVN r11248.	2006-08-17 16:30:03 +00:00
Brian Barrett	74e95bc65f	* more fixes for ticket #264 . We need to keep the original address around, so use the req_buff field for keeping track of the bsend buffer and the req_addr field for the user buffer, the way the comments suggested we were doing it This commit was SVN r11233.	2006-08-16 20:24:28 +00:00
Brian Barrett	0f47949703	* partial fix for #264... We need to return an MPI_ERR_BUFFER if we've run out of buffer space This commit was SVN r11229.	2006-08-16 17:32:31 +00:00
Galen Shipman	6ed255f114	Substantial changes to the CM PML, allows us to have a very thin request for all but buffered and persistent requests. Unfortunately we were note able to reuse the pml_base_request_t as it was just too heavy for our needs. Lots of code for 2/10 usec ;-) This commit was SVN r10810.	2006-07-14 19:32:26 +00:00
Galen Shipman	68ae99123d	fix bsend completion.. This commit was SVN r10709.	2006-07-10 22:27:32 +00:00
Brian Barrett	b7b93e48f5	* can definitely be optimized more, but add code for calling send for MTL components that have a blocking send implementation This commit was SVN r10679.	2006-07-06 16:37:59 +00:00
Brian Barrett	26eee59032	* turns out that you should only call bsend_request_alloc or bsend_request_init, but not both. Otherwise, you don't free some buffer space and end up leaking buffers and ending in badness * since you only call alloc() or init(), but not both, need to restore reference counting in init() This commit was SVN r10674.	2006-07-06 14:02:51 +00:00
George Bosilca	2bdb06b549	Force the request to NULL in order to avoid complaints from the compiler. This commit was SVN r10647.	2006-07-04 06:20:13 +00:00
George Bosilca	9ac1a6cdb3	Remove the warnings. Now they are ompi_free_list_item not opal_list_item_t. This commit was SVN r10645.	2006-07-04 04:21:16 +00:00
Brian Barrett	47725c9b02	* Add new PML (CM) and network drivers (MTL) for high speed interconnects that provide matching logic in the library. Currently includes support for MX and some support for Portals * Fix overuse of proc_pml pointer on the ompi_proc structuer, splitting into proc_pml for pml data and proc_bml for the BML endpoint data * bug fixes in bsend init code, which wasn't being used by the OB1 or DR PMLs... This commit was SVN r10642.	2006-07-04 01:20:20 +00:00

42 Коммитов