1
1

16 Коммитов

Автор SHA1 Сообщение Дата
Aravind Gopalakrishnan
2e83cf15ce Add support for GPU buffers for PSM2 MTL
PSM2 enables support for GPU buffers and CUDA managed memory and it can
directly recognize GPU buffers, handle copies between HFIs and GPUs.
Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases.
In this patch, we allow the PSM2 MTL to specify when
it does not require CUDA convertor support. This allows us to skip CUDA
convertor init phases and lets PSM2 handle the memory transfers.

This translates to improvements in latency.
The patch enables blocking collectives and workloads with GPU contiguous,
GPU non-contiguous memory.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2017-09-01 16:59:03 -07:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b7817d5891ff06a447ea190b0e77479.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
bosilca
b90c83840f Refactor the request completion (#1422)
* Remodel the request.
Added the wait sync primitive and integrate it into the PML and MTL
infrastructure. The multi-threaded requests are now significantly
less heavy and less noisy (only the threads associated with completed
requests are signaled).

* Fix the condition to release the request.
2016-05-24 18:20:51 -05:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Rainer Keller
6c5532072a - Split the datatype engine into two parts: an MPI specific part in
OMPI
   and a language agnostic part in OPAL. The convertor is completely
   moved into OPAL.  This offers several benefits as described in RFC
   http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
   namely:
    - Fewer basic types (int* and float* types, boolean and wchar
    - Fixing naming scheme to ompi-nomenclature.
    - Usability outside of the ompi-layer.
 - Due to the fixed nature of simple opal types, their information is
   completely
   known at compile time and therefore constified
 - With fewer datatypes (22), the actual sizes of bit-field types may be
   reduced
   from 64 to 32 bits, allowing reorganizing the opal_datatype
   structure, eliminating holes and keeping data required in convertor
   (upon send/recv) in one cacheline...
   This has implications to the convertor-datastructure and other parts
   of the code.
 - Several performance tests have been run, the netpipe latency does not
   change with
   this patch on Linux/x86-64 on the smoky cluster.
 - Extensive tests have been done to verify correctness (no new
   regressions) using:
   1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
    ompi-ddt:
    a. running both trunk and ompi-ddt resulted in no differences
       (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
       correctly).
    b. with --enable-memchecker and running under valgrind (one buglet
       when run with static found in test-suite, commited)
   2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
      all passed (except for the dynamic/ tests failed!! as trunk/MTT)
   3. compilation and usage of HDF5 tests on Jaguar using PGI and
      PathScale compilers.
   4. compilation and usage on Scicortex.
 - Please note, that for the heterogeneous case, (-m32 compiled
   binaries/ompi), neither
   ompi-trunk, nor ompi-ddt branch would successfully launch.

This commit was SVN r21641.
2009-07-13 04:56:31 +00:00
George Bosilca
1bd31aa3ac Cleanup the OMPI_DECLSPEC/OMPI_MODULE_DECLSPEC in the PMLs.
This commit was SVN r17093.
2008-01-09 20:32:39 +00:00
George Bosilca
e41ee17ca5 Add a small comment that hopefully will enforce the correct ordering of
the fields between CM and the other PML in the requests structure.

This commit was SVN r15760.
2007-08-03 23:59:29 +00:00
George Bosilca
433f8a7694 This patch bring full support for message queues in Open MPI. Now the send and
receive queues are shared among all PMLs, they are declared in the base PML,
and the selected PML is in charge of initializing and releasing them. 

The CM PML is slightly different compared with OB1 or DR. Internally it use
2 different types of requests: light and heavy. However, now with this patch
both types of requests are stored in the same queue, and cast appropriately
on the allocation macro. This means we might use less memory than we allocate,
but in exchange we got full support for most of the parallel debuggers.

Another thing with this patch, is that now for all PML (CM included) the basic
PML requests start with the same fields, and they are declared in the same order
in the request structure. Moreover, the fields have been moved in such a way
that only one volatile/atomic will exist per line of cache (hopefully).

This commit was SVN r15346.
2007-07-10 22:16:38 +00:00
Brian Barrett
01e8fc5f91 Redo of r12871, without the preconnect code change:
Move the req_mtl structure back to the end of each of the structures in 
the CM PML. The req_mtl structure is cast into a mtl_*_request_structure 
for each MTL, which is larger than the req_mtl itself. The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment). This was 
moved as an optimization at some point, which caused buffer sends to fail...

Refs trac:669

This commit was SVN r12873.

The following SVN revision numbers were found above:
  r12871 --> open-mpi/ompi@597598b712

The following Trac tickets were found above:
  Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
2006-12-15 17:54:14 +00:00
Brian Barrett
bdf0b231b2 Undo r12871, as it contained some code in ompi/runtime that shouldn't have been
committed

Refs trac:669

This commit was SVN r12872.

The following SVN revision numbers were found above:
  r12871 --> open-mpi/ompi@597598b712

The following Trac tickets were found above:
  Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
2006-12-15 17:52:13 +00:00
Brian Barrett
597598b712 Move the req_mtl structure back to the end of each of the structures in the
CM PML.  The req_mtl structure is cast into a mtl_*_request_structure for
each MTL, which is larger than the req_mtl itself.  The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment).  This was
moved as an optimization at some point, which caused buffer sends to
fail...

Refs trac:669

This commit was SVN r12871.

The following Trac tickets were found above:
  Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
2006-12-15 17:46:53 +00:00
George Bosilca
858ab24e8e The req_mtl field has to be the last in the struct or bad things happen.
This commit was SVN r12548.
2006-11-10 20:53:41 +00:00
George Bosilca
eb45a5e402 Move things around a little bit. Mainly fields from the send and receive
request in the base request. Rearrange the fields to keep the data
together. Remove some useless tests.

This commit was SVN r12482.
2006-11-08 04:58:23 +00:00
Brian Barrett
1cf4d0bd18 * Start of fix for #258 -- implement cancel so that we pass down to the
MTL layer.  Needed to include more knowledge of which fragment was
  which since both thin and heavy requests can be canceled

This commit was SVN r11207.
2006-08-15 21:12:03 +00:00
Galen Shipman
e0ed41f6ef Helps compilation if all files are around..
This commit was SVN r10816.
2006-07-14 20:39:18 +00:00