1
1
Граф коммитов

1128 Коммитов

Автор SHA1 Сообщение Дата
bosilca
1b8556f926 Merge pull request #653 from hjelmn/moar_ob1_fixes
pml/ob1: fix bugs in static request objects
2015-06-24 14:28:11 -07:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Nathan Hjelm
9a8a87611e pml/ob1: fix bugs in static request objects
This commit fixes several bugs in the static request objects used by
ob1 for blocking send/receive operations.

 - Fix memory leak when using MPI_THREAD_MULTIPLE. Requests were
   allocated off the free list but were destructed and NOT returned.

 - Fix double-destruct of static objects. There is no reason to
   CONSTRUCT/DESTUCT the static object for each send/receive
   operation. This adds overhead and no benefit. To keep the code
   clean helper functions have been added to finalize ob1 send/receive
   requests.

 - Remove now unnecessary include of alloca.h.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-23 11:00:45 -06:00
Nathan Hjelm
ac51acb3e1 Merge pull request #651 from hjelmn/fix_thread_multiple_check
pml/ob1: do not use OPAL_ENABLE_MULTI_THREADS to determine thread multiple support
2015-06-22 21:45:43 -06:00
Nathan Hjelm
284dd6babe pml/ob1: do not use OPAL_ENABLE_MULTI_THREADS to determine thread multiple support
OPAL_ENABLE_MULTI_THREADS is always on. The correct value to check is
OMPI_ENABLE_THREAD_MULTIPLE.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-22 19:17:23 -06:00
Andrew Friedley
2c9be59b37 Add new PSM2 MTL.
This new MTL runs over PSM2 for Omni Path.  PSM2 is a descendant of PSM
with changes to support more ranks and some MPI-3 features like mprobe.

PSM2 will only support Omni Path networks; PSM only supports True Scale.
Likewise, the existing PSM MTL will continue to be maintained for True
Scale, while the PSM2 MTL is developed and maintained for Omni Path.
2015-06-22 07:55:46 -07:00
rhc54
9a8bda0b72 Merge pull request #637 from jithinjosepkl/pr/pml-cm-opt
pml-cm bug fixes
2015-06-15 19:25:09 -07:00
Jithin Jose
7ccde09a09 Do opal_convertor_copy_and_prepare_for_send for buffered send mode as
MCA_PML_CM_HVY_SEND_REQUEST_BSEND_ALLOC calls opal_convertor_pack
directly.

Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-15 17:12:50 -07:00
Gilles Gouaillardet
ee3a1da28a pml/ob1:mca_pml_ob1_recv_request_put_frag silence a warning
proc local variable is used only in heterogeneous mode
2015-06-15 10:00:53 +09:00
George Bosilca
67b70bb47a Add multi-threaded support. 2015-06-12 14:22:17 -07:00
George Bosilca
b2cf74cabc A first cut at a possible solution for the missing requests
from the message queues (a debugging feature). With this approach
all blocking (single threaded) requests are allocated from the main
freelist, so they will be accounted for during the message queues
investigation).
2015-06-12 14:22:17 -07:00
Jithin Jose
7cfbfc4c89 Initialize convertor in pml-cm-send and recv.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-10 09:39:31 -07:00
Jeff Squyres
347290f785 pml/Makefile.am: add missing file to $(headers) 2015-06-02 20:07:54 -07:00
Jithin Jose
5ba5a9ade2 Offset buffer by datatype true_lb to handle resized datatypes.
- Follow up patch for 56869bff38

Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-27 13:51:05 -07:00
Jithin Jose
07043894bd Avoid extra lookup for ompi_proc in homogenous build
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:42 -07:00
Jithin Jose
50089977ac Inline PML-CM
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jithin Jose
56869bff38 Avoid datatype pack/unpack for contiguous data on homogenous systems.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Gilles Gouaillardet
e980958ad4 pml/ob1: silence a warning 2015-05-26 15:05:44 +09:00
Gilles Gouaillardet
85c45e2275 pml/ob1: fix mca_pml_ob1_recv_request_put_frag(...) in heterogeneous mode 2015-05-22 15:48:45 +09:00
Nathan Hjelm
ce48eabd84 pml/ob1: use c99 flexible array members instead of size 1 arrays
This commit updates several ob1 structures to take advantage of C99's
flexible array member.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 10:31:35 -06:00
Ralph Castain
6e95bcd583 Fix typo in oob_tcp.c when IPV6 enabled. Cleanup a few other warnings, including a type in coll_sm that prevented that component from registering its MCA params! 2015-05-07 21:05:08 -07:00
Gilles Gouaillardet
9d56b85b55 initialize common symbols from ompi 2015-05-08 10:11:58 +09:00
Nathan Hjelm
033894b493 Merge pull request #541 from hjelmn/c99_components
C99 component initialization
2015-04-20 10:45:39 -06:00
Nathan Hjelm
d251fa1525 pml/ob1: fix heterogenous build
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-20 09:27:00 -06:00
Nathan Hjelm
df75d0382f ompi: use C99 subobject naming for component initialization
This commit helps future-proof ompi components by initializing each
component member by name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-18 10:29:58 -06:00
Nathan Hjelm
3436f2917d Merge pull request #449 from hjelmn/mca_base_update
mca/base update
2015-04-16 08:41:48 -06:00
Jithin Jose
c09582a3ff - CM blocking send/recv optimizations
This patch tries to do as little as possible in the PML CM blocking
    send/receive routines.  Basically, avoid creating and filling in an
    entire request object.  An OMPI-level request is still needed, but we
    can create that on the stack instead of going to a free list.

Signed-off-by: Andrew Friedley <andrew.friedley@intel.com>
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-04-03 15:19:08 -07:00
Nathan Hjelm
b68d66bb9b MCA: Add the project/project version to the MCA base component
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.

All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:04 -06:00
adrianreber
714d9aa67e Merge pull request #348 from adrianreber/topic/orte_cr_continue_like_restart
Topic/orte cr continue like restart
2015-03-12 14:54:02 +01:00
Alina Sklarevich
28586caecf MTL_MXM/PML_YALLA: fix coverity issues. 2015-03-12 11:49:22 +02:00
Nathan Hjelm
ce6caab2a7 Merge pull request #463 from hjelmn/cuda_async
btl/openib: cuda: fix CUDA-aware support with async copy
2015-03-11 09:52:48 -06:00
Adrian Reber
c08e234af7 FT: fix compilation using --with-ft (5/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

With the changes introduced in the previous patches in this series
some goto constructs for cleanup are no longer necessary and removed.
2015-03-11 14:23:33 +01:00
Adrian Reber
1c5a8df724 FT: fix compilation using --with-ft (2/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

The FT code used barrier mechanisms which have been removed
with aec5cd08bd. This patch replaces
all those different barriers with opal_pmix.fence(NULL, 0);
I am not sure this is completely correct but at least a starting
point for a review.
2015-03-11 14:23:33 +01:00
Adrian Reber
f45dd069bd FT: fix compilation using --with-ft (1/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

This first patch moves orte_cr_continue_like_restart from ORTE
to opal_cr_continue_like_restart in OPAL. This only leaves three
calls from OPAL to ORTE in the FT code. As it is not yet 100%
clear how to handle these calls the code orte_sstore.set_attr()
has been #ifdef'd out for now.
2015-03-11 14:23:33 +01:00
Alina Sklarevich
f9a9b936a1 PML_YALLA: fix compilation warnings. 2015-03-11 10:58:54 +02:00
Nathan Hjelm
3d32dbd793 btl/openib: cuda: fix CUDA-aware support with async copy
This commit should resolve an issue seen with CUDA-aware support. The
problem came in with BTL 3.0. Before 3.0 the size of the copy was
stored in the incoming segment's des_remote_count field. This field
does not exist in BTL 3.0 so I stored the value in the
des_segment_count field. This caused problems with the cuda support
code. To fix the issue the endpoint pointer is now stored in the in
fragment's endpoint pointer which free's up the segment's des_cbdata
pointer for storing the transfer size.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-10 14:38:12 -06:00
Mike Dubman
6f91a007e1 Merge pull request #458 from yosefe/topic/pml-yalla-fix-segv
keep mxm context alive as long as pml_yalla component is open.
2015-03-10 13:38:14 +02:00
yosefe
976144dca7 keep mxm context alive as long as pml_yalla component is open.
pml_yalla_del_comm may be called after yalla module is finalized, which
leads to invalid memory access if mxm context is already destroyed in
this point.
2015-03-10 11:52:44 +02:00
George Bosilca
420ae98dfe Remove all unnecessary whitespaces and make sure we close the module
correctly.
2015-03-05 13:00:13 -05:00
Alex Mikheev
168c83ed95 OMPI/MXM: add out of band barrier at the end of del_procs
mxm shutdown requires out of band barrier
2015-03-02 12:56:02 +02:00
Rolf vandeVaart
30e9dd5066 Look in extra rdma array to find bml. This is needed with recent BML changes. Only affects CUDA-aware code. 2015-02-27 09:02:21 -05:00
George Bosilca
3fd8dc099d Revert "This function is now useless."
This reverts commit 0871c5c489.
2015-02-26 17:54:46 -05:00
George Bosilca
7f90cedf23 Revert "Fix the logic for computing the different weights for each BTLs. This"
This reverts commit de118609ec.
2015-02-26 17:54:31 -05:00
George Bosilca
d4c2fc9d41 Merge branch 'master' of github.com:open-mpi/ompi 2015-02-25 12:01:57 -05:00
Mike Dubman
a0afb7d96e Merge pull request #424 from miked-mellanox/topic/master_fix_yalla
fixes issue #414
2015-02-25 19:01:47 +02:00
George Bosilca
f3b58006c8 Merge branch 'master' of github.com:open-mpi/ompi 2015-02-25 12:01:35 -05:00
Jeff Squyres
c3381150de ob1: fix another PERUSE compile error 2015-02-25 05:53:12 -08:00
yosefe
0332ab4d8b Initialize pml_yalla bsend request status. 2015-02-25 15:33:26 +02:00
Nathan Hjelm
0ac2f08460 pml/ob1: fix peruse compile error
Fixes #416
2015-02-24 15:39:46 -07:00
Nathan Hjelm
5ef24000c7 pml/yalla: fix typo in PML_YALLA_FREELIST_INIT 2015-02-24 10:08:54 -07:00