1
1
Граф коммитов

1154 Коммитов

Автор SHA1 Сообщение Дата
yosefe
cc76db8d39 ucx: reduce components priority to 5. 2015-10-21 17:38:25 +03:00
Mike Dubman
4ea13f10f6 Merge pull request #1008 from alex-mikheev/topic/ucx_support
UCX support for ompi and oshmem
2015-10-21 09:33:33 +03:00
yosefe
a313588337 ompi: Add UCX PML. 2015-10-20 19:46:06 +03:00
Nathan Hjelm
53f6b57c0a pml/cm: use the priority of the mtl component
This commit changes the priority of mtl components to be relative to
pml/ob1 and updates the mtl interface to expose this priority. cm now
sets its own priority based on the priority of the selected mtl
component.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-19 12:32:42 -06:00
Nathan Hjelm
bedd80214e pml/ob1: remove priority check
This commit removes code that checks the ob1 priority vs the previous
priority. The previous priority is meaningless here and may only cause
ob1 to disable itself when it shouldn't.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-19 12:32:41 -06:00
Nathan Hjelm
2fd176ac7f cm: fix selection priority
This patch removes a priority check that disables cm if the previous
pml had higher priority. The check was incorrect as coded and is
unnecessary as we finalize all but one pml anyway.

Fixes open-mpi/ompi#1035

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-19 12:32:26 -06:00
Nathan Hjelm
341b60dd57 Merge pull request #1029 from kawashima-fj/pr/ob1-fin-memory-leak
pml/ob1: Fix a memory leak regarding pending FIN control messages.
2015-10-15 07:55:52 -06:00
KAWASHIMA Takahiro
4e56505202 pml/ob1: Fix a memory leak regarding pending FIN control messages.
Once a FIN control message is appended to the pending list,
the ob1 PML attempts to send the FIN again in the                               `mca_pml_ob1_process_pending_packets` function.
But if the PML failed to sent the FIN again, the `mca_pml_ob1_send_fin`
function creates a new `mca_pml_ob1_pckt_pending_t` object and the
old object is not retured to the free list.
2015-10-15 11:15:03 +09:00
Jeff Squyres
889d80a659 mxm/yalla: disable MPI dynamic process functionality
Disable the MPI dynamic process functionality when these components
are selected to be used.
2015-10-14 13:42:56 -07:00
Nathan Hjelm
12bd300c40 Merge pull request #929 from hjelmn/add_procs
Update add_procs support
2015-09-28 17:29:13 -06:00
Nathan Hjelm
6611c000c9 Fix coverity warnings
Fix CID 1315271: Constant expression result

The intent of this conditional is to not produce a peruse event for
probe or mprobe requests. Coverity is correct that the expression is
always true. Changed the || to && to fix. Also moved the conditional
within an OMPI_WANT_PERUSE to ensure the conditional is not evaluated
if peruse is disabled.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-28 15:35:25 -06:00
George Bosilca
01d8e23ccc Fix the random errors related to the recursive sends and receives
identified by Fujitsu.
2015-09-26 00:44:51 +02:00
Nathan Hjelm
54a4061d88 Add support for detecting when dynamic add_procs is not possible
This commit adds support to the pml, mtl, and btl frameworks for
components to indicate at runtime that they do not support the new
dynamic add_procs behavior. At the high end the lack of dynamic
add_procs support is signalled by the pml using the new pml_flags
member to the pml module structure. If the
MCA_PML_BASE_FLAG_REQUIRE_WORLD flag is set MPI_Init will generate the
ompi_proc_t array passed to add_proc from ompi_proc_world () instead
of ompi_proc_get_allocated ().

Both cm and ob1 have been updated to detect if the underlying mtl and
btl components support dynamic add_procs.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-23 16:22:05 -06:00
Gilles Gouaillardet
a611274704 pml: fix commit open-mpi/ompi@6e6a3e965c
do not use the const modifier for allocator nor recv buffers
2015-09-18 09:54:18 +09:00
Nathan Hjelm
b4a0d40915 pml/ob1: Add support for dynamically calling add_procs
This commit contains the following changes:

 - pml/ob1: use the bml accessor function when requesting a bml
   endpoint. this will ensure that bml endpoints are only created when
   needed. for example, a bml endpoint is not requested and not
   allocated when receiving an eager message from a peer.

 - pml/ob1: change the pml_procs array in the ob1 communicator to a
   proc pointer array. at the cost of a single level of extra
   redirection this will allow us to allocate pml procs on demand.

 - pml/ob1: add an accessor function to access the pml proc structure
   for a given peer. this function will allocate the proc if it
   doesn't already exist.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:54 -06:00
Gilles Gouaillardet
6e6a3e965c pml: do not cast way the const modifier when this is not necessary
update the pml framework and mpi c bindings
2015-09-09 09:18:57 +09:00
Ralph Castain
cf6137b530 Integrate PMIx 1.0 with OMPI.
Bring Slurm PMI-1 component online
Bring the s2 component online

Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways.

Bring the OMPI pubsub/pmi component online

Get comm_spawn working again

Ensure we always provide a cpuset, even if it is NULL

pmix/cray: adjust cray pmix component for pmix

Make changes so cray pmix can work within the integrated
ompi/pmix framework.

Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet

Cleanup comm_spawn - procs now starting, error in connect_accept

Complete integration
2015-08-29 16:04:10 -07:00
yosefe
85580ad055 yalla: fix passing on-demand mapping config to mxm. 2015-08-18 15:00:59 +03:00
Gilles Gouaillardet
6b2fe9120e yalla: fix Makefile.am LDFLAGS 2015-08-13 17:33:52 +09:00
Jithin Jose
bc4e8b7e73 Fix warnings in direct (pml-cm,mtl-ofi) build
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-07-29 15:49:37 -07:00
yosefe
103cac5bd9 yalla: fix mxm configuration parsing.
Take configuration from MXM_MPI_xx instead of MXM_PML_xx, same as mtl
mxm.
2015-07-08 19:18:23 +03:00
Rolf vandeVaart
30a872b478 Add the ability to send host buffers through one sized staging buffers and CUDA buffers through different sized buffers. Fixes performance issues 2015-07-02 11:11:15 -04:00
Nathan Hjelm
ee36d813dc Merge pull request #657 from hjelmn/c99
more c99 updates
2015-06-25 11:21:09 -06:00
Nathan Hjelm
4d92c9989e more c99 updates
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-25 10:14:13 -06:00
Howard Pritchard
e49a37c034 ownership: update ownership files
per discussions at OMPI devel workshop

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-06-25 10:04:42 -06:00
George Bosilca
dc1b125b12 There is no destructor for the base requests. 2015-06-24 14:29:45 -07:00
bosilca
1b8556f926 Merge pull request #653 from hjelmn/moar_ob1_fixes
pml/ob1: fix bugs in static request objects
2015-06-24 14:28:11 -07:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Nathan Hjelm
9a8a87611e pml/ob1: fix bugs in static request objects
This commit fixes several bugs in the static request objects used by
ob1 for blocking send/receive operations.

 - Fix memory leak when using MPI_THREAD_MULTIPLE. Requests were
   allocated off the free list but were destructed and NOT returned.

 - Fix double-destruct of static objects. There is no reason to
   CONSTRUCT/DESTUCT the static object for each send/receive
   operation. This adds overhead and no benefit. To keep the code
   clean helper functions have been added to finalize ob1 send/receive
   requests.

 - Remove now unnecessary include of alloca.h.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-23 11:00:45 -06:00
Nathan Hjelm
ac51acb3e1 Merge pull request #651 from hjelmn/fix_thread_multiple_check
pml/ob1: do not use OPAL_ENABLE_MULTI_THREADS to determine thread multiple support
2015-06-22 21:45:43 -06:00
Nathan Hjelm
284dd6babe pml/ob1: do not use OPAL_ENABLE_MULTI_THREADS to determine thread multiple support
OPAL_ENABLE_MULTI_THREADS is always on. The correct value to check is
OMPI_ENABLE_THREAD_MULTIPLE.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-22 19:17:23 -06:00
Andrew Friedley
2c9be59b37 Add new PSM2 MTL.
This new MTL runs over PSM2 for Omni Path.  PSM2 is a descendant of PSM
with changes to support more ranks and some MPI-3 features like mprobe.

PSM2 will only support Omni Path networks; PSM only supports True Scale.
Likewise, the existing PSM MTL will continue to be maintained for True
Scale, while the PSM2 MTL is developed and maintained for Omni Path.
2015-06-22 07:55:46 -07:00
rhc54
9a8bda0b72 Merge pull request #637 from jithinjosepkl/pr/pml-cm-opt
pml-cm bug fixes
2015-06-15 19:25:09 -07:00
Jithin Jose
7ccde09a09 Do opal_convertor_copy_and_prepare_for_send for buffered send mode as
MCA_PML_CM_HVY_SEND_REQUEST_BSEND_ALLOC calls opal_convertor_pack
directly.

Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-15 17:12:50 -07:00
Gilles Gouaillardet
ee3a1da28a pml/ob1:mca_pml_ob1_recv_request_put_frag silence a warning
proc local variable is used only in heterogeneous mode
2015-06-15 10:00:53 +09:00
George Bosilca
67b70bb47a Add multi-threaded support. 2015-06-12 14:22:17 -07:00
George Bosilca
b2cf74cabc A first cut at a possible solution for the missing requests
from the message queues (a debugging feature). With this approach
all blocking (single threaded) requests are allocated from the main
freelist, so they will be accounted for during the message queues
investigation).
2015-06-12 14:22:17 -07:00
Jithin Jose
7cfbfc4c89 Initialize convertor in pml-cm-send and recv.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-10 09:39:31 -07:00
Jeff Squyres
347290f785 pml/Makefile.am: add missing file to $(headers) 2015-06-02 20:07:54 -07:00
Jithin Jose
5ba5a9ade2 Offset buffer by datatype true_lb to handle resized datatypes.
- Follow up patch for 56869bff38

Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-27 13:51:05 -07:00
Jithin Jose
07043894bd Avoid extra lookup for ompi_proc in homogenous build
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:42 -07:00
Jithin Jose
50089977ac Inline PML-CM
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jithin Jose
56869bff38 Avoid datatype pack/unpack for contiguous data on homogenous systems.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Gilles Gouaillardet
e980958ad4 pml/ob1: silence a warning 2015-05-26 15:05:44 +09:00
Gilles Gouaillardet
85c45e2275 pml/ob1: fix mca_pml_ob1_recv_request_put_frag(...) in heterogeneous mode 2015-05-22 15:48:45 +09:00
Nathan Hjelm
ce48eabd84 pml/ob1: use c99 flexible array members instead of size 1 arrays
This commit updates several ob1 structures to take advantage of C99's
flexible array member.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 10:31:35 -06:00
Ralph Castain
6e95bcd583 Fix typo in oob_tcp.c when IPV6 enabled. Cleanup a few other warnings, including a type in coll_sm that prevented that component from registering its MCA params! 2015-05-07 21:05:08 -07:00
Gilles Gouaillardet
9d56b85b55 initialize common symbols from ompi 2015-05-08 10:11:58 +09:00
Nathan Hjelm
033894b493 Merge pull request #541 from hjelmn/c99_components
C99 component initialization
2015-04-20 10:45:39 -06:00
Nathan Hjelm
d251fa1525 pml/ob1: fix heterogenous build
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-20 09:27:00 -06:00