1
1
Граф коммитов

5445 Коммитов

Автор SHA1 Сообщение Дата
Andrew Friedley
2c9be59b37 Add new PSM2 MTL.
This new MTL runs over PSM2 for Omni Path.  PSM2 is a descendant of PSM
with changes to support more ranks and some MPI-3 features like mprobe.

PSM2 will only support Omni Path networks; PSM only supports True Scale.
Likewise, the existing PSM MTL will continue to be maintained for True
Scale, while the PSM2 MTL is developed and maintained for Omni Path.
2015-06-22 07:55:46 -07:00
Gilles Gouaillardet
0bd765eddd fix NBC_Copy for legitimate zero size messages
this fixes a regression from open-mpi/ompi@9a70765f27
2015-06-22 09:51:25 +09:00
Edgar Gabriel
dedeee9771 finishing the changes for the non-blocking and split cpllective I/O operations. Everything except for the
interface changes to the io framework is done.
2015-06-18 06:22:41 -05:00
Edgar Gabriel
3b11a8b61c making the current work compile. 2015-06-18 05:56:51 -05:00
Edgar Gabriel
cc219281ba checkpoint of the current work, since I need to resync wioth master to fix the compilation problems 2015-06-18 05:20:07 -05:00
Edgar Gabriel
100515e321 remove split collective interfaces from fcoll and their fake implemenations. Not required anymore 2015-06-18 05:20:07 -05:00
Edgar Gabriel
19cac73a9b first part of the changes trequired to support non-blocking colelctive io operations 2015-06-18 05:20:07 -05:00
Gilles Gouaillardet
0f08070a1c ompio: fix misc memory leaks
as identified by Coverity with CIDs 72147-72149, 731275 and 1269872
2015-06-17 11:17:54 +09:00
Gilles Gouaillardet
0f17cdfc57 fcoll: fix misc memory leaks
as reported by Coverity with CIDs 72293,72294 and 1269894
2015-06-17 11:17:52 +09:00
Nathan Hjelm
c33b786dd9 Merge pull request #620 from hjelmn/ompi_coverity
ompi coverity fixes
2015-06-16 06:10:40 -06:00
rhc54
9a8bda0b72 Merge pull request #637 from jithinjosepkl/pr/pml-cm-opt
pml-cm bug fixes
2015-06-15 19:25:09 -07:00
Jithin Jose
7ccde09a09 Do opal_convertor_copy_and_prepare_for_send for buffered send mode as
MCA_PML_CM_HVY_SEND_REQUEST_BSEND_ALLOC calls opal_convertor_pack
directly.

Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-15 17:12:50 -07:00
Jeff Squyres
cc66745e7a mtl/ofi: convert to use external libfabric
Use the new OPAL_CHECK_LIBFABRIC macro.
2015-06-15 15:17:06 -07:00
Gilles Gouaillardet
ee3a1da28a pml/ob1:mca_pml_ob1_recv_request_put_frag silence a warning
proc local variable is used only in heterogeneous mode
2015-06-15 10:00:53 +09:00
George Bosilca
67b70bb47a Add multi-threaded support. 2015-06-12 14:22:17 -07:00
George Bosilca
b2cf74cabc A first cut at a possible solution for the missing requests
from the message queues (a debugging feature). With this approach
all blocking (single threaded) requests are allocated from the main
freelist, so they will be accounted for during the message queues
investigation).
2015-06-12 14:22:17 -07:00
Ryan Grant
eec120678c Merge pull request #614 from tkordenbrock/topic/portals4.triggered.collectives
coll-portals4: implement collective operations using Portals4 triggered operations
2015-06-11 08:20:55 -06:00
Jithin Jose
7cfbfc4c89 Initialize convertor in pml-cm-send and recv.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-10 09:39:31 -07:00
Todd Kordenbrock
b725186768 mtl-portals4: Verify the result of PtlPTAlloc()
The Portals4 MTL allocates two Portals IDs requesting specific
well-known IDs and assumes that those IDs are allocated.  If those IDs
are in use, PtlPTAlloc() will allocate a different ID.  This commit
verifies that the requested IDs were allocated.
2015-06-09 14:43:50 -05:00
Jeff Squyres
347290f785 pml/Makefile.am: add missing file to $(headers) 2015-06-02 20:07:54 -07:00
Jeff Squyres
a55eb5e2c6 Merge pull request #602 from jithinjosepkl/pr/pml-cm-opt
Optimizations to PML-CM
2015-06-02 13:47:10 -05:00
Todd Kordenbrock
a274d2795c coll-portals4: implement collective operations using Portals4 triggered operations
This commit implements the reduce, allreduce, barrier and bcast
collective operations using Portals4 triggered operations.
2015-06-02 11:41:19 -05:00
Nathan Hjelm
472e5635c7 topo/base: fix coverity issue
CID 1295340 Unchecked return value (CHECKED_RETURN)

Check the return code of mca_base_framework_open. If the call fails for some reason
the component array will not be properly defined. This will cause issues in
mca_topo_base_find_available.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-02 08:59:15 -06:00
Edgar Gabriel
aa72e5b2ca fix the selection logic to not overwrite on the new aggregator side the list of
aggregators determined by the algorithm.
2015-05-27 22:35:45 -05:00
Jithin Jose
5ba5a9ade2 Offset buffer by datatype true_lb to handle resized datatypes.
- Follow up patch for 56869bff38

Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-27 13:51:05 -07:00
Jithin Jose
c745854d9b Avoid opal_convertor_pack for contigous data types in MXM mtl
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-27 11:09:25 -07:00
Jithin Jose
07043894bd Avoid extra lookup for ompi_proc in homogenous build
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:42 -07:00
Jithin Jose
50089977ac Inline PML-CM
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jithin Jose
56869bff38 Avoid datatype pack/unpack for contiguous data on homogenous systems.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jeff Squyres
fb12572438 OFI: make v1.10 and v2.0 fit in version checking scheme
v1.10 is now in the same compatibility level as v1.7/v1.8 (there
is/will be no v1.9 series).  v2.0 now takes over for what used to be
called v1.9.
2015-05-26 18:58:28 -07:00
Nathan Hjelm
d21bd24126 ompi/crcp: fix logic issue after component selection
CID 70630 Dereference before null check

Cleaned up useless goto statements and deleted NULL check. If
mca_base_select returns success than best_module and best_component
will always be non-NULL.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-26 11:48:40 -06:00
Gilles Gouaillardet
e980958ad4 pml/ob1: silence a warning 2015-05-26 15:05:44 +09:00
Gilles Gouaillardet
85c45e2275 pml/ob1: fix mca_pml_ob1_recv_request_put_frag(...) in heterogeneous mode 2015-05-22 15:48:45 +09:00
Nathan Hjelm
ce48eabd84 pml/ob1: use c99 flexible array members instead of size 1 arrays
This commit updates several ob1 structures to take advantage of C99's
flexible array member.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 10:31:35 -06:00
Gilles Gouaillardet
b6c67e051d io/ompio: fix misc memory leaks
as reported by Coverity with CIDs 72147-72149,72187,72188,731274,731275,741356,
1269889,1269893,1271535 and 1269872
2015-05-20 17:19:39 +09:00
Gilles Gouaillardet
1488e82efd osc/pt2pt: enable heterogeneous support 2015-05-14 16:42:48 +09:00
Todd Kordenbrock
c42e277385 mtl-portals4: thread multiple updates
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.

Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
2015-05-13 17:06:18 -05:00
Yohann Burette
27f1884cf8 mtl/ofi: Reworked header files. Added compat to ease maintenance. 2015-05-12 15:47:50 -07:00
rhc54
b59fa14004 Merge pull request #583 from rhc54/topic/mallocwarnings
Silence malloc(0) warnings reported by Lisandro
2015-05-12 13:37:38 -07:00
Ralph Castain
9a70765f27 Silence malloc(0) warnings reported by Lisandro 2015-05-12 12:38:58 -07:00
Ryan Grant
bbeaf41a52 Merge pull request #580 from tkordenbrock/topic/mtl.add.status.to.short.recv.blocks
mtl-portals4: add status to short recv blocks to coordinate out of or…
2015-05-11 13:44:45 -06:00
Ryan Grant
265682bdb9 Merge pull request #581 from tkordenbrock/topic/remove.overlapping.multiMD.code
portals4: use a single Memory Descriptor to cover all of memory
2015-05-11 13:20:32 -06:00
George Bosilca
78f5f0f8a9 Show the name of the collective that failed to get initialized. 2015-05-11 15:10:37 -04:00
Todd Kordenbrock
9df163f116 portals4: use a single Memory Descriptor to cover all of memory
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created.  Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
2015-05-11 11:49:41 -05:00
Todd Kordenbrock
074583060d mtl-portals4: add status to short recv blocks to coordinate out of order events
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK).  This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
2015-05-11 11:48:25 -05:00
Gilles Gouaillardet
650289bc33 romio314: update one more romio->romio314 name
Also missed this in open-mpi/ompi@db257cdbc0.
2015-05-08 18:26:33 +09:00
Ralph Castain
6e95bcd583 Fix typo in oob_tcp.c when IPV6 enabled. Cleanup a few other warnings, including a type in coll_sm that prevented that component from registering its MCA params! 2015-05-07 21:05:08 -07:00
Gilles Gouaillardet
9d56b85b55 initialize common symbols from ompi 2015-05-08 10:11:58 +09:00
Gilles Gouaillardet
ab148e4e0c romio314: update one more romio->romio314 name
Also missed this in open-mpi/ompi@db257cdbc0.
2015-05-08 09:12:22 +09:00
Jeff Squyres
b3d89cf7b0 romio314: update one more romio->romio314 name
Missed this in db257cdbc0.
2015-05-07 09:40:45 -07:00