Andrew Friedley
2c9be59b37
Add new PSM2 MTL.
...
This new MTL runs over PSM2 for Omni Path. PSM2 is a descendant of PSM
with changes to support more ranks and some MPI-3 features like mprobe.
PSM2 will only support Omni Path networks; PSM only supports True Scale.
Likewise, the existing PSM MTL will continue to be maintained for True
Scale, while the PSM2 MTL is developed and maintained for Omni Path.
2015-06-22 07:55:46 -07:00
Gilles Gouaillardet
0bd765eddd
fix NBC_Copy for legitimate zero size messages
...
this fixes a regression from open-mpi/ompi@9a70765f27
2015-06-22 09:51:25 +09:00
Edgar Gabriel
dedeee9771
finishing the changes for the non-blocking and split cpllective I/O operations. Everything except for the
...
interface changes to the io framework is done.
2015-06-18 06:22:41 -05:00
Edgar Gabriel
3b11a8b61c
making the current work compile.
2015-06-18 05:56:51 -05:00
Edgar Gabriel
cc219281ba
checkpoint of the current work, since I need to resync wioth master to fix the compilation problems
2015-06-18 05:20:07 -05:00
Edgar Gabriel
100515e321
remove split collective interfaces from fcoll and their fake implemenations. Not required anymore
2015-06-18 05:20:07 -05:00
Edgar Gabriel
19cac73a9b
first part of the changes trequired to support non-blocking colelctive io operations
2015-06-18 05:20:07 -05:00
Gilles Gouaillardet
0f08070a1c
ompio: fix misc memory leaks
...
as identified by Coverity with CIDs 72147-72149, 731275 and 1269872
2015-06-17 11:17:54 +09:00
Gilles Gouaillardet
0f17cdfc57
fcoll: fix misc memory leaks
...
as reported by Coverity with CIDs 72293,72294 and 1269894
2015-06-17 11:17:52 +09:00
Nathan Hjelm
c33b786dd9
Merge pull request #620 from hjelmn/ompi_coverity
...
ompi coverity fixes
2015-06-16 06:10:40 -06:00
rhc54
9a8bda0b72
Merge pull request #637 from jithinjosepkl/pr/pml-cm-opt
...
pml-cm bug fixes
2015-06-15 19:25:09 -07:00
Jithin Jose
7ccde09a09
Do opal_convertor_copy_and_prepare_for_send for buffered send mode as
...
MCA_PML_CM_HVY_SEND_REQUEST_BSEND_ALLOC calls opal_convertor_pack
directly.
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-15 17:12:50 -07:00
Jeff Squyres
cc66745e7a
mtl/ofi: convert to use external libfabric
...
Use the new OPAL_CHECK_LIBFABRIC macro.
2015-06-15 15:17:06 -07:00
Gilles Gouaillardet
ee3a1da28a
pml/ob1:mca_pml_ob1_recv_request_put_frag silence a warning
...
proc local variable is used only in heterogeneous mode
2015-06-15 10:00:53 +09:00
George Bosilca
67b70bb47a
Add multi-threaded support.
2015-06-12 14:22:17 -07:00
George Bosilca
b2cf74cabc
A first cut at a possible solution for the missing requests
...
from the message queues (a debugging feature). With this approach
all blocking (single threaded) requests are allocated from the main
freelist, so they will be accounted for during the message queues
investigation).
2015-06-12 14:22:17 -07:00
Ryan Grant
eec120678c
Merge pull request #614 from tkordenbrock/topic/portals4.triggered.collectives
...
coll-portals4: implement collective operations using Portals4 triggered operations
2015-06-11 08:20:55 -06:00
Jithin Jose
7cfbfc4c89
Initialize convertor in pml-cm-send and recv.
...
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-06-10 09:39:31 -07:00
Todd Kordenbrock
b725186768
mtl-portals4: Verify the result of PtlPTAlloc()
...
The Portals4 MTL allocates two Portals IDs requesting specific
well-known IDs and assumes that those IDs are allocated. If those IDs
are in use, PtlPTAlloc() will allocate a different ID. This commit
verifies that the requested IDs were allocated.
2015-06-09 14:43:50 -05:00
Jeff Squyres
347290f785
pml/Makefile.am: add missing file to $(headers)
2015-06-02 20:07:54 -07:00
Jeff Squyres
a55eb5e2c6
Merge pull request #602 from jithinjosepkl/pr/pml-cm-opt
...
Optimizations to PML-CM
2015-06-02 13:47:10 -05:00
Todd Kordenbrock
a274d2795c
coll-portals4: implement collective operations using Portals4 triggered operations
...
This commit implements the reduce, allreduce, barrier and bcast
collective operations using Portals4 triggered operations.
2015-06-02 11:41:19 -05:00
Nathan Hjelm
472e5635c7
topo/base: fix coverity issue
...
CID 1295340 Unchecked return value (CHECKED_RETURN)
Check the return code of mca_base_framework_open. If the call fails for some reason
the component array will not be properly defined. This will cause issues in
mca_topo_base_find_available.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-02 08:59:15 -06:00
Edgar Gabriel
aa72e5b2ca
fix the selection logic to not overwrite on the new aggregator side the list of
...
aggregators determined by the algorithm.
2015-05-27 22:35:45 -05:00
Jithin Jose
5ba5a9ade2
Offset buffer by datatype true_lb to handle resized datatypes.
...
- Follow up patch for 56869bff38
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-27 13:51:05 -07:00
Jithin Jose
c745854d9b
Avoid opal_convertor_pack for contigous data types in MXM mtl
...
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-27 11:09:25 -07:00
Jithin Jose
07043894bd
Avoid extra lookup for ompi_proc in homogenous build
...
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:42 -07:00
Jithin Jose
50089977ac
Inline PML-CM
...
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jithin Jose
56869bff38
Avoid datatype pack/unpack for contiguous data on homogenous systems.
...
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-05-26 21:42:41 -07:00
Jeff Squyres
fb12572438
OFI: make v1.10 and v2.0 fit in version checking scheme
...
v1.10 is now in the same compatibility level as v1.7/v1.8 (there
is/will be no v1.9 series). v2.0 now takes over for what used to be
called v1.9.
2015-05-26 18:58:28 -07:00
Nathan Hjelm
d21bd24126
ompi/crcp: fix logic issue after component selection
...
CID 70630 Dereference before null check
Cleaned up useless goto statements and deleted NULL check. If
mca_base_select returns success than best_module and best_component
will always be non-NULL.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-26 11:48:40 -06:00
Gilles Gouaillardet
e980958ad4
pml/ob1: silence a warning
2015-05-26 15:05:44 +09:00
Gilles Gouaillardet
85c45e2275
pml/ob1: fix mca_pml_ob1_recv_request_put_frag(...) in heterogeneous mode
2015-05-22 15:48:45 +09:00
Nathan Hjelm
ce48eabd84
pml/ob1: use c99 flexible array members instead of size 1 arrays
...
This commit updates several ob1 structures to take advantage of C99's
flexible array member.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 10:31:35 -06:00
Gilles Gouaillardet
b6c67e051d
io/ompio: fix misc memory leaks
...
as reported by Coverity with CIDs 72147-72149,72187,72188,731274,731275,741356,
1269889,1269893,1271535 and 1269872
2015-05-20 17:19:39 +09:00
Gilles Gouaillardet
1488e82efd
osc/pt2pt: enable heterogeneous support
2015-05-14 16:42:48 +09:00
Todd Kordenbrock
c42e277385
mtl-portals4: thread multiple updates
...
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.
Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
2015-05-13 17:06:18 -05:00
Yohann Burette
27f1884cf8
mtl/ofi: Reworked header files. Added compat to ease maintenance.
2015-05-12 15:47:50 -07:00
rhc54
b59fa14004
Merge pull request #583 from rhc54/topic/mallocwarnings
...
Silence malloc(0) warnings reported by Lisandro
2015-05-12 13:37:38 -07:00
Ralph Castain
9a70765f27
Silence malloc(0) warnings reported by Lisandro
2015-05-12 12:38:58 -07:00
Ryan Grant
bbeaf41a52
Merge pull request #580 from tkordenbrock/topic/mtl.add.status.to.short.recv.blocks
...
mtl-portals4: add status to short recv blocks to coordinate out of or…
2015-05-11 13:44:45 -06:00
Ryan Grant
265682bdb9
Merge pull request #581 from tkordenbrock/topic/remove.overlapping.multiMD.code
...
portals4: use a single Memory Descriptor to cover all of memory
2015-05-11 13:20:32 -06:00
George Bosilca
78f5f0f8a9
Show the name of the collective that failed to get initialized.
2015-05-11 15:10:37 -04:00
Todd Kordenbrock
9df163f116
portals4: use a single Memory Descriptor to cover all of memory
...
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created. Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
2015-05-11 11:49:41 -05:00
Todd Kordenbrock
074583060d
mtl-portals4: add status to short recv blocks to coordinate out of order events
...
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK). This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
2015-05-11 11:48:25 -05:00
Gilles Gouaillardet
650289bc33
romio314: update one more romio->romio314 name
...
Also missed this in open-mpi/ompi@db257cdbc0 .
2015-05-08 18:26:33 +09:00
Ralph Castain
6e95bcd583
Fix typo in oob_tcp.c when IPV6 enabled. Cleanup a few other warnings, including a type in coll_sm that prevented that component from registering its MCA params!
2015-05-07 21:05:08 -07:00
Gilles Gouaillardet
9d56b85b55
initialize common symbols from ompi
2015-05-08 10:11:58 +09:00
Gilles Gouaillardet
ab148e4e0c
romio314: update one more romio->romio314 name
...
Also missed this in open-mpi/ompi@db257cdbc0 .
2015-05-08 09:12:22 +09:00
Jeff Squyres
b3d89cf7b0
romio314: update one more romio->romio314 name
...
Missed this in db257cdbc0
.
2015-05-07 09:40:45 -07:00