1
1
Граф коммитов

1154 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
df75d0382f ompi: use C99 subobject naming for component initialization
This commit helps future-proof ompi components by initializing each
component member by name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-18 10:29:58 -06:00
Nathan Hjelm
3436f2917d Merge pull request #449 from hjelmn/mca_base_update
mca/base update
2015-04-16 08:41:48 -06:00
Jithin Jose
c09582a3ff - CM blocking send/recv optimizations
This patch tries to do as little as possible in the PML CM blocking
    send/receive routines.  Basically, avoid creating and filling in an
    entire request object.  An OMPI-level request is still needed, but we
    can create that on the stack instead of going to a free list.

Signed-off-by: Andrew Friedley <andrew.friedley@intel.com>
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-04-03 15:19:08 -07:00
Nathan Hjelm
b68d66bb9b MCA: Add the project/project version to the MCA base component
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.

All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:04 -06:00
adrianreber
714d9aa67e Merge pull request #348 from adrianreber/topic/orte_cr_continue_like_restart
Topic/orte cr continue like restart
2015-03-12 14:54:02 +01:00
Alina Sklarevich
28586caecf MTL_MXM/PML_YALLA: fix coverity issues. 2015-03-12 11:49:22 +02:00
Nathan Hjelm
ce6caab2a7 Merge pull request #463 from hjelmn/cuda_async
btl/openib: cuda: fix CUDA-aware support with async copy
2015-03-11 09:52:48 -06:00
Adrian Reber
c08e234af7 FT: fix compilation using --with-ft (5/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

With the changes introduced in the previous patches in this series
some goto constructs for cleanup are no longer necessary and removed.
2015-03-11 14:23:33 +01:00
Adrian Reber
1c5a8df724 FT: fix compilation using --with-ft (2/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

The FT code used barrier mechanisms which have been removed
with aec5cd08bd. This patch replaces
all those different barriers with opal_pmix.fence(NULL, 0);
I am not sure this is completely correct but at least a starting
point for a review.
2015-03-11 14:23:33 +01:00
Adrian Reber
f45dd069bd FT: fix compilation using --with-ft (1/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

This first patch moves orte_cr_continue_like_restart from ORTE
to opal_cr_continue_like_restart in OPAL. This only leaves three
calls from OPAL to ORTE in the FT code. As it is not yet 100%
clear how to handle these calls the code orte_sstore.set_attr()
has been #ifdef'd out for now.
2015-03-11 14:23:33 +01:00
Alina Sklarevich
f9a9b936a1 PML_YALLA: fix compilation warnings. 2015-03-11 10:58:54 +02:00
Nathan Hjelm
3d32dbd793 btl/openib: cuda: fix CUDA-aware support with async copy
This commit should resolve an issue seen with CUDA-aware support. The
problem came in with BTL 3.0. Before 3.0 the size of the copy was
stored in the incoming segment's des_remote_count field. This field
does not exist in BTL 3.0 so I stored the value in the
des_segment_count field. This caused problems with the cuda support
code. To fix the issue the endpoint pointer is now stored in the in
fragment's endpoint pointer which free's up the segment's des_cbdata
pointer for storing the transfer size.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-10 14:38:12 -06:00
Mike Dubman
6f91a007e1 Merge pull request #458 from yosefe/topic/pml-yalla-fix-segv
keep mxm context alive as long as pml_yalla component is open.
2015-03-10 13:38:14 +02:00
yosefe
976144dca7 keep mxm context alive as long as pml_yalla component is open.
pml_yalla_del_comm may be called after yalla module is finalized, which
leads to invalid memory access if mxm context is already destroyed in
this point.
2015-03-10 11:52:44 +02:00
George Bosilca
420ae98dfe Remove all unnecessary whitespaces and make sure we close the module
correctly.
2015-03-05 13:00:13 -05:00
Alex Mikheev
168c83ed95 OMPI/MXM: add out of band barrier at the end of del_procs
mxm shutdown requires out of band barrier
2015-03-02 12:56:02 +02:00
Rolf vandeVaart
30e9dd5066 Look in extra rdma array to find bml. This is needed with recent BML changes. Only affects CUDA-aware code. 2015-02-27 09:02:21 -05:00
George Bosilca
3fd8dc099d Revert "This function is now useless."
This reverts commit 0871c5c489.
2015-02-26 17:54:46 -05:00
George Bosilca
7f90cedf23 Revert "Fix the logic for computing the different weights for each BTLs. This"
This reverts commit de118609ec.
2015-02-26 17:54:31 -05:00
George Bosilca
d4c2fc9d41 Merge branch 'master' of github.com:open-mpi/ompi 2015-02-25 12:01:57 -05:00
Mike Dubman
a0afb7d96e Merge pull request #424 from miked-mellanox/topic/master_fix_yalla
fixes issue #414
2015-02-25 19:01:47 +02:00
George Bosilca
f3b58006c8 Merge branch 'master' of github.com:open-mpi/ompi 2015-02-25 12:01:35 -05:00
Jeff Squyres
c3381150de ob1: fix another PERUSE compile error 2015-02-25 05:53:12 -08:00
yosefe
0332ab4d8b Initialize pml_yalla bsend request status. 2015-02-25 15:33:26 +02:00
Nathan Hjelm
0ac2f08460 pml/ob1: fix peruse compile error
Fixes #416
2015-02-24 15:39:46 -07:00
Nathan Hjelm
5ef24000c7 pml/yalla: fix typo in PML_YALLA_FREELIST_INIT 2015-02-24 10:08:54 -07:00
Nathan Hjelm
5f1254d710 Update code base to use the new opal_free_list_t
Use of the old ompi_free_list_t and ompi_free_list_item_t is
deprecated. These classes will be removed in a future commit.

This commit updates the entire code base to use opal_free_list_t and
opal_free_list_item_t.

Notes:

OMPI_FREE_LIST_*_MT -> opal_free_list_* (uses opal_using_threads ())

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-24 10:05:45 -07:00
Nathan Hjelm
ed78553512 Update opal_free_list_t usage to reflect new class interface.
Please verify your components have been updated correctly. Keep in
mind that in terms of threading:

OPAL_FREE_LIST_GET -> opal_free_list_get_st
OPAL_FREE_LIST_RETURN -> opal_free_list_return_st

I used the opal_using_threads() variant anytime it appeared multiple
threads could be operating on the free list. If this is not the case
update to _st. If multiple threads are always in use change to _mt.
2015-02-24 10:05:44 -07:00
Howard Pritchard
c9e81b54fb Merge pull request #412 from hppritcha/topic/owner_files
add owner files to opa/ompi/orte mca directories
2015-02-23 09:48:20 -07:00
Howard Pritchard
bf89131f9e add owner files to opa/ompi/orte mca directories
This commit adds an owner file in each of the component directories
for each framework.  This allows for a simple script to parse
the contents of the files and generate, among other things, tables
to be used on the project's wiki page.  Currently there are two
"fields" in the file, an owner and a status.  A tool to parse
the files and generate tables for the wiki page will be added
in a subsequent commit.
2015-02-22 15:10:23 -07:00
Mike Dubman
00d416ba9d yalla: fix coverity errors
dead code fix
2015-02-22 13:57:45 +02:00
George Bosilca
0871c5c489 This function is now useless. 2015-02-21 16:38:17 -05:00
George Bosilca
de118609ec Fix the logic for computing the different weights for each BTLs. This
removes the call to qsort, as the BTLs are already sorted based on
their respective bandwidth.
2015-02-21 16:37:18 -05:00
Rolf vandeVaart
dbd0064713 Fix bug in CUDA-aware and GDR introduced by refactoring 2015-02-18 17:44:28 -05:00
Nathan Hjelm
3847025540 pml/ob1: when using btl_get try to register the entire region before attempting to break the get into multiple rdma fragments
A little background. Historically ob1 always registered the entire memory
region when the RGET protocol was in use. This changed when Mellanox
added support to fragment RGET using the btl_prepare_dst function. Now
that the BTL layer has changed to split out the limits of get/put there
is explicit fragmentation code in ob1. Before this commit the registration
was still done per RGET fragment.

This commit will attempt to register the entire region before creating
RGET fragments. If the registration is successfull then all RGET
fragments will use this registration otherwise they will each attempt
to register their own segment of the receive buffer. If that fails
enough times each fragment will give up and fall back on send/recv.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:37 -07:00
Nathan Hjelm
868e10caf2 pml/bfo: ompi ignore until updated for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:37 -07:00
Nathan Hjelm
c4a0e02261 pml/ob1: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:37 -07:00
Jeff Squyres
f38f2a159b pml_base: whitespace cleanup; no code changes 2015-02-06 11:27:50 -08:00
Jeff Squyres
46a1722dfc pml_base: fix errant show_help message 2015-02-06 11:27:50 -08:00
Yohann Burette
1ad188206b Add OFI MTL to CM PML.
This allows the CM PML to be picked when the OFI MTL is selected.
2015-01-20 10:50:14 -08:00
George Bosilca
df0512550e The extent of the datatype is irrelevant for deciding to do an immediate
send as long as we have to pack.
2015-01-19 02:23:12 -05:00
Gilles Gouaillardet
d14daf40d0 ob1: correctly handle types in which size > extent
do not send inline if extent*count *OR* size*count are greater than 256
2015-01-19 14:07:23 +09:00
Howard Pritchard
3fc7b389ff initial async progress changes for gni 2014-12-24 11:50:23 -07:00
yosefe
3f152733bf Add yalla to the list of default PMLs 2014-12-01 13:11:28 +02:00
Nathan Hjelm
1b564f62bd Revert "Merge pull request #275 from hjelmn/btlmod"
This reverts commit ccaecf0fd6, reversing
changes made to 6a19bf85dd.
2014-11-19 23:22:43 -07:00
Nathan Hjelm
1a5349ec79 ompi ignore bfo until it is updated for new btl interface 2014-11-19 11:33:04 -07:00
Nathan Hjelm
0110603782 ob1 warning fix 2014-11-19 11:33:04 -07:00
Nathan Hjelm
24427639b6 Fix ob1 warnings 2014-11-19 11:33:03 -07:00
Nathan Hjelm
271818f887 pml/ob1: bug fixes and adjustments for changes in btl_sendi behavior 2014-11-19 11:33:03 -07:00
Nathan Hjelm
ee2b111011 Update PML for latest BTL update 2014-11-19 11:33:02 -07:00
Nathan Hjelm
c61e017177 pml: updates to reflect member changes in mca_btl_base_descriptor_t
and mca_btl_base_module_t structures
2014-11-19 11:33:02 -07:00
Nathan Hjelm
5936411a07 pml/ob1: when using btl_get try to register the entire region before
attempting to break the get into multiple rdma fragments

A little background. Historically ob1 always registered the entire memory
region when the RGET protocol was in use. This changed when Mellanox
added support to fragment RGET using the btl_prepare_dst function. Now
that the BTL layer has changed to split out the limits of get/put there
is explicit fragmentation code in ob1. Before this commit the registration
was still done per RGET fragment.

This commit will attempt to register the entire region before creating
RGET fragments. If the registration is successfull then all RGET
fragments will use this registration otherwise they will each attempt
to register their own segment of the receive buffer. If that fails
enough times each fragment will give up and fall back on send/recv.
2014-11-19 11:33:02 -07:00
Nathan Hjelm
b75bb8aea7 Update pml for btl changes 2014-11-19 11:33:02 -07:00
Jeff Squyres
7a5b2e9b13 ob1: change an OPAL_UNLIKELY to OPAL_LIKELY
Per
924d39e415 (commitcomment-8378266),
this OPAN_UNLIKELY should really be OPAL_LIKELY.
2014-10-31 03:22:55 -07:00
George Bosilca
924d39e415 Always OBJ_DESTRUCT the send request. 2014-10-30 01:28:50 -04:00
Gilles Gouaillardet
ed93c8787d ob1: add a destructor to mca_pml_ob1_recv_request_t
opal_mutex_t must be OBJ_DESTRUCTed in order to avoid
a memory leak (pthread_mutex_init allocates memory under
Cygwin, so pthread_mutex_destroy is mandatory)

Thanks to Marco Atzeri for reporting this issue
2014-10-29 13:30:29 +09:00
Jeff Squyres
c22e1ae33b configury: new OPAL_SET_LIB_PREFIX/ORTE_SET_LIB_PREFIX macros
These two macros set the prefix for the OPAL and ORTE libraries,
respectively.  Specifically, the OPAL library will be named
libPREFIXopen-pal.la and the ORTE library will be named
libPREFIXopen-rte.la.

These macros must be called, even if the prefix argument is empty.

The intent is that Open MPI will call these macros with an empty
prefix, but other projects (such as ORCM) will call these macros with
a non-empty prefix.  For example, ORCM libraries can be named
liborcm-open-pal.la and liborcm-open-rte.la.

This scheme is necessary to allow running Open MPI applications under
systems that use their own versions of ORTE and OPAL.  For example,
when running MPI applications under ORTE, if the ORTE and OPAL
libraries between OMPI and ORCM are not identical (which, because they
are released at different times, are likely to be different), we need
to ensure that the OMPI applications link against their ORTE and OPAL
libraries, but the ORCM executables link against their ORTE and OPAL
libraries.
2014-10-22 10:32:19 -07:00
yosefe
b4f569b4d4 yalla: address comments on #246 by @jsquires 2014-10-22 10:42:56 +03:00
yosefe
ce7c748e51 Add new PML yalla, which uses mxm directly to reduce overhead.
http://starwars.wikia.com/wiki/Ubed_Yalla
2014-10-21 16:08:24 +03:00
Ralph Castain
aec5cd08bd Per the PMIx RFC:
WHAT:    Merge the PMIx branch into the devel repo, creating a new
               OPAL “lmix” framework to abstract PMI support for all RTEs.
               Replace the ORTE daemon-level collectives with a new PMIx
               server and update the ORTE grpcomm framework to support
               server-to-server collectives

WHY:      We’ve had problems dealing with variations in PMI implementations,
               and need to extend the existing PMI definitions to meet exascale
               requirements.

WHEN:   Mon, Aug 25

WHERE:  https://github.com/rhc54/ompi-svn-mirror.git

Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding.

All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level.

Accordingly, we have:

* created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations.

* Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported.

* Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint

* removed the prior OMPI/OPAL modex code

* added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform.

* retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand

This commit was SVN r32570.
2014-08-21 18:56:47 +00:00
Gilles Gouaillardet
f24699623f check-help-strings cleanup
This commit was SVN r32495.
2014-08-11 03:25:22 +00:00
Gilles Gouaillardet
f7b13d1126 Fix missing ampersand.
also replase the OMPI_CAST_RTE_NAME macro with
an inline function if OPAL_ENABLE_DEBUG, so we can
get warnings from the compiler if ampersand is missing.

Thanks to Paul Hargrove for reporting the bugs

This commit was SVN r32408.
2014-08-04 02:52:56 +00:00
Ralph Castain
daeb9b6c4f Some more cleanups. Remove direct references to ORTE by changing OMPI_CAST_ORTE_NAME -> OMPI_CAST_RTE_NAME. Ensure that ORTE tools (mpirun, orted, tools) set the OPAL proc structure fields so OPAL knows what is going on and uses the correct print functions (still need to fix the problem for non-MPI apps). Properly return uint32_t from the opal utilities instead of int32_t as that is what the ORTE process name fields contain.
Thanks to Gilles for pointing out some of the discrepancies.

This commit was SVN r32398.
2014-08-01 14:44:11 +00:00
George Bosilca
cee2a4e5c8 Missing alloca.h. Thanks Paul for catching this.
This commit was SVN r32388.
2014-08-01 03:28:23 +00:00
Ralph Castain
552c9ca5a0 George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT:    Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL

All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies.  This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP.  Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose.  UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs.  A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.

This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
Nathan Hjelm
f960e4273e Fix typo in r32196
The wrong descriptor field was used when calculating the size received when
using the RDMA rendevous protcol.

This commit was SVN r32232.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-14 21:00:53 +00:00
Gilles Gouaillardet
77184b5c4c Fix a cornercase with MPI_PROC_NULL persistent requests
Handle OMPI_REQUEST_NOOP in MPI_Startall rather than PML

cmr=v1.8.2:reviewer=bosilca:ticket=4764

This commit was SVN r32213.

The following Trac tickets were found above:
  Ticket 4764 --> https://svn.open-mpi.org/trac/ompi/ticket/4764
2014-07-11 04:37:01 +00:00
Nathan Hjelm
1b9621eeb0 Fix typo in r32196
This commit was SVN r32202.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-10 18:43:49 +00:00
Nathan Hjelm
a14e0f10d4 Per RFC: Remove des_src and des_dst members from the
mca_btl_base_segment_t and replace them with des_local and des_remote

This change also updates the BTL version to 3.0.0. This commit does
not represent the final version of BTL 3.0.0. More changes are coming.

In making this change I updated all of the BTLs as well as BTL user's
to use the new structure members. Please evaluate your component to
ensure the changes are correct.

RFC text:

This is the first of several BTL interface changes I am proposing for
the 1.9/2.0 release series.

What: Change naming of btl descriptor members. I propose we change
des_src and des_dst (and their associated counts) to be des_local and
des_remote. For receive callbacks the des_local member will be used to
communicate the segment information to the callback. The proposed change
will include updating all of the doxygen in btl.h as well as updating
all BTLs and BTL users to use the new naming scheme.

Why: My btl usage makes use of both put and get operations on the same
descriptor. With the current naming scheme I need to ensure that there
is consistency beteen the segments described in des_src and des_dst
depending on whether a put or get operation is executed. Additionally,
the current naming prevents BTLs that do not require prepare/RMA matched
operations (do not set MCA_BTL_FLAGS_RDMA_MATCHED) from executing
multiple simultaneous put AND get operations. At the moment the
descriptor can only be used with one or the other. The naming change
makes it easier for BTL users to setup/modify descriptors for RMA
operations as the local segment and remote segment are always in the
same member field. The only issue I forsee with this change is that it
will require a little more work to move BTL fixes to the 1.8 release
series.

This commit was SVN r32196.
2014-07-10 16:31:15 +00:00
Gilles Gouaillardet
8d3bea2771 Fix the cornercase with MPI_PROC_NULL persistent requests.
This corner case is now handled in the pml so the same code
is invoked for both MPI_Start and MPI_Startall.
This also correctly report an error if MPI_Startall is invoked twice
on a MPI_PROC_NULL persistent request.

This commit was SVN r32139.
2014-07-04 04:58:52 +00:00
George Bosilca
843ef1fcb0 ompi_mpi_abort had one extra argument that was never used. Clean it up.
This commit was SVN r32124.
2014-07-03 00:34:44 +00:00
Ralph Castain
f3cb124e50 Revert r32082 and r32070 - the developer's conference has decided to go a different direction on the threaded progress effort. This will involve some degree of prototyping to understand the tradeoffs prior to making a final design decision, and so we'll hold off on the final change until that is completed.
This commit was SVN r32089.

The following SVN revision numbers were found above:
  r32070 --> open-mpi/ompi@12d92d0c22
  r32082 --> open-mpi/ompi@aa6438ef7a
2014-06-25 20:43:28 +00:00
Ralph Castain
12d92d0c22 Per the OMPI developer conference, remove the last vestiges of OMPI_USE_PROGRESS_THREADS
This commit was SVN r32070.
2014-06-24 17:05:11 +00:00
George Bosilca
fd0e1b7261 If we detect an error on a request that has been already released
at the MPI level, we should call abort on MPI_COMM_WORLD.

Fixes ticket #1943.
cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31982.
2014-06-10 16:24:13 +00:00
Jeff Squyres
b0a6e42f45 pml ob1: use the pre-computed size from the free lists
Based on a suggestion from George on #31806, use the pre-computed
sizes rather than duplicating the computation math (which may change
someday in the future).

cmr=v1.8.2:ticket=trac:4647

This commit was SVN r31841.

The following Trac tickets were found above:
  Ticket 4647 --> https://svn.open-mpi.org/trac/ompi/ticket/4647
2014-05-20 20:32:25 +00:00
George Bosilca
db9660264e Update the error message to pinpoint the right location.
Thanks Tim.

This commit was SVN r31839.
2014-05-20 20:08:42 +00:00
George Bosilca
685f051557 Move the allocator initialization from open to init. This clean
a memory leak. Similar changes shuld be applied to all the 
other PML that are copies of OB1. This patch is related to
#4653.

This commit was SVN r31838.
2014-05-20 19:34:18 +00:00
Nathan Hjelm
a1d5ce0893 pml/ob1: as per past RFC bring the inline send optimization to
MPI_Isend.

I filed an RFC for this optimization some time back. It is a
relatively simple optimization. If the data associated with an
MPI_Isend can be put on the wire without allocating an MPI_Request
then do so. In this case we can legally return omp_request_empty
which will correctly indicate that the request is complete and that is
was not cancelled (these are the only requirements on send requests).

cmr=v1.8.3:reviewer=bosilca

This commit was SVN r31828.
2014-05-19 19:34:59 +00:00
Gilles Gouaillardet
2b89aac15b Fix a typo in MCA_PML_OB1_RECV_REQUEST_UNPACK
cmr=v1.8.2:reviewer=rhc

This commit was SVN r31817.
2014-05-19 11:00:13 +00:00
Gilles Gouaillardet
c82a6f5063 Fix a memory leak in mca/pml/bfo
Allocate the allocator in init rather than open

cmr=v1.8.2:reviewer=rhc

This commit was SVN r31816.
2014-05-19 10:40:18 +00:00
Jeff Squyres
025e4a852b pml_ob1: ensure to have enough space for send/recvreq on stack
r30343 introduced the optimization of putting the OB1 sendreq and
recvreq on the stack for blocking sends and receives.  However, the
requests did not contain enough storage for the data that is normally
immediately ''after'' the request (e.g., BTL data).

This commit changes these requests to be pointers and to use alloca()
to get enough total space for the OB1 request and all the associated
data.

The change is smaller than it looks; most of it is just changing from
"foo.bar" to "foo->bar" notation (etc.).

Submitted by Jeff, reviewed by Nathan.  But we want George to look at
this (and get a little soak time on the trunk) before moving to v1.8.

cmr=v1.8.2:reviewer=bosilca

This commit was SVN r31806.

The following SVN revision numbers were found above:
  r30343 --> open-mpi/ompi@2b57f4227e
2014-05-17 01:05:59 +00:00
Nathan Hjelm
4113cfa03a pml/ob1: add missing OBJ_DESTRUCT
An OBJ_DESTRUCT was missing for mca_pml_ob1.send_ranges causing a
memory leak. Identified by valgrind.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31768.
2014-05-14 21:15:45 +00:00
Ralph Castain
a8e2d6c3a6 The bulk of the remaining renaming changes, in one final glorious "blob". Thanks to Jeff for some help chasing down a few spots. Per chat with Jeff, we decided to cleanup a few things that were historical in nature:
top_ompi_srcdir  ->  OMPI_TOP_SRCDIR
top_ompi_builddir -> OMPI_TOP_BUILDDIR

We also split the srcdir/builddir flags according to their local tree (e.g., OPAL_TOP_SRCDIR), and tied them all together in configure.ac. Renamed ompi_ignore and ompi_unignore to be opal_<foo> as these are agnostic markers.

Only thing left is ompilibdir being treated similar to what we dif for srcdir/builddir. Coming soon.

This commit was SVN r31678.
2014-05-07 21:48:53 +00:00
Nathan Hjelm
626b521e9c pml/ob1: fix heterogeneous support when using the send_inline optimization
We will track #4568 from the 1.8 CMR.

Closes trac:4568

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31535.

The following Trac tickets were found above:
  Ticket 4568 --> https://svn.open-mpi.org/trac/ompi/ticket/4568
2014-04-28 17:36:26 +00:00
Adrian Reber
e5bef82ee1 OPAL_ENABLE_FT_CR: remove compiler warnings
When compiling --with-ft there are a few compiler warnings about
unused variables. This patch fixes those compiler warnings.

This commit was SVN r30927.
2014-03-04 15:28:07 +00:00
Jeff Squyres
f8dbba78a7 Send the BTL-passed message to ompi_rte_abort.
cmr=v1.8:reviewer=rolfv

This commit was SVN r30889.
2014-02-28 16:20:54 +00:00
Ralph Castain
49d938de29 Merge one-sided updates to the trunk - written by Brian Barrett and Nathan Hjelmn
cmr=v1.7.5:reviewer=hjelmn:subject=Update one-sided to MPI-3

This commit was SVN r30816.
2014-02-25 17:36:43 +00:00
Nathan Hjelm
a06e491c2c ob1: large buffered sends were broken by the ob1 optimizations. fix them
The problem was caused by the static request optimization. The buffered send case
is much like the isend case in that the request structure may be needed after
MPI_Bsend completes. Fix this case by calling isend and freeing the resulting
request.

cmr=v1.7.5:ticket=trac:4149

This commit was SVN r30601.

The following Trac tickets were found above:
  Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
2014-02-07 00:12:36 +00:00
Nathan Hjelm
3902cf66f1 ob1: OBJ_CONSTRUCT the convertor in the send_inline optimization.
This change does not appear to increase the small message latency of ping-pong
benchmarks and fixes an issue found by our ibm datatype tests.

Fixes trac:4232

cmr=v1.7.5:ticket=trac:4149

This commit was SVN r30598.

The following Trac tickets were found above:
  Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
  Ticket 4232 --> https://svn.open-mpi.org/trac/ompi/ticket/4232
2014-02-06 21:27:42 +00:00
George Bosilca
bde9619386 Various minor cleanups.
This commit was SVN r30431.
2014-01-26 17:27:12 +00:00
Ralph Castain
06e6a06f3e Cleanup a couple of abstraction breaks found by Thomas Naughton
This commit was SVN r30371.
2014-01-22 21:36:24 +00:00
Nathan Hjelm
66b69da394 Fix a bug in the ob1 optimizations that can cause a segfault.
btl sendi functions currently can not handle the descriptor being NULL. The
send inline optimization was assuming (incorrectly) that NULL was ok.

cmr=v1.7.5:ticket=trac:4149

This commit was SVN r30364.

The following Trac tickets were found above:
  Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
2014-01-22 16:31:58 +00:00
Nathan Hjelm
2b57f4227e ob1: optimize blocking send and receive paths
Per RFC. There are two optimizations in this commit:

 - Allocate requests for blocking sends and receives on the stack. This
   bypasses the request free list and saves two atomics on the critical path.
   This change improves the small message ping-pong by 50-200ns on both AMD
   and Intel CPUs.

 - For small messages try to use the btl sendi function before intializing a
   send request. If the sendi fails or the btl does not have a sendi function
   silently fallback on the standard send path.

cmr=v1.7.5:reviewer=brbarret

This commit was SVN r30343.
2014-01-21 15:16:21 +00:00
Brian Barrett
8b778903d8 Fix longstanding issue with our multi-project support. Rather than using
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi.  This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.

This commit was SVN r30140.
2014-01-07 22:11:15 +00:00
Rolf vandeVaart
e7f430d9ac Add empty line that was inadvertently removed in message.
This commit was SVN r30099.
2013-12-30 18:38:07 +00:00
Rolf vandeVaart
b955dbd6d9 Fix various items discovered by review of ticket #3951.
This commit was SVN r29900.
2013-12-13 21:25:07 +00:00
Rolf vandeVaart
d556b60b21 Chnage some CUDA configure code and macro names per review request by jsquyres in ticket #3880.
Functionally, nothing changes.

This commit was SVN r29815.
2013-12-06 14:35:10 +00:00
Jeff Squyres
3a7af4ab40 Fix another clang warning: sendreq is undefined if proc==NULL.
cmr=v1.7.4:reviewer=hjelmn:subject=fix ob1 undefined sendreq value

This commit was SVN r29774.
2013-12-02 19:44:42 +00:00
Ralph Castain
ac9820c46f Link against common cuda library
Thanks to Jorg Bornschein for pointing it out

cmr=v1.7.4:reviewer=rolfv

This commit was SVN r29750.
2013-11-24 17:06:51 +00:00
Rolf vandeVaart
4964a5e98b Per this RFC from October 8, 2013 and as discuessed in telecon.
http://www.open-mpi.org/community/lists/devel/2013/10/13072.php

Add support for pinning GPU Direct RDMA in openib BTL for better small message latency of GPU buffers. 
Note that none of this is compiled in unless CUDA-aware support is requested.

This commit was SVN r29680.
2013-11-13 13:22:39 +00:00