1
1

134 Коммитов

Автор SHA1 Сообщение Дата
Mike Dubman
d6ead2a3a5 Add support for routable ROCE where different subnet_id is a valid to proceed with MPI routing.
(can happen in the same LAN)
developed by vasily, reviewed by miked
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r29479.
2013-10-23 06:08:54 +00:00
Rolf vandeVaart
c15b2a26b8 Fix some formatting. Move some CUDA-aware mca parameter initialization earlier.
This commit was SVN r29162.
2013-09-13 17:43:41 +00:00
Rolf vandeVaart
ba9ec1b8bc For debug builds, add the ability to view memory registrations and deregistrations in the openib BTL.
This commit was SVN r29159.
2013-09-13 14:28:26 +00:00
Alex Mikheev
9e2fdc7d56 - correction of r28440
This commit was SVN r28441.

The following SVN revision numbers were found above:
  r28440 --> open-mpi/ompi@93ce233530
2013-05-02 12:52:58 +00:00
Alex Mikheev
93ce233530 - btl_openib: changed default SRQ settings:
- increase number of wqe to minimize number of RNRs
    - it is better to have high watermark and post relatively small number of wqes
    - increased TX queue size

This commit was SVN r28440.
2013-05-02 12:46:35 +00:00
Pavel Shamis
fed6e60131 Fixing OpenIB BTL compilation failure for a cases when
BTL_OPENIB_MALLOC_HOOKS_ENABLED is disabled.

This commit was SVN r28290.
2013-04-04 20:17:18 +00:00
Nathan Hjelm
cf377db823 MCA/base: Add new MCA variable system
Features:
 - Support for an override parameter file (openmpi-mca-param-override.conf).
   Variable values in this file can not be overridden by any file or environment
   value.
 - Support for boolean, unsigned, and unsigned long long variables.
 - Support for true/false values.
 - Support for enumerations on integer variables.
 - Support for MPIT scope, verbosity, and binding.
 - Support for command line source.
 - Support for setting variable source via the environment using
   OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
 - Cleaner API.
 - Support for variable groups (equivalent to MPIT categories).

Notes:
 - Variables must be created with a backing store (char **, int *, or bool *)
   that must live at least as long as the variable.
 - Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
   mca_base_var_set_value() to change the value.
 - String values are duplicated when the variable is registered. It is up to
   the caller to free the original value if necessary. The new value will be
   freed by the mca_base_var system and must not be freed by the user.
 - Variables with constant scope may not be settable.
 - Variable groups (and all associated variables) are deregistered when the
   component is closed or the component repository item is freed. This
   prevents a segmentation fault from accessing a variable after its component
   is unloaded.
 - After some discussion we decided we should remove the automatic registration
   of component priority variables. Few component actually made use of this
   feature.
 - The enumerator interface was updated to be general enough to handle
   future uses of the interface.
 - The code to generate ompi_info output has been moved into the MCA variable
   system. See mca_base_var_dump().

opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system

This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.

This commit was SVN r28236.
2013-03-27 21:09:41 +00:00
Jeff Squyres
bbddd6ea03 Add header file for opal_show_help().
This commit was SVN r28056.
2013-02-13 16:31:59 +00:00
Brian Barrett
312f37706e In talking about this with Jeff and Ralph, we don't actually need
ompi_show_help, because opal_show_help is replaced with an 
aggregating version when using ORTE, so there's no reason to
directly call orte_show_help.

This commit was SVN r28051.
2013-02-12 21:10:11 +00:00
Brian Barrett
f42783ae1a Move the RTE framework change into the trunk. With this change, all non-CR
runtime code goes through one of the rte, dpm, or pubsub frameworks.

This commit was SVN r27934.
2013-01-27 23:25:10 +00:00
Rolf vandeVaart
f63c88701f Improve CUDA GPU transfers over openib BTL. Use aynchronous copies.
This is RFC that was submitted in July and December of 2012.

This commit was SVN r27862.
2013-01-17 22:34:43 +00:00
Nathan Hjelm
f3ce12e71a Per RFC fix several leaks in opal and ompi. Details below.
pml/v:
  - If vprotocol is not being used vprotocol_include_list is leaked. Assume vprotocol never takes ownership (see below) and always free the string.

coll/ml:
  - (patch verified) calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
  - Need to free mca_coll_ml_component.config_file_name on component close.

btl/openib:
  - calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.

vprotocol/base:
  - There was no way for pml/v to determine if vprotocol took ownership of vprotocol_include_list. Fix by always never ownership (use strdup).

mca/base:
  - param_lookup will result in storage->stringval to be a newly allocated string if the mca parameter has a string value. ensure this string is always freed.

cmr:v1.7

This commit was SVN r27569.
2012-11-06 18:57:46 +00:00
Mike Dubman
d47d550dfc performance optimization: process completions in the batch manner
This commit was SVN r27559.
2012-11-05 14:02:37 +00:00
Jeff Squyres
fb2e543a57 Refs trac:3275.
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't).  If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!".  Kaboom.

The only problem is that it didn't give you any indication of where
this value was being set.  Quite maddening, from a user perspective.

So we changed the ompi_info handles this case.  If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.

At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).  

We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens.  Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior.  And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes.  Why?  I think we used them in exactly one
place in the code base (mca_base_components_open.c).  So we deleted
those and just used the normal OPAL_* codes instead.

While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).

This commit was SVN r27306.

The following Trac tickets were found above:
  Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
2012-09-11 20:47:24 +00:00
Jeff Squyres
e5babf830a Fixes trac:3258: add btl_openib_abort_not_enough_reg_mem MCA parameter
that causes MPI jobs to abort if there is not enough registered memory
available (vs. just warning).

This commit was SVN r27140.

The following Trac tickets were found above:
  Ticket 3258 --> https://svn.open-mpi.org/trac/ompi/ticket/3258
2012-08-25 11:39:06 +00:00
Jeff Squyres
02e2c88224 Back out r26869 (i.e., put back a single per-peer QP in the default
receive queues value) so that we don't break the use of RDMA CM, and
therefore break RoCE.

This commit was SVN r27017.

The following SVN revision numbers were found above:
  r26869 --> open-mpi/ompi@fe0e7f81df
2012-08-13 15:57:21 +00:00
Yael Dayan
7895cd1114 adding a fragmentation mechanism to the Get flow in function mca_pml_ob1_recv_request_progress_rget
This commit was SVN r26956.
2012-08-07 07:15:21 +00:00
Yevgeny Kliteynik
a6458da4ba Using 8K as a minimal CQ length
- For now we'll use 8192 as a base value
 - We leave the adjust_cq() as is
 - For the long term we can work on an appropriate setting to expose through the INI file.

8K CQEs are 512K per process, which is 8MB for ppn=16

This commit was SVN r26877.
2012-07-26 21:06:18 +00:00
Nathan Hjelm
fe0e7f81df btl/openib: as discussed remove the per-peer queue pair from the default configuration
This commit was SVN r26869.
2012-07-25 22:53:58 +00:00
Jeff Squyres
bb13e21538 Roll back r26730, but bump the default CQ length base up to 1500, not
1000.  Refs trac:3154.

IB/iWarp vendors need to get together to figure out a real fix.

This commit was SVN r26777.

The following SVN revision numbers were found above:
  r26730 --> open-mpi/ompi@5315c91baf

The following Trac tickets were found above:
  Ticket 3154 --> https://svn.open-mpi.org/trac/ompi/ticket/3154
2012-07-10 16:53:27 +00:00
Jeff Squyres
5315c91baf Fixes trac:3152: slightly more advanced than the patch on the ticket:
* If the MCA param btl_openib_cq_size is set to 0 (which is the
   default), use the device CQ max size. Otherwise, use the MCA param
   value (and never adjust it again).
 * Remove the CQ size adjustment code. Since we default to max CQ
   size, there really isn't much point in having it any more. I think
   people setting an absolute CQ size is going to be rare, so let's
   not do anything fancy with it.
 * If the MCA param value is larger than what the device supports,
   print a warning (only once per process) and default to using the
   device max
 * Add a BTL_VERBOSE displaying which CQ size we used

This commit was SVN r26730.

The following Trac tickets were found above:
  Ticket 3152 --> https://svn.open-mpi.org/trac/ompi/ticket/3152
2012-07-03 16:49:59 +00:00
Josh Hursey
28681deffa Backout the ORCA commit. :(
There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk.

This commit was SVN r26676.
2012-06-27 01:28:28 +00:00
Josh Hursey
542330e3a7 Commit of ORCA: Open MPI Runtime Collaborative Abstraction
This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI.

The project is described on the wiki:
  https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition

And on this email thread:
  http://www.open-mpi.org/community/lists/devel/2012/06/11109.php

This commit was SVN r26670.
2012-06-26 21:42:16 +00:00
Nathan Hjelm
37c624ee43 prepare to delete mpool/rdma
This commit was SVN r26664.
2012-06-26 15:55:23 +00:00
Mike Dubman
98c2c749fb fix define name to BTL_OPENIB_MALLOC_HOOKS_ENABLED
Thanks to Ludovic.Hablot@ext.bull.net  for pointing this out

This commit was SVN r26432.
2012-05-11 18:30:45 +00:00
Mike Dubman
cd17fee9a8 performance fix: openib use memalign for malloc
This commit was SVN r26409.
2012-05-08 20:42:09 +00:00
Ralph Castain
e7f6be5385 Unused variable
This commit was SVN r25301.
2011-10-17 18:59:22 +00:00
Rainer Keller
4e6a6fc146 - Check, whether the compiler supports __builtin_clz (count leading
zeroes);
   if so, use it for bit-operations like opal_cube_dim and opal_hibit.
   Implement two versions of power-of-two.
   In case of opal_next_poweroftwo, this reduces the average execution
   time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
   measured rdtsc, with loop over 2^27 values).
   Numbers for other functions are similar (but of course heavily depend
   on the usage, e.g. opal_hibit() with a start of 4 does not save
   much).  The bsr instruction on AMD Opteron is also not as fast.

 - Replace various places where the next power-of-two is computed.
   
   Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
   Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.

This commit was SVN r25270.
2011-10-11 22:49:01 +00:00
Yevgeny Kliteynik
4fbe68dd86 Removing trailing white spaces in all the openib btl code.
This commit was SVN r24855.
2011-07-04 14:00:41 +00:00
Yevgeny Kliteynik
b05211148d Supporting dynamic SL (#2674)
- Added enable/disable configuration parameter for dynamic SL
 - All the dynamic SL code is conditionalized
 - Removed libibmad dependency
 - Using only one include - ib_types.h (part of opensm-devel package)
 - Removed all the macro and data types definitions, using the
   existing definitions from ib_types.h instead
 - general cleaning here and there

The async mode is not implemented yet - stay tuned...

This commit was SVN r24830.
2011-06-28 14:28:29 +00:00
Jeff Squyres
d1d2cd0a87 Make the description of mca_btl_openib_cq_size be more accurate of
what it really is/does.

cmr:v1.5.4:kliteyn cmr:v1.4.4:reviewer=kliteyn

This commit was SVN r24684.
2011-05-05 13:10:11 +00:00
Jeff Squyres
25a8944e09 Fixes trac:2776. Let the openib BTL auto-detect its bandwidth.
cmr:v1.5.4

This commit was SVN r24621.

The following Trac tickets were found above:
  Ticket 2776 --> https://svn.open-mpi.org/trac/ompi/ticket/2776
2011-04-19 16:31:36 +00:00
Jeff Squyres
82f9474fec Revert r24533 and r24507 until the compile errors can be fixed.
This commit was SVN r24541.

The following SVN revision numbers were found above:
  r24507 --> open-mpi/ompi@4ce1936fed
  r24533 --> open-mpi/ompi@3204af2d36
2011-03-18 13:33:02 +00:00
Doron Shoham
4ce1936fed Fix the following for dynamic SL patch:
* rename ib_path_rec_service_level -> ib_path_record_service_level
* use mad.h and ib_types.h
* free all resources
* move ibv_post_recv to be just before ibv_post_send
* cleanup and beatify code

This commit was SVN r24507.
2011-03-10 16:19:00 +00:00
Jeff Squyres
4cb8a42e7b Add btl_openib_gid_index MCA param to allow selecting which GID to use
from an openfabrics port's GID table.

This commit was SVN r24456.
2011-02-24 14:09:22 +00:00
Doron Shoham
47a0752856 max_hw_msg_size should be 0 (default) or greater
This commit was SVN r24455.
2011-02-24 09:17:18 +00:00
Doron Shoham
e41e15c8db cosmetic fixes in openib btl:
* replace tabs with ws
* remove unnecessary casting
* use proper escape codes for printf() like functions

This commit was SVN r24445.
2011-02-23 15:50:37 +00:00
Rolf vandeVaart
acd38ff746 Final changes from jsquyres review. Moved configure
code from upper level into btl configure.m4.  Changed
prefix from "OMPI" to "BTL" in preprocessor macro.  Add
an mca param that shows it has been configured in.

This commit was SVN r24270.
2011-01-19 20:58:22 +00:00
Doron Shoham
834625cc51 Currently the service lever is passed as static parameter (ib_service_level), but the service level is possibly dynamic and if so the only way to get a proper value is to ask the SA.
New mca parameter is added (ib_path_rec_service_level) - positive value means that we should get the SL from the SA.

This is usable for torus topologies where different SL value is used for different endpoints.

A cache is kept of ib queue pairs used to communicate with the SA for a particular device and port and path record SL values retrieved from that SA.

The interaction with the cache assumes that there are no recursive calls to these routines. This must be solved either by code flow, by using higher level locks, or by adding a locking mechanism to these routines along with some method for avoiding deadlock.

This code use a UD queue pair to talk to the SA, and not need to chmod /dev/infiniband/umad* for use by normal users.  

The request to the SA is a SubnAdmGet(), not a SubnAdmGetTable().
In the future we might add a support of a SubnAdmGetTable(), but it will require implementing RMPP (Reliable Multi-Packet Transaction Protocol) and I'm not sure we want to do that.

This patched is based on the work of David McMillen <davem@systemfabricworks.com>.

This commit was SVN r24195.
2010-12-30 08:20:24 +00:00
Doron Shoham
bfe611d3bd This patch fixes bugs #2627 (1.5.2) and #2623 (1.4.2) - Sending large messages over RDMA fails.
The patch includes the following:
 *  Add new mca parameter - btl_openib_max_hw_msg_size - Maximum size (in bytes) of a single fragment of a long message when using the RDMA protocols (must be > 0 and <= hw capabilities).
 *  If btl_openib_max_hw_msg_size is larger than the maximum hw limitation print error message.
 *  Change the default openib flags to include only PUT and not GET.
 *  Print error message if user choose manually GET flag in openib btl.
 *  In prepare_dst: limit the message size to be the minimum of both endpoint's hw_limitation and the user limitation (if requested).

This commit was SVN r24191.
2010-12-23 11:48:43 +00:00
Rainer Keller
7c85144ac6 - Hmm, these mca parameters indeeed are registered twice ,-]
Thanks, Jeff!

   This should be added to CMR:v1.5:#2527

This commit was SVN r23589.
2010-08-10 21:11:59 +00:00
Rainer Keller
2ee01042c9 - Spelling fixes and line breaks in the parameter descriptions.
Please cmr:v1.5

This commit was SVN r23578.
2010-08-09 16:10:31 +00:00
Rolf vandeVaart
b7a27ab36a Add support for openib BTL failover to be used with bfo PML.
By default, feature is configured out so no effect on 
normal operation.

This commit was SVN r23412.
2010-07-14 10:08:19 +00:00
Shiqing Fan
5b37e2922c Use semicolon as the separator for Windows, as colon is normally part of the windows path.
This commit was SVN r23411.
2010-07-14 09:12:10 +00:00
Pavel Shamis
a124f6b10b Adding a hash table for management dependences between SRQs and their BTL modules.
This commit was SVN r22653.
2010-02-18 09:48:16 +00:00
Vasily Filipov
c036c6ef95 Adding support for on-demand SRQ pre-post (receive wqe allocation)
This commit was SVN r22313.
2009-12-15 15:52:10 +00:00
Vasily Filipov
354bfe527f Improving support for non homogeneous OpenFabrics network configurations
This commit was SVN r22312.
2009-12-15 14:25:07 +00:00
Rolf vandeVaart
c82e468ede Undo revision r21767 - sorry folks
This commit was SVN r21769.

The following SVN revision numbers were found above:
  r21767 --> open-mpi/ompi@41f38110ff
2009-08-05 22:23:26 +00:00
Rolf vandeVaart
41f38110ff HCA failover support in openib BTL
This commit was SVN r21767.
2009-08-05 21:53:02 +00:00
Lenny Verkhovsky
7f8dc7c8b8 fix for r21524, mispell fix HAVE_IBV_FORK_INIT
This commit was SVN r21533.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r21524
2009-06-25 17:45:38 +00:00