1
1
Граф коммитов

1174 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
d2d06008a0 Change the default value of mpi_leave_pinned to -1, meaning that we'll
figure it out at runtime (really meaning: we'll still default to "0"
unless something explicitly overrides to 1, such as the openib BTL).
This way, ompi_info doesn't confusingly report mpi_leave_pinned==0 for
mpi_leave_pinned, but we end up running with mpi_leave_pinned==1.

Fixes trac:1502.

This commit was SVN r19571.

The following Trac tickets were found above:
  Ticket 1502 --> https://svn.open-mpi.org/trac/ompi/ticket/1502
2008-09-16 22:06:14 +00:00
Shiqing Fan
04ee20a880 - Mainly type casts. Microsoft VC++ compiler is too strict.
This commit was SVN r19517.
2008-09-08 15:39:30 +00:00
George Bosilca
579d70edad We should use #ifdef and not #if
This commit was SVN r19504.
2008-09-05 12:44:19 +00:00
George Bosilca
10b4664e9c Fix the problem with the unexpected MX handler. The Solution so far is to
mostly don't use this mechanism, as we have to be thread safe in order to
really take full advantage of it. The unexpected handler is called by one
of th MX threads, and we do not have control on the moment. In a non threaded
case, this will completely destroy our recv requests queues, so the safest
approach is to don't the unexpected handler.

This commit was SVN r19496.
2008-09-04 15:46:29 +00:00
George Bosilca
9e0d64b9e5 Add a send immediate for MX.
This commit was SVN r19495.
2008-09-04 15:43:56 +00:00
Jeff Squyres
7b05a14d9a Back up r19489, which was the result of a "svn ci -m ..." instead of
an "hg ci -m ...".  Oops.

This commit was SVN r19490.

The following SVN revision numbers were found above:
  r19489 --> open-mpi/ompi@ea866f9e26
2008-09-03 08:45:33 +00:00
Jeff Squyres
ea866f9e26 Up to SVN r19488
This commit was SVN r19489.

The following SVN revision numbers were found above:
  r19488 --> open-mpi/ompi@c0b8a4a9b5
2008-09-03 08:35:59 +00:00
Ralph Castain
f722f134f7 Check the return code from opal_paffinity before continuing to set maffinity as it can only be done if the map_to_socket_core function succeeds.
This commit was SVN r19396.
2008-08-23 03:13:29 +00:00
Ralph Castain
4ef9d15d97 Revamp the opal mca paffinity interface. We ran into a problem when we encountered machines that had "holes" in their physical processor layout - e.g., machines that supported "hotplugging", or that had unpopulated sockets. To solve that problem, we had to clarify at the API level where we were describing physical vs logical processor info, and then translate accordingly in the underlying implementation.
See opal/mca/paffinity/paffinity.h for explanation as to the physical vs logical nature of the params used in the API.

Fixes trac:1435

This commit was SVN r19391.

The following Trac tickets were found above:
  Ticket 1435 --> https://svn.open-mpi.org/trac/ompi/ticket/1435
2008-08-21 19:21:28 +00:00
George Bosilca
a41dcbbc44 Play nicely with the PML. If the data is in the pending send queue, then
tell the PML we're still in charge of sending it.

This commit was SVN r19311.
2008-08-17 20:07:53 +00:00
George Bosilca
f3568a271a Replace tabs with spaces.
This commit was SVN r19310.
2008-08-17 20:06:48 +00:00
George Bosilca
10612bef8a Bring back the SM pending queue, to avoid deadlocks.
This commit fixes trac:1378.

This commit was SVN r19309.

The following Trac tickets were found above:
  Ticket 1378 --> https://svn.open-mpi.org/trac/ompi/ticket/1378
2008-08-17 19:00:50 +00:00
Jeff Squyres
c9f3f2c682 From Ralph Campbell at QLogic:
Roland noticed that the QLogic HCA driver was using the PCIe vendor ID
for the ibv_query_device so the IEEE OUI value is now used. This means
the config file should recognize the vendor ID value 0x1175 too.

This commit was SVN r19277.
2008-08-13 18:35:37 +00:00
Jeff Squyres
735c1d8bee Add a missing OMPI_DECLSPEC
This commit was SVN r19221.
2008-08-07 20:36:11 +00:00
Jeff Squyres
4aff8038f9 A compromise -- move the shutting down of the async thread to the
component_close, when we know all the btl modules are gone and there
will be no more of them created.

This commit was SVN r19214.
2008-08-07 13:48:38 +00:00
Jeff Squyres
e835c89242 Revert r19205.
This commit was SVN r19212.

The following SVN revision numbers were found above:
  r19205 --> open-mpi/ompi@58b3ea2f8b
2008-08-07 11:51:38 +00:00
Donald Kerr
7a88e2c6cb catch when mpool not created and prevent the subsequent segv
This commit was SVN r19211.
2008-08-07 11:33:28 +00:00
Jeff Squyres
6d9f047238 Somehow this sign ended up pointing the wrong way.
This commit was SVN r19207.
2008-08-06 20:35:42 +00:00
Jeff Squyres
58b3ea2f8b The async thread should not be shut down by the module since it's a
per-device entity.  Also, we should shut down the CPCs after we have
destroyed both PP and SRQ QPs.

This commit was SVN r19205.
2008-08-06 18:26:53 +00:00
Ralph Castain
277e4ac292 Provide a warning message if a user's app executes a "fork" operation while using subsystems that may not cleanly support it - e.g., the openib btl. The provided warning is a generic one indicating that use of fork in current conditions is not recommended.
This is setup so that it only is issued once (as opposed to every time they do it), and goes through orte_show_help so the user doesn't get hammered by #procs copies of the warning. In addition, there is a new MCA param (can't have too many!) to shut the warning off altogether.

This closes ticket #1244

This commit was SVN r19196.
2008-08-06 14:22:03 +00:00
Rainer Keller
69a16b14bc - Fix variable set but not used
Coverity CID1057

This commit was SVN r19189.
2008-08-06 14:02:12 +00:00
Rainer Keller
0d08866786 - Declare functions in lex-files as extern "C" {} to get
rid of warnings.

This commit was SVN r19132.
2008-08-04 11:49:01 +00:00
Jeff Squyres
b45d59ea2e Adjust API usage for new parameter in mca_base_param_lookup_source()
This commit was SVN r19115.
2008-07-31 21:56:20 +00:00
George Bosilca
7f01be2830 Remove useless defines. The MCA was bumped to 2.0, there is no reason to keep
these defines around.

This commit was SVN r19091.
2008-07-30 10:40:18 +00:00
Donald Kerr
2899f64146 moving vendor_id info to top of file
This commit was SVN r19087.
2008-07-29 22:33:17 +00:00
Donald Kerr
513225c9f3 add Sun Microsystems, Inc. vendor_id
This commit was SVN r19085.
2008-07-29 19:31:41 +00:00
Jeff Squyres
0af7ac53f2 Fixes trac:1392, #1400
* add "register" function to mca_base_component_t
   * converted coll:basic and paffinity:linux and paffinity:solaris to
     use this function
   * we'll convert the rest over time (I'll file a ticket once all
     this is committed)
 * add 32 bytes of "reserved" space to the end of mca_base_component_t
   and mca_base_component_data_2_0_0_t to make future upgrades
   [slightly] easier
   * new mca_base_component_t size: 196 bytes
   * new mca_base_component_data_2_0_0_t size: 36 bytes
 * MCA base version bumped to v2.0
   * '''We now refuse to load components that are not MCA v2.0.x'''
 * all MCA frameworks versions bumped to v2.0
 * be a little more explicit about version numbers in the MCA base
   * add big comment in mca.h about versioning philosophy

This commit was SVN r19073.

The following Trac tickets were found above:
  Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392
2008-07-28 22:40:57 +00:00
Lenny Verkhovsky
f278609d77 Funny warning message about rd_win size fix
This commit was SVN r19063.
2008-07-28 14:30:57 +00:00
Adrian Knoth
5096512c3a Cosmetics, only typos.
This commit was SVN r19061.
2008-07-28 13:33:08 +00:00
Jeff Squyres
e3e79c0881 Fixes trac:1379:
* Use synonym/deprecated MCA param API for some mca base params
 * In openib BTL, if we have appropriate memory hooks support, and if
   mpi_leave_pinned and mpi_leave_pinned_pipeline were not set by the
   user, set mpi_leave_pinned to 1.
 * Defer checking mpi_leave_pinned_* until as late as possible (i.e.,
   until after the btl's have had a chance to set mpi_leave_pinned to
   1):
   * in ob1 pml
   * in rdma mpool

This commit was SVN r19022.

The following Trac tickets were found above:
  Ticket 1379 --> https://svn.open-mpi.org/trac/ompi/ticket/1379
2008-07-24 22:51:26 +00:00
Jeff Squyres
5b9219565c Remove the use of __cpu_to_be64() and replace it with hton64().
This commit was SVN r18995.
2008-07-23 12:08:55 +00:00
Jeff Squyres
2f208f885c Fixes trac:1295: change language in openib BTL from IB-specific to be
"!OpenFabrics" / neutral (i.e., refer to IB and/or iWARP).

 * Mostly just type, variable/field, and funcion name changes, such as
   s/hca/device/g, etc.  
 * Changed the INI file for the hardware-specific parameters to be
   mca-btl-openib-device-params.ini.
 * Updated a lot of help messages in the help-*.txt files, not just to
   update it to be !OpenFabrics/neutral language, but also for some
   consistency of tone, indenting, etc.
 * Deprecated a bunch of MCA params in favor of language-neutral new
   ones:
   * btl_openib_warn_no_hca_params_found (s/hca/device/)
   * btl_openib_hca_param_files
   * btl_openib_ib_cq_size (s/_ib_/_of_/)
   * btl_openib_ib_max_inline_data
   * btl_openib_ib_psn
   * btl_openib_ib_mtu
   * btl_openib_ib_pkey_ix
   * btl_openib_ib_pkey_val

This commit was SVN r18985.

The following Trac tickets were found above:
  Ticket 1295 --> https://svn.open-mpi.org/trac/ompi/ticket/1295
2008-07-23 00:28:59 +00:00
Jon Mason
f80404d991 Add openib error handling during wireup for rdmacm
The rdmacm event handler has no way of reporting fatal errors to the upper
layers.  By calling mca_btl_openib_endpoint_invoke_error in the rdmacm event
handler for the errors encountered, these errors can now be handled
appropriately.

Closes out Ticket #1283

This commit was SVN r18980.
2008-07-22 19:03:13 +00:00
Pavel Shamis
849a8f86a7 Bug fox for #1388 - fixing ib_cm_listen() random failures.
This commit was SVN r18952.
2008-07-20 06:21:32 +00:00
Andrew Friedley
d479d981a7 Missed this on my last commit regarding if include/exclude -- should now iterate from 0, not 1.
This commit was SVN r18924.
2008-07-16 14:15:25 +00:00
Andrew Friedley
dabe6defb3 Add support for btl_ofud_{in,ex}clude MCA parameters.
This commit was SVN r18916.
2008-07-15 17:57:52 +00:00
Pavel Shamis
e233c11df0 Disabling IBCM cpc by default. In order to enable it, you need to define it explicitly in command line.
This commit was SVN r18911.
2008-07-15 08:02:00 +00:00
Pavel Shamis
807c53c7b1 "failed to ib_cm_listen 10 times" - is know bug in IBCM and we should print it
only if IBCM was explicitly requested.

This commit was SVN r18897.
2008-07-13 16:36:41 +00:00
Pavel Shamis
12379e7f3e Fixing race condition between main thread and async event thread
during openib finalization.

This commit was SVN r18895.
2008-07-13 16:21:49 +00:00
Pavel Shamis
a34bb98f8a Bug fix for #1376.
If IBCM was explicitly specified with exclude/include parameter,
OpenIB BTL will enable verbose report for "/dev/infiniband/ucm" error,
other way the error will not be reported.

This commit was SVN r18868.
2008-07-10 15:08:49 +00:00
Brad Benton
9f0280bd55 arghh...I inadvertantly checked this in to the 1.3 branch rather than
first to the trunk.  So, here is the trunk checkin:

The call to orte_show_help() to notify truncation of the max_inline value
was missing the want_error_header boolean, which eventually results in
a SEGV.  This change corrects the call with the bool set to true.

This commit was SVN r18839.
2008-07-08 15:28:53 +00:00
Pavel Shamis
452141bfb8 Bugfix for #1375.
- Adding configure options that allow to disable IB/RDMA-CM support.
- Code cleanup in openib section of configure

This commit was SVN r18830.
2008-07-08 06:32:54 +00:00
Jeff Squyres
160ba5fe11 Set max_inline_data for the iWARP adapters to be 64
This commit was SVN r18782.
2008-06-30 14:25:32 +00:00
Pavel Shamis
eaa7676c57 Changing default maximum inline data size
from Maximum_Supported_By_Device to 128 (our original value).

This commit was SVN r18774.
2008-06-30 07:47:09 +00:00
Jeff Squyres
f42f55f84b Several improvements and fixes to the openib IBCM CPC:
* Move the passive side QP move to RTS to before we send the reply
   (vs. sending it after we get the RTU).  A lengthy comment explains
   the need for this.
 * Add some timers to the code for analyzing where time is spent.
 * Clarify a few error messages.
 * Currently have a loop around ib_cm_listen() because sometimes it
   fails for seeminly no reason.  Have pending e-mails in to Sean
   Hefty to see if we can figure out why this is happening.  Note that
   the more MPI processes you add, the more likely this error is to
   occur (e.g., ran 720 processes and it happens at least 50% of the
   time).  This makes IBCM somewhat unattractive for general use;
   hopefully we can get a fix...

This commit was SVN r18766.
2008-06-28 10:53:58 +00:00
Jeff Squyres
67933e743c Move some of the things I did in r18762 to the openib BTL proper (in
endpoint.c) because it's almost identical in all the CPC's.  OOB,
XOOB, and IBCM now all invoke the btl error handler properly if
there's an error during wireup.  RDMACM still needs to be done.

This commit was SVN r18764.

The following SVN revision numbers were found above:
  r18762 --> open-mpi/ompi@3eda04578f
2008-06-27 22:48:45 +00:00
Jeff Squyres
3eda04578f Clean up a lot of error handling in IBCM CPC; properly pass error to
upper level btl (and therefore the PML) when something goes wrong
during wireup.

Refs trac:1283.

This commit was SVN r18762.

The following Trac tickets were found above:
  Ticket 1283 --> https://svn.open-mpi.org/trac/ompi/ticket/1283
2008-06-27 21:02:05 +00:00
Jeff Squyres
ad16d8f335 Fix some issues in the ibcm CPC:
* Properly handle non-symmetric subnet ID's
 * Be a bit more stringent when checking for the GID
 * Add lots of BTL_VERBOSE's for diagnostics

This commit was SVN r18754.
2008-06-26 20:23:56 +00:00
Jeff Squyres
dd563f9297 Add more output to btl_base_verbose
This commit was SVN r18743.
2008-06-25 20:16:34 +00:00
Jeff Squyres
b8b1aded4e Hook all the ibcm diagnostics up to "--mca btl_base_verbose 1":
* Use BTL_VERBOSE
 * Remove tailing \n's
 * Remove redundant prefixes (BTL_VERBOSE outputs __FILE__, function,
   and __LINE__)

This commit was SVN r18742.
2008-06-25 19:19:03 +00:00