1
1
Граф коммитов

3708 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
8c267d50a3 Fixes trac:1121.
We already show_help when we fail to create queues, so I just made the
message a little more verbose such that it may be that OMPI is trying
to use a feature that is not supported on the hardware.

This commit was SVN r18553.

The following Trac tickets were found above:
  Ticket 1121 --> https://svn.open-mpi.org/trac/ompi/ticket/1121
2008-05-30 19:03:58 +00:00
George Bosilca
e361bcb64c Send optimizations.
1. The send path get shorter. The BTL is allowed to return > 0 to specify that the
   descriptor was pushed to the networks, and that the memory attached to it is 
   available again for the upper layer. The MCA_BTL_DES_SEND_ALWAYS_CALLBACK flag
   can be used by the PML to force the BTL to always trigger the callback.
   Unmodified BTL will continue to work as expected, as they will return OMPI_SUCCESS
   which force the PML to have exactly the same behavior as before. Some BTLs have
   been modified: self, sm, tcp, mx.
2. Add send immediate interface to BTL.
   The idea is to have a mechanism of allowing the BTL to take advantage of
   send optimizations such as the ability to deliver data "inline". Some
   network APIs such as Portals allow data to be sent using a "thin" event
   without packing data into a memory descriptor. This interface change
   allows the BTL to use such capabilities and allows for other optimizations
   in the future. All existing BTLs except for Portals and sm have this interface
   set to NULL.

This commit was SVN r18551.
2008-05-30 03:58:39 +00:00
Galen Shipman
4da4c44210 Receive side changes, basically uses multiple active message callbacks rather
than using a single receive callback followed by a switch on the header.
Also fast pathed the matching for small fragments. 

This commit was SVN r18549.
2008-05-30 01:29:09 +00:00
Jeff Squyres
728ee47be4 Just check for the presents of $sysfsdir/class/infiniband and check
that it's a directory.  That's good enough to know that the
OpenFabrics kernel drivers have been loaded.  If you have no RDMA
devices and don't want to see the OMPI warning about not finding any
devices, then don't start the OpenFabrics kernel drivers.

This commit was SVN r18540.
2008-05-29 14:19:51 +00:00
Nysal Jan
25ac3629e9 eHCA does not have SRQ. Adding receive_queues value so that it works out of the box
This commit was SVN r18537.
2008-05-29 13:55:39 +00:00
Jeff Squyres
d5bf8fe005 Remove unused variables.
This commit was SVN r18532.
2008-05-29 11:58:16 +00:00
Jeff Squyres
e5ea9d08ca Fixes trac:1305: check to see if $sysfsdir/class/infiniband exists and is
non-empty.  If not, then exit the openib btl silently.  This addresses
the case where libibverbs is installed (which is getting more common)
and therefore the openib BTL was built/installed, but the kernel
drivers are not loaded (assumedly because there is no RDMA hardware
present).  In this case, "mpirun a.out" will not issue a warning.

There appears to be no good way to definitely tell if there are no
RDMA hardware devices present.  For example, if libibverbs/the openib
BTL is installed, there are no RDMA devices present, but the RDMA
hardware kernel drivers ''are'' loaded, OMPI will warn that it was
unable to find suitable devices.  This warning is easily eliminated by
unloading the kernel drivers.

This commit was SVN r18530.

The following Trac tickets were found above:
  Ticket 1305 --> https://svn.open-mpi.org/trac/ompi/ticket/1305
2008-05-28 22:05:47 +00:00
Pavel Shamis
28c763f751 Fixing the error flow when somebody tries to use XRC without XOOB.
This commit was SVN r18527.
2008-05-28 15:56:04 +00:00
Pavel Shamis
2c81b0ab9a Fixing compilation warning in btl_openib_connect_ibcm.c
This commit was SVN r18526.
2008-05-28 15:20:48 +00:00
Ralph Castain
828ae26d90 ORTE-level MCA params are defined in several places. Ompi_info cannot call orte_init due to an issue with the memory allocator, thus making it impossible for ompi_info to display all of the ORTE-level MCA params.
By consolidating them all into one function, ompi_info can call that function and register the desired variables. This also requires, however, that ompi_info call orte_output_init to avoid generating tons of error messages, so make that adjustment too. 

Fixes ticket #1314

In addition, orte_output has a race condition issue whereby calls to orte_output/verbose can occur prior to either the RML being defined/setup, or the HNP being defined. This latter occurs during the initialization of the orte_process_info structure. In both cases, there is no way orte_output can send the output to the HNP. Hence, the message must be simply output locally.

Fixes ticket #1315

This commit was SVN r18524.
2008-05-28 13:29:58 +00:00
Pavel Shamis
879a9fe45c setup_qps() may exit with error.
This commit was SVN r18523.
2008-05-28 11:36:38 +00:00
Pavel Shamis
e657a03143 Fixing broken XRC initialization flow.
This commit was SVN r18522.
2008-05-28 11:31:38 +00:00
Rolf vandeVaart
18879285c7 Fix the selection logic to prevent memory leaks. More work may be done in the priority logic but for now we just fix the leaks and preserve current behavior.
This commit fixes trac:1307.

This commit was SVN r18504.

The following Trac tickets were found above:
  Ticket 1307 --> https://svn.open-mpi.org/trac/ompi/ticket/1307
2008-05-27 14:16:39 +00:00
Pavel Shamis
6596d19c90 Adding new ConnectX vendor_part_id. Fix for ticket #1310.
This commit was SVN r18495.
2008-05-26 12:25:49 +00:00
Gleb Natapov
5fabade090 Use payload_buffer_alignment value for payload alignment.
This commit was SVN r18493.
2008-05-26 08:29:02 +00:00
Jeff Squyres
e1f118d0e6 Remove unused variable
This commit was SVN r18491.
2008-05-24 13:05:04 +00:00
Rolf vandeVaart
5baa733ad5 Fix another warning (using a variable before it was initialized.)
Thanks Jeff for pointing this out.

This commit was SVN r18489.
2008-05-23 13:57:55 +00:00
Rolf vandeVaart
0d8faf7559 Fix the fix for ticket #1298. Thanks George for pointing it out.
This commit was SVN r18488.
2008-05-23 13:33:38 +00:00
Jeff Squyres
1b50e5f6a5 Use the right variable in the output
This commit was SVN r18487.
2008-05-23 13:11:12 +00:00
Rich Graham
b08839f9f5 change reduce-scatter/gather for non-power of 2. Spreading out the
load for the non-power of 2 phase of the reduction.

This commit was SVN r18486.
2008-05-22 21:42:42 +00:00
Rich Graham
f2a4b67809 automate the allreduce selection logic.
This commit was SVN r18484.
2008-05-22 20:53:35 +00:00
Jeff Squyres
8faeeab81a Style cleanup only: s/struct foo/foo_t/g to conform to rest of code
base

This commit was SVN r18483.
2008-05-22 19:26:00 +00:00
Jeff Squyres
1f7f0e1f96 Fixes trac:1281
* s/port/tcp_port/g where relevant to disambiguate TCP port from
   device port
 * Rework ipaddrcheck to make it work in the LMC>0 case

This commit was SVN r18482.

The following Trac tickets were found above:
  Ticket 1281 --> https://svn.open-mpi.org/trac/ompi/ticket/1281
2008-05-22 19:18:15 +00:00
Rolf vandeVaart
8c3b31b181 Need to properly handle zero-length scatters and gathers on intercommunicators. Add a check for the MPI_ROOT and MPI_PROC_NULL processes so they do not enter collective module when count=0.
This commit was SVN r18481.
2008-05-22 19:09:43 +00:00
Rich Graham
5900415a25 for non-powers of 2, distribute the work on the first step among all
the procs doing the work.

This commit was SVN r18480.
2008-05-22 18:50:53 +00:00
Jon Mason
d0e26b1cf6 Add pretty comments for *_iwarp.*
This commit was SVN r18478.
2008-05-22 18:02:20 +00:00
Jeff Squyres
62ac6533e0 * Add proper copyrights
* Ensure _iwarp.h is always included, or you'll get warnings on
   platforms that don't have the RDMACM
 * Add skeleton for function descriptions in comments in iwarp.h

This commit was SVN r18477.
2008-05-22 17:41:43 +00:00
Jeff Squyres
28b56c389a Only check if the opal_ifindex is >= 0 (opal_ifbegin() and
opal_ifnext() return -1 upon completion); don't check it against
opal_ifcount() -- the interface indexes aren't necessarily related to
how many interfaces were found.

This commit was SVN r18476.
2008-05-22 02:10:23 +00:00
Jeff Squyres
27978b29f8 Fixes trac:1302: ensure to also use the LID for identifing an incoming
IBCM request (not just the port number).

This commit was SVN r18475.

The following Trac tickets were found above:
  Ticket 1302 --> https://svn.open-mpi.org/trac/ompi/ticket/1302
2008-05-22 01:28:34 +00:00
George Bosilca
21b940887a Tricky stuff !!! If we post a receive for ZERO bytes and we match it
with something with a different size ... well we segfault. The reason was
that the logic in the PML OB1 call the convertor based on the length
of he data on the wire and not the length of the data that the receiver
expects.

In other words, this is only half a patch :) It fix the problem, but we
still have to make sure the unpack is not called at all when the receiver
expect ZERO bytes.

This commit was SVN r18474.
2008-05-21 23:31:34 +00:00
George Bosilca
c31cc5b270 Remove a warning about line being unused.
This commit was SVN r18472.
2008-05-21 20:46:22 +00:00
George Bosilca
df2156568d The Elan BTL is now thread safe, and can be build in all conditions.
This commit was SVN r18471.
2008-05-21 20:44:37 +00:00
Pak Lui
1585789e8b Fix the undeclared variable.
This commit was SVN r18470.
2008-05-21 04:09:54 +00:00
Rich Graham
afd71abde6 remove some useless qualifiers.
This commit was SVN r18469.
2008-05-21 01:11:49 +00:00
Jon Mason
b9c25efbd2 Modify to comply with the "prefix rule" and remove "static inline" for
the non-rdmacm enabled case.   This should fix Ticket #1294.

This commit was SVN r18468.
2008-05-20 23:28:59 +00:00
Jeff Squyres
64f61ebd07 Fixes trac:1285. Really.
This commit has the same commit message as r18450, but without the
extra bonus memory corruption that was introduced.

This commit was SVN r18467.

The following SVN revision numbers were found above:
  r18450 --> open-mpi/ompi@5295902ebe

The following Trac tickets were found above:
  Ticket 1285 --> https://svn.open-mpi.org/trac/ompi/ticket/1285
2008-05-20 21:53:42 +00:00
Edgar Gabriel
0500420bec fixing a bug in the inter-communicator scatter operation, where we used
accidentally rcount instead of scounts.

This commit was SVN r18466.
2008-05-20 21:17:19 +00:00
Rolf vandeVaart
74d0259480 Add new implentation of barrier. This shows better performance on some clusters.
However, no decision logic is changed by this commit so default behavior has not changed.  This
is only selectable by runtime parameters.

This commit was SVN r18464.
2008-05-20 17:37:41 +00:00
Rolf vandeVaart
71091a19c3 Fix bug in spacing of code per https://svn.open-mpi.org/trac/ompi/wiki/CodingStyle.
This commit was SVN r18463.
2008-05-20 14:11:10 +00:00
Jeff Squyres
a9e26c33e0 Ensure that we don't try to call orte_show_help() before orte_init()
succeeds.

This commit was SVN r18458.
2008-05-19 21:57:54 +00:00
Rolf vandeVaart
763f5259a8 Fix memory leak of 88 bytes that occurred on each call to MPI_Comm_dup.
Need to release the items and the item list after selecting the collective
modules that are being used.  Reviewed by Jeff Squyres.

This commit was SVN r18457.
2008-05-19 21:34:01 +00:00
Jeff Squyres
c8c01572d0 ompi_info was erroneously not showing all the paths that it supports
(via compiled-in defaults/configure, or via env variables).

This commit was SVN r18456.
2008-05-19 17:44:56 +00:00
Jeff Squyres
01a7f7eeb6 Switch orte_output* -> OPAL_OUTPUT* for two reasons:
1. We can't use orte_output in the CPC service thread because orte is
    not thread safe
 1. Use the macro version sso that they're compiled out of production
    builds 

This commit was SVN r18455.
2008-05-19 17:42:51 +00:00
Jeff Squyres
7154776465 Removed unused variable / compiler warning.
This commit was SVN r18454.
2008-05-19 13:41:45 +00:00
Jeff Squyres
76fc8dd188 Revert r18450 -- there is some memory badness in there somewhere...
This commit was SVN r18451.

The following SVN revision numbers were found above:
  r18450 --> open-mpi/ompi@5295902ebe
2008-05-18 19:11:45 +00:00
Jeff Squyres
5295902ebe Fixes trac:1285:
* allow receive_queues to be specified in the INI file 
 * detect when multiple different receive_queues are specified and 
   gracefully abort 

However, accomplishing these goals ran into multiple difficulties. By 
putting receive_queues in the INI file: 

 1. we may not find the value until we've already traversed multiple HCAs 
 1. we may find multiple different receive_queues values

But since the openib btl initializes as it discovers each HCA/port/LID
(including the BSRQ data), if we find a new receive_queues value late
in the discovery process, then all the BSRQ data that was previously
initialized will likely be invalid. So I had to pull all the BSRQ
initialization out until after the rest of the discovery /
initialization process.

Additionally, note that if the user specifies the MCA parameter
btl_openib_receive_queues, it trumps whatever was in the INI file. So
in this case, there can never be a receive_queues conflict.  This
commit does the following (Jon wrote part of this, too):

 * adapt _ini.c to accept the "receive_queues" field in the file 
 * move 90% of _setup_qps() from _ini.c to _component.c 
 * move what was left of _setup_qps() into the main 
   _register_mca_params() function 
 * adapt init_one_hca() to detect conflicting receive_queues values 
   from the INI file 
 * after the _component.c loop calling init_one_hca(): 
   * call setup_qps() to parse the final receive_queues string value 
   * traverse all resulting btls and initialize their HCAs (if they
     weren't already): setup some lists and call prepare_hca_for_use()

I tested this code on a dual-HCA system where I artificially put in 
differing receive_queues values in the INI file for the two different 
types of HCAs that I have and it all seemed to work.

This commit was SVN r18450.

The following Trac tickets were found above:
  Ticket 1285 --> https://svn.open-mpi.org/trac/ompi/ticket/1285
2008-05-18 18:50:56 +00:00
Jeff Squyres
87d4201bdf From our faithful Debian package maintainers: remove some lint-quality
lines from the man pages.

This commit was SVN r18449.
2008-05-16 14:58:52 +00:00
Jeff Squyres
1cc663ebf6 Change this back to use opal_init_util() -- using orte_init() mucks
with the C++ memory allocator.  Let's not go there.

This commit was SVN r18447.
2008-05-16 14:18:56 +00:00
Jeff Squyres
caacaadb0a Minor shuffling of code: no need to query the GID in the iWARP case.
This commit was SVN r18446.
2008-05-16 03:36:48 +00:00
Jeff Squyres
9f1b5237fe Ensure to return an error rather than continue
This commit was SVN r18445.
2008-05-16 03:36:11 +00:00