1
1
Граф коммитов

66 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
George Bosilca
d8dee3a740 If the MX driver was unable to load correctly, or if the endpoint was not
created then don't try to call the MX endpoint close function.

This commit was SVN r12950.
2007-01-02 00:01:50 +00:00
George Bosilca
e223b27268 A fragment is marked completed by the PML when the peer signal the
completion of the RDMA operation associated with the fragment. The
PML will call the BML free which in turn will call the BTL free. The MX 
BTL will not release the fragment if it not tagged with 0xff.

This commit was SVN r12947.
2006-12-31 03:17:47 +00:00
George Bosilca
47601e315e Allow the MX BTL to select at runtime if the unexpected handler will
be activated or not.

This commit was SVN r12944.
2006-12-30 20:57:50 +00:00
George Bosilca
d401a65975 Minor cleanups. Don't set the fields that will never be used.
This commit was SVN r12941.
2006-12-29 07:55:17 +00:00
George Bosilca
416e5b5f6a Enable the MX extensions if and only if the mx_extensions.h header
is installed on the system.

This commit was SVN r12937.
2006-12-29 00:31:32 +00:00
George Bosilca
d7bc180a90 The max allocated tag is not 16. Use the define instead.
This commit was SVN r12936.
2006-12-28 22:48:58 +00:00
George Bosilca
3eeecc3838 Add support for faster small messages. While sending a message, we check if
the data was buffered by the MX library. If it's the case then we declare
the send as completed and disable the completion event for the mx request.

This commit was SVN r12935.
2006-12-28 22:34:24 +00:00
George Bosilca
b996c00d1a Set the limits for the MX fragments to 4K. Add code to dump the state of the MX
hardware (not activated).

This commit was SVN r12931.
2006-12-28 08:40:37 +00:00
George Bosilca
3903009b8b Add a check for the unexpected handler. If enabled, allow the zero-copy
protocol over the MX BTL. Now, we have only one matching, the one in Open
MPI.

The problem is that when the unexpected handler is triggered, not all the
message is on the host memory. In the best case we get one MX fragment (internal
MX fragment), in the worst we get NULL. The only way to fit this with the
design of the PML is to force the eager protocol at the MX internal fragment
size, and to limit the send/receive protocol at the same size. Tests show
the outcome is not far from optimal (if the pipeline depth is increased
a little bit).

Set MX_PIPELINE_LOG in order to allow MX to use internal fragments of 4K.

This commit was SVN r12930.
2006-12-28 03:35:41 +00:00
George Bosilca
ff2319dcb7 Complete the OUT protocol. Small latency improvements. Some minor cleanups.
Create some macros, reorder some functions. Make sure all fragments are
correctly released at the end.

This commit was SVN r12926.
2006-12-26 18:15:24 +00:00
George Bosilca
75a35ed7ee Implement the PUT protocol over MX. The send/receive approach give the best
performance on a 2G Myrinet card, as it look like pipelining the messages
by 1M is faster than a simple send/receive. However, when using a 10G card
the send/receive will limit the maximum bandwidth to 2.5Gbs. The reason is
the scarce bus resources that have to be shared between the Myrinet hardware
and the memcpy operation. The PUT protocol remove the memcpy, we now have a 
true zero-copy mechanism. But, there is no pipelining yet as it look like the
RDMA pipeline somehow disappeared from the OB1 PML ...

This commit was SVN r12925.
2006-12-24 22:52:46 +00:00
George Bosilca
e8bd985870 Add more output when calls to the MX library fails.
Move the connection status from theproc into the endpoint.

This commit was SVN r12924.
2006-12-24 22:34:48 +00:00
George Bosilca
14dc72f595 Allow the user to change the MX flags.
This commit was SVN r12923.
2006-12-24 22:21:00 +00:00
George Bosilca
dbe2798638 Allow MX to handle shared memory and self communications. By default these features
are disabled (btl_mx_shared_mem respectively btl_mx_self have to be set in order
to activate them).

This commit was SVN r12922.
2006-12-24 22:18:41 +00:00
Brian Barrett
7880353fcc Need to close every endpoint we open, or the MX progress thread doesn't die,
which can cause segfaults on shutdown.  Calling mx_finalize() isn't enough
to shutdown the thread, so must close endpoints as well.

Refs trac:513

This commit was SVN r12908.

The following Trac tickets were found above:
  Ticket 513 --> https://svn.open-mpi.org/trac/ompi/ticket/513
2006-12-21 18:13:22 +00:00
George Bosilca
80bc0c8868 Allow the MX to survive if we are unable to connect to a peer. The PML will
try to find another route.

This commit was SVN r12837.
2006-12-13 01:12:07 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
George Bosilca
59cfee0cd2 Use the MX infinite timeout by default. The user can modify it using an MCA
parameter.

This commit was SVN r12670.
2006-11-27 20:18:58 +00:00
George Bosilca
139f9cf3d0 Make sure we disable the MX shared memory when we use the MX BTL.
This commit was SVN r12587.
2006-11-13 22:17:06 +00:00
George Bosilca
3d0df2cf29 Allow the MX BTL to finish the small sends quicker. Once the mx_isend is posted if
the message size is less than 4K do a check for the message completion and if any
call the callback.

This commit was SVN r12453.
2006-11-06 23:12:01 +00:00
George Bosilca
126a68dc9a Big datatype commit. Remove all unused features of the datatype engine. As the memory
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).

This commit was SVN r12331.
2006-10-26 23:11:26 +00:00
George Bosilca
a3ad4a7fc8 The visibility flags (and/or Windows friendly export) is now on for all BTLs.
This commit was SVN r11662.
2006-09-14 22:19:39 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00
Brian Barrett
943e7dcfba * use a temporary to avoid passing pointers to size_t-sized structures into
the mca param functions, which expect poinrters to integers

This commit was SVN r11262.
2006-08-18 21:36:07 +00:00
Galen Shipman
e5c594c211 More updates for the async error handler for btl's
In order to provide backwards compatability the framework versions are bumped
and the handler registeration function is at the end of the btl struct.
Testing done on sm, openib, and gm.. 

This commit was SVN r11256.
2006-08-17 22:02:01 +00:00
Galen Shipman
3b49953ce2 Add error callback to the btl interface, this allows error to be delivered to
the upperlayer assynchronously although there are some issues with this.. such
as there are multiple consumers of the btl's.. who get's the

This commit was SVN r11232.
2006-08-16 20:21:38 +00:00
George Bosilca
14b3f141db Nothing relevant !!!
This commit was SVN r10711.
2006-07-11 00:30:26 +00:00
Brian Barrett
05046e8ad2 if MX isn't running on some hosts, but is on others, we were blocking in the modex receive
waiting for the non-running procs to publish their contact information.  Publish their
(lack of) contact information.

This commit was SVN r10355.
2006-06-14 19:07:38 +00:00
Galen Shipman
218a438509 finished the ompi_free_list_t class nightmare..
This commit was SVN r10314.
2006-06-12 22:09:03 +00:00
Brian Barrett
5163f2b296 Fix for bug #36. The MX, MVAPI, and OpenIB components don't have
support for progress threads, so we shouldn't build them or try to use
them when support for progress threads has been requested.  The TCP, GM,
SELF, and SM BTLs should have progress thread support, so they aren't
disabled.  The Portals BTL isn't compiled on platforms with threads,
so it doens't need to be updated.

This commit was SVN r10156.
2006-06-01 01:30:16 +00:00
George Bosilca
bdecdc8d41 Cleanup the MX BTL. Remove all mpool related code as there will never be a MX mpool.
This commit was SVN r9808.
2006-05-04 06:55:45 +00:00
George Bosilca
3e968d4f63 There is no length on the free list.
This commit was SVN r9704.
2006-04-24 23:13:51 +00:00
George Bosilca
61bea41350 The same in MX (missing copyright).
This commit was SVN r9661.
2006-04-19 21:37:30 +00:00
Tim Woodall
712468dbef add diagnostic interface
This commit was SVN r9328.
2006-03-17 17:39:41 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Galen Shipman
c8045bf397 Fixup for ORTE datatype checkin,
- use appropriate header files 
- change calls from orte_dps to orte_dss 

This commit was SVN r8920.
2006-02-07 15:20:44 +00:00
George Bosilca
20fd358327 A BTL cannot depend on the orte data-types.
This commit was SVN r8915.
2006-02-07 06:05:36 +00:00
George Bosilca
9d990af4a5 Remove 2 useless functions. They have been replaced by the mca_base version few commits ago.
This commit was SVN r8287.
2005-11-28 20:14:23 +00:00
George Bosilca
bfa4a40983 Cast it the a well known type to remove a warning.
This commit was SVN r8261.
2005-11-26 21:17:15 +00:00
George Bosilca
00c10a6372 Make the MX BTL startup scalable. When the number of processes involved in the MPI application
increase the previous connection code was broken. It can take as much as 60 seconds to connect
64 processes. Now we do not create the connections when we add the procs but only when we send
them the first message. Now it take only 1.6 seconds to setup a 64 procs MPI job over MX (doing a 2 steps barrier in order to insure that we create all the connections).

This commit was SVN r8252.
2005-11-23 23:48:56 +00:00
George Bosilca
bba42f5e49 We are allowed to call mx_set_error_handler before any other MX functions, even before mx_init.
With the errors set to return mx_init will not force the application to exit if there is no MX kernel
module loaded.

This commit was SVN r8184.
2005-11-17 18:47:27 +00:00
George Bosilca
7ad6b2b70e Add a MCA params to allow/disable the MX shared memory capabilities. Right now this param
is labeled as internal so the users will not see it but it is not read-only so we can still
play with it (that's for our internal tests). This is supposed to dissapear later after the
next (or next next) release of the MX library, but we need it now as a quick fix before the
release.

This commit was SVN r8161.
2005-11-15 20:54:45 +00:00
George Bosilca
e297b58fbd Add more MCA arguments.
Make some of them system (not seems by the user) and read-only.
Small cleanups.

This commit was SVN r8126.
2005-11-12 00:31:59 +00:00
George Bosilca
8119c970db Improve the connection algorithm for MX. There are 2 problems here:
- first we setup the connections in the begining with all the peers
- MX does not handle well the case where several peers make connections to the same
  destination simultaneously.

So I change the order in which we connect. First we compute our rank in the array,
then in a round-robin fashion we setup connection starting with our left neighboard.

This commit was SVN r8075.
2005-11-10 01:15:49 +00:00
George Bosilca
dc1ad885d1 Move the output message outside the loop. We print an error message only once when we fail to
connect to a peer. Bonus, we print some additional informations like its MAC Address or name
if it's on our tables.

This commit was SVN r8074.
2005-11-10 01:13:18 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
George Bosilca
b0def3f6bf MX has 2 limitations regarding the iovecs. First they do not support iovec witha total size
larger than 32K for inter-nodes transfert ... and then they do not support iovecs larger than
16K for inter-node transfert. Therefore we have to set the size of our first fragment to
16K to match both cases.

This commit was SVN r7926.
2005-10-28 20:37:43 +00:00
George Bosilca
1fe18814da Decrease the default length for the first fragment.
This commit was SVN r7643.
2005-10-06 00:05:01 +00:00
George Bosilca
0f04132b13 mx_connect in the MX documentation is supposed to take a timeout in seconds. However, in real life it seems that the timeout should be in micro-second.
This commit was SVN r7642.
2005-10-06 00:04:27 +00:00