1
1

71 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
3ebaff8dfe Implement new BTL parameters:
We eagerly send data up to btl_*_eager_limit with the match
Upon ACK of the MATCH we start using send/receives of size
btl_*_max_send_size up to the btl_*_rdma_pipeline_offset
After the btl_*_rdma_pipeline_offset we begin using RDMA writes of
size btl_*_rdma_pipeline_frag_size.

Now, on a per message basis we only use the above protocol if the
message is larger than btl_*_min_rdma_pipeline_size

btl_*_eager_limit - > same
btl_*_max_send_size -> same
btl_*_rdma_pipeline_offset -> btl_*_min_rdma_size
btl_*_rdma_pipeline_frag_size -> btl_*_max_rdma_size


btl_*_min_rdma_pipeline_size is new..

This patch also moves all BTL common parameters initialisation into
btl_base_mca.c file.

This commit was SVN r14681.
2007-05-17 07:54:27 +00:00
George Bosilca
1cb26e3b9c Finally the convertor export a convenience function to allow a consistent
computation of the current location on the pack/unpack process. This can
be used both for retrieving the pointer to the first byte (in the special
case of the cached RDMA protocol) and for getting the current
position (for the pipelined protocol).

I modified all BTLs, but most of them are still untested.

This commit was SVN r14180.
2007-03-30 22:02:45 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Jeff Squyres
f820e44112 Remove a gcc-ism from the code (defining an anonymous union in the
middle of a struct).  Now we properly define and name the union
outside the struct and simply create an instance of it inside the
struct. 

This commit was SVN r13709.
2007-02-19 18:21:57 +00:00
George Bosilca
020b8ade70 A slightly better fix for the data mismatch compiler complaints.
This commit was SVN r13695.
2007-02-17 05:23:57 +00:00
George Bosilca
04138c23af No more warnings.
This commit was SVN r13683.
2007-02-16 16:25:58 +00:00
Brian Barrett
8900d3ae43 Second take at fixing the issues with using ompi_ptr_t. Add helper functions for converting from .pval to .lval and vice-versa. Users of ompi_ptr_t types should only use one of the fields in the union unless using the helper conversion functions. For the BTLs, local pointers will always be stored in the .pval field and remote pointers always stored in the .lval field.
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.

Refs trac:587

This commit was SVN r13023.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-07 01:48:57 +00:00
Brian Barrett
48ec0b2071 Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix
for now...

This commit was SVN r12997.

The following SVN revision numbers were found above:
  r12974 --> open-mpi/ompi@27cea44a9c
2007-01-04 22:07:37 +00:00
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
Brian Barrett
b448b4e47e More heterogeneous fixes. Don't set reachability bit on a remote proc
if the remote architecture differs from the local architecture and the
btl doesn't support heterogeneous transport.

Refs trac:587

This commit was SVN r12879.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2006-12-17 17:27:08 +00:00
Gleb Natapov
190e7a27cd Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma).
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).

This commit was SVN r12878.
2006-12-17 12:26:41 +00:00
George Bosilca
126a68dc9a Big datatype commit. Remove all unused features of the datatype engine. As the memory
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).

This commit was SVN r12331.
2006-10-26 23:11:26 +00:00
George Bosilca
640178c4b3 Grepping through the source files I found these calls to the data-type engine
with the wrong type of arguments.

This commit was SVN r12148.
2006-10-17 21:05:04 +00:00
Galen Shipman
e5c594c211 More updates for the async error handler for btl's
In order to provide backwards compatability the framework versions are bumped
and the handler registeration function is at the end of the btl struct.
Testing done on sm, openib, and gm.. 

This commit was SVN r11256.
2006-08-17 22:02:01 +00:00
Brian Barrett
d367dc5d56 * Fix for bug #115 -- we need to decrement the use count on a pinned buffer
so that memory is actually deregistered.  Reviewed by Galen.

This commit was SVN r10349.
2006-06-14 13:38:24 +00:00
Galen Shipman
2667c52a5d Track fragments by list, not by size..
-- reviewed by Brian, needs to hit all the branches.. 

This commit was SVN r10078.
2006-05-25 18:07:26 +00:00
George Bosilca
f7a5a582c5 Diagnostic function for mvapi. It print all the credits used for the flow control.
This commit was SVN r9355.
2006-03-21 17:02:14 +00:00
Tim Woodall
bd870519fd - modified convertor copy_and_prepare routines to accept an addition
flag, new flags to be included when convertor is initialized
- modified pml/btl module defs and added stub functions for diagnostic
  output routines to dump state of queues / endpoints
- updates to data reliability pml

This commit was SVN r9329.
2006-03-17 18:46:48 +00:00
Tim Woodall
712468dbef add diagnostic interface
This commit was SVN r9328.
2006-03-17 17:39:41 +00:00
Galen Shipman
e58b758031 standardize behavior of btl_alloc, if the size is larger than the max send
size, btl_alloc returns NULL. 

This commit was SVN r9114.
2006-02-22 17:37:59 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
George Bosilca
0f1c6d79e8 Make the MVAPI BTL thread safe again. The problem was a double locking on the endpoint mutex.
It's still not very clean as we still lock the mvapi_btl mutex inside a critical section
protected by the endpoint mutex ...

This commit was SVN r8810.
2006-01-25 23:14:06 +00:00
Galen Shipman
ddc22d8c7e Use endpoint_lock, not ib_lock (copy paste error from openib btl)
This commit was SVN r8806.
2006-01-25 15:04:37 +00:00
Tim Woodall
51ec050647 port of revised flow control from openib
This commit was SVN r8799.
2006-01-24 23:44:30 +00:00
Tim Woodall
63d0438991 merge in changes from release branch
This commit was SVN r8637.
2006-01-04 16:34:45 +00:00
Tim Woodall
4ff5316b2d correct copy/paste error
This commit was SVN r8601.
2005-12-22 18:07:46 +00:00
Tim Woodall
5d91c492d6 improve diagnostic error messages
This commit was SVN r8600.
2005-12-22 18:01:47 +00:00
George Bosilca
e5158142b9 The lb should be extracted from the datatype not from the convertor.
This commit was SVN r8446.
2005-12-10 23:27:20 +00:00
Tim Woodall
1929a97d2f corrections for MPI_BOTTOM
This commit was SVN r8429.
2005-12-09 23:27:55 +00:00
Brian Barrett
38391e3406 disable shared receive queue support at compile time if the mvapi implementation
does not support shared receive queues (such as the one shipped by SilverStorm / 
Infinicon for OS X).  Reviewed by Galen.

This commit was SVN r8389.
2005-12-06 15:46:30 +00:00
Galen Shipman
55a9fbefd8 fix misc compiler warnings..
This commit was SVN r8263.
2005-11-27 22:53:30 +00:00
George Bosilca
7c73095440 Update the correct sended size.
This commit was SVN r8237.
2005-11-22 21:51:04 +00:00
Tim Woodall
2013104d1a SRQ cleanup
This commit was SVN r8104.
2005-11-10 20:51:56 +00:00
Tim Woodall
b5ed723ea4 - check for null return
- disable debug

This commit was SVN r8070.
2005-11-10 00:02:18 +00:00
Galen Shipman
3079fc2da1 use correct lock for threaded build..
This commit was SVN r8055.
2005-11-09 16:09:05 +00:00
Tim Woodall
b4ca28da4b removed debug
This commit was SVN r8046.
2005-11-08 21:41:02 +00:00
Tim Woodall
2d9c509add flow control
This commit was SVN r8039.
2005-11-08 16:50:07 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Galen Shipman
4a15761732 add support for srq limit reached async event, even though it doesn't appear
to  be supported by mellanox vapi.. perhaps this will be supported in the near 
future, for now it doesn't hurt to have it in the trunk


Also cleanup the receive descriptor posting macro's.. 

This commit was SVN r7903.
2005-10-27 22:47:19 +00:00
Galen Shipman
4d2d39b0a6 intial checking of SRQ flow control support for mvapi
This commit was SVN r7796.
2005-10-18 14:55:11 +00:00
Brian Barrett
7b20370306 * pretty-print an error message if a btl component loads but can't find
any NICs to use
* Make mvapi, gm, and mx components all publish information, even if there
  are no NICs available so that modex_recv doesn't hang.  If there are no
  NICs available, don't set the reachable bit, but don't do anything
  to fail.  This unfortunately doesn't cover the hangs that will result if
  different procs load different sets of components, but it's a start

This commit was SVN r7550.
2005-09-30 04:39:44 +00:00
Galen Shipman
9fe5844071 decrement ref count on removal of registration from mru and tree.
add misc asserts to check for proper reference counting. 

ugly hack 1 -- use mallopt to never release memory ala sbrk - this is
commented out in mca_btl_mvapi_component_init

ugly hack 2 -- test registrations comming out of the tree via rcache_find, for
an unknown reason the tree is returning registrations where the address is not
within the base or bound of the registration. If this happens, we return
NULL. 

comment out code to enable mem hooks if leave_pinned is set, note we can do
this via an mca param and will default it to leave_pinned with mem_hooks when
we iron out these issues. 

I am adding a unit test for the rcache. Note that we have a unit test for the
rb tree but the compare function is significantly different than that used for
registrations. After we have tracked down the issues with rcache_rb we will
remove the above hacks. 

This commit was SVN r7499.
2005-09-24 00:24:49 +00:00
Galen Shipman
96ab5a6bd3 we can be in WAITING_ACK state without a race if the OOB ack is "slower" than
the scheduling of queued IB send operations. 

This commit was SVN r7452.
2005-09-21 16:47:08 +00:00
Tim Woodall
0ee34051f8 debug asserts
This commit was SVN r7449.
2005-09-21 15:30:17 +00:00
Galen Shipman
f0b1ea52bc if all else fails in prepare_src,, pack
init the rdma_pending list in ob1

This commit was SVN r7366.
2005-09-14 04:41:33 +00:00
Galen Shipman
d932cfd342 merge of rcache work into the trunk.. lotsa fun ;-)..
I regression tested before the merge, I will regression test tonight and
correct issues that might have crept in. 

This commit was SVN r7329.
2005-09-12 22:28:23 +00:00
Galen Shipman
e5ea1b55ef fix for threaded build
This commit was SVN r7194.
2005-09-06 15:21:31 +00:00
Tim Woodall
dfe52fceef minor changes to thread locking
This commit was SVN r7154.
2005-09-02 16:27:01 +00:00
Galen Shipman
589b1b8b5a Additional changes to add_proc and tokens
This commit was SVN r7152.
2005-09-02 15:18:36 +00:00
Galen Shipman
a7a4da4502 Scale the SRQ based on the log base 2 of the number of peers,
this assumes that the peers have all been added via add_procs up front. 
Bad things will happen if add_procs is called again later on a new set of
 procs to fix this we need to modify the srq which may wreck things.. looking
 into this deeper.. 

This commit was SVN r7142.
2005-09-02 04:06:51 +00:00