openmpi

Автор	SHA1	Сообщение	Дата
Tim Prins	889f6c79fe	Properly initialize the freelist in the constructor. This commit was SVN r17175.	2008-01-22 18:17:06 +00:00
George Bosilca	841ae01bc9	Add OPAL_LIKELY to the freelist. This commit was SVN r17142.	2008-01-15 05:44:28 +00:00
George Bosilca	bd72057364	Cleanup the comments. This commit was SVN r17094.	2008-01-09 20:34:37 +00:00
George Bosilca	906e8bf1d1	Replace the ompi_pointer_array with opal_pointer_array. The next step (sometimes after the merge with the ORTE branch), the opal_pointer_array will became the only pointer_array implementation (the orte_pointer_array will be removed). This commit was SVN r17007.	2007-12-21 06:02:00 +00:00
Rich Graham	e4646a4dd5	going through the ompi_free_list_init_ex, fl_payload_buffer_size and fl_payload_buffer_alignment were not being set. This commit was SVN r16641.	2007-11-02 17:51:32 +00:00
Rich Graham	27a748e7eb	change all instances of ompi_free_list_init to ompi_free_list_init_new. Header and payload data are specified separately at this stage. This commit was SVN r16633.	2007-11-01 23:38:50 +00:00
Rich Graham	aa82acd34c	continuing the incremental changes. fl_elem_class renamed fl_frag_class, and ompi_free_list_init_new() and ompi_free_list_init_ex_new() were added. Next step will be to start converting from ompi_free_list_init to() ompi_free_list_init_new(), and then remove ompi_free_list_init(), and rename ompi_free_list_init_new() back to ompi_free_list_init(). The merge of the branch with the trunk was so substantial, it is far easeir to re-implement the changes in the trunk, rather than trying to fix the bugs the merge brought in ... This commit was SVN r16630.	2007-11-01 17:25:12 +00:00
Rich Graham	52fb318950	starting to put in the changes for ompi_free_list_t. fl_elem_size is renamed to fl_frag_size, fl_alignment is renamed to fl_frag_alignment, and fl_payload_buffer_size and fl_payload_buffer_alignment are added. This commit was SVN r16629.	2007-11-01 16:47:44 +00:00
George Bosilca	e724ca0a1f	Remodel the ompi_free_list a little. The free_list_memory is in fact a free_list_item so instead of having a struct, use typedef to make them equivalent. Modify the parallel debuggers support in order to allow them access to the internal types even when we have an optimized build. This commit was SVN r16567.	2007-10-25 16:47:54 +00:00
Shiqing Fan	a0660f4deb	- Just some type casts. This commit was SVN r16100.	2007-09-12 15:29:58 +00:00
Jeff Squyres	8ace07efed	This commit brings in two major things: 1. Galen's fine-grain control of queue pair resources in the openib BTL. 1. Pasha's new implementation of asychronous HCA event handling. Pasha's new implementation doesn't take much explanation, but the new "multifrag" stuff does. Note that "svn merge" was not used to bring this new code from the /tmp/ib_multifrag branch -- something Bad happened in the periodic trunk pulls on that branch making an actual merge back to the trunk effectively impossible (i.e., lots and lots of arbitrary conflicts and artifical changes). :-( == Fine-grain control of queue pair resources == Galen's fine-grain control of queue pair resources to the OpenIB BTL (thanks to Gleb for fixing broken code and providing additional functionality, Pasha for finding broken code, and Jeff for doing all the svn work and regression testing). Prior to this commit, the OpenIB BTL created two queue pairs: one for eager size fragments and one for max send size fragments. When the use of the shared receive queue (SRQ) was specified (via "-mca btl_openib_use_srq 1"), these QPs would use a shared receive queue for receive buffers instead of the default per-peer (PP) receive queues and buffers. One consequence of this design is that receive buffer utilization (the size of the data received as a percentage of the receive buffer used for the data) was quite poor for a number of applications. The new design allows multiple QPs to be specified at runtime. Each QP can be setup to use PP or SRQ receive buffers as well as giving fine-grained control over receive buffer size, number of receive buffers to post, when to replenish the receive queue (low water mark) and for SRQ QPs, the number of outstanding sends can also be specified. The following is an example of the syntax to describe QPs to the OpenIB BTL using the new MCA parameter btl_openib_receive_queues: {{{ -mca btl_openib_receive_queues \ "P,128,16,4;S,1024,256,128,32;S,4096,256,128,32;S,65536,256,128,32" }}} Each QP description is delimited by ";" (semicolon) with individual fields of the QP description delimited by "," (comma). The above example therefore describes 4 QPs. The first QP is: P,128,16,4 Meaning: per-peer receive buffer QPs are indicated by a starting field of "P"; the first QP (shown above) is therefore a per-peer based QP. The second field indicates the size of the receive buffer in bytes (128 bytes). The third field indicates the number of receive buffers to allocate to the QP (16). The fourth field indicates the low watermark for receive buffers at which time the BTL will repost receive buffers to the QP (4). The second QP is: S,1024,256,128,32 Shared receive queue based QPs are indicated by a starting field of "S"; the second QP (shown above) is therefore a shared receive queue based QP. The second, third and fourth fields are the same as in the per-peer based QP. The fifth field is the number of outstanding sends that are allowed at a given time on the QP (32). This provides a "good enough" mechanism of flow control for some regular communication patterns. QPs MUST be specified in ascending receive buffer size order. This requirement may be removed prior to 1.3 release. This commit was SVN r15474.	2007-07-18 01:15:59 +00:00
George Bosilca	6e8d25fdaf	rearrange the code a bit. This commit was SVN r15297.	2007-07-05 22:47:31 +00:00
Rainer Keller	6f9251ed39	- Small fixes by PGI -Minform=inform This commit was SVN r14524.	2007-04-26 08:16:07 +00:00
Gleb Natapov	e5450613b5	Add new SM BTL parameter btl_sm_cb_max_num. If set to value greater then zero it limits the number of circular buffers allocated between each pair of peers. This allows for more tight memory usage control. This commit was SVN r14120.	2007-03-22 12:21:42 +00:00
Brian Barrett	f6be04ff37	be a bit more careful with parens than the r13992 fix This commit was SVN r13996. The following SVN revision numbers were found above: r13992 --> open-mpi/ompi@3cbac958eb	2007-03-09 16:39:23 +00:00
Brian Barrett	3cbac958eb	fix warning about types This commit was SVN r13992.	2007-03-09 02:32:22 +00:00
Jeff Squyres	b94a39236b	Submitted by Gleb, reviewed by Rich: Queue_empty is determined by the reader, and is it's local view. However, the writer may continue writing to this queue. The decision to go on to the next cb_fifo is done in an atomic region, checking the writer's view. The writer also "changes it's view" in an atomic region protected by the same lock. This commit was SVN r13968.	2007-03-08 16:51:59 +00:00
Gleb Natapov	be018944d2	Clean up circular buffer implementation. Get rid of _same_base_address() functions by pre-calculating everything in advance. This commit was SVN r13923.	2007-03-05 14:27:26 +00:00
Gleb Natapov	90fb58de4f	When frags are allocated from mpool by free_list the frag structure is also allocated from mpool memory (which is registered memory for RDMA transports) This is not a problem for a small jobs, but for a big number of ranks an amount of waisted memory is big. This commit was SVN r13921.	2007-03-05 14:17:50 +00:00
Rainer Keller	0889ebd59f	- Eliminate warnings, that PGI-6.2.5 issues with -Minform=inform This commit was SVN r13840.	2007-02-28 08:36:34 +00:00
Pavel Shamis	edeab0e912	Adding Mellanox Technologies copyright to files touched by Mellanox. This commit was SVN r13669.	2007-02-15 18:03:20 +00:00
Gleb Natapov	4d4b0a022a	Add error callback to sm BTL. Call it when allocation of the initial circular buffer fails. If cb is already allocated, but it is full and allocation of additional cb fails, we spin waiting for receiver to free space in existing cb. This commit was SVN r13635.	2007-02-13 12:01:36 +00:00
Jeff Squyres	33619d6b43	Minor fixes for the ompi_bitmap class that were found while investivating #817: * Remove use of legal_numbits member and always just use the full size of the array. There was a corner case where legal_numbits was not an even multiple of the number of bits in the array where bits would not get freed properly, ususally causing wasted fortran MPI handles, or, as in the case of #817, wasted attribute keyvals (i.e., the user freed them, but the bitmap didn't reflect the free). * Re-order some error checks to ensure that we don't segv (we don't currently trigger this problem anywhere; I just noticed it while doing the other attribute keyval and legal_numbits work). Since this change affects all Fortran MPI handles, I ran all the intel and ibm tests and all still pass with this change. This commit was SVN r13561.	2007-02-08 18:20:36 +00:00
Gleb Natapov	4e5deec496	Fix previous patch. In case of different sm base use pointer to tail after recalculating it. This commit was SVN r13557.	2007-02-08 14:59:18 +00:00
Gleb Natapov	c56497cf46	Fix race condition, that happens in circular buffer if consumer drained full cb before producer set cb_overflow flag. This commit was SVN r13552.	2007-02-08 07:16:14 +00:00
George Bosilca	575075ea77	typo. This commit was SVN r13510.	2007-02-06 16:53:44 +00:00
Jeff Squyres	c91fcd7fbd	Fix a bunch of minor typos submitted by Bernhard Fischer. This commit was SVN r13505.	2007-02-06 12:00:30 +00:00
Rainer Keller	235f87fd14	- Small cleanups, getting rid of variables, using index early, etc. This commit was SVN r13222.	2007-01-19 23:28:04 +00:00
Rainer Keller	125ba1acfa	- Reduce the amount of warnings with -Wshadow -- mainly due to usage of index and abs in inline-fcts in header files. This commit was SVN r13217.	2007-01-19 19:48:06 +00:00
Gleb Natapov	190e7a27cd	Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma). udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited). This commit was SVN r12878.	2006-12-17 12:26:41 +00:00
George Bosilca	b51b87a4aa	The correct way to compute the difference between the actual size and the expected size, based on the comment few lines before. This commit was SVN r12235.	2006-10-20 19:33:55 +00:00
George Bosilca	d7268557a8	Complete the SM BTL changes. Now all displacements are ptrdiff_t and there is no warnings about any issue with signed/unsigned. This commit was SVN r12234.	2006-10-20 19:28:12 +00:00
George Bosilca	06563b5dec	Last set of explicit conversions. We are now close to the zero warnings on all platforms. The only exceptions (and I will not deal with them anytime soon) are on Windows: - the write functions which require the length to be an int when it's a size_t on all UNIX variants. - all iovec manipulation functions where the iov_len is again an int when it's a size_t on most of the UNIXes. As these only happens on Windows, so I think we're set for now :) This commit was SVN r12215.	2006-10-20 03:57:44 +00:00
George Bosilca	e81d38f322	Remove a function that was just a proof of concept. The same approach is not used by the TotalView support. This commit was SVN r12214.	2006-10-20 03:34:16 +00:00
George Bosilca	33f300f636	I don't know what it was supposed to do but I'm quite sure it didn't do it correctly. Just follow inc_num and you will understand. Now _resize will grow the list to match the required number of elements as described in the comment in the .h file. This commit was SVN r12074.	2006-10-10 14:47:51 +00:00
Pavel Shamis	e400da01bc	Fix of typo error introduced in revision 12053. Reviewed by: Jeff This commit was SVN r12071.	2006-10-10 11:58:29 +00:00
Brian Barrett	51b2a0fd3f	A couple of changes to improve shared memory behavior when resources get constrained: * Make sure we always have a number of eager fragments available that scales with the number of processes communicating with a given proc over shared memory * Use FREE_LIST_GET instead of FREE_LIST_WAIT to return an error to the PML when resource exhaustion occurs * Don't dereference the frag during alloc unless we're sure it's not NULL Reviewed by: Galen Refs trac:413 This commit was SVN r12053. The following Trac tickets were found above: Ticket 413 --> https://svn.open-mpi.org/trac/ompi/ticket/413	2006-10-06 21:13:49 +00:00
George Bosilca	2411ad74e4	Fix the bug #315 . If there are multiple threads waiting for a free_list item then use broadcast in order to wake them up. If there is only one then use signal (which is supposed to be faster) and of course if there are no threads waiting then just continue. This commit was SVN r12049.	2006-10-06 16:17:50 +00:00
George Bosilca	2029284820	Typo. This commit was SVN r11695.	2006-09-18 17:57:55 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
David Daniel	59f2d86c36	* Move Gleb's rcache work from the gleb-rcache branch to the trunk This commit was SVN r11198.	2006-08-15 18:40:08 +00:00
Gleb Natapov	383694c68d	Add support to get alignemnt buffers from free_list_t. Convert openib BTL to new interface. This commit was SVN r10899.	2006-07-20 14:39:05 +00:00
George Bosilca	b2a9d15db6	Broadcast the condition (not signal it) as we add multiple elements to the free list. This commit was SVN r10850.	2006-07-17 17:07:20 +00:00
George Bosilca	6b7467ea4d	NULL is not an option ... This commit was SVN r10779.	2006-07-13 07:38:35 +00:00
George Bosilca	7602066c4d	The next and prev items cannot be NULL. The limit is the sentinel item. This commit was SVN r10778.	2006-07-13 07:32:13 +00:00
George Bosilca	a43eb4b43e	It's not about how much memory we use, but about how we use it. Keeping the cache misses as low as possible is always a good approach. The opal_list_t is widely used, it should be a highly optimized class. The same functionality can be reached with one one sentinel instead of 2 currently used. I don't have anything against the STL version, but so far nothing can compare with the Knuth algorithm. I replace the current implementation with a modified version of the Knuth algorithm (the one described in The Art of Computer Programming). As expected, the latency went down. This commit was SVN r10776.	2006-07-13 04:56:15 +00:00
Gleb Natapov	012d95d195	If ompi_free_list_grow fails wait until resources are available instead of spinning without progress. This commit was SVN r10520.	2006-06-27 09:23:51 +00:00
Galen Shipman	8855e5b73a	Fixes for DR as well as better diagnostic.. Successfully passing the intel test suite with/without induced errors/drops. This commit was SVN r10518.	2006-06-26 22:29:29 +00:00
George Bosilca	9eb023a5c2	OK my last commit was ... kind of wrong. It only worked if the element_size was smaller than the CACHE_LINE_SIZE. Here is the version that works. In fact this works on 2 steps. First we set the element size to something multiple of the desired alignment. Then when we allocate memory, we compute the total size, and we will align each of the elements (we allocate multiple of them every time) to the CACHE_LINE_SIZE. This commit was SVN r10479.	2006-06-22 14:47:07 +00:00
George Bosilca	c71f6c9765	All elements will be aligned to the CACHE_LINE_SIZE define (currently 128 bytes). The simplest way to make sure they are aligned is to update the size of the basic element to a multiple of the desired alignment. It will use a little bit more memory, but the improvements on the SM BTL seems quite interesting. This commit was SVN r10478.	2006-06-22 14:07:14 +00:00

1 2

96 Коммитов