openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	d944d5ec52	Just in case something goes drastically wrong, don't segv. This commit was SVN r18049.	2008-03-31 21:55:07 +00:00
George Bosilca	efa89bfa3f	Revert r17857. The context should be set in one case ... when we call prepare_{src\|dst} without calling a get or put. So, just keep it here until a better solution is found. This commit was SVN r17872. The following SVN revision numbers were found above: r17857 --> open-mpi/ompi@d460ccfbf9	2008-03-18 19:01:27 +00:00
George Bosilca	d460ccfbf9	No need to check for NULL there. The bml_btl is set correctly on the upper level. This commit was SVN r17857.	2008-03-18 03:02:31 +00:00
George Bosilca	39353ebb44	Cleanup. This commit was SVN r17855.	2008-03-18 02:56:50 +00:00
George Bosilca	76deec135e	The .h file is not used anymore (it contain the descriptor cache). Update the Makefile.am file as well. This commit was SVN r17854.	2008-03-18 02:50:24 +00:00
Jeff Squyres	61290c0e51	Remove a useless file. This commit was SVN r17852.	2008-03-18 01:50:47 +00:00
Gleb Natapov	3a9652ffc4	Endpoint array may not exist if in add_proc() we failed to find suitable btl for communication with a proc. Don't segfault in this case. This commit was SVN r17804.	2008-03-11 08:13:37 +00:00
Tim Prins	84b2099fe8	Remove the now-unused orte_value_array. As this is the last 'class' split between orte and ompi, remove the big comment about the split in ompi_bitmap. Also, update some properties (source files should not be executeable...), and remove a couple unneeded inclusions of orte_proc_table.h This commit was SVN r17655.	2008-02-28 21:39:42 +00:00
Ralph Castain	d70e2e8c2b	Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer This commit was SVN r17632.	2008-02-28 01:57:57 +00:00
George Bosilca	6310ce955c	The first patch related to the Active Message stuff. So far, here is what we have: - the registration array is now global instead of one by BTL. - each framework have to declare the entries in the registration array reserved. Then it have to define the internal way of sharing (or not) these entries between all components. As an example, the PML will not share as there is only one active PML at any moment, while the BTLs will have to. The tag is 8 bits long, the first 3 are reserved for the framework while the remaining 5 are use internally by each framework. - The registration function is optional. If a BTL do not provide such function, nothing happens. However, in the case where such function is provided in the BTL structure, it will be called by the BML, when a tag is registered. Now, it's time for the second step... Converting OB1 from a switch based PML to an active message one. This commit was SVN r17140.	2008-01-15 05:32:53 +00:00
Jon Mason	a0d4122606	The new cpc selection framework is now in place. The patch below allows for dynamic selection of cpc methods based on what is available. It also allows for inclusion/exclusions of methods. It even futher allows for modifying the priorities of certain cpc methods to better determine the optimal cpc method. This patch also contains XRC compile time disablement (per Jeff's patch). At a high level, the cpc selections works by walking through each cpc and allowing it to test to see if it is permissable to run on this mpirun. It returns a priority if it is permissable or a -1 if not. All of the cpc names and priorities are rolled into a string. This string is then encapsulated in a message and passed around all the ompi processes. Once received and unpacked, the list received is compared to a local copy of the list. The connection method is chosen by comparing the lists passed around to all nodes via modex with the list generated locally. Any non-negative number is a potentially valid connection method. The method below of determining the optimal connection method is to take the cross-section of the two lists. The highest single value (and the other side being non-negative) is selected as the cpc method. svn merge -r 16948:17128 https://svn.open-mpi.org/svn/ompi/tmp-public/openib-cpc/ . This commit was SVN r17138.	2008-01-14 23:22:03 +00:00
Gleb Natapov	8b511b969d	Introduce a new BTL parameter btl_rndv_eager_limit which determines size of a first fragment of rendezvous protocol. Remove no longer used btl_min_send_size parameter. This commit was SVN r16969.	2007-12-16 08:35:17 +00:00
Jeff Squyres	213b5d5c6e	Per long threads on the mailing list and much confusion discussion about linkers, have all OPAL, ORTE, and OMPI components '''not'' link against the OPAL, ORTE, or OMPI libraries. See ttp://www.open-mpi.org/community/lists/users/2007/10/4220.php for details (or https://svn.open-mpi.org/trac/ompi/wiki/Linkers for a better-formatted version of the same info). This commit was SVN r16968.	2007-12-15 13:32:02 +00:00
Gleb Natapov	666b282e7e	Add mca_bml_base_send_status function. It returns ORTE_ERR_RESOURCE_BUSY if packet was queued inside BTL. BTL should return this error if packet was queued internally. This commit was SVN r16904.	2007-12-09 14:12:38 +00:00
Gleb Natapov	e2e211f23b	Add flags parameter to btl_alloc() and btl_prepare_src() functions. If BTL knows at the time of allocation priority of a descriptor it may do some optimizations. This commit was SVN r16901.	2007-12-09 14:08:01 +00:00
Gleb Natapov	7364b7cf47	Add endpoint parameter to btl_alloc() function. Enables various optimizations inside BTL. This commit was SVN r16898.	2007-12-09 14:00:42 +00:00
Gleb Natapov	2d784752dd	Remove descriptor caching form BML. With descriptor caching some optimizations are impossible. This commit was SVN r16897.	2007-12-09 13:58:17 +00:00
Brian Barrett	59b22533f2	Enable RDMA for heterogeneous situations. Currently done by overloading the ompi_convertor_need_buffers function to only return 0 if the convertor is homogeneous (which it never does on the trunk, but does to on v1.2, but that's a different issue). Only enable the heterogeneous rdma code for a btl if it supports it (via a flag), as some btls need some work for this to work properly. Currently only TCP and OpenIB extensively tested This commit was SVN r15990.	2007-08-28 21:23:44 +00:00
Sven Stork	fd778a5539	- put the label to the right place This commit was SVN r15699.	2007-07-31 09:34:41 +00:00
Sven Stork	a13d2dcb96	- fix possible memory leak found by coverity This commit was SVN r15698.	2007-07-31 09:32:49 +00:00
Galen Shipman	514811c50b	cleanup btl.h comments document the btl interface a bit better This commit was SVN r15618.	2007-07-25 17:26:23 +00:00
Jeff Squyres	3bc940ac27	Fix three things from r15474 (thanks to Brian for noticing): * bml.h had a change that introduced a variable named "_order" to avoid a conflict with a local variable. The namespace starting with _ belongs to the os/compiler/kernel/not us. So we can't start symbols with _. So I replaced it with arg_order, and also updated the threaded equivalent of the macro that was modified. * in btl_openib_proc.c, one opal_output accidentally had its string reverted from "ompi_modex_recv..." to "mca_pml_base_modex_recv....". This was fixed. * The change to ompi/runtime/ompi_preconnect.c was entirely reverted; it was an artifact of debugging. This commit was SVN r15475. The following SVN revision numbers were found above: r15474 --> open-mpi/ompi@8ace07efed	2007-07-18 11:38:06 +00:00
Jeff Squyres	8ace07efed	This commit brings in two major things: 1. Galen's fine-grain control of queue pair resources in the openib BTL. 1. Pasha's new implementation of asychronous HCA event handling. Pasha's new implementation doesn't take much explanation, but the new "multifrag" stuff does. Note that "svn merge" was not used to bring this new code from the /tmp/ib_multifrag branch -- something Bad happened in the periodic trunk pulls on that branch making an actual merge back to the trunk effectively impossible (i.e., lots and lots of arbitrary conflicts and artifical changes). :-( == Fine-grain control of queue pair resources == Galen's fine-grain control of queue pair resources to the OpenIB BTL (thanks to Gleb for fixing broken code and providing additional functionality, Pasha for finding broken code, and Jeff for doing all the svn work and regression testing). Prior to this commit, the OpenIB BTL created two queue pairs: one for eager size fragments and one for max send size fragments. When the use of the shared receive queue (SRQ) was specified (via "-mca btl_openib_use_srq 1"), these QPs would use a shared receive queue for receive buffers instead of the default per-peer (PP) receive queues and buffers. One consequence of this design is that receive buffer utilization (the size of the data received as a percentage of the receive buffer used for the data) was quite poor for a number of applications. The new design allows multiple QPs to be specified at runtime. Each QP can be setup to use PP or SRQ receive buffers as well as giving fine-grained control over receive buffer size, number of receive buffers to post, when to replenish the receive queue (low water mark) and for SRQ QPs, the number of outstanding sends can also be specified. The following is an example of the syntax to describe QPs to the OpenIB BTL using the new MCA parameter btl_openib_receive_queues: {{{ -mca btl_openib_receive_queues \ "P,128,16,4;S,1024,256,128,32;S,4096,256,128,32;S,65536,256,128,32" }}} Each QP description is delimited by ";" (semicolon) with individual fields of the QP description delimited by "," (comma). The above example therefore describes 4 QPs. The first QP is: P,128,16,4 Meaning: per-peer receive buffer QPs are indicated by a starting field of "P"; the first QP (shown above) is therefore a per-peer based QP. The second field indicates the size of the receive buffer in bytes (128 bytes). The third field indicates the number of receive buffers to allocate to the QP (16). The fourth field indicates the low watermark for receive buffers at which time the BTL will repost receive buffers to the QP (4). The second QP is: S,1024,256,128,32 Shared receive queue based QPs are indicated by a starting field of "S"; the second QP (shown above) is therefore a shared receive queue based QP. The second, third and fourth fields are the same as in the per-peer based QP. The fifth field is the number of outstanding sends that are allowed at a given time on the QP (32). This provides a "good enough" mechanism of flow control for some regular communication patterns. QPs MUST be specified in ascending receive buffer size order. This requirement may be removed prior to 1.3 release. This commit was SVN r15474.	2007-07-18 01:15:59 +00:00
George Bosilca	752909c628	These are supposed to have a high probability of success. This commit was SVN r15377.	2007-07-11 23:02:47 +00:00
George Bosilca	8643f38adf	Don't allow the BTL to be closed before the end of the process. Count the number of times the BTLs are opened, and then don't remove them until close was called the same number of times. This commit was SVN r15376.	2007-07-11 22:21:04 +00:00
Brian Barrett	8b9e8054fd	Move modex from pml base to general ompi runtime, sicne it's used by more than just the PML/BTLs these days. Also clean up the code so that it handles the situation where not all nodes register information for a given node (rather than just spinning until that node sends information, like we do today). Includes r15234 and r15265 from the /tmp/bwb-modex branch. This commit was SVN r15310. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r15234 r15265	2007-07-09 17:16:34 +00:00
Gleb Natapov	54b40aef91	Schedule SEND traffic of pipeline protocol between BTLs in accordance with relative bandwidths of each BTL. Precalculate what part of a message should be send via each BTL in advance instead of doing it during scheduling. This commit was SVN r15248.	2007-07-01 11:34:23 +00:00
Gleb Natapov	b88b7dedfe	Rename btl_rdma_offset to btl_pipeline_send_length. This commit was SVN r15153.	2007-06-21 07:12:40 +00:00
Josh Hursey	7fd1805e97	Fix a couple of compile warnings that Tim P brought to by attention. This commit was SVN r15132.	2007-06-19 00:46:16 +00:00
Galen Shipman	3401bd2b07	Add optional ordering to the BTL interface. This is required to tighten up the BTL semantics. Ordering is not guaranteed, but, if the BTL returns a order tag in a descriptor (other than MCA_BTL_NO_ORDER) then we may request another descriptor that will obey ordering w.r.t. to the other descriptor. This will allow sane behavior for RDMA networks, where local completion of an RDMA operation on the active side does not imply remote completion on the passive side. If we send a FIN message after local completion and the FIN is not ordered w.r.t. the RDMA operation then badness may occur as the passive side may now try to deregister the memory and the RDMA operation may still be pending on the passive side. Note that this has no impact on networks that don't suffer from this limitation as the ORDER tag can simply always be specified as MCA_BTL_NO_ORDER. This commit was SVN r14768.	2007-05-24 19:51:26 +00:00
Gleb Natapov	be71b78f6a	Initialize btl_send_limit before use. This commit was SVN r14745.	2007-05-24 08:40:26 +00:00
Gleb Natapov	3ebaff8dfe	Implement new BTL parameters: We eagerly send data up to btl__eager_limit with the match Upon ACK of the MATCH we start using send/receives of size btl__max_send_size up to the btl__rdma_pipeline_offset After the btl__rdma_pipeline_offset we begin using RDMA writes of size btl__rdma_pipeline_frag_size. Now, on a per message basis we only use the above protocol if the message is larger than btl__min_rdma_pipeline_size btl__eager_limit - > same btl__max_send_size -> same btl__rdma_pipeline_offset -> btl__min_rdma_size btl__rdma_pipeline_frag_size -> btl__max_rdma_size btl_*_min_rdma_pipeline_size is new.. This patch also moves all BTL common parameters initialisation into btl_base_mca.c file. This commit was SVN r14681.	2007-05-17 07:54:27 +00:00
Josh Hursey	4c453caab6	Make the check a bit better This commit was SVN r14542.	2007-04-27 17:38:36 +00:00
Rich Graham	ce35761683	make sure not to go out of bounds. element i+1 of bml_btls is referenced, which for i-arr_size-1 is beyond the array dimentions. This commit was SVN r14464.	2007-04-22 21:43:34 +00:00
Josh Hursey	12e5d0e817	ft_event Commit: - Move the PML Modex stuff out of the BML -- Abstraction violation. - Also fix the location of the add_procs with respect to the stage gates. This commit was SVN r14422.	2007-04-19 03:05:12 +00:00
Josh Hursey	d12ddcdb7a	Protect the free since if we never send any messages this could be NULL. This commit was SVN r14421.	2007-04-19 02:17:50 +00:00
Josh Hursey	8f119d9063	Closes trac:977 Fix for memory corruption in the restarted process stack. This stemed from the brute force method we were previously using. This commit fixes this by using a lighter weight solution focused in the r2 BML instead of above the PML. This is a more efficient and flexible solution, and it solves the original problem. In the process I pulled out the ft_event function in the tcp BTL and r2 BML into a set of *_ft.[c\|h] files just to keep any updates to these code paths as isolated as possible to make merging easier on everyone. This commit was SVN r14371. The following SVN revision numbers were found above: r2 --> open-mpi/ompi@58fdc18855 The following Trac tickets were found above: Ticket 977 --> https://svn.open-mpi.org/trac/ompi/ticket/977	2007-04-14 02:06:05 +00:00
Jeff Squyres	51f286d737	Just like r14289 on the ORTE trunk: Per discussions with Brian and Ralph, make a slight correction in where components are installed. Use $pkglibdir, not $libdir/openmpi, so that when compiled in the orte trunk, components are installed to the right directory (because the component search patch is checking $pkglibdir). This commit was SVN r14345. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r14289	2007-04-12 11:19:42 +00:00
George Bosilca	88365518aa	Small cleanup. This commit was SVN r14319.	2007-04-12 04:34:53 +00:00
Josh Hursey	38547459ae	Improve the cleanup process in ob1 Remove a redundant statement in the r2 BML. This commit was SVN r14228. The following SVN revision numbers were found above: r2 --> open-mpi/ompi@58fdc18855	2007-04-05 17:37:29 +00:00
Brian Barrett	e283e6f9d9	Retry of r14142, without the one-sided code... Back out r14073 - it speeds up TCP latency / bandwidth but at the same time it kills ROMIO and one-sided performance when using only TCP. The problem is that it only allows those two to be progressed every couple of seconds, leading to what looks like hangs in the one-sided tests (and the ROMIO stuff, although people seem to not notice that at this point). This commit was SVN r14144. The following SVN revision numbers were found above: r14073 --> open-mpi/ompi@64fbbc20b8 r14142 --> open-mpi/ompi@241545a098	2007-03-26 16:01:27 +00:00
Brian Barrett	62e5e81e99	revert r14142, as the onesided change should not have come over This commit was SVN r14143. The following SVN revision numbers were found above: r14142 --> open-mpi/ompi@241545a098	2007-03-26 15:58:41 +00:00
Brian Barrett	241545a098	Back out r14073 - it speeds up TCP latency / bandwidth but at the same time it kills ROMIO and one-sided performance when using only TCP. The problem is that it only allows those two to be progressed every couple of seconds, leading to what looks like hangs in the one-sided tests (and the ROMIO stuff, although people seem to not notice that at this point). This commit was SVN r14142. The following SVN revision numbers were found above: r14073 --> open-mpi/ompi@64fbbc20b8	2007-03-26 15:56:23 +00:00
George Bosilca	64fbbc20b8	Switch the event engine to a blocking mode if there is no high performance networks available. This commit was SVN r14073.	2007-03-20 11:15:08 +00:00
Josh Hursey	dadca7da88	Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD). This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158	2007-03-16 23:11:45 +00:00
Sven Stork	d8a369936e	- Fix more symbols that should be exported. This commit was SVN r13824.	2007-02-27 15:17:17 +00:00
Josh Hursey	c573171b7d	Mostly a cleanup commit. - Implement the BML/r2 finialize funciton - Cleanup the btl close routine - Wire up a pml_base_verbose MCA parameter so you can actually watch the PML selection logic if you really want to. - Fix a potental segfault in the selection logic. ompi_pointer_array_get_item() may return NULL, so we have to check for it This commit was SVN r13734. The following SVN revision numbers were found above: r2 --> open-mpi/ompi@58fdc18855	2007-02-21 16:18:43 +00:00
George Bosilca	a02d1c7c8d	No more warnings. This commit was SVN r13382.	2007-01-31 04:27:41 +00:00
Rainer Keller	3669e8921e	- Fix further compiler warnings regarding initialization and shadowing variables. This commit was SVN r13358.	2007-01-30 06:34:38 +00:00
Rainer Keller	125ba1acfa	- Reduce the amount of warnings with -Wshadow -- mainly due to usage of index and abs in inline-fcts in header files. This commit was SVN r13217.	2007-01-19 19:48:06 +00:00

1 2 3

129 Коммитов