openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	4f91b7806c	Remove unused variable (Coverty fix 169). This commit was SVN r19185.	2008-08-06 13:57:43 +00:00
Rainer Keller	23c2292478	- Fix variable set but not used Coverity CID1058 This commit was SVN r19184.	2008-08-06 13:57:38 +00:00
Jeff Squyres	0af7ac53f2	Fixes trac:1392, #1400 * add "register" function to mca_base_component_t * converted coll:basic and paffinity:linux and paffinity:solaris to use this function * we'll convert the rest over time (I'll file a ticket once all this is committed) * add 32 bytes of "reserved" space to the end of mca_base_component_t and mca_base_component_data_2_0_0_t to make future upgrades [slightly] easier * new mca_base_component_t size: 196 bytes * new mca_base_component_data_2_0_0_t size: 36 bytes * MCA base version bumped to v2.0 * '''We now refuse to load components that are not MCA v2.0.x''' * all MCA frameworks versions bumped to v2.0 * be a little more explicit about version numbers in the MCA base * add big comment in mca.h about versioning philosophy This commit was SVN r19073. The following Trac tickets were found above: Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392	2008-07-28 22:40:57 +00:00
Jeff Squyres	d37a25a2d0	Remove per http://www.open-mpi.org/community/lists/devel/2008/07/4386.php This commit was SVN r18972.	2008-07-22 00:57:23 +00:00
Edgar Gabriel	798f47b430	Fixes ticket #1334 hierarch disables itself now if the pml module used is not ob1. The reason is, that the multi-level hierarchy detection algorithm checks the names of the btl modules used. In case there are no btl's, we would segfault. Furthermore, three minor changes: - the 2-level hierarchy detection is now the default (sm vs. everything else in the world). - add udapl to the list of protocols checked for by the multi-level hierarch detection - some of the verbose statements of hierarch were inaccurate. Fixed those comments/messages. This commit was SVN r18817.	2008-07-07 18:44:48 +00:00
Ralph Castain	9613b3176c	Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach. I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive. This commit was SVN r18619.	2008-06-09 14:53:58 +00:00
Ralph Castain	c992e99035	Remove the tags from orte_output_open and the filtering operation from orte_output - this will be handled differently to improve the XML output interface This commit was SVN r18557.	2008-06-03 14:24:01 +00:00
Rolf vandeVaart	18879285c7	Fix the selection logic to prevent memory leaks. More work may be done in the priority logic but for now we just fix the leaks and preserve current behavior. This commit fixes trac:1307. This commit was SVN r18504. The following Trac tickets were found above: Ticket 1307 --> https://svn.open-mpi.org/trac/ompi/ticket/1307	2008-05-27 14:16:39 +00:00
Rolf vandeVaart	5baa733ad5	Fix another warning (using a variable before it was initialized.) Thanks Jeff for pointing this out. This commit was SVN r18489.	2008-05-23 13:57:55 +00:00
Rich Graham	b08839f9f5	change reduce-scatter/gather for non-power of 2. Spreading out the load for the non-power of 2 phase of the reduction. This commit was SVN r18486.	2008-05-22 21:42:42 +00:00
Rich Graham	f2a4b67809	automate the allreduce selection logic. This commit was SVN r18484.	2008-05-22 20:53:35 +00:00
Rich Graham	5900415a25	for non-powers of 2, distribute the work on the first step among all the procs doing the work. This commit was SVN r18480.	2008-05-22 18:50:53 +00:00
George Bosilca	c31cc5b270	Remove a warning about line being unused. This commit was SVN r18472.	2008-05-21 20:46:22 +00:00
Edgar Gabriel	0500420bec	fixing a bug in the inter-communicator scatter operation, where we used accidentally rcount instead of scounts. This commit was SVN r18466.	2008-05-20 21:17:19 +00:00
Rolf vandeVaart	74d0259480	Add new implentation of barrier. This shows better performance on some clusters. However, no decision logic is changed by this commit so default behavior has not changed. This is only selectable by runtime parameters. This commit was SVN r18464.	2008-05-20 17:37:41 +00:00
Rolf vandeVaart	71091a19c3	Fix bug in spacing of code per https://svn.open-mpi.org/trac/ompi/wiki/CodingStyle . This commit was SVN r18463.	2008-05-20 14:11:10 +00:00
Rolf vandeVaart	763f5259a8	Fix memory leak of 88 bytes that occurred on each call to MPI_Comm_dup. Need to release the items and the item list after selecting the collective modules that are being used. Reviewed by Jeff Squyres. This commit was SVN r18457.	2008-05-19 21:34:01 +00:00
Jeff Squyres	7154776465	Removed unused variable / compiler warning. This commit was SVN r18454.	2008-05-19 13:41:45 +00:00
Rolf vandeVaart	375406e1fa	Remove the ignore files as decided at Tuesday's developers conference call. Now, hierarchical collectives will be compiled in but the priority is still at 0 requiring a user to set mca parameters to enable them. This commit was SVN r18440.	2008-05-15 01:26:52 +00:00
Jeff Squyres	671f0c379d	Remove a whole pile of orte/util/show_help.h's that I missed. :-( This commit was SVN r18437.	2008-05-14 11:32:33 +00:00
Jeff Squyres	e7ecd56bd2	This commit represents a bunch of work on a Mercurial side branch. As such, the commit message back to the master SVN repository is fairly long. = ORTE Job-Level Output Messages = Add two new interfaces that should be used for all new code throughout the ORTE and OMPI layers (we already make the search-and-replace on the existing ORTE / OMPI layers): * orte_output(): (and corresponding friends ORTE_OUTPUT, orte_output_verbose, etc.) This function sends the output directly to the HNP for processing as part of a job-specific output channel. It supports all the same outputs as opal_output() (syslog, file, stdout, stderr), but for stdout/stderr, the output is sent to the HNP for processing and output. More on this below. * orte_show_help(): This function is a drop-in-replacement for opal_show_help(), with two differences in functionality: 1. the rendered text help message output is sent to the HNP for display (rather than outputting directly into the process' stderr stream) 1. the HNP detects duplicate help messages and does not display them (so that you don't see the same error message N times, once from each of your N MPI processes); instead, it counts "new" instances of the help message and displays a message every ~5 seconds when there are new ones ("I got X new copies of the help message...") opal_show_help and opal_output still exist, but they only output in the current process. The intent for the new orte_* functions is that they can apply job-level intelligence to the output. As such, we recommend that all new ORTE and OMPI code use the new orte_* functions, not thei opal_* functions. === New code === For ORTE and OMPI programmers, here's what you need to do differently in new code: * Do not include opal/util/show_help.h or opal/util/output.h. Instead, include orte/util/output.h (this one header file has declarations for both the orte_output() series of functions and orte_show_help()). * Effectively s/opal_output/orte_output/gi throughout your code. Note that orte_output_open() takes a slightly different argument list (as a way to pass data to the filtering stream -- see below), so you if explicitly call opal_output_open(), you'll need to slightly adapt to the new signature of orte_output_open(). * Literally s/opal_show_help/orte_show_help/. The function signature is identical. === Notes === * orte_output'ing to stream 0 will do similar to what opal_output'ing did, so leaving a hard-coded "0" as the first argument is safe. * For systems that do not use ORTE's RML or the HNP, the effect of orte_output_* and orte_show_help will be identical to their opal counterparts (the additional information passed to orte_output_open() will be lost!). Indeed, the orte_* functions simply become trivial wrappers to their opal_* counterparts. Note that we have not tested this; the code is simple but it is quite possible that we mucked something up. = Filter Framework = Messages sent view the new orte_* functions described above and messages output via the IOF on the HNP will now optionally be passed through a new "filter" framework before being output to stdout/stderr. The "filter" OPAL MCA framework is intended to allow preprocessing to messages before they are sent to their final destinations. The first component that was written in the filter framework was to create an XML stream, segregating all the messages into different XML tags, etc. This will allow 3rd party tools to read the stdout/stderr from the HNP and be able to know exactly what each text message is (e.g., a help message, another OMPI infrastructure message, stdout from the user process, stderr from the user process, etc.). Filtering is not active by default. Filter components must be specifically requested, such as: {{{ $ mpirun --mca filter xml ... }}} There can only be one filter component active. = New MCA Parameters = The new functionality described above introduces two new MCA parameters: * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that help messages will be aggregated, as described above. If set to 0, all help messages will be displayed, even if they are duplicates (i.e., the original behavior). * '''orte_base_show_output_recursions''': An MCA parameter to help debug one of the known issues, described below. It is likely that this MCA parameter will disappear before v1.3 final. = Known Issues = * The XML filter component is not complete. The current output from this component is preliminary and not real XML. A bit more work needs to be done to configure.m4 search for an appropriate XML library/link it in/use it at run time. * There are possible recursion loops in the orte_output() and orte_show_help() functions -- e.g., if RML send calls orte_output() or orte_show_help(). We have some ideas how to fix these, but figured that it was ok to commit before feature freeze with known issues. The code currently contains sub-optimal workarounds so that this will not be a problem, but it would be good to actually solve the problem rather than have hackish workarounds before v1.3 final. This commit was SVN r18434.	2008-05-13 20:00:55 +00:00
Rolf vandeVaart	0e32dd1022	Add MPI_Alltoallv to tuned collectives and add a pairwise implementation of MPI_Alltoallv. However, do not change the default behavior for now. The only way to use new pairwise implementation is via mca parameters. This commit was SVN r18394.	2008-05-07 02:31:24 +00:00
Rich Graham	4d1ae7b05f	accidentally made a change in the wrong place. This commit was SVN r18262.	2008-04-23 17:32:05 +00:00
Rich Graham	293dd6ad4e	add myself to list of people building this module. This commit was SVN r18261.	2008-04-23 17:25:36 +00:00
Rich Graham	7658cc79e4	Pass in the correct module to the reduction call. This commit was SVN r18260.	2008-04-23 17:23:30 +00:00
Tim Mattox	0215474cb8	Fix two bugs in coll_sm_module.c from bit-rot: Fixed a selection bug, and removed a bogus "free(proc)" call which ultimately caused MPI_Finalize to crash. This commit was SVN r18235.	2008-04-22 18:41:21 +00:00
Rich Graham	df35223603	add selection logic for barrier and reduce. This commit was SVN r18215.	2008-04-19 22:40:04 +00:00
Rich Graham	bee8b42f29	remove debug code that would not let people run. Add infrastructure for blocking-barrier. This commit was SVN r18214.	2008-04-19 01:34:04 +00:00
Rich Graham	6c77fa4921	add a blocking shared memory algorithm. This commit was SVN r18185.	2008-04-16 22:10:23 +00:00
Rich Graham	249445d61f	added reduce-scatter followed by gather to root. This commit was SVN r18133.	2008-04-11 13:49:08 +00:00
Rich Graham	a6bdbfab97	implement allreduce as reduce-scatter, followed by an allgather. This commit was SVN r18132.	2008-04-11 04:06:29 +00:00
Rich Graham	70f3aab5f2	remove some code that is not needed. This commit was SVN r18128.	2008-04-10 17:32:04 +00:00
Rich Graham	5c7db1e315	remove 2 race conditions in the buffer recycling logic. This commit was SVN r18127.	2008-04-10 17:20:52 +00:00
Edgar Gabriel	4964434205	reverting commit 18122, since the commit was executed accidentally in the wring directory. The UH copyrights do belong into this file (i.e. because of the fix which is in the 1.2 branch, the UH copyright notes are in the header there alreary), but I want to have the proper log for that. This commit was SVN r18124.	2008-04-10 15:09:31 +00:00
Edgar Gabriel	f87830767a	the verification of recvcount==0 and rank = root was braking inter-communicator scatter, since the root (root==MPI_ROOT) might very well have recvcount=0. The same fix has been applied to gather.c just the other way round. Fixes the bug reported on the mainling list by Martin Audet. If there is a 1.2.7 this fix might be worthwhile porting it over. Please note, that while the test works now for basic and for inter, we get a 0byte malloc warning from the inter module, which we still have to fix in a separate patch. This commit was SVN r18122.	2008-04-10 14:58:51 +00:00
Rich Graham	c6783549ef	getting old This commit was SVN r18110.	2008-04-09 16:55:16 +00:00
Rich Graham	1a20c3ce51	more debug. This commit was SVN r18109.	2008-04-09 16:19:52 +00:00
Rich Graham	e7e18303f6	more debug. This commit was SVN r18108.	2008-04-09 15:10:58 +00:00
Rich Graham	b14c6b17d5	adding debug output. This commit was SVN r18107.	2008-04-09 13:32:01 +00:00
Rich Graham	10434fb2f1	add barrier synchorinzation at the end of the module init, to avoid initializing shared memory variables in use. This commit was SVN r18105.	2008-04-09 03:44:40 +00:00
Rich Graham	19bb1a2e86	fix initialization bug. This commit was SVN r18104.	2008-04-08 23:34:06 +00:00
Rich Graham	a69a8d9626	initialize the flags. This commit was SVN r18102.	2008-04-08 22:16:39 +00:00
Rich Graham	8765a2bbdd	more debug code. This commit was SVN r18101.	2008-04-08 20:38:20 +00:00
Rich Graham	08becf33b5	add more debugging. This commit was SVN r18100.	2008-04-08 18:44:50 +00:00
Rich Graham	aa1b7dd406	more debug This commit was SVN r18099.	2008-04-08 03:56:47 +00:00
Rich Graham	0c18bdeff7	more debug code. This commit was SVN r18098.	2008-04-08 03:04:20 +00:00
Rich Graham	9d5a7238df	Add some debugging code. This commit was SVN r18097.	2008-04-07 23:20:15 +00:00
Rich Graham	fa696734d5	add some debug code. This commit was SVN r18096.	2008-04-07 21:03:23 +00:00
Rich Graham	1b54e8b76e	fix buffer management for nb-barrier. This commit was SVN r18081.	2008-04-05 21:59:04 +00:00
Rich Graham	94f8fd365c	a few reduction optimizations. Add bcast. This commit was SVN r18075.	2008-04-02 19:02:33 +00:00
George Bosilca	a00ca20446	More cleanups. This commit was SVN r18069.	2008-04-02 06:38:33 +00:00
Rich Graham	eb5d6096f1	add reduction routine - fix buffer recycling logic which was totally broken. This commit was SVN r18065.	2008-04-01 22:56:18 +00:00
Rich Graham	90e53ca9ee	debug the pipeline algorithm. This commit was SVN r18008.	2008-03-28 15:10:07 +00:00
Rich Graham	e2ad9c4be2	adjust to change in orte_process_info. This commit was SVN r17986.	2008-03-27 01:25:28 +00:00
Rich Graham	441fb9fb9e	checkpoint. This commit was SVN r17985.	2008-03-27 01:16:32 +00:00
Ralph Castain	cca449e379	Move an OMPI RML tag to the OMPI layer This commit was SVN r17950.	2008-03-25 13:30:48 +00:00
Ralph Castain	dc7f45dafd	Remove the obsolete and largely unused orte_system_info structure. The only fields that were used in that struct were nodeid and nodename - these have been transferred to the orte_process_info structure. Only one place used the user name field - session_dir, when formulating the name of the top-level directory. Accordingly, the code for getting the user's id has been moved to the session_dir code. This commit was SVN r17926.	2008-03-23 23:10:15 +00:00
Rich Graham	a7c836a2b0	fix location of the restrict key word. Make the tag in the fan-in/fan-out algorithm be fragment based. This commit was SVN r17903.	2008-03-21 01:40:36 +00:00
Rich Graham	2c66d396b7	take care of some bit-rot with the fanin-fanout method. This commit was SVN r17902.	2008-03-21 01:08:49 +00:00
Rich Graham	b9520e61dc	get the sm optimized allreduce working for all but user defined operations. Added to the reduction operations a set of reduction functions that take 2 input buffers and one output buffer to avoid some extra memory copies. These can't be used with user defined operations. The intel c collective suite passes both original, and new (new, not the user defined operations). This commit was SVN r17901.	2008-03-20 23:51:16 +00:00
Edgar Gabriel	570bbea5e0	fixing the allgather problem reported on the mailing list. The problem was that at one locatin we had the local-size instead of the remote size as a receive argument. This commit was SVN r17849.	2008-03-17 19:42:18 +00:00
Rich Graham	27182afb67	get the timers in correctly. This commit was SVN r17832.	2008-03-16 03:25:16 +00:00
Rich Graham	afcd1016fd	move temp buffer allocation out of the iteration loop - i.e. always use the same temp loop. The algorithm is rather synchronous already... This commit was SVN r17831.	2008-03-16 03:20:46 +00:00
Rich Graham	a1766b29f6	fix some barrier addressing errors. This commit was SVN r17830.	2008-03-15 22:46:19 +00:00
Rich Graham	0453e7d2f4	bug in management memory allocation - too much memory allocated. This commit was SVN r17829.	2008-03-15 18:12:20 +00:00
Rich Graham	3c2f1eb8bf	reduce the number of temp buffers used. This commit was SVN r17828.	2008-03-15 17:23:04 +00:00
Rich Graham	0f9d642d51	temp buffer pointers are computed when they are set up. A bit more efficient, but more important, it is much easier to play around with memory layout now. This commit was SVN r17827.	2008-03-15 16:36:35 +00:00
Rich Graham	e3e336b5ab	check point This commit was SVN r17826.	2008-03-15 13:31:21 +00:00
Rich Graham	ebcf928c24	add some diagnostics. This commit was SVN r17789.	2008-03-07 22:27:41 +00:00
Rich Graham	9131461511	move some test code to another machine. This commit was SVN r17785.	2008-03-07 19:18:02 +00:00
Rich Graham	c230b65543	fix a couple of bugs. Recursive doubling seems to be working. This commit was SVN r17777.	2008-03-07 02:51:38 +00:00
Rich Graham	70157166f9	checkpoint - compiles, now neeed to debug. This commit was SVN r17775.	2008-03-07 00:39:59 +00:00
Rich Graham	4eace9d020	starting to implement recursive doubling algorithm. This commit was SVN r17765.	2008-03-06 18:38:58 +00:00
Rich Graham	67ad9b6d6b	increase max data segments size. This commit was SVN r17677.	2008-03-02 19:11:09 +00:00
Rich Graham	53126fa7bd	add calls to opal_progress() This commit was SVN r17673.	2008-02-29 23:25:09 +00:00
Rich Graham	d37db14901	get the shared memory collectives working again with the new version of orte. This commit was SVN r17672.	2008-02-29 22:28:57 +00:00
Rich Graham	c253a7bda1	simplify the code abit. This commit was SVN r17664.	2008-02-29 03:55:12 +00:00
Rich Graham	1632d8b299	revert to an older (not previosly checked in) version to get around a regression. This commit was SVN r17663.	2008-02-29 03:12:12 +00:00
Rich Graham	827e8d877e	fix bug in node type, and some memory copy optimizations. This commit was SVN r17661.	2008-02-29 01:20:11 +00:00
Rich Graham	940d6732c9	remove compiler warnings. This commit was SVN r17656.	2008-02-28 22:01:19 +00:00
Rich Graham	2b5fab9d51	avoid 0 byte malloc. This commit was SVN r17653.	2008-02-28 21:11:42 +00:00
Rich Graham	4b26adef00	remove some debug output. This commit was SVN r17650.	2008-02-28 20:54:35 +00:00
Rich Graham	5df6c6d043	fix several race conditions. This commit was SVN r17645.	2008-02-28 19:40:19 +00:00
Ralph Castain	d70e2e8c2b	Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer This commit was SVN r17632.	2008-02-28 01:57:57 +00:00
Rich Graham	68aa691171	checkpoint work. This commit was SVN r17620.	2008-02-27 14:56:36 +00:00
Rich Graham	b4bbb70bb7	got it all, but for the mem copies. Also, need to make sure volatile declarations are all inplace, as well as memory barriers. This commit was SVN r17572.	2008-02-25 00:16:21 +00:00
Rich Graham	2d8c2420e8	checkpoint. This commit was SVN r17571.	2008-02-24 20:54:16 +00:00
Rich Graham	771584bff5	generate reduction tree. This commit was SVN r17569.	2008-02-24 03:25:40 +00:00
Rich Graham	b9bb78484d	a bit of omptimization. This commit was SVN r17528.	2008-02-20 16:19:49 +00:00
Rich Graham	09afc36f5f	correct addressing. This commit was SVN r17519.	2008-02-20 01:12:43 +00:00
Rich Graham	b87b15580c	fix memory allocation error. Initialize pointer. This commit was SVN r17514.	2008-02-19 20:01:42 +00:00
Rich Graham	1cd8a2e578	checkpoint - works for 2 procs, but not more. This commit was SVN r17477.	2008-02-17 05:21:58 +00:00
Rich Graham	8006927ae8	free buffer, rather than ask for another one, when done with the memory. This commit was SVN r17468.	2008-02-15 04:21:58 +00:00
Rich Graham	2277b47ab9	register mca_coll_sm2_allreduce_intra - function still does not do any reduction operations. This commit was SVN r17467.	2008-02-15 04:13:00 +00:00
Rich Graham	9b0687e6df	add buffer allocation and deallocation calls to the allreduce routine, so I can start debugging the memory management code. The allreduce fucntion does nothing at this stage. This commit was SVN r17466.	2008-02-15 03:59:14 +00:00
Rich Graham	41943dbd76	adding missing files. This commit was SVN r17462.	2008-02-15 00:59:28 +00:00
Rich Graham	41f4b06b39	buffer allocate/release code is fully written, and compiles. Now need to debug. This commit was SVN r17461.	2008-02-15 00:57:44 +00:00
Rich Graham	7cc58768cd	checkpoint something that compiles This commit was SVN r17460.	2008-02-15 00:33:14 +00:00
Rich Graham	292d930eea	check point. This commit was SVN r17457.	2008-02-14 20:00:26 +00:00
Edgar Gabriel	77057a50a3	- adding the two-level hierarchy detection algorithm - minor fix in the temporary collectives - removing the symmetric parameter, since it didn't really make sense. This commit was SVN r17359.	2008-02-01 17:11:36 +00:00
Rich Graham	fda485ff9c	backing file is allocated and deallocated. This commit was SVN r17358.	2008-02-01 15:26:20 +00:00
Rich Graham	165fc3f8cc	memory allocation implemented and debugged. Still need to finish file allocation/dealocation and control information initialization. This commit was SVN r17291.	2008-01-29 03:09:12 +00:00
Rich Graham	e24c2ebbc0	have a working skeleton for the SM-V2 component. It does nothing at this stage. This commit was SVN r17241.	2008-01-25 21:16:36 +00:00
Rich Graham	1d0334f4f2	skeleton for new shared memory collective component. This commit was SVN r17235.	2008-01-25 19:35:26 +00:00
Rich Graham	432ba0cecd	add comments about the life-cycle of a collective module. This commit was SVN r17223.	2008-01-25 03:46:31 +00:00
George Bosilca	31390c0074	We should take in account the extent of the datatype when we compute the initial displacement in bytes. Thanks to Daniel G. Hyams for the fix. This commit was SVN r17165.	2008-01-19 05:34:53 +00:00
George Bosilca	3fca3973d3	The PTLs are now long gone !!! This commit was SVN r17104.	2008-01-10 00:18:45 +00:00
George Bosilca	906e8bf1d1	Replace the ompi_pointer_array with opal_pointer_array. The next step (sometimes after the merge with the ORTE branch), the opal_pointer_array will became the only pointer_array implementation (the orte_pointer_array will be removed). This commit was SVN r17007.	2007-12-21 06:02:00 +00:00
Jeff Squyres	213b5d5c6e	Per long threads on the mailing list and much confusion discussion about linkers, have all OPAL, ORTE, and OMPI components '''not'' link against the OPAL, ORTE, or OMPI libraries. See ttp://www.open-mpi.org/community/lists/users/2007/10/4220.php for details (or https://svn.open-mpi.org/trac/ompi/wiki/Linkers for a better-formatted version of the same info). This commit was SVN r16968.	2007-12-15 13:32:02 +00:00
Andrew Friedley	c15047b264	Add LLNL copyright to the file i modified yesterday This commit was SVN r16404.	2007-10-09 15:18:23 +00:00
Andrew Friedley	fd51d9cf28	The call to opal_list_insert() had an off by one error (I think), causing selected components to get lost with certain load orderings. I went ahead and rewrote the code to use opal_list_insert_pos() instead, which gives a cleaner flow and more speed. This commit was SVN r16392.	2007-10-08 23:01:36 +00:00
Jeff Squyres	f92d9097d8	Some more changes to update to coll v1.1.0 that were missed yesterday. This actually exposed a very, very long-standing bug where part of the coll base was incorrectly checking the coll API version against the MCA API version. When coll went to v1.1 (yesterday) and was no longer the same as the MCA v1.0, the test started failing. This commit fixes to check for v1.1 everywhere in the coll base, and to ensure to check coll framework/API version numbers against coll framework/API version numbers (vs. against the MCA API version number). This commit was SVN r16373.	2007-10-07 12:20:22 +00:00
Jeff Squyres	3d34bff596	No technical/functional changes: simply change the name of the "data" parameter to "module" everywhere, just to be a little more clear what the purpose of that parameter is. This commit was SVN r16372.	2007-10-07 08:36:45 +00:00
Jeff Squyres	fc2b4376e9	Update forgotten macro. This commit was SVN r16368.	2007-10-06 14:11:35 +00:00
Jelena Pjesivac-Grbovic	ada43fef9e	This fixes bug #1157 in coll/self module. All vector functions had incorrect handling of the offset. This commit was SVN r16360.	2007-10-05 17:40:16 +00:00
Andrew Friedley	2e66590993	Fix mistakes in the basic component.. can't call collectives on the communicator and always pass the basic module.. have to give them the module off the communicator. This commit was SVN r16329.	2007-10-04 16:29:24 +00:00
George Bosilca	1e7a791349	Remove some of the problems identified by Coverty. This commit was SVN r16112.	2007-09-12 20:13:26 +00:00
George Bosilca	c755938eb0	Coverty: release the temporary buffer on error. This commit was SVN r16104.	2007-09-12 17:45:12 +00:00
Shiqing Fan	a0660f4deb	- Just some type casts. This commit was SVN r16100.	2007-09-12 15:29:58 +00:00
Jeff Squyres	c4a38f47f6	Resolve Coverity CID 467: remove unused variable / dead code. This commit was SVN r15997.	2007-08-29 01:23:18 +00:00
Edgar Gabriel	a2f5cada1a	convert the hiearch component to the new structure. More testing required before we remove the .ompi_ignore flag again. This commit was SVN r15954.	2007-08-23 20:41:29 +00:00
Shiqing Fan	a497a3fcad	- Fix some small bugs, copy-paste mistakes. This commit was SVN r15941.	2007-08-21 19:57:28 +00:00
Sven Stork	3985a35c35	- export required symbol This commit was SVN r15939.	2007-08-21 18:46:11 +00:00
Brian Barrett	af4e86c25f	Update collectives selection logic to allow for multiple components to be used at nce (up to one unique collective module per collective function). Matches r15795:15921 of the tmp/bwb-coll-select branch This commit was SVN r15924. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r15795 r15921	2007-08-19 03:37:49 +00:00
Jelena Pjesivac-Grbovic	9bd9c92dbd	Making sure that the decision function for scatter and gather correctly computes everything for MPI_IN_PLACE case. This commit was SVN r15841.	2007-08-13 17:35:50 +00:00
Jelena Pjesivac-Grbovic	b558e820cb	removing compiler wraning This commit was SVN r15803.	2007-08-08 15:22:01 +00:00
Jelena Pjesivac-Grbovic	daa10b277e	modifying scatter decision function to use binomial algorithm for small message sizes. This commit was SVN r15798.	2007-08-07 22:16:13 +00:00
Mohamad Chaarawi	59a7bf8a9f	Merging in the Sparse Groups.. This commit includes config changes.. This commit was SVN r15764.	2007-08-04 00:41:26 +00:00
Sven Stork	855434de59	- fixes several coverty issues - add missing initialisation for variables - use strncpy instead of strcpy This commit was SVN r15683.	2007-07-30 14:44:37 +00:00
Jelena Pjesivac-Grbovic	1b66a52c50	Modifying type of binomial tree used for binomial reduce: switching: 0 0 / \ \ / \ \ 1 \ \ --> 4 \ \ / \ \ / \ \ 3 2 \ 3 2 \ 4 1 (duh). The first form is the bmtree suitable for bcast, but the latter is better for reduce. Updating default decision function accordingly. This commit was SVN r15422.	2007-07-13 21:07:51 +00:00
Jelena Pjesivac-Grbovic	d677db9b5f	cleaning up alltoall implementation: - removing MPI_* calls from bruck implementation - simplifying 2 process case - identation, etc. This commit was SVN r15301.	2007-07-07 01:06:19 +00:00
Jelena Pjesivac-Grbovic	483222085e	Fixing compiler warnings. In gather, the ptmp += incr is irrelevant, since ptmp is set within the loop. This commit was SVN r15293.	2007-07-05 20:40:50 +00:00
Jelena Pjesivac-Grbovic	3b0a52a104	adding tuned allgatherv implementation using bruck, ring, and neighbor-exchange algorithms. The implementations passed intel and imb tests up to 40 processes. This commit was SVN r15280.	2007-07-03 23:33:12 +00:00
Jelena Pjesivac-Grbovic	d55b415bb0	fixing typo This commit was SVN r15240.	2007-06-28 20:56:55 +00:00
Jelena Pjesivac-Grbovic	8fc8b44d11	Modifying reduce decision function for large, single element reduces (again). Binary algorithm without segmentation tends to outperform binomial algorithm in this case. This commit was SVN r15226.	2007-06-27 22:01:56 +00:00
Jelena Pjesivac-Grbovic	0ecef1750d	Modifying the default reduce decision function to use binomial algorithm for single-element reduce (segmented algorithms make no sense in this case and can cause performance degradation). This commit was SVN r15209.	2007-06-26 20:14:03 +00:00
Jelena Pjesivac-Grbovic	567b40b9a9	Modifying the default broadcast decision function to use binomial algorithm for single-element broadcasts (segmented algorithms make no sense in this case and can cause performance degradation). This commit was SVN r15208.	2007-06-26 20:08:31 +00:00
Jelena Pjesivac-Grbovic	3740640711	Modifying MPI_Gather in tuned module: - adding linear algorithm with synchronization for gather. This algorithm prevents congestion at root process, but introduces synchronization (serializes non-root processes, but allows messages to arrive from two processes at the same time). It performed better than binomial and linear algorithms for large message, and intermediate and large communicator sizes. - Updating MPI_Gather decision function to reflect performance results from MX. I will perform more measurements though - so this one can change. This commit was SVN r15165.	2007-06-21 20:00:36 +00:00
Sven Stork	22af6d38e6	- UNexport symbols that shouldn't be needed outside the libraries - replace #if/#endif with BEGIN/END_C_DECLS - reformating This commit was SVN r14669.	2007-05-16 15:46:52 +00:00
Brian Barrett	21e00f6f0c	Clean up a couple of configure things: * Require Autoconf 2.60 or higher and remove some cruft required for AC 2.59 or the AC 2.59 / AC 2.60 mix * Remove a bunch of now unnecessary AC_SUBST calls * Use the libtool-provided variables for the -I and library to use when compiling against ltdl Fixes trac:1000 This commit was SVN r14652. The following Trac tickets were found above: Ticket 1000 --> https://svn.open-mpi.org/trac/ompi/ticket/1000	2007-05-15 04:23:48 +00:00
Jelena Pjesivac-Grbovic	625c6739ab	Removing warning about unsed variable This commit was SVN r14579.	2007-05-03 20:26:41 +00:00
Jelena Pjesivac-Grbovic	9eff74ad4d	Modifying generalized reduce "synchronized" behavior: - Removing "small" message size limit because it really does not relate to the eager size accross the board. Now, the leaf nodes in generalized reduce will use blocking send (DEFAULT/ORIGINAL BEHAVIOR) either when the maximum number of outstanding requests is 0 or when the total number of segments is less than the maximum number of outstanding requests. Otherwise, it will send messages using non-blocking synchronized send operation. This commit was SVN r14572.	2007-05-02 21:42:45 +00:00
George Bosilca	69642a9cd4	Remove 2 warnings about ptrdiff_t to unsigned long implicit conversion. This commit was SVN r14565.	2007-05-01 19:47:33 +00:00
Jelena Pjesivac-Grbovic	3eac49aa59	Adding flow control for leaf nodes in generalized reduce structure. This "feature" is disabled by default and it should not affect the current performance. In case when the message size is large and segment size is smaller than eager size for particular interface, the leaf nodes in generalized reduce function can overflood parent nodes by sending all segments without any synchronization. This can cause the parent to have HIGH number of unexpected messages (think 16MB message with 1KB segments for example). In case of binomial algorithm root node always has at least one child which is leaf, so this can potentially affect the root's performance significantly [Especially in large communicators where root may have quite a few children (binomial tree for example)]. When the segment size is bigger than the eager size, rendezvous protocol ensures that this does not happen so it is not necessary. Originally, the problem was exposed in "infinite" bucket allocator clean up time for "small" segment sizes (which may explain some "deadlocks" on Thunderbird tests). To prevent this, we allow user to specify mca parameter "--mca coll_tuned_reduce_algorithm_max_requests NUM" this limits number of outstanding messages from a leaf node in generalized reduce to the parent to NUM. Messages are sent as non-blocking synchrnous messages, so syncronization happens at "wait" time. The synchronization actually improved performance of pipeline and binomial algorithm for large message sizes with 1KB segments over MX, but I need to test it some more to make sure it is consistent. Since there is no easy way to find out what is "the eager" size for particular btl, I set the limit to 4000B. If message/individual segment size is greater than 4000B - we will not use this feature. This variable may or may not be exposed as mca parameter later... I did not have any problems running it and both "default" and "synchronous" tests passed Intel Reduce* tests up to 80 processes (over MX). This commit was SVN r14518.	2007-04-25 20:39:53 +00:00
Jelena Pjesivac-Grbovic	53cbec7a09	Make coll/tuned dynamic rules more verbose (when promted with --mca coll_base_verbose 1) This commit was SVN r14469.	2007-04-23 16:34:52 +00:00
Jeff Squyres	51f286d737	Just like r14289 on the ORTE trunk: Per discussions with Brian and Ralph, make a slight correction in where components are installed. Use $pkglibdir, not $libdir/openmpi, so that when compiled in the orte trunk, components are installed to the right directory (because the component search patch is checking $pkglibdir). This commit was SVN r14345. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r14289	2007-04-12 11:19:42 +00:00
George Bosilca	120cf76ad8	Remove some warnings. This commit was SVN r14196.	2007-04-02 19:11:06 +00:00
George Bosilca	cc65814969	And set the message size before the first use too. This commit was SVN r14159.	2007-03-28 18:01:13 +00:00
George Bosilca	b540545fa7	Set the communicator size before using it. This commit was SVN r14158.	2007-03-28 17:59:21 +00:00
Mohamad Chaarawi	bfaf9d4a12	Added new module for intercomm collectives. This will require an autogen. This commit was SVN r14149.	2007-03-27 02:06:42 +00:00
Jelena Pjesivac-Grbovic	d6402b6898	Adding in-order binary tree algorithm for non-commutative reduce operations. I tested algorithm with intel and ibm tests and it passed again - so it should work. This commit was SVN r14068.	2007-03-19 21:03:57 +00:00
Josh Hursey	dadca7da88	Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD). This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158	2007-03-16 23:11:45 +00:00
Rolf vandeVaart	42168575fd	Fix for the special case where np=2 and the sendbuf is set to MPI_IN_PLACE. In that case, sendcount and sendtype are not valid and we need to use recvcount and recvtype. This commit fixes trac:943. Reviewed by Jelena Pjesivac-Grbovic. This commit was SVN r14022. The following Trac tickets were found above: Ticket 943 --> https://svn.open-mpi.org/trac/ompi/ticket/943	2007-03-13 19:01:20 +00:00
Jelena Pjesivac-Grbovic	9780a000ba	Cleanup of generic reduce function and possible (low probability) bug fix. - fixing line lengths and some of the comments - possible bug fix (but I do not think we exposed it in any tests so far) temporary buffers were allocated as multiples of extent instead of true_extent + (count -1) * extent. Everything is still passing Intel tests over tcp and btl mx up to 64 nodes. This commit was SVN r13956.	2007-03-08 00:54:52 +00:00
Jelena Pjesivac-Grbovic	57cbafafd5	Clean up of generic broadcast function: removing unecessary statements and improving comments. This commit was SVN r13955.	2007-03-07 21:59:53 +00:00
Jelena Pjesivac-Grbovic	0c07654c30	Updating reduce_scatter decision function based on MX results up to 64 nodes and both 1ppn and 2ppn configurations. This commit was SVN r13945.	2007-03-07 00:38:33 +00:00
Jelena Pjesivac-Grbovic	e5ed167a6e	Adding tuned version of reduce_scatter implementation. Currently 3 algorithms are available: - non-overlapping, reduce + scatterv, (works for non-commutative operations) - recursive halving algorithm (copied from basic module) - ring algorithm (similar to allreduce ring, for large messages) This commit was SVN r13929.	2007-03-05 20:40:39 +00:00
Li-Ta Lo	196e2a86bb	addes binomial tree based scatter, passed IBM and intel tests This commit was SVN r13906.	2007-03-02 23:19:02 +00:00
Li-Ta Lo	11c94cbe76	eliminated the use of MPI_Get_count This commit was SVN r13904.	2007-03-02 22:57:50 +00:00
Li-Ta Lo	3765e19d15	added ASCII graph for the topologies This commit was SVN r13892.	2007-03-02 17:17:14 +00:00
Li-Ta Lo	bd75f2f162	change ALLGATHER to GATHER This commit was SVN r13891.	2007-03-02 17:02:29 +00:00
Li-Ta Lo	c5d8c221b0	added binomial tree based Gather alogrithm, passed IBM and Intel tests This commit was SVN r13835.	2007-02-28 01:11:01 +00:00
Jelena Pjesivac-Grbovic	627533fe4a	Adding segmented ring algorithm for Allreduce for commutative operations. Algorithm allows user to specify the segment size to be used for computation/communication overlap. The additional memory requirement for the algorithm is 2 x segment size. It performed well for (really) large message sizes over MX and it passed intel Allreduce_c and Allreduce_loc_c tests. This commit was SVN r13832.	2007-02-27 20:32:30 +00:00
George Bosilca	bec20422ee	Remove the warnings about printf data-type mismatch. This commit was SVN r13804.	2007-02-26 22:20:35 +00:00
Li-Ta Lo	c860bd1be5	fixed a typo in the comment This commit was SVN r13802.	2007-02-26 19:20:46 +00:00
Li-Ta Lo	73a73b1c78	added ASCII graph on reduce_log_intra This commit was SVN r13801.	2007-02-26 19:15:37 +00:00
Bill D'Amico	db1c2a58c4	Removed cruft - unused variables causing warnings during OMPI build. This commit was SVN r13772.	2007-02-23 18:55:41 +00:00
Tim Prins	f35f67ed1c	(very) minor correction to helpfile This commit was SVN r13758.	2007-02-22 16:02:12 +00:00
Li-Ta Lo	049921a5ec	the temporary buffer is not needed for the MPI_IN_PLACE cases if the underlying Gather is implemented correctly This commit was SVN r13740.	2007-02-21 20:39:56 +00:00
Jelena Pjesivac-Grbovic	36156f39c2	Modification to allreduce ring algorithm: - the block sizes are computed in more uniformn way. The first k blocks may be 1 element larger than the remaining blocks. The algorithm passed Intel Allreduce_c and Allreduce_loc_c tests, and IMB-3.2 Allreduce, over TCP and both btl and mtl MX (up to 128 processes). The algorithm still only supports commutative operations. This commit was SVN r13738.	2007-02-21 19:30:08 +00:00
Jelena Pjesivac-Grbovic	b608887466	Adding variant of linear alltoall algorithm where the number of outstanding requests can be limited using mca parameters. The implementation passed Intel, IMB-3.2, and mpi_test_suite tests over TCP and MX up to 128 processes (64 nodes), on both 32-bit and 64-bit machines. It is not activated by default, but it should be useful for really large communicator sizes. This commit was SVN r13720.	2007-02-20 04:25:00 +00:00
Jelena Pjesivac-Grbovic	d2d02642ca	Removing compilation warnings about the output format. This commit was SVN r13693.	2007-02-16 23:32:47 +00:00
Jelena Pjesivac-Grbovic	e532b928af	Adding segmented binary reduce algorithm which works with non-commutative operations. Implementation passed intel: MPI_Reduce_c , MPI_Reduce_loc_c, and MPI_Reduce_user_c tests over TCP, BTL MX, and MTL MX, as well as, mpi_test_suite Reduce tests (up to 64 nodes). The algorithm is still not activated by decision function (will be in the near future). This commit was SVN r13657.	2007-02-14 22:38:38 +00:00
Jelena Pjesivac-Grbovic	b52dc9e427	Modifying fixed decision function for reduce to utilize linear algorithm only for really small communicator sizes. This commit was SVN r13597.	2007-02-10 00:31:10 +00:00
Jelena Pjesivac-Grbovic	6efca498ec	Fixes trac:692 in trunk: receive buffer in MPI_Reduce operation is no longer overwritten on non-root nodes. This commit was SVN r13538. The following Trac tickets were found above: Ticket 692 --> https://svn.open-mpi.org/trac/ompi/ticket/692	2007-02-07 18:57:03 +00:00
Jeff Squyres	c91fcd7fbd	Fix a bunch of minor typos submitted by Bernhard Fischer. This commit was SVN r13505.	2007-02-06 12:00:30 +00:00
Jelena Pjesivac-Grbovic	e193d625bc	Bugfix for ring allreduce algorithm. The step used to iterate through buffer was function of true_extent instead of extent. This may or may not solve ticket #689 because I am still getting failures over btl mx, but I cannot reproduce failures over mtl mx nor tcp. This commit was SVN r13459.	2007-02-02 02:44:16 +00:00
Brian Barrett	93a2f31932	Use a recursive halving communication algorithm similar to the one used by MPICH2 for "small" commutative operations in the reduce_scatter basic implementation. "small" is currently pretty big, as it doesn't take much to beat reduce/scatterv. Need to do much more than this for better all around performance of MPI_Reduce_scatter, but this was enough to solve the problems I was having. This commit was SVN r13348.	2007-01-29 19:29:35 +00:00
Jelena Pjesivac-Grbovic	33dcb4f810	Minor change to linear alltoall algorithm: - post isends in reverse order of posting irecvs. if the messages arrive approximately in order, this should minimize the time spent in matching the requests. I did not see any performance difference over MX up to 64 nodes, but the change makes sense and may have some impact when we have (many) more nodes. This commit was SVN r13337.	2007-01-26 21:59:31 +00:00
George Bosilca	6f720f0d26	Add all required explicit conversions in order to be able to build on Windows. This commit was SVN r13264.	2007-01-24 00:48:16 +00:00
Jelena Pjesivac-Grbovic	5cbcf42dc3	Removing yet another unsed variable (missed it in previous submit). This commit was SVN r13259.	2007-01-23 21:30:57 +00:00
Jelena Pjesivac-Grbovic	afbd032ff9	Removing compiler warnings about comparison of unsigned values to signed ones, and unused variables. This commit was SVN r13258.	2007-01-23 21:10:07 +00:00
Jelena Pjesivac-Grbovic	568477ade8	Adding new Allreduce algorithms, updating allreduce decision function, and cleaning up util. - Allreduce algorithms: - Recursive doubling is used for small messages (up to 10KB) and can be used for both commutative and non-commutative operations. Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes. - Ring algorithms performs well for larger messages but cannot be used for non-commutative operations. It passed the same tests as recursive doubling, except some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c (which was expected). - MPI_Allreduce with new decision function passed all of the tests mentioned above. - Cleaning up coll_tuned_util. Moving isendrecv to static inline just like sendrecv. This commit was SVN r13252.	2007-01-23 01:19:11 +00:00
George Bosilca	242292673a	sendrecv is a static inline. This commit was SVN r13237.	2007-01-22 05:50:23 +00:00
Sven Stork	862dcb1a34	- fix compiler warning in ia64 This commit was SVN r13212.	2007-01-19 14:48:47 +00:00
Jelena Pjesivac-Grbovic	85192c01b0	Modifying util functionality: - removing static qualification on ompi_coll_tuned_sendrecv - adding ompi_coll_tuned_isendrecv function which posts isend and irecv requests These changes are separate from but necessary for new algorithms I am working on. This commit was SVN r13161.	2007-01-17 21:29:13 +00:00
Jelena Pjesivac-Grbovic	d2921a9d42	Cleanup of Barrier implementation: - utilizing coll_tuned_util functions - setting line length to 80. This implementation uses standard send messages (instead of synchronous ones). The change improved our performance over MX multiple number of times, however, there exists a small potential that last message to be sent can be delayed (until next mpi call, which means potentially infinitely). If this shows to be a problem, I will modify the algorithms to use synchronous send as last operation (which will incur performance penalty again). This commit was SVN r13071.	2007-01-10 22:49:43 +00:00
Jelena Pjesivac-Grbovic	ccc3ee0b6b	Minor changes to allgather implementation with some clean-up of util code. - in allgather algorithms I replaces irecv-isend-waitall sequence with call to ompi_coll_tuned_sendrecv - most of the functions in util code and allgather decision function conform to 80 character line width. - This commit was SVN r13069.	2007-01-10 21:56:59 +00:00
Brian Barrett	a34e67d743	Remove unneeded PARAM_INIT_FILE variable in configure.params files used by components that use configure.m4 for configuration or are always built. The macro has not been needed since moving to configure types other than configure.stub Fixes trac:590 This commit was SVN r13031. The following Trac tickets were found above: Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590	2007-01-08 03:44:22 +00:00
Jelena Pjesivac-Grbovic	eae3df4904	Updated broadcast decision function based on MX results up to 64 nodes. (The previous decision function did not consider binomial algorithm (since we did not have it at the time)). This commit was SVN r13007.	2007-01-06 00:37:40 +00:00
Brian Barrett	936fdd2ae1	remove some code that accidently came in with r12974. Refs trac:587 This commit was SVN r12991. The following SVN revision numbers were found above: r12974 --> open-mpi/ompi@27cea44a9c The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-04 20:17:07 +00:00
Brian Barrett	27cea44a9c	Fix a number of issues with the ompi_ptr_t: * Make sure that the pval always writes to the correct portion of the lval. This only matters on 32 bit big endian machines. * On 32 bit machines when assigning to pval, the other 4 bytes of lval weren't being written, which could lead to bogus data We use macros so that there aren't casts all over the code and the pval assignment can occur to the correct 4 bytes. Refs trac:587 This commit was SVN r12974. The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-03 19:47:48 +00:00
Jelena Pjesivac-Grbovic	3494e1bb05	- Updated decision function for Alltoall collective. Fixes "jump" for intermediate sizes message on 24+ number of nodes (at least on Grig cluster). This commit was SVN r12920.	2006-12-22 19:59:17 +00:00
George Bosilca	b1725e02d4	No more warnings plus some code reordering. This commit was SVN r12919.	2006-12-21 22:42:15 +00:00
Jelena Pjesivac-Grbovic	f1aec23507	Adding tuned allgather implementation. It contains four algorithms: Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps), and Neighbor Exchange (P/2 steps for even number of processes). All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes. The fixed decision function is based on results collected over MX on the Grig cluster at the University of Tennessee at Knoxville. I have also added (and commented out) copy of MPICH2 decision function for allgather (from their IJHPCA 2005 paper). This commit was SVN r12910.	2006-12-21 18:40:02 +00:00
Brian Barrett	6f8b366acb	Rename liborte to libopen-rte and libopal to libopen-pal per telecon today and bug #632. Refs trac:632 This commit was SVN r12762. The following Trac tickets were found above: Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632	2006-12-05 18:27:24 +00:00
Ralph Castain	6d6cebb4a7	Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things). Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it. I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn). This commit was SVN r12597.	2006-11-14 19:34:59 +00:00
George Bosilca	ec410644ce	Implement the send receive as 2 non blocking operations. That will help us avoiding too many calls to opal_progress. This commit was SVN r12553.	2006-11-10 23:06:19 +00:00
George Bosilca	c2c6a1b37e	Correctly compute the number of elements in a segment. For broadcast send the correct size for all intermediary nodes. This commit was SVN r12552.	2006-11-10 23:04:50 +00:00
George Bosilca	7102147b9f	Correctly detect when the specified algorithm is out of range. In this case we reset it to zero. This commit was SVN r12551.	2006-11-10 21:47:07 +00:00
George Bosilca	af68171253	Use the macro to compute the number of elements in a segment in both bcast and reduce and update the default values for the variables as required by the comment in the coll_tuned.h file. This commit was SVN r12546.	2006-11-10 20:04:08 +00:00
George Bosilca	476b922074	Updates & upgrades: - consistent arguments checking (not allowing to select an algorithm which is not available) - consistent way of computing the segcount (number of datatypes by segment). - small cleanups. - more informative debugging messages. This commit was SVN r12545.	2006-11-10 19:54:09 +00:00
George Bosilca	77ef979457	New architecture for broadcast. A generic broadcast working on a tree description. Most of the bcast algorithms can be completed using this generic function once we create the tree structure. Add all kind of trees. There are 2 versions of the generic bcast function. One using overlapping between receives (for intermediary nodes) and then blocking sends to all childs and another where all sends are non blocking. I still have to figure out which one give the smallest overhead. This commit was SVN r12530.	2006-11-10 05:53:50 +00:00
George Bosilca	1d80f685b5	Remove one compiler warning. This commit was SVN r12520.	2006-11-09 20:08:43 +00:00
George Bosilca	73eec4bfef	Show the MCA parameter coll_base_verbose only if Open MPI is compiled in debug mode. Otherwise there is no debug anyway ... This commit was SVN r12516.	2006-11-09 19:02:32 +00:00
George Bosilca	a82ce427e4	Update the number of reduce algorithms available. This commit was SVN r12503.	2006-11-08 22:20:34 +00:00
George Bosilca	0914892044	Small cleanups, some explicit casts. This commit was SVN r12494.	2006-11-08 16:54:03 +00:00
George Bosilca	74d3946342	Remove the call to set_args. This is only required for the MPI level, because there we have to be able to return to the user the description of the data. This commit was SVN r12493.	2006-11-08 16:52:48 +00:00
Jeff Squyres	427c20af0d	Use a new algorithm for allgatherv. The old algorithm essentially did N gatherv's: for (i = 0 ... size) MPI_Gatherv(..., root = i, ...) The new algorithm simply does (effectively): MPI_Gatherv(..., root = 0, ...) MPI_Bcast(..., root = 0, ...) This commit was SVN r12469.	2006-11-07 18:07:55 +00:00
George Bosilca	8529238d93	Add 2 more algorithms to the dynamic list. This commit was SVN r12415.	2006-11-02 19:19:08 +00:00
George Bosilca	393657ee26	Initialize the sndbuf in all cases. Do not forget to initialize the tree used in each of the broadcast functions. This commit was SVN r12332.	2006-10-27 00:13:33 +00:00
George Bosilca	126a68dc9a	Big datatype commit. Remove all unused features of the datatype engine. As the memory allocation logic is completely done outside the data-type engine (in the PML) there is no need for any special case inside the data-type engine. There is less arguments for the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is not required anymore as there is no memory allocated in the engine itself). This change affect all components using datatypes. I test most of them, but it might happens that I miss some ... If it's the case please let me know (don't shoot the pianist!!). This commit was SVN r12331.	2006-10-26 23:11:26 +00:00
George Bosilca	ba3c247f2a	Big collective commit. I lightly test it, but I think it should be quite stable. Anyway, the default decision functions (for broadcast, reduce and barrier) are based on a high performance network (not TCP). It should give good performance (really good) for any network having the following caracteristics: small latency (5 microseconds) and good bandwidth (more than 1Gb/s). + Cleanup of the reduce algorithms, plus 2 new algorithms (binary and binomial). Now most of the reduce algorithms use a generic tree based function for completing the reduce. + Added macros for computing the trees (they are used for bcast and reduce right now). + Allow the usage of all 5 topologies. + Jelena's implementation of a binary tree that can be used for non commutative operations. Right now only the tree building function is there, it will get activated soon. + Some others minor cleanups. This commit was SVN r12326.	2006-10-26 22:53:05 +00:00
George Bosilca	99631ccf66	Cleanups. This commit was SVN r12272.	2006-10-23 22:29:17 +00:00
George Bosilca	d7d3f9e486	Tuned collectives works only for at least 2 processes. We have the self module for the other cases. This commit was SVN r12271.	2006-10-23 22:28:56 +00:00
George Bosilca	b848a5ad06	Remove all ompi_coll_chain_t references. This commit was SVN r12269.	2006-10-23 21:47:50 +00:00
George Bosilca	39cd8d3d17	One to rule them all. We only need one topology information: a tree. How we build it it's hat make the difference. This commit was SVN r12268.	2006-10-23 21:46:30 +00:00
George Bosilca	9cf3040e5f	Allocate enough memory for the reduce operation when MPI_IN_PLACE is specified. This commit was SVN r12260.	2006-10-23 17:51:36 +00:00
George Bosilca	6b697ad3dd	If the operation is not commutative then force the basic reducve algorithm. The others cannot be used for non commutative operations ... yet ... This commit was SVN r12241.	2006-10-20 22:11:44 +00:00
George Bosilca	a7b6078b73	No more segfault. Still some wrong data around ... This commit was SVN r12238.	2006-10-20 20:17:34 +00:00
George Bosilca	02759cf515	Update the reduce chain collective. This commit was SVN r12237.	2006-10-20 19:47:52 +00:00
George Bosilca	06563b5dec	Last set of explicit conversions. We are now close to the zero warnings on all platforms. The only exceptions (and I will not deal with them anytime soon) are on Windows: - the write functions which require the length to be an int when it's a size_t on all UNIX variants. - all iovec manipulation functions where the iov_len is again an int when it's a size_t on most of the UNIXes. As these only happens on Windows, so I think we're set for now :) This commit was SVN r12215.	2006-10-20 03:57:44 +00:00
George Bosilca	527bb7a197	Remove a double ; This commit was SVN r12213.	2006-10-20 03:28:51 +00:00
George Bosilca	caefd6d0ee	Do not leak memory. Allocate the intermediary buffer only when we really need it (not leafs) and release on the same way. This commit was SVN r12200.	2006-10-19 22:20:33 +00:00
George Bosilca	26b33ec2d7	If there is just one node, we don't need a decision function, just do the copy and return. This commit was SVN r12199.	2006-10-19 22:19:36 +00:00
George Bosilca	3eb2f90ceb	For the recurvise doubling correctly compute the closest power of 2 number of nodes. This commit was SVN r12191.	2006-10-19 17:14:57 +00:00
George Bosilca	041fcb8d18	Update the barrier decision function. This commit was SVN r12190.	2006-10-19 17:14:01 +00:00
George Bosilca	c9da782804	Keep only one function to get the size of a datatype. This commit was SVN r12170.	2006-10-18 17:33:01 +00:00
George Bosilca	21ade43b96	Remove a non reacheable statement. This commit was SVN r12166.	2006-10-18 16:43:55 +00:00
George Bosilca	be27ee6fa0	Correct the bcast problem where we always did a bcast with segzise of 0. Activate the reduce decision function. Others small updates (mostly TAB to spaces). This commit was SVN r12161.	2006-10-18 02:00:46 +00:00
George Bosilca	8852c00c36	Look like a big commit but in fact it address only one issue. The way we're working with size and diplacement of data-type. After this patch all data can contain size_t bytes and the displacements are defined as ptrdiff_t. All of the files I was able to compile have been modified to match this requirement. This commit was SVN r12146.	2006-10-17 20:20:58 +00:00
Jeff Squyres	a8e9fa09da	Fix some compiler warnings introduced in r11619. I checked with George: ompi_ddt_type_size() returns a signed int only because of the MPI spec; it will never return a negative value. So casting the return value out of it to a (uint32_t) is safe, and makes the comparisons be between two unsigned values. This commit was SVN r11639. The following SVN revision numbers were found above: r11619 --> open-mpi/ompi@8667648a1b	2006-09-13 16:42:31 +00:00
Graham Fagg	8667648a1b	Simple fix (for ticket 363). We push segment size to type size. In other algorithms we switch of segementing altogether. But really the DDT can probably handle partial types so we could really keep the segsize constant (for all but reduce ops) and treat it just as byte arrays.. todos: macroize it as we do it 10 different ways, add mca params to control handling (push up size, no change, switch off segmenting) This commit was SVN r11619.	2006-09-12 00:01:27 +00:00
Jeff Squyres	fb4d7ab268	* Fix svn:ignore * Remove files that should not be in SVN This commit was SVN r11565.	2006-09-08 10:35:45 +00:00
George Bosilca	3b39df8ae1	More protection around what we really want to get exported. This commit was SVN r11437.	2006-08-27 04:49:02 +00:00
Sami Ayyorgun	aa8cd63418	changed some barrier variables for shared-memory to volatile This commit was SVN r11403.	2006-08-24 16:53:10 +00:00
Torsten Hoefler	6b22641669	added LibNBC (http://www.unixer.de/NBC ) as collv1 (blocking) component. I know it does not make much sense but one can play around with the performance. Numbers are available at http://www.unixer.de/research/nbcoll/perf/. This is the first step towards collv2. Next step includes the addition of non-blocking functions to the MPI-Layer and the collv1 interface. It implements all MPI-1 collective algorithms in a non-blocking manner. However, the collv1 interface does not allow non-blocking collectives so that all collectives are used blocking by the ompi-glue layer. I wanted to add LibNBC as a separate subdirectory, but I could not convince the buildsystem (and had not the time). So the component looks pretty messy. It would be great if somebody could explain me how to move all nbc{c,h}, and {hb,dict}{c,h} to a seperate subdirectory. It's .ompi_ignored because I did not test it exhaustively yet. This commit was SVN r11401.	2006-08-24 16:47:18 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
George Bosilca	6afa4c6c64	Windows friendly version. We have to split the OMPI_DECLSPEC in at least 3 different macros, one for each project. Therefore, now we have OPAL_DECLSPEC, ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project. This commit was SVN r11270.	2006-08-20 15:54:04 +00:00
Brian Barrett	4c101c6394	* rename the collectives sm bootstrap area to be consistent with other shared memory segments * make sure to properly unlink the collectives sm bootstrap area at shutdown * Add missing / in the path for the mpool shared memory segment * make sure to release the common_mmap structure in the SM btl after unlinking the file during shutdown This commit was SVN r10886.	2006-07-19 20:55:29 +00:00
George Bosilca	ee6fab783d	SwitchToThread is not defined by any library. Not even by the kernel32.lib as noted in the MSDN documentation. At least not on my WinXP Pro box. This commit was SVN r10719.	2006-07-11 05:36:04 +00:00
Graham Fagg	f10c21b746	corrected mca param description and algorithm count (now to find out why I have disallowed direct calling fo the bm tree) This commit was SVN r10603.	2006-06-30 23:22:49 +00:00
Graham Fagg	f64cbbe8f2	ops. some decisions used extent rather than size for decision making yes this means it WAS possible for two nodes to choice two different algorithms (discovered by Doug Gregor and figured out by George) Also changed some names like size to comsize so we know which sizes we are using where This should be updated in al versions This commit was SVN r10601.	2006-06-30 21:49:04 +00:00
George Bosilca	29219ee57d	Thanks to Gleb now we are able to call the schduler on Windows. Instead of using sched_yield, we use our friend SwitchToThread. This commit was SVN r9671.	2006-04-20 19:56:50 +00:00
Graham Fagg	c31a5ad4b3	A few small changes that just expanded in the name of neatness... (1) As pointed out by Torsten after Jeff comment that there are 15 collectives yesterday.. nope.. I have 16 but miss counted them in my ifdefs (I had two #11s). Replaces with enum... (2) Added a readonly MCA param for how many backend algorithms are available per collective (used by benchmarker/STS) This allowed me to remove the tuned query internal functions and replace them with ompi_coll_tuned_forced_max_algorithms[COLL]. (3) I was reading the user forced MCA params for the collectives on each comm create (module init) but I then put the values into a global set of variables (like ompi_coll_tuned_reduce_forced_algorithm). To fix this and make the code neater: (a) The component looks up the MCA param indices on Open if dynamic_rules is set via the ompi_coll_tuned_COLLECTIVE_intra_check_forced_init () call. (b) Got rid of the ompi_coll_ompi_coll_tuned_COLLECTIVE_forced_algorithm/segmentsize/etc globals with a struct that is now cached on the module data hung off the communicator. i.e. done right. (c) On module init if dynamic rules enabled we call a general getvalues routine (in coll_tuned_forced.c) to get the CURRENT values using the MCA param indices and then put them on the modules data segment. A shorter version of getvalues exists for barrier which only needs the algorithm choice This commit was SVN r9663.	2006-04-19 23:42:06 +00:00
Tim Woodall	bd870519fd	- modified convertor copy_and_prepare routines to accept an addition flag, new flags to be included when convertor is initialized - modified pml/btl module defs and added stub functions for diagnostic output routines to dump state of queues / endpoints - updates to data reliability pml This commit was SVN r9329.	2006-03-17 18:46:48 +00:00
Jeff Squyres	8a9e76dfa3	Thanks to Sven for noticing that the increment in scatter should be per the send datatype, not the receive datatype (MPI-1:105). This commit was SVN r9312.	2006-03-16 18:18:28 +00:00
Graham Fagg	95b060c741	output the right name and stop confusing george This commit was SVN r9215.	2006-03-08 00:40:14 +00:00
George Bosilca	39252b764f	Correctly compute the size of the datatype. This commit was SVN r9127.	2006-02-23 04:30:52 +00:00
George Bosilca	805c45de29	Don't let a division by zero happens ... This commit was SVN r9109.	2006-02-22 06:34:05 +00:00
Brian Barrett	566a050c23	Next step in the project split, mainly source code re-arranging - move files out of toplevel include/ and etc/, moving it into the sub-projects - rather than including config headers with <project>/include, have them as <project> - require all headers to be included with a project prefix, with the exception of the config headers ({opal,orte,ompi}_config.h mpi.h, and mpif.h) This commit was SVN r8985.	2006-02-12 01:33:29 +00:00
Graham Fagg	232bb9534a	Start moving stuff out of modules that should be in the component. This commit was SVN r8874.	2006-02-01 20:50:14 +00:00
Graham Fagg	5f2d82347f	a couple of changes to make barrier synchronous.. means last communication to any possible peer must be locally completing. for now using synchronous calls until the new functionality is available. then will change the code to use the new PML send flags. This commit was SVN r8867.	2006-01-31 23:21:46 +00:00
Graham Fagg	25375759c3	arrgh. reduce could for very small message sizes and proc counts call a linear function this was implemented using a chain (tree followed with pipeline) by setting the chain fanout to a factor of size etc but the chain datastructure was fixed in length and if exceeded the topo create returned a null which isn't helpfull in cid next function of comdup... Anyway two fixes, first we do have a real linear function so changed the decision function and second altered the topo chain create to force chain fanouts of less than 1 to 1 and fanouts bigger than max to max. next check in will change chain to dynamically allocd array (reallocable) but we shouldn't ever use a chain fanout for a linear tree anyway. (lession must rerun all tests for all data sizes when changing decision functions) This commit was SVN r8662.	2006-01-08 02:41:09 +00:00
George Bosilca	479d510eaf	Use the common SM component to unmap the shared memory file. This commit was SVN r8623.	2005-12-31 15:07:48 +00:00
Jeff Squyres	54c4bd3ce2	Update to have public symbols be consistent; use new prefix rule (apparently we've been doing this in opal and orte, but not in ompi yet). All public symbols begin with "ompi_coll_tuned_" (not mca_coll_tuned_) except the component struct. Now this component passes the illegal symbol report with no hits. This commit was SVN r8589.	2005-12-22 13:49:33 +00:00
Jeff Squyres	2435970cb8	Enable the new "tuned" coll component in an attempt to get wider testing. Note that this effectively replaces the "basic" component as the baseline collective component. Please report any problems with this component. If you run into problems with this component, you can disable it with: --mca coll_tuned_priority 0 This commit was SVN r8575.	2005-12-21 12:43:03 +00:00
Brian Barrett	a5af07cd6b	fixes suggested by Ralf for supporting both Libtool 1 and 2 in Open MPI... This commit was SVN r8538.	2005-12-19 03:10:23 +00:00
Graham Fagg	8651658816	minor compile warnings fix This commit was SVN r8497.	2005-12-14 19:09:46 +00:00
George Bosilca	6f45b6175a	Header protection. This commit was SVN r8441.	2005-12-10 22:11:10 +00:00
George Bosilca	79486e5922	Protect the min function on Windows as it's defined by default in windows.h This commit was SVN r8437.	2005-12-10 22:02:14 +00:00
George Bosilca	b7353c707d	Remove unprotected header files. This commit was SVN r8432.	2005-12-10 17:04:46 +00:00
Graham Fagg	141d4ea30a	cleaned up ready for changes to move cached data off the MCW module to the component (where they belong) This commit was SVN r8407.	2005-12-08 03:14:57 +00:00
Jeff Squyres	6fbd321442	Fix a bunch of install locations for header files This commit was SVN r8406.	2005-12-08 00:54:44 +00:00
Jeff Squyres	5f949b567d	forgot to commit this long ago This commit was SVN r8378.	2005-12-03 15:38:42 +00:00
Edgar Gabriel	83cef7f8ac	in scatterv we tested unfortunatly the wrong datatype for the displacement (for both the inter and intra-communicator version). The displacements in scatterv are given in multiples of the sendtype. This fix should probably make to v1.0.1 as well? This commit was SVN r8251.	2005-11-23 21:42:45 +00:00
George Bosilca	1aa6d27ffe	Remove all the compilation warnings I found including unused variables and functions. This commit was SVN r8226.	2005-11-22 03:42:15 +00:00
Brian Barrett	20cea60b82	* fix "make distclean" error in PML * turns out (duh!) that there was a reason that the <projectdir>dir variable was set in the AM conditional. If not, stupid directories are created and not needed... duh. This commit was SVN r8205.	2005-11-20 07:41:09 +00:00
Brian Barrett	8faa1884f0	* The last of the build system optimizations. Combine the component and component/base Makefile.am files, reducing the time configure spends stamping out Makefiles at the end * Install base_impl.h file when devel-headers are being installed This commit was SVN r8200.	2005-11-20 01:03:01 +00:00
George Bosilca	2749870f2c	I'm in for the tuned collective module :) This commit was SVN r8146.	2005-11-13 23:04:14 +00:00
George Bosilca	d8d13f879f	When --disable-debug is specified we have to explicitly include the optl/util/output.h header. This commit was SVN r8133.	2005-11-12 04:03:19 +00:00
Graham Fagg	877f7bbe6a	File based dynamic up and tested... Lots of misc fixes: printfs->opal_output, handles fanin/out correctly for forced ops unused vars, correct calculations on meaning of 'msgsize' for decision functions (varies depending on algorithm), etc This commit was SVN r8113.	2005-11-11 04:49:29 +00:00
George Bosilca	3507d5e9cd	opal/util/output.h is required for optimized builds. This commit was SVN r8076.	2005-11-10 01:19:27 +00:00
Graham Fagg	6b99301893	extra verbose in debug mode to help occ This commit was SVN r8061.	2005-11-09 21:01:35 +00:00
Graham Fagg	bcf8744bf6	valgrind saved me from a nasty order of eval error... i.e. derefing slected_data before setting it. Anyway fixed and no memory leaks in coll tuned so far. This commit was SVN r8037.	2005-11-08 04:52:30 +00:00
Graham Fagg	5b3ba944a8	Enabled, and running... todos. turn the debug messages into ompi ignorables and inot do some ops in ompi_bug mode This commit was SVN r8036.	2005-11-08 04:43:17 +00:00
Graham Fagg	833b558046	Full configuration file based control of tuned collectives. (verbose on bad config file and even cleans up after itself enought to make valgrind happy). This commit was SVN r8035.	2005-11-08 03:36:38 +00:00
Graham Fagg	39207db7cd	removed the n-dimmension rule base.. replacing it was simpler code for V1 This commit was SVN r8033.	2005-11-08 03:03:51 +00:00
Graham Fagg	dcd3450e06	simplified the building of different rule sets (also corrected some prototypes missing 'struct') This commit was SVN r8003.	2005-11-06 22:05:50 +00:00
Jeff Squyres	42ec26e640	Update the copyright notices for IU and UTK. This commit was SVN r7999.	2005-11-05 19:57:48 +00:00
Graham Fagg	9547a635a9	snapshot while switching systems but, dynamic rules from a user defined config file is almost there now This commit was SVN r7943.	2005-11-01 00:19:05 +00:00
Graham Fagg	fe03e068f2	allow forced algorithms (where the user or test suite knows better) to go through the dynamic decision rule interface. (forced algorithms are set with MCA params) fixed some silly verbose output with wrong func name in it etc updates to fixed dec rules. This commit was SVN r7940.	2005-10-31 20:45:50 +00:00
Edgar Gabriel	2ec5fa5d24	- The component will remove itself from the list of potential collective modules, if its priority is zero (the default value). Reason for that is + if there is no other module with a priority > 0, the hierarchical collective module has a problem anyway, since it has to rely on the coll modules of the subcommunicators. On the other hand, if its priority is zero, it won't be chosen anyway, and we can simply save the allreduce/allgather and comm_split operations which might occur during hierarchy detection. + to improve the startup times until we have the modex thing which we discussed with Jeff and Tim in Knoxville in place - adding an mca parameter indicating a symmetric configuration. This can speed up startup times, since each process can conclude from its data onto the data of the other processes -> no need for the allreduce operations. Per default this parameter is set to "no". This commit was SVN r7932.	2005-10-30 16:01:13 +00:00
Graham Fagg	5bb0d4a053	enable allreduce to be selected This commit was SVN r7888.	2005-10-26 23:55:37 +00:00
Graham Fagg	2587d7ade9	added some more linear functions. minor corrections on naming and debug info This commit was SVN r7887.	2005-10-26 23:51:56 +00:00
Graham Fagg	c3e1dc410d	Started to add basic linear functions Also started to add the allreduce algorithms as I test them (i.e. if it goes in its after testing from now on) This commit was SVN r7886.	2005-10-26 23:11:32 +00:00
Jeff Squyres	d47ce065e9	Minor Makefile.am fix for static builds. This commit was SVN r7882.	2005-10-26 15:57:58 +00:00
Edgar Gabriel	ba3bf6592f	fixing some warnings. No idea yet why the static builds fail... This commit was SVN r7879.	2005-10-26 12:56:56 +00:00
Edgar Gabriel	d009d8de57	opening the hierarchical collective component to the public. I am at this stage fairly confident that - it works in most scenarious (with symmetric hierarchies, with asymmetric hierarchies, wihout hierarchies - it just removes itself) - it does not create too many problems (I am not aware of any at least) - it does not slow down startup anymore dramatically (thanks to the fixes of Brian, Jeff, Tim and a significant reduction in the number of collective operations in the comm_query) Any feedback is highly welcome. This commit was SVN r7868.	2005-10-25 18:38:43 +00:00
Edgar Gabriel	00c04ab56a	moving the hierarch collective component to the new parameter registration interface. This commit was SVN r7867.	2005-10-25 18:34:47 +00:00
Graham Fagg	382f05c7ad	Infastructure changes. started to add static (fixed if) statement based decision rules based on gigE numbers added mca params so that a user can force a certain algorithm/segment/topo on a per collective basis (this is not in the fixed call path but only in the dynamic (at com create) call path). (these params can be used by test suites such as OCC to choice which algorithm they are using). This commit was SVN r7854.	2005-10-25 03:55:58 +00:00
Graham Fagg	d8e32464cb	ops. setting/reading mca option from the right varible would help. This commit was SVN r7850.	2005-10-24 21:33:48 +00:00
Edgar Gabriel	3a7efaf4d9	fix for reduce and allreduce for an unsymmetric case This commit was SVN r7802.	2005-10-18 19:20:48 +00:00
Edgar Gabriel	818b4af554	- reverting the logic in the hierarchy detection stuff. This can reduce the number of collective operations and simplifies the logic significantly. - introducing a special case if size of comm == 1, avoiding thus collective operations as well ( i.e. no need for hierarchies) - fix for an unsymmetric case. Still to be tested. This commit was SVN r7799.	2005-10-18 18:17:50 +00:00
Brian Barrett	1302cb4072	The next in a long line of crazed build system changes from Brian. This was originally suggested by Ralf Wildenhues, to try to speed autogen, configure, and make (and possibly even make install). Use automake's include directive to drastically reduce the number of Makefile files (although the number of Makefile.am files is the same - most are just included in a top-level Makefile.am). Also use an Automake SUBDIRs feature to eliminate the dynamic-mca tree, which was no longer really needed. This makes adding a framework easier (since you don't have to remember the dynamic-mca tree) and makes building faster (as make doesn't have to recurse through the dynamic-mca tree) This commit was SVN r7777.	2005-10-17 00:21:10 +00:00
Edgar Gabriel	7e45f64065	reduce has now been tested quite extensively for all (predefined) operations and for all root nodes and passed all tests. First cut on barrier (which from my perspective does not make sense from the performance point of view) and on allreduce (which might make sense), This commit was SVN r7774.	2005-10-15 22:24:44 +00:00
Edgar Gabriel	3fab9c628c	switching the root and creating (if necessary) the new local leader sub-communicators seems to work as well. Thoroughly tested with bcast, not yet that exhaustivly tested for the reduction. This commit was SVN r7773.	2005-10-15 21:13:44 +00:00
Edgar Gabriel	7d34770456	further bugfixes. The hierarchy detection works now as far as I can see (even in unsymmetric sitations). Bcast and reduce work as well. Still to test: the code which generates new local leader communicators, in case the root of the operation is not yet part of the lleader comm. This commit was SVN r7772.	2005-10-15 19:36:54 +00:00
Edgar Gabriel	63554d245f	further bugfixes This commit was SVN r7771.	2005-10-15 18:44:57 +00:00
Edgar Gabriel	92c7b77cbc	minor bug fixes This commit was SVN r7770.	2005-10-15 18:32:40 +00:00
Edgar Gabriel	ba163c611c	checkpoint before moving to a real cluster. Most of the recoding should be done. This version also doesn't break ompi (at least if its not chosen :-) ). New features compared to the version from last Thursday (where bcast and reduce seemed to work in most scenarios): - clearer internal infrastructure - ability to handle all root processes with a (hopefully) minimal number of local leader communicators. This commit was SVN r7769.	2005-10-15 17:04:01 +00:00
Edgar Gabriel	84c070fc0f	get rid of the different modes how to store the colorarray for now. Might be reintroduced later as an optimization. This commit was SVN r7762.	2005-10-14 18:11:21 +00:00
Edgar Gabriel	6d14440972	checkpoint for moving again to another machine. major rewrite to clean up internal interfaces in progress. This commit was SVN r7761.	2005-10-14 17:41:44 +00:00
Edgar Gabriel	770aeaf97b	modifications towards adding new local-leader communicators. This commit was SVN r7760.	2005-10-14 12:18:29 +00:00
Graham Fagg	636b42afff	handle non existant recv buf in reduce for non root processes (basic allreduce does this for mpi_in_place case) This commit was SVN r7759.	2005-10-14 00:00:37 +00:00
Graham Fagg	61b8218d76	MPI_IN_PLACE fix for reduce. (actually a work around for an optimisation in the reduce for not saving ops on the first recv of each segment) Minor change in topo. This commit was SVN r7758.	2005-10-13 23:38:21 +00:00
Edgar Gabriel	48f2563b4c	checkpoint. Moving to another machine. This commit was SVN r7757.	2005-10-13 20:04:26 +00:00
Edgar Gabriel	4b05359b16	minor fixes when freeing the component This commit was SVN r7756.	2005-10-13 18:22:16 +00:00
Edgar Gabriel	0a5a346bbb	first cut on the reduce operation. This commit was SVN r7755.	2005-10-13 17:58:13 +00:00
Edgar Gabriel	30af775d40	further fixes. The first hierarchical MPI_Bcast works! Its just ~ 100 times slower then basic at the moment :-) This commit was SVN r7754.	2005-10-13 17:34:42 +00:00
Edgar Gabriel	460b5cb840	further corrections to the hierarchy detection algorithms. It seems to work now as far as my tests show... This commit was SVN r7753.	2005-10-13 16:21:13 +00:00
Edgar Gabriel	f5d16419b2	fix in the logic regarding protocol detection. This commit was SVN r7749.	2005-10-13 15:07:35 +00:00
Edgar Gabriel	3e5ad3e681	Updates This commit was SVN r7738.	2005-10-12 20:56:29 +00:00
Edgar Gabriel	25518b63c5	first version of coll_hierarch which does not crash the rest of the library as long as its not selected :-) This commit was SVN r7707.	2005-10-11 22:05:24 +00:00
Edgar Gabriel	0675c22dab	updating with Jeff's help to the recent autogen/configure system This commit was SVN r7705.	2005-10-11 21:50:16 +00:00
Edgar Gabriel	7b07dbc163	another round of fixes. Unfortunatly, I also have to provide a trivial version of reduce and gather to make all this work.... This commit was SVN r7702.	2005-10-11 21:26:07 +00:00
Edgar Gabriel	c8adc2e65e	coding around the collective operations This commit was SVN r7698.	2005-10-11 20:34:17 +00:00
Edgar Gabriel	083d0b9630	Checkpoint: most of the coding should be done for the basic infrastructure. This commit was SVN r7696.	2005-10-11 19:45:21 +00:00
Graham Fagg	607bdf51b6	Last Cleanup BEFORE adding last two methods and final cross over points. - new mca param calls - move printfs to OPAL_OUTPUT This commit was SVN r7692.	2005-10-11 18:51:03 +00:00
Edgar Gabriel	b42d4ac780	Checkpoint: - update the hierarch stuff to use btl's instead of ptl's - start the new logic regarding how to handle local leader communicators This commit was SVN r7691.	2005-10-11 17:29:59 +00:00
Jeff Squyres	b22fab2826	Fix for a bug Galen noticed yesterday -- make the shared memory only be allocated the first time a sm coll is selected for a communicator, not before. This commit was SVN r7647.	2005-10-06 13:17:27 +00:00
Jeff Squyres	83b5a675f9	Don't automatically take the first entry off the selected component list; be sure to check its priority against the basic component and take the one with the higher priority. This commit was SVN r7621.	2005-10-04 17:09:45 +00:00
Jeff Squyres	b17c4334c4	- Remove all vestigates of using the built-in mcb_tree from the reduce_inorder() function -- we don't use the tree at all. - Add more relevant "volatile"'s for the control buffers in the fragment mpool (and associated casts where necessary) This commit was SVN r7616.	2005-10-04 14:52:59 +00:00
Jeff Squyres	c7fe54ba44	- Remove some silly compiler warnings - Move the "process 0" logic out of the main loop in reduce to make the code a bit less complex (at the price of slight code duplication, but it iss now significantly easier to read) - Fix problem with uniquenes guarantee in the bootstrap mpool -- using the CID alone was not sufficient enough to guarantee uniquenes; now use (CID, rank 0 process name) tuple to check for uniqueness - Made a few debugging help changes in coll_sm.h; especially helps debugging on uniprocessors This commit was SVN r7599.	2005-10-03 21:34:58 +00:00
Jeff Squyres	2cedfeec53	- Eliminate some unused base globals - Move one base global to the basic component and make it an MCA parameter - Convert the basic component to use the new MCA param API This commit was SVN r7598.	2005-10-03 21:07:42 +00:00
Jeff Squyres	57fb96b018	Clarification of a help message This commit was SVN r7597.	2005-10-03 21:06:13 +00:00
Jeff Squyres	ab099fa8cb	Re-indent; real commit with some changes coming shortly. This commit was SVN r7596.	2005-10-03 19:56:39 +00:00
Jeff Squyres	10064df0e9	Remove compiler warning This commit was SVN r7578.	2005-10-02 10:43:53 +00:00
Jeff Squyres	37fc944b01	Use the right number of segments per in-use flag when calculating offsets. This commit was SVN r7571.	2005-09-30 23:12:23 +00:00
Jeff Squyres	934caaf449	Fix at least one segv; use the right number of segments (i.e., the number o segments in the fragment pool, not in the bootstrap pool) This commit was SVN r7565.	2005-09-30 18:01:15 +00:00
Jeff Squyres	fcef1774d5	Per advice from Ralf W., change the pkgdata declarations in Makefile.am's to be a slightly more correct (and, more importantly, less error-prone) construct. This commit was SVN r7554.	2005-09-30 13:32:39 +00:00
Jeff Squyres	bc181d7130	Remove the .ompi_ignore so that everyone starts compiling this, but lower the default priority to 0 so that it's not active unless you specifically ask for it (this component needs more testing by people other than me before we unleash it on the public). This commit was SVN r7545.	2005-09-29 18:05:47 +00:00
Edgar Gabriel	67dd52efb1	making the allreduce and reduce_scatter tests pass as well This commit was SVN r7532.	2005-09-28 15:12:05 +00:00
Edgar Gabriel	dbbbd416df	fixing MPI_IN_PLACE for the log-reduce algorithm. This commit was SVN r7526.	2005-09-27 21:51:55 +00:00
Jeff Squyres	d67c31f238	Remove useless compiler warnings. This commit was SVN r7418.	2005-09-17 10:54:48 +00:00
Jeff Squyres	10d02b2110	Make sure to copy the right amount out of the temp buffer. This commit was SVN r7400.	2005-09-15 22:06:36 +00:00
Jeff Squyres	15d0a95202	- Remove extra whitespace from Makefile.am's from when we removed Makefile.options - Sample in each of the three projects of how to link againt the relevant libraries so that when components are loaded into a parent process' space, we don't rely on the libopal/liborte/libmpi symbols being in the parent's public symbol namespace -- instead, dynamically link to the relevant libraries, allowing the dynamic linker to pull those libraries in at run-time, if needed This commit was SVN r7397.	2005-09-15 20:56:18 +00:00
Jeff Squyres	3ecfe02b83	- Properly handle MPI_IN_PLACE - Return MPI_ERR_ARG, not EINVAL This commit was SVN r7391.	2005-09-15 19:33:54 +00:00
Jeff Squyres	2c1186cd19	Fix up the offsets for the non-root gatherv in the IN_PLACE case. This commit was SVN r7389.	2005-09-15 18:21:18 +00:00
Jeff Squyres	7ca22d9416	- Correct to use the right offsets - Copy back to the right location in the non-rank-0-IN_PLACE case This commit was SVN r7384.	2005-09-15 15:15:23 +00:00
Jeff Squyres	406f0575eb	- Remove useless error check - Ensure err is set to MPI_SUCCESS on the IN_PLACE case This commit was SVN r7383.	2005-09-15 15:14:00 +00:00
Jeff Squyres	cbfb062a7d	Fix silly mistake for IN_PLACE handling in scan This commit was SVN r7380.	2005-09-15 12:47:17 +00:00
Jeff Squyres	068b9c72a2	Bunches of changes - remove redundant OBJ_CONSTRUCT in bcast - fix up some macros in coll_sm.h - check to ensure that if there are too many processes in the communicator (i.e., if we couldn't fit a flag for each of them in the control segment), then fail selection - setup the in_use flags properly - adapt to new mpool API - first working copy of reduce -- not tree-baed (but still NUMA-aware), and only processes in order from process 0 to process N-1 -- do not have a tree-based and/or commutative version yet (i.e., process the results in whatever order they arrive) Reduce now passes the new ibm reduce_big.c test. Woo hoo! Time to declare success for the evening (and run the intel test tomorrow). This commit was SVN r7379.	2005-09-15 02:18:16 +00:00
Jeff Squyres	5365ae84b9	Remove extra variable. Still working with George / Edgar on reduce_log_intra(). This commit was SVN r7368.	2005-09-14 11:52:20 +00:00
Jeff Squyres	e0c47dd0bc	Fix for allreduce in IN_PLACE cases This commit was SVN r7364.	2005-09-14 02:42:32 +00:00
Jeff Squyres	0fcd682c4c	MPI-2 7.3.3 description of MPI_Allgatherv is wrong -- can't just have all processes call MPI_Gatherv(MPI_IN_PLACE...) because IN_PLACE is only allowed to be used at the root. Non-root processes must use their receive buf as the send buf. This commit was SVN r7363.	2005-09-14 02:21:33 +00:00
Graham Fagg	0f75381e56	Added various barrier routines: recursive doubling, bruck, double ring, 2proc etc all pass tests This commit was SVN r7355.	2005-09-13 20:58:42 +00:00
Jeff Squyres	5dca18f903	First cut of handling MPI_IN_PLACE: - added relevant logic for everything except mca_coll_basic_reduce_log_intra() -- need some help from George / Edgar on this one... - replaced ompi_ddt_sndrcv() with ompi_ddt_copy_content_same_ddt() where relevant - removed some "if (size > 1)" conditionals, because the self coll module will always be chosen for collectives where size==1 Waiting for BA's tests to check the validity of this IN_PLACE stuff. We'll see how it goes! This commit was SVN r7351.	2005-09-13 20:06:54 +00:00
Jeff Squyres	bd95f5d474	Arrgh -- check the right argument for IN_PLACE. This commit was SVN r7350.	2005-09-13 19:56:43 +00:00
Jeff Squyres	7c09923751	Updates: - Handle MPI_IN_PLACE - Use ompi_ddt_copy_content_same_ddt() where relevant This commit was SVN r7349.	2005-09-13 19:39:49 +00:00

... 5 6 7 8 9 ...

701 Коммитов