openmpi

Автор	SHA1	Сообщение	Дата
Jelena Pjesivac-Grbovic	9780a000ba	Cleanup of generic reduce function and possible (low probability) bug fix. - fixing line lengths and some of the comments - possible bug fix (but I do not think we exposed it in any tests so far) temporary buffers were allocated as multiples of extent instead of true_extent + (count -1) * extent. Everything is still passing Intel tests over tcp and btl mx up to 64 nodes. This commit was SVN r13956.	2007-03-08 00:54:52 +00:00
Jelena Pjesivac-Grbovic	57cbafafd5	Clean up of generic broadcast function: removing unecessary statements and improving comments. This commit was SVN r13955.	2007-03-07 21:59:53 +00:00
Jelena Pjesivac-Grbovic	0c07654c30	Updating reduce_scatter decision function based on MX results up to 64 nodes and both 1ppn and 2ppn configurations. This commit was SVN r13945.	2007-03-07 00:38:33 +00:00
Jelena Pjesivac-Grbovic	e5ed167a6e	Adding tuned version of reduce_scatter implementation. Currently 3 algorithms are available: - non-overlapping, reduce + scatterv, (works for non-commutative operations) - recursive halving algorithm (copied from basic module) - ring algorithm (similar to allreduce ring, for large messages) This commit was SVN r13929.	2007-03-05 20:40:39 +00:00
Li-Ta Lo	196e2a86bb	addes binomial tree based scatter, passed IBM and intel tests This commit was SVN r13906.	2007-03-02 23:19:02 +00:00
Li-Ta Lo	11c94cbe76	eliminated the use of MPI_Get_count This commit was SVN r13904.	2007-03-02 22:57:50 +00:00
Li-Ta Lo	3765e19d15	added ASCII graph for the topologies This commit was SVN r13892.	2007-03-02 17:17:14 +00:00
Li-Ta Lo	bd75f2f162	change ALLGATHER to GATHER This commit was SVN r13891.	2007-03-02 17:02:29 +00:00
Li-Ta Lo	c5d8c221b0	added binomial tree based Gather alogrithm, passed IBM and Intel tests This commit was SVN r13835.	2007-02-28 01:11:01 +00:00
Jelena Pjesivac-Grbovic	627533fe4a	Adding segmented ring algorithm for Allreduce for commutative operations. Algorithm allows user to specify the segment size to be used for computation/communication overlap. The additional memory requirement for the algorithm is 2 x segment size. It performed well for (really) large message sizes over MX and it passed intel Allreduce_c and Allreduce_loc_c tests. This commit was SVN r13832.	2007-02-27 20:32:30 +00:00
George Bosilca	bec20422ee	Remove the warnings about printf data-type mismatch. This commit was SVN r13804.	2007-02-26 22:20:35 +00:00
Li-Ta Lo	c860bd1be5	fixed a typo in the comment This commit was SVN r13802.	2007-02-26 19:20:46 +00:00
Li-Ta Lo	73a73b1c78	added ASCII graph on reduce_log_intra This commit was SVN r13801.	2007-02-26 19:15:37 +00:00
Bill D'Amico	db1c2a58c4	Removed cruft - unused variables causing warnings during OMPI build. This commit was SVN r13772.	2007-02-23 18:55:41 +00:00
Tim Prins	f35f67ed1c	(very) minor correction to helpfile This commit was SVN r13758.	2007-02-22 16:02:12 +00:00
Li-Ta Lo	049921a5ec	the temporary buffer is not needed for the MPI_IN_PLACE cases if the underlying Gather is implemented correctly This commit was SVN r13740.	2007-02-21 20:39:56 +00:00
Jelena Pjesivac-Grbovic	36156f39c2	Modification to allreduce ring algorithm: - the block sizes are computed in more uniformn way. The first k blocks may be 1 element larger than the remaining blocks. The algorithm passed Intel Allreduce_c and Allreduce_loc_c tests, and IMB-3.2 Allreduce, over TCP and both btl and mtl MX (up to 128 processes). The algorithm still only supports commutative operations. This commit was SVN r13738.	2007-02-21 19:30:08 +00:00
Jelena Pjesivac-Grbovic	b608887466	Adding variant of linear alltoall algorithm where the number of outstanding requests can be limited using mca parameters. The implementation passed Intel, IMB-3.2, and mpi_test_suite tests over TCP and MX up to 128 processes (64 nodes), on both 32-bit and 64-bit machines. It is not activated by default, but it should be useful for really large communicator sizes. This commit was SVN r13720.	2007-02-20 04:25:00 +00:00
Jelena Pjesivac-Grbovic	d2d02642ca	Removing compilation warnings about the output format. This commit was SVN r13693.	2007-02-16 23:32:47 +00:00
Jelena Pjesivac-Grbovic	e532b928af	Adding segmented binary reduce algorithm which works with non-commutative operations. Implementation passed intel: MPI_Reduce_c , MPI_Reduce_loc_c, and MPI_Reduce_user_c tests over TCP, BTL MX, and MTL MX, as well as, mpi_test_suite Reduce tests (up to 64 nodes). The algorithm is still not activated by decision function (will be in the near future). This commit was SVN r13657.	2007-02-14 22:38:38 +00:00
Jelena Pjesivac-Grbovic	b52dc9e427	Modifying fixed decision function for reduce to utilize linear algorithm only for really small communicator sizes. This commit was SVN r13597.	2007-02-10 00:31:10 +00:00
Jelena Pjesivac-Grbovic	6efca498ec	Fixes trac:692 in trunk: receive buffer in MPI_Reduce operation is no longer overwritten on non-root nodes. This commit was SVN r13538. The following Trac tickets were found above: Ticket 692 --> https://svn.open-mpi.org/trac/ompi/ticket/692	2007-02-07 18:57:03 +00:00
Jeff Squyres	c91fcd7fbd	Fix a bunch of minor typos submitted by Bernhard Fischer. This commit was SVN r13505.	2007-02-06 12:00:30 +00:00
Jelena Pjesivac-Grbovic	e193d625bc	Bugfix for ring allreduce algorithm. The step used to iterate through buffer was function of true_extent instead of extent. This may or may not solve ticket #689 because I am still getting failures over btl mx, but I cannot reproduce failures over mtl mx nor tcp. This commit was SVN r13459.	2007-02-02 02:44:16 +00:00
Brian Barrett	93a2f31932	Use a recursive halving communication algorithm similar to the one used by MPICH2 for "small" commutative operations in the reduce_scatter basic implementation. "small" is currently pretty big, as it doesn't take much to beat reduce/scatterv. Need to do much more than this for better all around performance of MPI_Reduce_scatter, but this was enough to solve the problems I was having. This commit was SVN r13348.	2007-01-29 19:29:35 +00:00
Jelena Pjesivac-Grbovic	33dcb4f810	Minor change to linear alltoall algorithm: - post isends in reverse order of posting irecvs. if the messages arrive approximately in order, this should minimize the time spent in matching the requests. I did not see any performance difference over MX up to 64 nodes, but the change makes sense and may have some impact when we have (many) more nodes. This commit was SVN r13337.	2007-01-26 21:59:31 +00:00
George Bosilca	6f720f0d26	Add all required explicit conversions in order to be able to build on Windows. This commit was SVN r13264.	2007-01-24 00:48:16 +00:00
Jelena Pjesivac-Grbovic	5cbcf42dc3	Removing yet another unsed variable (missed it in previous submit). This commit was SVN r13259.	2007-01-23 21:30:57 +00:00
Jelena Pjesivac-Grbovic	afbd032ff9	Removing compiler warnings about comparison of unsigned values to signed ones, and unused variables. This commit was SVN r13258.	2007-01-23 21:10:07 +00:00
Jelena Pjesivac-Grbovic	568477ade8	Adding new Allreduce algorithms, updating allreduce decision function, and cleaning up util. - Allreduce algorithms: - Recursive doubling is used for small messages (up to 10KB) and can be used for both commutative and non-commutative operations. Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes. - Ring algorithms performs well for larger messages but cannot be used for non-commutative operations. It passed the same tests as recursive doubling, except some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c (which was expected). - MPI_Allreduce with new decision function passed all of the tests mentioned above. - Cleaning up coll_tuned_util. Moving isendrecv to static inline just like sendrecv. This commit was SVN r13252.	2007-01-23 01:19:11 +00:00
George Bosilca	242292673a	sendrecv is a static inline. This commit was SVN r13237.	2007-01-22 05:50:23 +00:00
Sven Stork	862dcb1a34	- fix compiler warning in ia64 This commit was SVN r13212.	2007-01-19 14:48:47 +00:00
Jelena Pjesivac-Grbovic	85192c01b0	Modifying util functionality: - removing static qualification on ompi_coll_tuned_sendrecv - adding ompi_coll_tuned_isendrecv function which posts isend and irecv requests These changes are separate from but necessary for new algorithms I am working on. This commit was SVN r13161.	2007-01-17 21:29:13 +00:00
Jelena Pjesivac-Grbovic	d2921a9d42	Cleanup of Barrier implementation: - utilizing coll_tuned_util functions - setting line length to 80. This implementation uses standard send messages (instead of synchronous ones). The change improved our performance over MX multiple number of times, however, there exists a small potential that last message to be sent can be delayed (until next mpi call, which means potentially infinitely). If this shows to be a problem, I will modify the algorithms to use synchronous send as last operation (which will incur performance penalty again). This commit was SVN r13071.	2007-01-10 22:49:43 +00:00
Jelena Pjesivac-Grbovic	ccc3ee0b6b	Minor changes to allgather implementation with some clean-up of util code. - in allgather algorithms I replaces irecv-isend-waitall sequence with call to ompi_coll_tuned_sendrecv - most of the functions in util code and allgather decision function conform to 80 character line width. - This commit was SVN r13069.	2007-01-10 21:56:59 +00:00
Brian Barrett	a34e67d743	Remove unneeded PARAM_INIT_FILE variable in configure.params files used by components that use configure.m4 for configuration or are always built. The macro has not been needed since moving to configure types other than configure.stub Fixes trac:590 This commit was SVN r13031. The following Trac tickets were found above: Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590	2007-01-08 03:44:22 +00:00
Jelena Pjesivac-Grbovic	eae3df4904	Updated broadcast decision function based on MX results up to 64 nodes. (The previous decision function did not consider binomial algorithm (since we did not have it at the time)). This commit was SVN r13007.	2007-01-06 00:37:40 +00:00
Brian Barrett	936fdd2ae1	remove some code that accidently came in with r12974. Refs trac:587 This commit was SVN r12991. The following SVN revision numbers were found above: r12974 --> open-mpi/ompi@27cea44a9c The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-04 20:17:07 +00:00
Brian Barrett	27cea44a9c	Fix a number of issues with the ompi_ptr_t: * Make sure that the pval always writes to the correct portion of the lval. This only matters on 32 bit big endian machines. * On 32 bit machines when assigning to pval, the other 4 bytes of lval weren't being written, which could lead to bogus data We use macros so that there aren't casts all over the code and the pval assignment can occur to the correct 4 bytes. Refs trac:587 This commit was SVN r12974. The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-03 19:47:48 +00:00
Jelena Pjesivac-Grbovic	3494e1bb05	- Updated decision function for Alltoall collective. Fixes "jump" for intermediate sizes message on 24+ number of nodes (at least on Grig cluster). This commit was SVN r12920.	2006-12-22 19:59:17 +00:00
George Bosilca	b1725e02d4	No more warnings plus some code reordering. This commit was SVN r12919.	2006-12-21 22:42:15 +00:00
Jelena Pjesivac-Grbovic	f1aec23507	Adding tuned allgather implementation. It contains four algorithms: Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps), and Neighbor Exchange (P/2 steps for even number of processes). All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes. The fixed decision function is based on results collected over MX on the Grig cluster at the University of Tennessee at Knoxville. I have also added (and commented out) copy of MPICH2 decision function for allgather (from their IJHPCA 2005 paper). This commit was SVN r12910.	2006-12-21 18:40:02 +00:00
Brian Barrett	6f8b366acb	Rename liborte to libopen-rte and libopal to libopen-pal per telecon today and bug #632. Refs trac:632 This commit was SVN r12762. The following Trac tickets were found above: Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632	2006-12-05 18:27:24 +00:00
Ralph Castain	6d6cebb4a7	Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things). Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it. I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn). This commit was SVN r12597.	2006-11-14 19:34:59 +00:00
George Bosilca	ec410644ce	Implement the send receive as 2 non blocking operations. That will help us avoiding too many calls to opal_progress. This commit was SVN r12553.	2006-11-10 23:06:19 +00:00
George Bosilca	c2c6a1b37e	Correctly compute the number of elements in a segment. For broadcast send the correct size for all intermediary nodes. This commit was SVN r12552.	2006-11-10 23:04:50 +00:00
George Bosilca	7102147b9f	Correctly detect when the specified algorithm is out of range. In this case we reset it to zero. This commit was SVN r12551.	2006-11-10 21:47:07 +00:00
George Bosilca	af68171253	Use the macro to compute the number of elements in a segment in both bcast and reduce and update the default values for the variables as required by the comment in the coll_tuned.h file. This commit was SVN r12546.	2006-11-10 20:04:08 +00:00
George Bosilca	476b922074	Updates & upgrades: - consistent arguments checking (not allowing to select an algorithm which is not available) - consistent way of computing the segcount (number of datatypes by segment). - small cleanups. - more informative debugging messages. This commit was SVN r12545.	2006-11-10 19:54:09 +00:00
George Bosilca	77ef979457	New architecture for broadcast. A generic broadcast working on a tree description. Most of the bcast algorithms can be completed using this generic function once we create the tree structure. Add all kind of trees. There are 2 versions of the generic bcast function. One using overlapping between receives (for intermediary nodes) and then blocking sends to all childs and another where all sends are non blocking. I still have to figure out which one give the smallest overhead. This commit was SVN r12530.	2006-11-10 05:53:50 +00:00
George Bosilca	1d80f685b5	Remove one compiler warning. This commit was SVN r12520.	2006-11-09 20:08:43 +00:00
George Bosilca	73eec4bfef	Show the MCA parameter coll_base_verbose only if Open MPI is compiled in debug mode. Otherwise there is no debug anyway ... This commit was SVN r12516.	2006-11-09 19:02:32 +00:00
George Bosilca	a82ce427e4	Update the number of reduce algorithms available. This commit was SVN r12503.	2006-11-08 22:20:34 +00:00
George Bosilca	0914892044	Small cleanups, some explicit casts. This commit was SVN r12494.	2006-11-08 16:54:03 +00:00
George Bosilca	74d3946342	Remove the call to set_args. This is only required for the MPI level, because there we have to be able to return to the user the description of the data. This commit was SVN r12493.	2006-11-08 16:52:48 +00:00
Jeff Squyres	427c20af0d	Use a new algorithm for allgatherv. The old algorithm essentially did N gatherv's: for (i = 0 ... size) MPI_Gatherv(..., root = i, ...) The new algorithm simply does (effectively): MPI_Gatherv(..., root = 0, ...) MPI_Bcast(..., root = 0, ...) This commit was SVN r12469.	2006-11-07 18:07:55 +00:00
George Bosilca	8529238d93	Add 2 more algorithms to the dynamic list. This commit was SVN r12415.	2006-11-02 19:19:08 +00:00
George Bosilca	393657ee26	Initialize the sndbuf in all cases. Do not forget to initialize the tree used in each of the broadcast functions. This commit was SVN r12332.	2006-10-27 00:13:33 +00:00
George Bosilca	126a68dc9a	Big datatype commit. Remove all unused features of the datatype engine. As the memory allocation logic is completely done outside the data-type engine (in the PML) there is no need for any special case inside the data-type engine. There is less arguments for the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is not required anymore as there is no memory allocated in the engine itself). This change affect all components using datatypes. I test most of them, but it might happens that I miss some ... If it's the case please let me know (don't shoot the pianist!!). This commit was SVN r12331.	2006-10-26 23:11:26 +00:00
George Bosilca	ba3c247f2a	Big collective commit. I lightly test it, but I think it should be quite stable. Anyway, the default decision functions (for broadcast, reduce and barrier) are based on a high performance network (not TCP). It should give good performance (really good) for any network having the following caracteristics: small latency (5 microseconds) and good bandwidth (more than 1Gb/s). + Cleanup of the reduce algorithms, plus 2 new algorithms (binary and binomial). Now most of the reduce algorithms use a generic tree based function for completing the reduce. + Added macros for computing the trees (they are used for bcast and reduce right now). + Allow the usage of all 5 topologies. + Jelena's implementation of a binary tree that can be used for non commutative operations. Right now only the tree building function is there, it will get activated soon. + Some others minor cleanups. This commit was SVN r12326.	2006-10-26 22:53:05 +00:00
George Bosilca	99631ccf66	Cleanups. This commit was SVN r12272.	2006-10-23 22:29:17 +00:00
George Bosilca	d7d3f9e486	Tuned collectives works only for at least 2 processes. We have the self module for the other cases. This commit was SVN r12271.	2006-10-23 22:28:56 +00:00
George Bosilca	b848a5ad06	Remove all ompi_coll_chain_t references. This commit was SVN r12269.	2006-10-23 21:47:50 +00:00
George Bosilca	39cd8d3d17	One to rule them all. We only need one topology information: a tree. How we build it it's hat make the difference. This commit was SVN r12268.	2006-10-23 21:46:30 +00:00
George Bosilca	9cf3040e5f	Allocate enough memory for the reduce operation when MPI_IN_PLACE is specified. This commit was SVN r12260.	2006-10-23 17:51:36 +00:00
George Bosilca	6b697ad3dd	If the operation is not commutative then force the basic reducve algorithm. The others cannot be used for non commutative operations ... yet ... This commit was SVN r12241.	2006-10-20 22:11:44 +00:00
George Bosilca	a7b6078b73	No more segfault. Still some wrong data around ... This commit was SVN r12238.	2006-10-20 20:17:34 +00:00
George Bosilca	02759cf515	Update the reduce chain collective. This commit was SVN r12237.	2006-10-20 19:47:52 +00:00
George Bosilca	06563b5dec	Last set of explicit conversions. We are now close to the zero warnings on all platforms. The only exceptions (and I will not deal with them anytime soon) are on Windows: - the write functions which require the length to be an int when it's a size_t on all UNIX variants. - all iovec manipulation functions where the iov_len is again an int when it's a size_t on most of the UNIXes. As these only happens on Windows, so I think we're set for now :) This commit was SVN r12215.	2006-10-20 03:57:44 +00:00
George Bosilca	527bb7a197	Remove a double ; This commit was SVN r12213.	2006-10-20 03:28:51 +00:00
George Bosilca	caefd6d0ee	Do not leak memory. Allocate the intermediary buffer only when we really need it (not leafs) and release on the same way. This commit was SVN r12200.	2006-10-19 22:20:33 +00:00
George Bosilca	26b33ec2d7	If there is just one node, we don't need a decision function, just do the copy and return. This commit was SVN r12199.	2006-10-19 22:19:36 +00:00
George Bosilca	3eb2f90ceb	For the recurvise doubling correctly compute the closest power of 2 number of nodes. This commit was SVN r12191.	2006-10-19 17:14:57 +00:00
George Bosilca	041fcb8d18	Update the barrier decision function. This commit was SVN r12190.	2006-10-19 17:14:01 +00:00
George Bosilca	c9da782804	Keep only one function to get the size of a datatype. This commit was SVN r12170.	2006-10-18 17:33:01 +00:00
George Bosilca	21ade43b96	Remove a non reacheable statement. This commit was SVN r12166.	2006-10-18 16:43:55 +00:00
George Bosilca	be27ee6fa0	Correct the bcast problem where we always did a bcast with segzise of 0. Activate the reduce decision function. Others small updates (mostly TAB to spaces). This commit was SVN r12161.	2006-10-18 02:00:46 +00:00
George Bosilca	8852c00c36	Look like a big commit but in fact it address only one issue. The way we're working with size and diplacement of data-type. After this patch all data can contain size_t bytes and the displacements are defined as ptrdiff_t. All of the files I was able to compile have been modified to match this requirement. This commit was SVN r12146.	2006-10-17 20:20:58 +00:00
Jeff Squyres	a8e9fa09da	Fix some compiler warnings introduced in r11619. I checked with George: ompi_ddt_type_size() returns a signed int only because of the MPI spec; it will never return a negative value. So casting the return value out of it to a (uint32_t) is safe, and makes the comparisons be between two unsigned values. This commit was SVN r11639. The following SVN revision numbers were found above: r11619 --> open-mpi/ompi@8667648a1b	2006-09-13 16:42:31 +00:00
Graham Fagg	8667648a1b	Simple fix (for ticket 363). We push segment size to type size. In other algorithms we switch of segementing altogether. But really the DDT can probably handle partial types so we could really keep the segsize constant (for all but reduce ops) and treat it just as byte arrays.. todos: macroize it as we do it 10 different ways, add mca params to control handling (push up size, no change, switch off segmenting) This commit was SVN r11619.	2006-09-12 00:01:27 +00:00
Jeff Squyres	fb4d7ab268	* Fix svn:ignore * Remove files that should not be in SVN This commit was SVN r11565.	2006-09-08 10:35:45 +00:00
George Bosilca	3b39df8ae1	More protection around what we really want to get exported. This commit was SVN r11437.	2006-08-27 04:49:02 +00:00
Sami Ayyorgun	aa8cd63418	changed some barrier variables for shared-memory to volatile This commit was SVN r11403.	2006-08-24 16:53:10 +00:00
Torsten Hoefler	6b22641669	added LibNBC (http://www.unixer.de/NBC ) as collv1 (blocking) component. I know it does not make much sense but one can play around with the performance. Numbers are available at http://www.unixer.de/research/nbcoll/perf/. This is the first step towards collv2. Next step includes the addition of non-blocking functions to the MPI-Layer and the collv1 interface. It implements all MPI-1 collective algorithms in a non-blocking manner. However, the collv1 interface does not allow non-blocking collectives so that all collectives are used blocking by the ompi-glue layer. I wanted to add LibNBC as a separate subdirectory, but I could not convince the buildsystem (and had not the time). So the component looks pretty messy. It would be great if somebody could explain me how to move all nbc{c,h}, and {hb,dict}{c,h} to a seperate subdirectory. It's .ompi_ignored because I did not test it exhaustively yet. This commit was SVN r11401.	2006-08-24 16:47:18 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
George Bosilca	6afa4c6c64	Windows friendly version. We have to split the OMPI_DECLSPEC in at least 3 different macros, one for each project. Therefore, now we have OPAL_DECLSPEC, ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project. This commit was SVN r11270.	2006-08-20 15:54:04 +00:00
Brian Barrett	4c101c6394	* rename the collectives sm bootstrap area to be consistent with other shared memory segments * make sure to properly unlink the collectives sm bootstrap area at shutdown * Add missing / in the path for the mpool shared memory segment * make sure to release the common_mmap structure in the SM btl after unlinking the file during shutdown This commit was SVN r10886.	2006-07-19 20:55:29 +00:00
George Bosilca	ee6fab783d	SwitchToThread is not defined by any library. Not even by the kernel32.lib as noted in the MSDN documentation. At least not on my WinXP Pro box. This commit was SVN r10719.	2006-07-11 05:36:04 +00:00
Graham Fagg	f10c21b746	corrected mca param description and algorithm count (now to find out why I have disallowed direct calling fo the bm tree) This commit was SVN r10603.	2006-06-30 23:22:49 +00:00
Graham Fagg	f64cbbe8f2	ops. some decisions used extent rather than size for decision making yes this means it WAS possible for two nodes to choice two different algorithms (discovered by Doug Gregor and figured out by George) Also changed some names like size to comsize so we know which sizes we are using where This should be updated in al versions This commit was SVN r10601.	2006-06-30 21:49:04 +00:00
George Bosilca	29219ee57d	Thanks to Gleb now we are able to call the schduler on Windows. Instead of using sched_yield, we use our friend SwitchToThread. This commit was SVN r9671.	2006-04-20 19:56:50 +00:00
Graham Fagg	c31a5ad4b3	A few small changes that just expanded in the name of neatness... (1) As pointed out by Torsten after Jeff comment that there are 15 collectives yesterday.. nope.. I have 16 but miss counted them in my ifdefs (I had two #11s). Replaces with enum... (2) Added a readonly MCA param for how many backend algorithms are available per collective (used by benchmarker/STS) This allowed me to remove the tuned query internal functions and replace them with ompi_coll_tuned_forced_max_algorithms[COLL]. (3) I was reading the user forced MCA params for the collectives on each comm create (module init) but I then put the values into a global set of variables (like ompi_coll_tuned_reduce_forced_algorithm). To fix this and make the code neater: (a) The component looks up the MCA param indices on Open if dynamic_rules is set via the ompi_coll_tuned_COLLECTIVE_intra_check_forced_init () call. (b) Got rid of the ompi_coll_ompi_coll_tuned_COLLECTIVE_forced_algorithm/segmentsize/etc globals with a struct that is now cached on the module data hung off the communicator. i.e. done right. (c) On module init if dynamic rules enabled we call a general getvalues routine (in coll_tuned_forced.c) to get the CURRENT values using the MCA param indices and then put them on the modules data segment. A shorter version of getvalues exists for barrier which only needs the algorithm choice This commit was SVN r9663.	2006-04-19 23:42:06 +00:00
Tim Woodall	bd870519fd	- modified convertor copy_and_prepare routines to accept an addition flag, new flags to be included when convertor is initialized - modified pml/btl module defs and added stub functions for diagnostic output routines to dump state of queues / endpoints - updates to data reliability pml This commit was SVN r9329.	2006-03-17 18:46:48 +00:00
Jeff Squyres	8a9e76dfa3	Thanks to Sven for noticing that the increment in scatter should be per the send datatype, not the receive datatype (MPI-1:105). This commit was SVN r9312.	2006-03-16 18:18:28 +00:00
Graham Fagg	95b060c741	output the right name and stop confusing george This commit was SVN r9215.	2006-03-08 00:40:14 +00:00
George Bosilca	39252b764f	Correctly compute the size of the datatype. This commit was SVN r9127.	2006-02-23 04:30:52 +00:00
George Bosilca	805c45de29	Don't let a division by zero happens ... This commit was SVN r9109.	2006-02-22 06:34:05 +00:00
Brian Barrett	566a050c23	Next step in the project split, mainly source code re-arranging - move files out of toplevel include/ and etc/, moving it into the sub-projects - rather than including config headers with <project>/include, have them as <project> - require all headers to be included with a project prefix, with the exception of the config headers ({opal,orte,ompi}_config.h mpi.h, and mpif.h) This commit was SVN r8985.	2006-02-12 01:33:29 +00:00
Graham Fagg	232bb9534a	Start moving stuff out of modules that should be in the component. This commit was SVN r8874.	2006-02-01 20:50:14 +00:00
Graham Fagg	5f2d82347f	a couple of changes to make barrier synchronous.. means last communication to any possible peer must be locally completing. for now using synchronous calls until the new functionality is available. then will change the code to use the new PML send flags. This commit was SVN r8867.	2006-01-31 23:21:46 +00:00
Graham Fagg	25375759c3	arrgh. reduce could for very small message sizes and proc counts call a linear function this was implemented using a chain (tree followed with pipeline) by setting the chain fanout to a factor of size etc but the chain datastructure was fixed in length and if exceeded the topo create returned a null which isn't helpfull in cid next function of comdup... Anyway two fixes, first we do have a real linear function so changed the decision function and second altered the topo chain create to force chain fanouts of less than 1 to 1 and fanouts bigger than max to max. next check in will change chain to dynamically allocd array (reallocable) but we shouldn't ever use a chain fanout for a linear tree anyway. (lession must rerun all tests for all data sizes when changing decision functions) This commit was SVN r8662.	2006-01-08 02:41:09 +00:00
George Bosilca	479d510eaf	Use the common SM component to unmap the shared memory file. This commit was SVN r8623.	2005-12-31 15:07:48 +00:00
Jeff Squyres	54c4bd3ce2	Update to have public symbols be consistent; use new prefix rule (apparently we've been doing this in opal and orte, but not in ompi yet). All public symbols begin with "ompi_coll_tuned_" (not mca_coll_tuned_) except the component struct. Now this component passes the illegal symbol report with no hits. This commit was SVN r8589.	2005-12-22 13:49:33 +00:00
Jeff Squyres	2435970cb8	Enable the new "tuned" coll component in an attempt to get wider testing. Note that this effectively replaces the "basic" component as the baseline collective component. Please report any problems with this component. If you run into problems with this component, you can disable it with: --mca coll_tuned_priority 0 This commit was SVN r8575.	2005-12-21 12:43:03 +00:00
Brian Barrett	a5af07cd6b	fixes suggested by Ralf for supporting both Libtool 1 and 2 in Open MPI... This commit was SVN r8538.	2005-12-19 03:10:23 +00:00
Graham Fagg	8651658816	minor compile warnings fix This commit was SVN r8497.	2005-12-14 19:09:46 +00:00
George Bosilca	6f45b6175a	Header protection. This commit was SVN r8441.	2005-12-10 22:11:10 +00:00
George Bosilca	79486e5922	Protect the min function on Windows as it's defined by default in windows.h This commit was SVN r8437.	2005-12-10 22:02:14 +00:00
George Bosilca	b7353c707d	Remove unprotected header files. This commit was SVN r8432.	2005-12-10 17:04:46 +00:00
Graham Fagg	141d4ea30a	cleaned up ready for changes to move cached data off the MCW module to the component (where they belong) This commit was SVN r8407.	2005-12-08 03:14:57 +00:00
Jeff Squyres	6fbd321442	Fix a bunch of install locations for header files This commit was SVN r8406.	2005-12-08 00:54:44 +00:00
Jeff Squyres	5f949b567d	forgot to commit this long ago This commit was SVN r8378.	2005-12-03 15:38:42 +00:00
Edgar Gabriel	83cef7f8ac	in scatterv we tested unfortunatly the wrong datatype for the displacement (for both the inter and intra-communicator version). The displacements in scatterv are given in multiples of the sendtype. This fix should probably make to v1.0.1 as well? This commit was SVN r8251.	2005-11-23 21:42:45 +00:00
George Bosilca	1aa6d27ffe	Remove all the compilation warnings I found including unused variables and functions. This commit was SVN r8226.	2005-11-22 03:42:15 +00:00
Brian Barrett	20cea60b82	* fix "make distclean" error in PML * turns out (duh!) that there was a reason that the <projectdir>dir variable was set in the AM conditional. If not, stupid directories are created and not needed... duh. This commit was SVN r8205.	2005-11-20 07:41:09 +00:00
Brian Barrett	8faa1884f0	* The last of the build system optimizations. Combine the component and component/base Makefile.am files, reducing the time configure spends stamping out Makefiles at the end * Install base_impl.h file when devel-headers are being installed This commit was SVN r8200.	2005-11-20 01:03:01 +00:00
George Bosilca	2749870f2c	I'm in for the tuned collective module :) This commit was SVN r8146.	2005-11-13 23:04:14 +00:00
George Bosilca	d8d13f879f	When --disable-debug is specified we have to explicitly include the optl/util/output.h header. This commit was SVN r8133.	2005-11-12 04:03:19 +00:00
Graham Fagg	877f7bbe6a	File based dynamic up and tested... Lots of misc fixes: printfs->opal_output, handles fanin/out correctly for forced ops unused vars, correct calculations on meaning of 'msgsize' for decision functions (varies depending on algorithm), etc This commit was SVN r8113.	2005-11-11 04:49:29 +00:00
George Bosilca	3507d5e9cd	opal/util/output.h is required for optimized builds. This commit was SVN r8076.	2005-11-10 01:19:27 +00:00
Graham Fagg	6b99301893	extra verbose in debug mode to help occ This commit was SVN r8061.	2005-11-09 21:01:35 +00:00
Graham Fagg	bcf8744bf6	valgrind saved me from a nasty order of eval error... i.e. derefing slected_data before setting it. Anyway fixed and no memory leaks in coll tuned so far. This commit was SVN r8037.	2005-11-08 04:52:30 +00:00
Graham Fagg	5b3ba944a8	Enabled, and running... todos. turn the debug messages into ompi ignorables and inot do some ops in ompi_bug mode This commit was SVN r8036.	2005-11-08 04:43:17 +00:00
Graham Fagg	833b558046	Full configuration file based control of tuned collectives. (verbose on bad config file and even cleans up after itself enought to make valgrind happy). This commit was SVN r8035.	2005-11-08 03:36:38 +00:00
Graham Fagg	39207db7cd	removed the n-dimmension rule base.. replacing it was simpler code for V1 This commit was SVN r8033.	2005-11-08 03:03:51 +00:00
Graham Fagg	dcd3450e06	simplified the building of different rule sets (also corrected some prototypes missing 'struct') This commit was SVN r8003.	2005-11-06 22:05:50 +00:00
Jeff Squyres	42ec26e640	Update the copyright notices for IU and UTK. This commit was SVN r7999.	2005-11-05 19:57:48 +00:00
Graham Fagg	9547a635a9	snapshot while switching systems but, dynamic rules from a user defined config file is almost there now This commit was SVN r7943.	2005-11-01 00:19:05 +00:00
Graham Fagg	fe03e068f2	allow forced algorithms (where the user or test suite knows better) to go through the dynamic decision rule interface. (forced algorithms are set with MCA params) fixed some silly verbose output with wrong func name in it etc updates to fixed dec rules. This commit was SVN r7940.	2005-10-31 20:45:50 +00:00
Edgar Gabriel	2ec5fa5d24	- The component will remove itself from the list of potential collective modules, if its priority is zero (the default value). Reason for that is + if there is no other module with a priority > 0, the hierarchical collective module has a problem anyway, since it has to rely on the coll modules of the subcommunicators. On the other hand, if its priority is zero, it won't be chosen anyway, and we can simply save the allreduce/allgather and comm_split operations which might occur during hierarchy detection. + to improve the startup times until we have the modex thing which we discussed with Jeff and Tim in Knoxville in place - adding an mca parameter indicating a symmetric configuration. This can speed up startup times, since each process can conclude from its data onto the data of the other processes -> no need for the allreduce operations. Per default this parameter is set to "no". This commit was SVN r7932.	2005-10-30 16:01:13 +00:00
Graham Fagg	5bb0d4a053	enable allreduce to be selected This commit was SVN r7888.	2005-10-26 23:55:37 +00:00
Graham Fagg	2587d7ade9	added some more linear functions. minor corrections on naming and debug info This commit was SVN r7887.	2005-10-26 23:51:56 +00:00
Graham Fagg	c3e1dc410d	Started to add basic linear functions Also started to add the allreduce algorithms as I test them (i.e. if it goes in its after testing from now on) This commit was SVN r7886.	2005-10-26 23:11:32 +00:00
Jeff Squyres	d47ce065e9	Minor Makefile.am fix for static builds. This commit was SVN r7882.	2005-10-26 15:57:58 +00:00
Edgar Gabriel	ba3bf6592f	fixing some warnings. No idea yet why the static builds fail... This commit was SVN r7879.	2005-10-26 12:56:56 +00:00
Edgar Gabriel	d009d8de57	opening the hierarchical collective component to the public. I am at this stage fairly confident that - it works in most scenarious (with symmetric hierarchies, with asymmetric hierarchies, wihout hierarchies - it just removes itself) - it does not create too many problems (I am not aware of any at least) - it does not slow down startup anymore dramatically (thanks to the fixes of Brian, Jeff, Tim and a significant reduction in the number of collective operations in the comm_query) Any feedback is highly welcome. This commit was SVN r7868.	2005-10-25 18:38:43 +00:00
Edgar Gabriel	00c04ab56a	moving the hierarch collective component to the new parameter registration interface. This commit was SVN r7867.	2005-10-25 18:34:47 +00:00
Graham Fagg	382f05c7ad	Infastructure changes. started to add static (fixed if) statement based decision rules based on gigE numbers added mca params so that a user can force a certain algorithm/segment/topo on a per collective basis (this is not in the fixed call path but only in the dynamic (at com create) call path). (these params can be used by test suites such as OCC to choice which algorithm they are using). This commit was SVN r7854.	2005-10-25 03:55:58 +00:00
Graham Fagg	d8e32464cb	ops. setting/reading mca option from the right varible would help. This commit was SVN r7850.	2005-10-24 21:33:48 +00:00
Edgar Gabriel	3a7efaf4d9	fix for reduce and allreduce for an unsymmetric case This commit was SVN r7802.	2005-10-18 19:20:48 +00:00
Edgar Gabriel	818b4af554	- reverting the logic in the hierarchy detection stuff. This can reduce the number of collective operations and simplifies the logic significantly. - introducing a special case if size of comm == 1, avoiding thus collective operations as well ( i.e. no need for hierarchies) - fix for an unsymmetric case. Still to be tested. This commit was SVN r7799.	2005-10-18 18:17:50 +00:00
Brian Barrett	1302cb4072	The next in a long line of crazed build system changes from Brian. This was originally suggested by Ralf Wildenhues, to try to speed autogen, configure, and make (and possibly even make install). Use automake's include directive to drastically reduce the number of Makefile files (although the number of Makefile.am files is the same - most are just included in a top-level Makefile.am). Also use an Automake SUBDIRs feature to eliminate the dynamic-mca tree, which was no longer really needed. This makes adding a framework easier (since you don't have to remember the dynamic-mca tree) and makes building faster (as make doesn't have to recurse through the dynamic-mca tree) This commit was SVN r7777.	2005-10-17 00:21:10 +00:00
Edgar Gabriel	7e45f64065	reduce has now been tested quite extensively for all (predefined) operations and for all root nodes and passed all tests. First cut on barrier (which from my perspective does not make sense from the performance point of view) and on allreduce (which might make sense), This commit was SVN r7774.	2005-10-15 22:24:44 +00:00
Edgar Gabriel	3fab9c628c	switching the root and creating (if necessary) the new local leader sub-communicators seems to work as well. Thoroughly tested with bcast, not yet that exhaustivly tested for the reduction. This commit was SVN r7773.	2005-10-15 21:13:44 +00:00
Edgar Gabriel	7d34770456	further bugfixes. The hierarchy detection works now as far as I can see (even in unsymmetric sitations). Bcast and reduce work as well. Still to test: the code which generates new local leader communicators, in case the root of the operation is not yet part of the lleader comm. This commit was SVN r7772.	2005-10-15 19:36:54 +00:00
Edgar Gabriel	63554d245f	further bugfixes This commit was SVN r7771.	2005-10-15 18:44:57 +00:00
Edgar Gabriel	92c7b77cbc	minor bug fixes This commit was SVN r7770.	2005-10-15 18:32:40 +00:00
Edgar Gabriel	ba163c611c	checkpoint before moving to a real cluster. Most of the recoding should be done. This version also doesn't break ompi (at least if its not chosen :-) ). New features compared to the version from last Thursday (where bcast and reduce seemed to work in most scenarios): - clearer internal infrastructure - ability to handle all root processes with a (hopefully) minimal number of local leader communicators. This commit was SVN r7769.	2005-10-15 17:04:01 +00:00
Edgar Gabriel	84c070fc0f	get rid of the different modes how to store the colorarray for now. Might be reintroduced later as an optimization. This commit was SVN r7762.	2005-10-14 18:11:21 +00:00
Edgar Gabriel	6d14440972	checkpoint for moving again to another machine. major rewrite to clean up internal interfaces in progress. This commit was SVN r7761.	2005-10-14 17:41:44 +00:00
Edgar Gabriel	770aeaf97b	modifications towards adding new local-leader communicators. This commit was SVN r7760.	2005-10-14 12:18:29 +00:00
Graham Fagg	636b42afff	handle non existant recv buf in reduce for non root processes (basic allreduce does this for mpi_in_place case) This commit was SVN r7759.	2005-10-14 00:00:37 +00:00
Graham Fagg	61b8218d76	MPI_IN_PLACE fix for reduce. (actually a work around for an optimisation in the reduce for not saving ops on the first recv of each segment) Minor change in topo. This commit was SVN r7758.	2005-10-13 23:38:21 +00:00
Edgar Gabriel	48f2563b4c	checkpoint. Moving to another machine. This commit was SVN r7757.	2005-10-13 20:04:26 +00:00
Edgar Gabriel	4b05359b16	minor fixes when freeing the component This commit was SVN r7756.	2005-10-13 18:22:16 +00:00
Edgar Gabriel	0a5a346bbb	first cut on the reduce operation. This commit was SVN r7755.	2005-10-13 17:58:13 +00:00
Edgar Gabriel	30af775d40	further fixes. The first hierarchical MPI_Bcast works! Its just ~ 100 times slower then basic at the moment :-) This commit was SVN r7754.	2005-10-13 17:34:42 +00:00
Edgar Gabriel	460b5cb840	further corrections to the hierarchy detection algorithms. It seems to work now as far as my tests show... This commit was SVN r7753.	2005-10-13 16:21:13 +00:00
Edgar Gabriel	f5d16419b2	fix in the logic regarding protocol detection. This commit was SVN r7749.	2005-10-13 15:07:35 +00:00
Edgar Gabriel	3e5ad3e681	Updates This commit was SVN r7738.	2005-10-12 20:56:29 +00:00
Edgar Gabriel	25518b63c5	first version of coll_hierarch which does not crash the rest of the library as long as its not selected :-) This commit was SVN r7707.	2005-10-11 22:05:24 +00:00
Edgar Gabriel	0675c22dab	updating with Jeff's help to the recent autogen/configure system This commit was SVN r7705.	2005-10-11 21:50:16 +00:00
Edgar Gabriel	7b07dbc163	another round of fixes. Unfortunatly, I also have to provide a trivial version of reduce and gather to make all this work.... This commit was SVN r7702.	2005-10-11 21:26:07 +00:00
Edgar Gabriel	c8adc2e65e	coding around the collective operations This commit was SVN r7698.	2005-10-11 20:34:17 +00:00
Edgar Gabriel	083d0b9630	Checkpoint: most of the coding should be done for the basic infrastructure. This commit was SVN r7696.	2005-10-11 19:45:21 +00:00
Graham Fagg	607bdf51b6	Last Cleanup BEFORE adding last two methods and final cross over points. - new mca param calls - move printfs to OPAL_OUTPUT This commit was SVN r7692.	2005-10-11 18:51:03 +00:00
Edgar Gabriel	b42d4ac780	Checkpoint: - update the hierarch stuff to use btl's instead of ptl's - start the new logic regarding how to handle local leader communicators This commit was SVN r7691.	2005-10-11 17:29:59 +00:00
Jeff Squyres	b22fab2826	Fix for a bug Galen noticed yesterday -- make the shared memory only be allocated the first time a sm coll is selected for a communicator, not before. This commit was SVN r7647.	2005-10-06 13:17:27 +00:00
Jeff Squyres	83b5a675f9	Don't automatically take the first entry off the selected component list; be sure to check its priority against the basic component and take the one with the higher priority. This commit was SVN r7621.	2005-10-04 17:09:45 +00:00
Jeff Squyres	b17c4334c4	- Remove all vestigates of using the built-in mcb_tree from the reduce_inorder() function -- we don't use the tree at all. - Add more relevant "volatile"'s for the control buffers in the fragment mpool (and associated casts where necessary) This commit was SVN r7616.	2005-10-04 14:52:59 +00:00
Jeff Squyres	c7fe54ba44	- Remove some silly compiler warnings - Move the "process 0" logic out of the main loop in reduce to make the code a bit less complex (at the price of slight code duplication, but it iss now significantly easier to read) - Fix problem with uniquenes guarantee in the bootstrap mpool -- using the CID alone was not sufficient enough to guarantee uniquenes; now use (CID, rank 0 process name) tuple to check for uniqueness - Made a few debugging help changes in coll_sm.h; especially helps debugging on uniprocessors This commit was SVN r7599.	2005-10-03 21:34:58 +00:00
Jeff Squyres	2cedfeec53	- Eliminate some unused base globals - Move one base global to the basic component and make it an MCA parameter - Convert the basic component to use the new MCA param API This commit was SVN r7598.	2005-10-03 21:07:42 +00:00
Jeff Squyres	57fb96b018	Clarification of a help message This commit was SVN r7597.	2005-10-03 21:06:13 +00:00
Jeff Squyres	ab099fa8cb	Re-indent; real commit with some changes coming shortly. This commit was SVN r7596.	2005-10-03 19:56:39 +00:00
Jeff Squyres	10064df0e9	Remove compiler warning This commit was SVN r7578.	2005-10-02 10:43:53 +00:00
Jeff Squyres	37fc944b01	Use the right number of segments per in-use flag when calculating offsets. This commit was SVN r7571.	2005-09-30 23:12:23 +00:00
Jeff Squyres	934caaf449	Fix at least one segv; use the right number of segments (i.e., the number o segments in the fragment pool, not in the bootstrap pool) This commit was SVN r7565.	2005-09-30 18:01:15 +00:00
Jeff Squyres	fcef1774d5	Per advice from Ralf W., change the pkgdata declarations in Makefile.am's to be a slightly more correct (and, more importantly, less error-prone) construct. This commit was SVN r7554.	2005-09-30 13:32:39 +00:00
Jeff Squyres	bc181d7130	Remove the .ompi_ignore so that everyone starts compiling this, but lower the default priority to 0 so that it's not active unless you specifically ask for it (this component needs more testing by people other than me before we unleash it on the public). This commit was SVN r7545.	2005-09-29 18:05:47 +00:00
Edgar Gabriel	67dd52efb1	making the allreduce and reduce_scatter tests pass as well This commit was SVN r7532.	2005-09-28 15:12:05 +00:00
Edgar Gabriel	dbbbd416df	fixing MPI_IN_PLACE for the log-reduce algorithm. This commit was SVN r7526.	2005-09-27 21:51:55 +00:00
Jeff Squyres	d67c31f238	Remove useless compiler warnings. This commit was SVN r7418.	2005-09-17 10:54:48 +00:00
Jeff Squyres	10d02b2110	Make sure to copy the right amount out of the temp buffer. This commit was SVN r7400.	2005-09-15 22:06:36 +00:00
Jeff Squyres	15d0a95202	- Remove extra whitespace from Makefile.am's from when we removed Makefile.options - Sample in each of the three projects of how to link againt the relevant libraries so that when components are loaded into a parent process' space, we don't rely on the libopal/liborte/libmpi symbols being in the parent's public symbol namespace -- instead, dynamically link to the relevant libraries, allowing the dynamic linker to pull those libraries in at run-time, if needed This commit was SVN r7397.	2005-09-15 20:56:18 +00:00
Jeff Squyres	3ecfe02b83	- Properly handle MPI_IN_PLACE - Return MPI_ERR_ARG, not EINVAL This commit was SVN r7391.	2005-09-15 19:33:54 +00:00
Jeff Squyres	2c1186cd19	Fix up the offsets for the non-root gatherv in the IN_PLACE case. This commit was SVN r7389.	2005-09-15 18:21:18 +00:00
Jeff Squyres	7ca22d9416	- Correct to use the right offsets - Copy back to the right location in the non-rank-0-IN_PLACE case This commit was SVN r7384.	2005-09-15 15:15:23 +00:00
Jeff Squyres	406f0575eb	- Remove useless error check - Ensure err is set to MPI_SUCCESS on the IN_PLACE case This commit was SVN r7383.	2005-09-15 15:14:00 +00:00
Jeff Squyres	cbfb062a7d	Fix silly mistake for IN_PLACE handling in scan This commit was SVN r7380.	2005-09-15 12:47:17 +00:00
Jeff Squyres	068b9c72a2	Bunches of changes - remove redundant OBJ_CONSTRUCT in bcast - fix up some macros in coll_sm.h - check to ensure that if there are too many processes in the communicator (i.e., if we couldn't fit a flag for each of them in the control segment), then fail selection - setup the in_use flags properly - adapt to new mpool API - first working copy of reduce -- not tree-baed (but still NUMA-aware), and only processes in order from process 0 to process N-1 -- do not have a tree-based and/or commutative version yet (i.e., process the results in whatever order they arrive) Reduce now passes the new ibm reduce_big.c test. Woo hoo! Time to declare success for the evening (and run the intel test tomorrow). This commit was SVN r7379.	2005-09-15 02:18:16 +00:00
Jeff Squyres	5365ae84b9	Remove extra variable. Still working with George / Edgar on reduce_log_intra(). This commit was SVN r7368.	2005-09-14 11:52:20 +00:00
Jeff Squyres	e0c47dd0bc	Fix for allreduce in IN_PLACE cases This commit was SVN r7364.	2005-09-14 02:42:32 +00:00
Jeff Squyres	0fcd682c4c	MPI-2 7.3.3 description of MPI_Allgatherv is wrong -- can't just have all processes call MPI_Gatherv(MPI_IN_PLACE...) because IN_PLACE is only allowed to be used at the root. Non-root processes must use their receive buf as the send buf. This commit was SVN r7363.	2005-09-14 02:21:33 +00:00
Graham Fagg	0f75381e56	Added various barrier routines: recursive doubling, bruck, double ring, 2proc etc all pass tests This commit was SVN r7355.	2005-09-13 20:58:42 +00:00
Jeff Squyres	5dca18f903	First cut of handling MPI_IN_PLACE: - added relevant logic for everything except mca_coll_basic_reduce_log_intra() -- need some help from George / Edgar on this one... - replaced ompi_ddt_sndrcv() with ompi_ddt_copy_content_same_ddt() where relevant - removed some "if (size > 1)" conditionals, because the self coll module will always be chosen for collectives where size==1 Waiting for BA's tests to check the validity of this IN_PLACE stuff. We'll see how it goes! This commit was SVN r7351.	2005-09-13 20:06:54 +00:00
Jeff Squyres	bd95f5d474	Arrgh -- check the right argument for IN_PLACE. This commit was SVN r7350.	2005-09-13 19:56:43 +00:00
Jeff Squyres	7c09923751	Updates: - Handle MPI_IN_PLACE - Use ompi_ddt_copy_content_same_ddt() where relevant This commit was SVN r7349.	2005-09-13 19:39:49 +00:00
Graham Fagg	9053790973	fixed bruck alltoall bug. now passes ibm tests This commit was SVN r7346.	2005-09-13 18:35:45 +00:00
Jeff Squyres	47a1a2b7ec	Arrgh. Compile before commit. Sorry folks -- stupid typo fixed. This commit was SVN r7345.	2005-09-13 18:12:10 +00:00
Jeff Squyres	da87169d17	Add support for MPI_IN_PLACE for the easy operations. This commit was SVN r7344.	2005-09-13 18:02:36 +00:00

... 2 3 4 5 6 ...

398 Коммитов