openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	8d92231de3	Deprecated comment. This commit was SVN r31472.	2014-04-21 23:30:05 +00:00
Ralph Castain	49d938de29	Merge one-sided updates to the trunk - written by Brian Barrett and Nathan Hjelmn cmr=v1.7.5:reviewer=hjelmn:subject=Update one-sided to MPI-3 This commit was SVN r30816.	2014-02-25 17:36:43 +00:00
Nathan Hjelm	1a021b8f2d	coll/ml: add support for blocking and non-blocking allreduce, reduce, and allgather. The new collectives provide a signifigant performance increase over tuned for small and medium messages. We are initially setting the priority lower than tuned until this has had some time to soak in the trunk. Please set coll_ml_priority to 90 for MTT runs. Credit for this work goes to Manjunath Gorentla Venkata (ORNL), Pavel Shamis (ORNL), and Nathan Hjelm (LANL). Commit details (for reference): Import ORNL's collectives for MPI_Allreduce, MPI_Reduce, and MPI_Allgather. We need to take the basesmuma header into account when calculating the ptpcoll small message thresholds. Add a define to bcol.h indicating the maximum header size so we can take the header into account while not making ptpcoll dependent on information from basesmuma. This resolves an issue with allreduce where ptpcoll overwrites the header of the next buffer in the basesmuma bank. Fix reduce and make a sequential collective launcher in coll_ml_inlines.h The root calculation for reduce was wrong for any root != 0. There are four possibilities for the root: - The root is not the current process but is in the current hierarchy. In this case the root is the index of the global root as specified in the root vector. - The root is not the current process and is not in the next level of the hierarchy. In this case 0 must be the local root since this process will never communicate with the real root. - The root is not the current process but will be in next level of the hierarchy. In this case the current process must be the root. - I am the root. The root is my index. Tested with IMB which rotates the root on every call to MPI_Reduce. Consider IMB the reproducer for the issue this commit solves. Make the bcast algorithm decision an enumerated variable Resolve various asset failures when destructing coll ml requests. Two issues: - Always reset the request to be invalid before returning it to the free list. This will avoid an asset in ompi_request_t's destructor. OMPI_REQUEST_FINI does this (and also releases the fortran handle index). - Never explicitly construct or destruct the superclass of an opal object. This screws up the class function tables and will cause either an assert failure or a segmentation fault when destructing coll ml requests. Cleanup allgather. I removed the duplicate non-blocking and blocking functions and modeled the cleanup after what I found in allreduce. Also cleaned up the code somewhat. Don't bother copying from the send to the recieve buffer in bcol_basesmuma_allreduce_intra_fanin_fanout if the pointers are the same. The eliminates a warning about memcpy and aliasing and avoids an unnecessary call to memcpy. Alwasy call CHECK_AND_RELEASE on memsync collectives. There was a call to OBJ_RELEASE on the collective communicator but because CHECK_AND_RECYLCE was never called there was not matching call to OBJ_RELEASE. This caused coll ml to leak communicators. Make allreduce use the sequential collective launcher in coll_ml_inlines.h Just launch the next collective in the component progress. I am a little unsure about this patch. There appears to be some sort of race between collectives that causes buffer exhaustion in some cases (IMB Allreduce is a reproducer). Changing progress to only launch the next bcol seems to resolve the issue but might not be the best fix. Note that I see little-no performance penalty for this change. Fix allreduce when there are extra sources. There was an issue with the buffer offset calculation when there are extra sources. In the case of extra sources == 1 the offset was set to buffer_size (just past the header of the next buffer). I adjusted the buffer size to take into accoun the maximum header size (see the earlier commit that added this) and simplified the offset calculation. Make reduce/allreduce non-blocking. This is required for MPI_Comm_idup to work correctly. This has been tested with various layouts using the ibm testsuite and imb and appears to have the same performance as the old blocking version. Fix allgather for non-contiguous layouts and simplify parsing the topology. Some things in this patch: - There were several comments to the effect that level 0 of the hierarchy MUST contain all of the ranks. At least one function made this assumption but it was not true. I changed the sbgp components and the coll ml initization code to enforce this requirement. - Ensure that hierarchy level 0 has the ranks in the correct scatter gather order. This removes the need for a separate sort list and fixes the offset calculation for allgather. - There were several passes over the hierarchy to determine properties of the hierarchy. I eliminated these extra passes and the memory allocation associated with them and calculate the tree properties on the fly. The same DFS recursion also handles the re-order of level 0. All these changes have been verified with MPI_Allreduce, MPI_Reduce, and MPI_Allgather. All functions now pass all IBM/Open MPI, and IMB tests. coll/ml: correct pointer usage for MPI_BOTTOM Since contiguous datatypes are copied via memcpy (bypassing the convertor) we need to adjust for the lb of the datatype. This corrects problems found testing code that uses MPI_BOTTOM (NULL) as the send pointer. Add fallback collectives for allreduce and reduce. cmr=v1.7.5:reviewer=pasha This commit was SVN r30363.	2014-01-22 15:39:19 +00:00
Jeff Squyres	6c53711ac8	Provide Java MPI_Op callbacks via an intercept routine (just like how we do MPI::Op C++ callbacks). This commit was SVN r29262.	2013-09-26 21:36:44 +00:00
Jeff Squyres	9d87857c25	Warning squash: explicitly ignore the return value from asprintf. This commit was SVN r26681.	2012-06-27 17:40:24 +00:00
Jeff Squyres	253444c6d0	== Highlights == 1. New mpifort wrapper compiler: you can utilize mpif.h, use mpi, and use mpi_f08 through this one wrapper compiler 1. mpif77 and mpif90 still exist, but are sym links to mpifort and may be removed in a future release 1. The mpi module has been re-implemented and is significantly "mo' bettah" 1. The mpi_f08 module offers many, many improvements over mpif.h and the mpi module This stuff is coming from a VERY long-lived mercurial branch (3 years!); it'll almost certainly take a few SVN commits and a bunch of testing before I get it correctly committed to the SVN trunk. == More details == Craig Rasmussen and I have been working with the MPI-3 Fortran WG and Fortran J3 committees for a long, long time to make a prototype MPI-3 Fortran bindings implementation. We think we're at a stable enough state to bring this stuff back to the trunk, with the goal of including it in OMPI v1.7. Special thanks go out to everyone who has been incredibly patient and helpful to us in this journey: * Rolf Rabenseifner/HLRS (mastermind/genius behind the entire MPI-3 Fortran effort) * The Fortran J3 committee * Tobias Burnus/gfortran * Tony !Goetz/Absoft * Terry !Donte/Oracle * ...and probably others whom I'm forgetting :-( There's still opportunities for optimization in the mpi_f08 implementation, but by and large, it is as far along as it can be until Fortran compilers start implementing the new F08 dimension(..) syntax. Note that gfortran is currently unsupported for the mpi_f08 module and the new mpi module. gfortran users will a) fall back to the same mpi module implementation that is in OMPI v1.5.x, and b) not get the new mpi_f08 module. The gfortran maintainers are actively working hard to add the necessary features to support both the new mpi_f08 module and the new mpi module implementations. This will take some time. As mentioned above, ompi/mpi/f77 and ompi/mpi/f90 no longer exist. All the fortran bindings implementations have been collated under ompi/mpi/fortran; each implementation has its own subdirectory: {{{ ompi/mpi/fortran/ base/ - glue code mpif-h/ - what used to be ompi/mpi/f77 use-mpi-tkr/ - what used to be ompi/mpi/f90 use-mpi-ignore-tkr/ - new mpi module implementation use-mpi-f08/ - new mpi_f08 module implementation }}} There's also a prototype 6-function-MPI implementation under use-mpi-f08-desc that emulates the new F08 dimension(..) syntax that isn't fully available in Fortran compilers yet. We did that to prove it to ourselves that it could be done once the compilers fully support it. This directory/implementation will likely eventually replace the use-mpi-f08 version. Other things that were done: * ompi_info grew a few new output fields to describe what level of Fortran support is included * Existing Fortran examples in examples/ were renamed; new mpi_f08 examples were added * The old Fortran MPI libraries were renamed: * libmpi_f77 -> libmpi_mpifh * libmpi_f90 -> libmpi_usempi * The configury for Fortran was consolidated and significantly slimmed down. Note that the F77 env variable is now IGNORED for configure; you should only use FC. Example: {{{ shell$ ./configure CC=icc CXX=icpc FC=ifort ... }}} All of this work was done in a Mercurial branch off the SVN trunk, and hosted at Bitbucket. This branch has got to be one of OMPI's longest-running branches. Its first commit was Tue Apr 07 23:01:46 2009 -0400 -- it's over 3 years old! :-) We think we've pulled in all relevant changes from the OMPI trunk (e.g., Fortran implementations of the new MPI-3 MPROBE stuff for mpif.h, use mpi, and use mpi_f08, and the recent Fujitsu Fortran patches). I anticipate some instability when we bring this stuff into the trunk, simply because it touches a LOT of code in the MPI layer in the OMPI code base. We'll try our best to make it as pain-free as possible, but please bear with us when it is committed. This commit was SVN r26283.	2012-04-18 15:57:29 +00:00
George Bosilca	a0c26fd715	Replace jumps with returns. This commit was SVN r21845.	2009-08-20 02:29:30 +00:00
George Bosilca	33e7cc864c	The ops should be indexed based on the MPI datatype index (which is actually the same as the Fortran value of the MPI type). This commit was SVN r21689.	2009-07-15 21:30:09 +00:00
Rainer Keller	6c5532072a	- Split the datatype engine into two parts: an MPI specific part in OMPI and a language agnostic part in OPAL. The convertor is completely moved into OPAL. This offers several benefits as described in RFC http://www.open-mpi.org/community/lists/devel/2009/07/6387.php namely: - Fewer basic types (int* and float* types, boolean and wchar - Fixing naming scheme to ompi-nomenclature. - Usability outside of the ompi-layer. - Due to the fixed nature of simple opal types, their information is completely known at compile time and therefore constified - With fewer datatypes (22), the actual sizes of bit-field types may be reduced from 64 to 32 bits, allowing reorganizing the opal_datatype structure, eliminating holes and keeping data required in convertor (upon send/recv) in one cacheline... This has implications to the convertor-datastructure and other parts of the code. - Several performance tests have been run, the netpipe latency does not change with this patch on Linux/x86-64 on the smoky cluster. - Extensive tests have been done to verify correctness (no new regressions) using: 1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and ompi-ddt: a. running both trunk and ompi-ddt resulted in no differences (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run correctly). b. with --enable-memchecker and running under valgrind (one buglet when run with static found in test-suite, commited) 2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt: all passed (except for the dynamic/ tests failed!! as trunk/MTT) 3. compilation and usage of HDF5 tests on Jaguar using PGI and PathScale compilers. 4. compilation and usage on Scicortex. - Please note, that for the heterogeneous case, (-m32 compiled binaries/ompi), neither ompi-trunk, nor ompi-ddt branch would successfully launch. This commit was SVN r21641.	2009-07-13 04:56:31 +00:00
Terry Dontje	0178b6c45f	Added padding to predefined handle structures to maintain library version to version compatibility. This commit was SVN r20627.	2009-02-24 17:17:33 +00:00
Jeff Squyres	4d8a187450	Two major things in this commit: * New "op" MPI layer framework * Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2) = Op framework = Add new "op" framework in the ompi layer. This framework replaces the hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples for pre-defined MPI_Ops, allowing components and modules to provide the back-end functions. The intent is that components can be written to take advantage of hardware acceleration (GPU, FPGA, specialized CPU instructions, etc.). Similar to other frameworks, components are intended to be able to discover at run-time if they can be used, and if so, elect themselves to be selected (or disqualify themselves from selection if they cannot run). If specialized hardware is not available, there is a default set of functions that will automatically be used. This framework is ''not'' used for user-defined MPI_Ops. The new op framework is similar to the existing coll framework, in that the final set of function pointers that are used on any given intrinsic MPI_Op can be a mixed bag of function pointers, potentially coming from multiple different op modules. This allows for hardware that only supports some of the operations, not all of them (e.g., a GPU that only supports single-precision operations). All the hard-coded back-end MPI_Op functions for (MPI_Op, MPI_Datatype) tuples still exist, but unlike coll, they're in the framework base (vs. being in a separate "basic" component) and are automatically used if no component is found at runtime that provides a module with the necessary function pointers. There is an "example" op component that will hopefully be useful to those writing meaningful op components. It is currently .ompi_ignore'd so that it doesn't impinge on other developers (it's somewhat chatty in terms of opal_output() so that you can tell when its functions have been invoked). See the README file in the example op component directory. Developers of new op components are encouraged to look at the following wiki pages: https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework = MPI_REDUCE_LOCAL = Part of the MPI-2.2 proposal listed here: https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24 is to add a new function named MPI_REDUCE_LOCAL. It is very easy to implement, so I added it (also because it makes testing the op framework pretty easy -- you can do it in serial rather than via parallel reductions). There's even a man page! This commit was SVN r20280.	2009-01-14 23:44:31 +00:00
Jeff Squyres	a7586bdd90	Cosmetic changes: * Update to 4 space tabs where relevant (and some irrelevant white space changes) * Move a few constants to the left of !=/== * Add a few {}'s are one line blocks * Use BEGIN/END_C_DECLS * Change /< to / in a few places This commit was SVN r20177.	2008-12-31 14:50:54 +00:00
Rich Graham	afd71abde6	remove some useless qualifiers. This commit was SVN r18469.	2008-05-21 01:11:49 +00:00
Rich Graham	3b42d2268d	add functions to handle two different input buffers and a separate output buffer. User defined data types have not way to make use of these. This commit was SVN r18012.	2008-03-28 23:45:44 +00:00
Jeff Squyres	4fbcb75ce8	With 5 commits over a 16 hour period and 3 broken tarball builds and a still-broken trunk build on common platforms (e.g., 64 bit Linux RHEL4U4), I think it's clear that this code is not ready for prime-time. I'm backing out all the commits in the trunk/ompi/op tree from r17901 onwards. This code can be re-committed when compiles and runs on common platforms. cd ompi/op svn merge -r 17907:17900 https://svn.open-mpi.org/svn/ompi/trunk/ompi/op . This commit was SVN r17908. The following SVN revision numbers were found above: r17901 --> open-mpi/ompi@b9520e61dc	2008-03-21 14:47:01 +00:00
Rich Graham	df4a6c3fc5	fix function prototypes for new 3 buffer routines. This commit was SVN r17906.	2008-03-21 13:44:15 +00:00
Rich Graham	b9520e61dc	get the sm optimized allreduce working for all but user defined operations. Added to the reduction operations a set of reduction functions that take 2 input buffers and one output buffer to avoid some extra memory copies. These can't be used with user defined operations. The intel c collective suite passes both original, and new (new, not the user defined operations). This commit was SVN r17901.	2008-03-20 23:51:16 +00:00
Galen Shipman	b3b3c98c89	missing include file This commit was SVN r17484.	2008-02-17 19:38:20 +00:00
George Bosilca	906e8bf1d1	Replace the ompi_pointer_array with opal_pointer_array. The next step (sometimes after the merge with the ORTE branch), the opal_pointer_array will became the only pointer_array implementation (the orte_pointer_array will be removed). This commit was SVN r17007.	2007-12-21 06:02:00 +00:00
Rainer Keller	d3372729bb	- Support for opt. MPI_REAL2 (who has that?) to make checks for MPI-implementations fail in the right way ,-] - check in configure.ac - BINARY INCOMPATIBLE change to mpif-common.h (if implemented the right way) Actually OMPI_F90_CHECK takes two arguments, not three. - Only have corresponding C-Type, if the opt. Fortran type is really supported, Otherwise pass ompi_mpi_unavailable to DECLARE_MPI_SYNONYM_DDT; - Reviewed by George and Jeff This commit was SVN r15133.	2007-06-19 05:03:11 +00:00
Rainer Keller	1feb5fb21a	- Initializaton fixes of structure (o_f_to_c_index)... - Mainly indentation, except for ompi_op_create, here just dont nest into ifs... This commit was SVN r15131.	2007-06-18 23:03:56 +00:00
Sven Stork	037b01ce9e	- more symbols that need to be exported This commit was SVN r14415.	2007-04-18 14:53:56 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
George Bosilca	a297a7ae67	MPI standard state that MPI_LONG_LONG and MPI_LONG_LONG_INT are synonyms. Thanks to Martin audet for finding out this one. This commit was SVN r9699.	2006-04-24 21:24:10 +00:00
Rainer Keller	b4e7f38360	- Well, well, the OMPI_OP_TYPE_CHAR was not supposed to stay in the enum. Actually, the current ordering of the enum is the nice however, at the moment for 1.0, signed_char is not supported. This commit was SVN r9246.	2006-03-10 16:02:45 +00:00
Rainer Keller	0fa295dc28	- Allow MPI_UNSIGNED_CHAR and MPI_SIGNED_CHAR for Reduction operations as described by MPI2, p77. This commit was SVN r9229.	2006-03-09 16:51:59 +00:00
Jeff Squyres	a192af34e7	Shame on me for not looking at the expanded collective datatype tables for the C++ bindings in MPI-2 p276-278 to see that MPI_BOOL should work with MPI_LAND, MPI_LOR, and MPI_LXOR. Thanks to Andy Selle for pointing this out. This commit was SVN r9200.	2006-03-04 18:35:33 +00:00
Brian Barrett	566a050c23	Next step in the project split, mainly source code re-arranging - move files out of toplevel include/ and etc/, moving it into the sub-projects - rather than including config headers with <project>/include, have them as <project> - require all headers to be included with a project prefix, with the exception of the config headers ({opal,orte,ompi}_config.h mpi.h, and mpif.h) This commit was SVN r8985.	2006-02-12 01:33:29 +00:00
George Bosilca	e20265bd2b	Dont let any external to the data-type code check directly for the predefined data-types. Instead, use the newly provided data-type function ompi_ddt_is_predefined.. This commit was SVN r8903.	2006-02-06 18:01:45 +00:00
Jeff Squyres	5f96a74e33	Make user-defined MPI::Op's be thread safe (the previous implementation was not thread safe). See lengthy comment in ompi/mpi/cxx/intercepts.cc::ompi_mpi_cxx_op_intercept() for a full explanation. This commit was SVN r8606.	2005-12-23 16:49:09 +00:00
George Bosilca	6fb4ce5e2e	Some dependencies cleanups (there were on hold for a while). This commit was SVN r8425.	2005-12-09 05:14:18 +00:00
Jeff Squyres	42ec26e640	Update the copyright notices for IU and UTK. This commit was SVN r7999.	2005-11-05 19:57:48 +00:00
Jeff Squyres	4fc135fd2b	Looks like I forgot to put DDT support for the optional C datatypes MPI_UNSIGNED_LONG_LONG, MPI_LONG_LONG, and MPI_LONG_LONG_INT -- although I already had implementations of all the relevant functions for these types. Doh! This commit was SVN r7944.	2005-11-01 03:28:59 +00:00
Jeff Squyres	7bdfe6557b	- Update the checks in REDUCE, ALLREDUCE, SCAN, EXSCAN, and REDUCE_SCATTER to more thoroughly check the datatype/op combination to see if it's valid or not. If it's not, print a meaningful error message rather than "Invalid MPI_Op" indicating what specifically was wrong (therefore hopefully helping users track down where in the code the problem is, and/or telling us that there's a reduction operation combo that we don't support that we should) - The check for whether a datatype is intrinsic needed to be updated -- it's not sufficient to check that dtype->id < DT_MAX_PREDEFINED; you really need to check the PREDEFINED flag on the datatype. Thanks to George for this fix (only intrinsics have a meaningful value in dtype->id). This commit was SVN r7923.	2005-10-28 16:47:32 +00:00
Jeff Squyres	23ab9e0277	A better solution to the previous commit -- RETAIN/RELEASE the MPI_Op at the top-level MPI API function. This allows two kinds of scenarios: 1. MPI_Ireduce(..., op, ...); MPI_Op_free(op); MPI_Wait(...); For the non-blocking collectives that we're someday planning -- to make them analogous to non-blocking point-to-point stuff. 2. Thread 1: MPI_Reduce(..., op, ...); Thread 2: MPI_Op_free(op); Granted, for #2 to occur would tread a fine line between a correct and erroneous MPI program, but it is possible (as long as the Op_free was after MPI_reduce() had started to execute). It's more realistic with case #1, where the Op_free() could be executed in the same thread or a different thread. This commit was SVN r7870.	2005-10-25 19:20:42 +00:00
Jeff Squyres	f8fd10715c	- Minor style fix - Be sure to properly OBJ_CONSTRUCT the intrinsic MPI_Op's - RETAIN/RELEASE the op's when used in the invoke function This commit was SVN r7863.	2005-10-25 16:24:00 +00:00
Brian Barrett	1e2f7d6a3d	* make sure to expose ompi_op_t as an object This commit was SVN r7848.	2005-10-24 20:31:14 +00:00
Jeff Squyres	c568936a7c	Add support for MPI_OP_SUM, PROD, REPLACE with MPI_DOUBLE_COMPLEX. Need to consult with George -- might also need to add support for complex types on floats or long doubles... This commit was SVN r7716.	2005-10-12 02:31:28 +00:00
Jeff Squyres	1d20f800ba	Add another flag to MPI_Op's -- whether it is associative with floating point numbers or not (e.g., MAX is, but SUM is not). This commit was SVN r7230.	2005-09-08 09:47:27 +00:00
Brian Barrett	499e4de1e7	* rename ompi_object and ompi_class to opal_object and opal_class This commit was SVN r6321.	2005-07-03 16:06:07 +00:00
Jeff Squyres	4ab17f019b	Rename src -> ompi This commit was SVN r6269.	2005-07-02 13:43:57 +00:00

41 Коммитов