openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	1ef988c3d9	A slight optimization: no longer call sched_yield() when polling for shmem progress (or the Windows equiv). Instead, poll hard on the condition, but periocially call opal_progress(). This allows badly-formed apps (e.g., the ibm test communicator/bsend_free) to actually complete. To be clear, there are far too many apps out there that assume that MPI collectives will actually progress the rest of MPI. I don't like putting in a feature to enable broken apps, but I have a dim recollection of this issue coming up before (apps "hanging" when testing the sm coll because they assumed that calling collectives would trigger other MPI progress). Rather than have people claim that OMPI is broken, I prefer to put in this "workaround". :-( Indeed, the bsend_free test ''may'' be coded that way for exactly that reason...? I don't remember offhand... This commit was SVN r21984.	2009-09-21 22:20:44 +00:00
Jeff Squyres	533633b8cb	Fixes trac:1988. The little bug that turned out to be huge. Yoinks. * Various cosmetic/style updates in the btl sm * Clean up concept of mpool module (I think that code was written way back when the concept of "modules" was fuzzy) * Bring over some old fixes from the /tmp/timattox-sm-coll/ tree to fix potential segv's when mmap'ed regions were at different addresses in different processes (thanks Tim!). * Change sm coll to no longer use mpool as its main source of shmem; rather, just mmap its own segment (because it's fixed size -- there was nothing to be gained by using mpool; shedding the use of mpool saved a lot of complexity in the sm coll setup). This effectively made Tim's fixes moot (because now everything is an offset into the mmap that is computed locally; there are no global pointers). :-) * Slightly updated common/sm to allow making mmap's for a specific set of procs (vs. ''all'' procs in the process). This potentially allows for same-host-inter-proc mmaps -- yay! * Fixed many, many things in the coll sm (particularly in reduce): * Fixed handling of MPI_IN_PLACE in reduce and allreduce * Fixed handling of non-contiguous datatypes in reduce * Changed the order of reductions to go from process (n-1)'s data to process 0's data, because that's how all other OMPI coll components work * Fixed lots of usage of ddt functions * When using a non-contiguous datatype, if the root process is not (n-1), now we used a 2nd convertor to copy from shmem to the rbuf (saves a memory copy vs. what was done before) * Lots and lots of little cleanups, clarifications, and minor optimizations (although still more could be done -- e.g., I think the use of write memory barriers is fairly sub-optimal; they could be ganged together at the root, for example) I'm marking this as "fixes trac:1988" and closing the ticket; if something is still broken, we can re-open the ticket. This commit was SVN r21967. The following Trac tickets were found above: Ticket 1988 --> https://svn.open-mpi.org/trac/ompi/ticket/1988	2009-09-15 00:25:21 +00:00
Rainer Keller	8e1b23779f	- Replace combinations of #if defined (c_plusplus) defined (__cplusplus) followed by extern "C" { and the closing counterpart by BEGIN_C_DECLS and END_C_DECLS. Notable exceptions are: - opal/include/opal_config_bottom.h: This is our generated code, that itself defines BEGIN_C_DECL and END_C_DECL - ompi/mpi/cxx/mpicxx.h: Here we do not include opal_config_bottom.h: - Belongs to external code: opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.c opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.h - opal/include/opal/prefetch.h: Has C++ specific macros that are protected: - Had #if ... } #endif _and_ END_C_DECLS (aka end up with 2x END_C_DECLS) ompi/mca/btl/openib/btl_openib.h - opal/event/event.h has #ifdef __cplusplus as BEGIN_C_DECLS... - opal/win32/ompi_process.h: had extern "C"\n {... opal/win32/ompi_process.h: dito - ompi/mca/btl/pcie/btl_pcie_lex.l: needed to add *_C_DECLS ompi/mpi/f90/test/align_c.c: dito - ompi/debuggers/msgq_interface.h: used #ifdef __cplusplus - ompi/mpi/f90/xml/common-C.xsl: Amend Tested on linux using --with-openib and --with-mx The following do not contain either opal_config.h, orte_config.h or ompi_config.h (but possibly other header files, that include one of the above): ompi/mca/bml/r2/bml_r2_ft.h ompi/mca/btl/gm/btl_gm_endpoint.h ompi/mca/btl/gm/btl_gm_proc.h ompi/mca/btl/mx/btl_mx_endpoint.h ompi/mca/btl/ofud/btl_ofud_endpoint.h ompi/mca/btl/ofud/btl_ofud_frag.h ompi/mca/btl/ofud/btl_ofud_proc.h ompi/mca/btl/openib/btl_openib_mca.h ompi/mca/btl/portals/btl_portals_endpoint.h ompi/mca/btl/portals/btl_portals_frag.h ompi/mca/btl/sctp/btl_sctp_endpoint.h ompi/mca/btl/sctp/btl_sctp_proc.h ompi/mca/btl/tcp/btl_tcp_endpoint.h ompi/mca/btl/tcp/btl_tcp_ft.h ompi/mca/btl/tcp/btl_tcp_proc.h ompi/mca/btl/template/btl_template_endpoint.h ompi/mca/btl/template/btl_template_proc.h ompi/mca/btl/udapl/btl_udapl_eager_rdma.h ompi/mca/btl/udapl/btl_udapl_endpoint.h ompi/mca/btl/udapl/btl_udapl_mca.h ompi/mca/btl/udapl/btl_udapl_proc.h ompi/mca/mtl/mx/mtl_mx_endpoint.h ompi/mca/mtl/mx/mtl_mx.h ompi/mca/mtl/psm/mtl_psm_endpoint.h ompi/mca/mtl/psm/mtl_psm.h ompi/mca/pml/cm/pml_cm_component.h ompi/mca/pml/csum/pml_csum_comm.h ompi/mca/pml/dr/pml_dr_comm.h ompi/mca/pml/dr/pml_dr_component.h ompi/mca/pml/dr/pml_dr_endpoint.h ompi/mca/pml/dr/pml_dr_recvfrag.h ompi/mca/pml/example/pml_example.h ompi/mca/pml/ob1/pml_ob1_comm.h ompi/mca/pml/ob1/pml_ob1_component.h ompi/mca/pml/ob1/pml_ob1_endpoint.h ompi/mca/pml/ob1/pml_ob1_rdmafrag.h ompi/mca/pml/ob1/pml_ob1_recvfrag.h ompi/mca/pml/v/pml_v_output.h opal/include/opal/prefetch.h opal/mca/timer/aix/timer_aix.h opal/util/qsort.h test/support/components.h This commit was SVN r21855. The following SVN revision numbers were found above: r2 --> open-mpi/ompi@58fdc18855	2009-08-20 11:42:18 +00:00
George Bosilca	23e8ce91ba	Rework the selection logic for the tuned collectives. All supported collectives now are able to use the dynamic rules. Moreover, these rules are loaded only once, and stored at the component level. All communicators are able to use these rules (not only MPI_COMM_WORLD as until now). A lot of minor corrections, memory management issues and reduction in the amount of memory used by the tuned collectives. This commit was SVN r21825.	2009-08-14 21:06:23 +00:00
Shiqing Fan	bce2f44154	Update related .windows files with proper compiling properties, in order to have a successful DSO build. This commit was SVN r21805.	2009-08-12 08:55:58 +00:00
Rainer Keller	6c5532072a	- Split the datatype engine into two parts: an MPI specific part in OMPI and a language agnostic part in OPAL. The convertor is completely moved into OPAL. This offers several benefits as described in RFC http://www.open-mpi.org/community/lists/devel/2009/07/6387.php namely: - Fewer basic types (int* and float* types, boolean and wchar - Fixing naming scheme to ompi-nomenclature. - Usability outside of the ompi-layer. - Due to the fixed nature of simple opal types, their information is completely known at compile time and therefore constified - With fewer datatypes (22), the actual sizes of bit-field types may be reduced from 64 to 32 bits, allowing reorganizing the opal_datatype structure, eliminating holes and keeping data required in convertor (upon send/recv) in one cacheline... This has implications to the convertor-datastructure and other parts of the code. - Several performance tests have been run, the netpipe latency does not change with this patch on Linux/x86-64 on the smoky cluster. - Extensive tests have been done to verify correctness (no new regressions) using: 1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and ompi-ddt: a. running both trunk and ompi-ddt resulted in no differences (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run correctly). b. with --enable-memchecker and running under valgrind (one buglet when run with static found in test-suite, commited) 2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt: all passed (except for the dynamic/ tests failed!! as trunk/MTT) 3. compilation and usage of HDF5 tests on Jaguar using PGI and PathScale compilers. 4. compilation and usage on Scicortex. - Please note, that for the heterogeneous case, (-m32 compiled binaries/ompi), neither ompi-trunk, nor ompi-ddt branch would successfully launch. This commit was SVN r21641.	2009-07-13 04:56:31 +00:00
Jeff Squyres	92e40cb20a	Enable the coll sync component to barrier before each 1000th collective. This commit was SVN r21594.	2009-07-02 20:16:45 +00:00
Jeff Squyres	cad12fda5f	* Remove an extra blank line from the help file * Add the help file to the Makefile.am so that it gets installed This commit was SVN r21567.	2009-06-30 18:58:09 +00:00
Rainer Keller	fbb2834977	- Missed string.h to get rid of warnings... This commit was SVN r21265.	2009-05-22 23:47:49 +00:00
Rainer Keller	225a1d6d8e	- For memcpy and memset need string.h This commit was SVN r21259.	2009-05-21 22:36:06 +00:00
Greg Koenig	60485ff95f	This is a very large change to rename several #define values from OMPI_* to OPAL_*. This allows opal layer to be used more independent from the whole of ompi. NOTE: 9 "svn mv" operations immediately follow this commit. This commit was SVN r21180.	2009-05-06 20:11:28 +00:00
Shiqing Fan	cd565923d3	Completely remove ltdl support for Windows build. This commit was SVN r21170.	2009-05-05 18:59:13 +00:00
George Bosilca	039fed1973	Fix Coverity CID #264 . This commit was SVN r21162.	2009-05-05 13:54:55 +00:00
George Bosilca	db096d7d3a	Fix Coverity CID #304 . This commit was SVN r21159.	2009-05-05 13:47:47 +00:00
Rainer Keller	9736af1191	- Fix Coverity CID 182: Well, well, just do not "call" ompi_comm_rank twice but rather reuse variable... - Fix Coverity CID 1262: Using uninitialized value "(statuses[err_index]).MPI_ERROR" Sure, these statuses are only initialized after ompi_request_wait_all, so introduce a short-circuit label to jump to... This commit was SVN r21153.	2009-05-05 12:28:51 +00:00
Rainer Keller	221fb9dbca	... Delayed due to notifier commits earlier this day ... - Delete unnecessary header files using contrib/check_unnecessary_headers.sh after applying patches, that include headers, being "lost" due to inclusion in one of the now deleted headers... In total 817 files are touched. In ompi/mpi/c/ header files are moved up into the actual c-file, where necessary (these are the only additional #include), otherwise it is only deletions of #include (apart from the above additions required due to notifier...) - To get different MCAs (OpenIB, TM, ALPS), an earlier version was successfully compiled (yesterday) on: Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled This commit was SVN r21096.	2009-04-29 01:32:14 +00:00
Shiqing Fan	3d4e0472d6	Add windows support files into the tarball, including .windows, CMakeLists.txt files, and CMake modules. Thanks to Jeff for testing it on Linux. This commit was SVN r21069.	2009-04-24 16:39:33 +00:00
George Bosilca	c5b1bdd57c	Correctly deal with the error case. The problem is tricky: the MPI standard doesn't allow MPI_ERR_IN_STATUS to be returned from any functions that return only one completed request (few exception here: wait_some and wait_all and the test versions). As we use an wait_all in these send_receive functions we should convert the MPI_ERR_IN_STATUS to the real error, i.e. the one comming from the MPI_ERROR field in the status corresponding to the failed request. This commit was SVN r20907.	2009-03-31 23:44:59 +00:00
Rainer Keller	d8cf4c0fec	- Get pgcc on XT to complain less: In case we use memcmp, strlen, strup and friends include <string.h> Also several constants.h are not included directly - Let's have mca_topo_base_cart_create return ompi-errors in ompi/mca/topo/base/topo_base_cart_create.c This commit was SVN r20773.	2009-03-13 02:10:32 +00:00
Jeff Squyres	14ee1b7ba2	Refs trac:1826: remove barriers before all non-rooted collective ops. This commit was SVN r20763. The following Trac tickets were found above: Ticket 1826 --> https://svn.open-mpi.org/trac/ompi/ticket/1826	2009-03-12 02:23:08 +00:00
Rainer Keller	ec0ed48718	- Revert r20739 This commit was SVN r20742. The following SVN revision numbers were found above: r20739 --> open-mpi/ompi@781caee0b6	2009-03-05 21:56:03 +00:00
Rainer Keller	781caee0b6	- First of two or three patches, in orte/util/proc_info.h: Adapt orte_process_info to orte_proc_info, and change orte_proc_info() to orte_proc_info_init(). - Compiled on linux-x86-64 - Discussed with Ralph This commit was SVN r20739.	2009-03-05 20:36:44 +00:00
Shiqing Fan	99b415a7e0	On windows, the mca_common_* libraries should be installed in bin, otherwise the libraries that are dependent on them, e.g. shared build of mca_btl_sm, couldn't be loaded at runtime. This commit fixes the problem. This commit was SVN r20735.	2009-03-05 14:57:35 +00:00
Rainer Keller	9dea63d63a	- Last of intrusive commits (promised)... err for now. Anyway, this is blocking the move: do not include pml.h if not really needed, aka none of the following used: mca_pml MCA_PML_CALL OMPI_ANY_TAG OMPI_ANY_SOURCE OMPI_PROC_NULL - Notable exceptions (deleting in one header->adding): - ompi/mca/mtl/psm/ - ompi/mca/osc/rdma/ - ompi/mca/btl/openib/btl_openib_endpoint.c depended on pml_base_sendreq.h - Tested on Linux/x86-64, this time including make check (thanks Jeff and Ralph) This commit was SVN r20725.	2009-03-04 17:06:51 +00:00
Rainer Keller	811f2bd9b4	- As discussed on RFC, move the ompi_bitmap to the opal layer. Add a check against a maximum (actually get rid of ifs internally to opal_bitmap.c) -- the functionality to set the current maximum size opal_bitmap_set_max_size() is currently only used in attribute.c to set the maximum OMPI_FORTRAN_HANDLE_MAX... Tested on linux/x86-64 with intel-tests with all_tests_no_perf_f run with 6 procs. Let's look into MTT as well... This commit was SVN r20708.	2009-03-03 22:25:13 +00:00
Rich Graham	7ef1550267	add an index to indicate which socket group I belong to. This commit was SVN r20672.	2009-03-02 14:39:54 +00:00
Rich Graham	daf7673aff	gather socket information - not debugged.` This commit was SVN r20670.	2009-03-02 10:58:12 +00:00
Rainer Keller	96e1b9b747	- Header orte/mca/rml/rml.h is not needed if no occurence of orte_rml or ORTE_RML. As the others compiles fine with -Wimplicit-function-declaration This commit was SVN r20639.	2009-02-26 03:52:31 +00:00
Terry Dontje	0178b6c45f	Added padding to predefined handle structures to maintain library version to version compatibility. This commit was SVN r20627.	2009-02-24 17:17:33 +00:00
Shiqing Fan	2148220ce4	Update the share libs dependency for windows build. This commit was SVN r20625.	2009-02-23 17:49:46 +00:00
Jeff Squyres	3742c3550c	Add "sync" collective component. This component is totally deactivated by default. It is activated by setting either of the following two MCA parameters to values greater than 0: * coll_sync_barrier_before * coll_sync_barrier_after If !_before is >0, then the sync coll collective will insert itself before the underlying collective operations and invoke a barrier before every Nth barrier (N == coll_sync_barrier_before). Similar for !_after. Note that N is a _per communicator_ value; not global to the MPI process. If both are 0 (which is the default), this component returns NULL for the comm query, meaning that it is not insertted into the coll module stack. The intent of this component is to provide a a workaround for applications with large numbers of collectives of short messages that can cause unbounded unexpected messages. Specifically, it is possible for some iterative collective communication patterns to cause unbounded unexpected messages. Forcing a barrier before or after every Nth collective operation would prevent that behavior by forcing applications to synchronize (and thereby consume any outstanding unexpected messages caused by collectives on the same communicator). Open MPI still needs to bound unexpected messages resource consumption at the receiver, but this is a viable workaround for at least some symptoms of the problem. Additionally, there has been anecdotal evidence of some applications that "perfom better" when they put barriers after other collective operations. This could be due to many factors -- including shortening the unexpected message queue. Putting this component in Open MPI allows people to try this with their own applications and give real world feedback on this kind of behavior. This commit was SVN r20584.	2009-02-18 23:32:44 +00:00
Rainer Keller	d81443cc5a	- On the way to get the BTLs split out and lessen dependency on orte: Often, orte/util/show_help.h is included, although no functionality is required -- instead, most often opal_output.h, or orte/mca/rml/rml_types.h Please see orte_show_help_replacement.sh commited next. - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration actually showed two missing #include "orte/util/show_help.h" in orte/mca/odls/base/odls_base_default_fns.c and in orte/tools/orte-top/orte-top.c Manually added these. Let's have MTT the last word. This commit was SVN r20557.	2009-02-14 02:26:12 +00:00
Jeff Squyres	8b29e27ead	Some minor valgrind-inspired cleanups: fix some memory leaks This commit was SVN r20543.	2009-02-13 03:45:32 +00:00
Tim Mattox	9b83df22ec	Fix some "is proc on local node?" logic that got accidentally flipped by r20496 for the sm BTL, openib BTL on iWarp, and the sm & sm2 coll modules. This commit was SVN r20515. The following SVN revision numbers were found above: r20496 --> open-mpi/ompi@4cdf91a8d4	2009-02-11 15:02:38 +00:00
Ralph Castain	4cdf91a8d4	Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself). The prior ompi_proc_t structure had a uint8_t flag field in it, where only one bit was used to flag that a proc was "local". In that context, "local" was constrained to mean "local to this node". This commit provides a greater degree of granularity on the term "local", to include tests to see if the proc is on the same socket, PC board, node, switch, CU (computing unit), and cluster. Add #define's to designate which bits stand for which local condition. This was added to the OPAL layer to avoid conflicting with the proposed movement of the BTLs. To make it easier to use, a set of macros have been defined - e.g., OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in the code base to clearly indicate which sense of locality is being considered. All locations in the code base that looked at the current proc_t field have been changed to use the new macros. Also modify the orte_ess modules so that each returns a uint8_t (to match the ompi_proc_t field) that contains a complete description of the locality of this proc. Obviously, not all environments will be capable of providing such detailed info. Thus, getting a "false" from a test for "on_local_socket" may simply indicate a lack of knowledge. This commit was SVN r20496.	2009-02-10 02:20:16 +00:00
Jeff Squyres	4d8a187450	Two major things in this commit: * New "op" MPI layer framework * Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2) = Op framework = Add new "op" framework in the ompi layer. This framework replaces the hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples for pre-defined MPI_Ops, allowing components and modules to provide the back-end functions. The intent is that components can be written to take advantage of hardware acceleration (GPU, FPGA, specialized CPU instructions, etc.). Similar to other frameworks, components are intended to be able to discover at run-time if they can be used, and if so, elect themselves to be selected (or disqualify themselves from selection if they cannot run). If specialized hardware is not available, there is a default set of functions that will automatically be used. This framework is ''not'' used for user-defined MPI_Ops. The new op framework is similar to the existing coll framework, in that the final set of function pointers that are used on any given intrinsic MPI_Op can be a mixed bag of function pointers, potentially coming from multiple different op modules. This allows for hardware that only supports some of the operations, not all of them (e.g., a GPU that only supports single-precision operations). All the hard-coded back-end MPI_Op functions for (MPI_Op, MPI_Datatype) tuples still exist, but unlike coll, they're in the framework base (vs. being in a separate "basic" component) and are automatically used if no component is found at runtime that provides a module with the necessary function pointers. There is an "example" op component that will hopefully be useful to those writing meaningful op components. It is currently .ompi_ignore'd so that it doesn't impinge on other developers (it's somewhat chatty in terms of opal_output() so that you can tell when its functions have been invoked). See the README file in the example op component directory. Developers of new op components are encouraged to look at the following wiki pages: https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework = MPI_REDUCE_LOCAL = Part of the MPI-2.2 proposal listed here: https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24 is to add a new function named MPI_REDUCE_LOCAL. It is very easy to implement, so I added it (also because it makes testing the op framework pretty easy -- you can do it in serial rather than via parallel reductions). There's even a man page! This commit was SVN r20280.	2009-01-14 23:44:31 +00:00
Edgar Gabriel	1072812bcf	not every element in the pointer array list contains a valid entry. Thus, do not try to free elements if the list returns NULL. This commit was SVN r20275.	2009-01-14 19:11:30 +00:00
George Bosilca	01adc999c5	Correctly forward the right module if we call another collective function. Kudos to Edgar for figuring out this tricky bug. This commit was SVN r20267.	2009-01-14 03:22:54 +00:00
Jeff Squyres	11b375f8b5	CIDs 1080-1090: assert() checks were not sufficient to check for NEGATIVE_RETURNS from _reg_int() because those are not always checked. So replace them with real if() checks. This commit was SVN r20195.	2009-01-03 15:56:25 +00:00
Jeff Squyres	f13ea32830	Remove the code checkig the MCA "coll" parameter for a list of coll components to use. This code was rendered obsolete (albiet harmless) by the MCA base improvements that only open the components that were specified by each framework's MCA parameter. This commit was SVN r20176.	2008-12-31 13:40:51 +00:00
Jeff Squyres	759a295cc9	Gaah -- missed one s/m/component/g This commit was SVN r20175.	2008-12-31 13:35:37 +00:00
Jeff Squyres	955d1e132d	Rename a variable to be "component" (not "m"), to emphasize that it is the component struct, not a module. This commit was SVN r20174.	2008-12-31 13:32:46 +00:00
Jeff Squyres	865900dd27	Nothing of substance; just indenting changes (''finally'' update this framework base to 4 space tabs!). This commit was SVN r20173.	2008-12-31 12:17:08 +00:00
Jeff Squyres	ce313fa391	Minor fixes to a few comments This commit was SVN r20172.	2008-12-31 11:34:27 +00:00
Jeff Squyres	d533215dac	Fix a comment to reflect the right version number This commit was SVN r20169.	2008-12-30 12:39:32 +00:00
Nysal Jan	ee8ec6f6b5	Remove dead/redundant code. Minimize number of calloc invocations This commit was SVN r20121.	2008-12-12 10:55:50 +00:00
Shiqing Fan	a5281f0434	- 1/4 commit for Windows Visual Studio and CCP support: CMakeLists and .windows files. In contribs preconfigured and precompiled parts. This commit was SVN r20108.	2008-12-10 20:59:20 +00:00
Rolf vandeVaart	137729d2f9	Fix warnings (thanks Jeff) from previous fix. This is extra fix for ticket #1554. This commit was SVN r19728.	2008-10-10 14:35:52 +00:00
Tim Mattox	de623ea161	Remove a redundant if & goto. This commit was SVN r19724.	2008-10-09 15:07:56 +00:00
Rolf vandeVaart	aad4427caa	Fix the implementation of MPI_Reduce_scatter on intercommunicators. We still do an interreduce but it is now followed by an intrascatterv. This fixes trac:1554. This commit was SVN r19723. The following Trac tickets were found above: Ticket 1554 --> https://svn.open-mpi.org/trac/ompi/ticket/1554	2008-10-09 14:35:20 +00:00
Rolf vandeVaart	13e8975f83	In the case where we detect a value of 0 in the recvcount array, fall back to the simpler algorithms. This is not the optimal solution, but it works. This commit was SVN r19702.	2008-10-07 19:44:51 +00:00
Rolf vandeVaart	0a0ddfc934	Handle MPI_IN_PLACE correctly in the ompi_coll_tuned_reduce_scatter_intra_ring function. We were not adjusting the sendbuf in this case so we were reducing garbage. This fixes ticket #1506. This commit was SVN r19673.	2008-10-02 20:01:27 +00:00
George Bosilca	325d006577	Mostly cleanups, and eventually a little bit more scalable add_procs. There was an argument that was barely used, and on return at the PML level it contained nothing usable. It has been removed, so now we're using less memory ... This commit was SVN r19657.	2008-09-30 15:47:43 +00:00
George Bosilca	6a9514ee08	Make the code match the comment. I checked with Jelena, and based on the papers we published this is the expected algorithm for the specified message and communicator size. This commit closes ticket #1330. This commit was SVN r19563.	2008-09-15 23:28:40 +00:00
Edgar Gabriel	ef2bb46e45	no need to create and free the groups. We just want to translate the ranks and we can use the internal group structures right away for that operation. Fixes an issue with groups that have not been freed previously, due to the fact that ompi_group_free was not visible here (I know, this could have been solved also by setting OMPI_DECLSPEC on ompi_group_free, but this solution should be faster.) This commit was SVN r19362.	2008-08-19 13:59:58 +00:00
Edgar Gabriel	149ecb8d7d	1. debug the four new algorithms 2. fix a bug in the initial communicator creation of llcomm 3. fix a bug which showed up as the result of fixing issue number 2: we have to check now whether llcomm has really be created before freeing the according llcomm in hierarch_destruct. This commit was SVN r19361.	2008-08-18 21:54:35 +00:00
Edgar Gabriel	7cbc4a4077	adding four different algorithms for a hierarchical bcast which try to generate an overlap between the different layers. Why four versions? Because there is right now always the trade-off between using non-blocking operations on a layer with a trivial, linear algorithm and using the more sophisticaed algorithms in a blocking manner. - bcast_intra_seg used the bcast of lcomm and llcomm, similarly to original algorithm in hierarch. However, it can segment the message, such that we might get an overlap between the two layers. This overlap is based on the assumption, that a process might be done early with a bcast and can start the next one. - bcast_intra_seg1: replaces the llcomm->bcast by isend/irecvs to increase the overlap, keeps the lcomm->bcast however - bcast_intra_seg2: replaced lcomm->bcast by isend/irecvs to increase the overlap, keeps however llcomm->bcast - bcast_intra_seg3: replaced both lcomm->bcast and llcomm->bcast by isend/irecvs The code is lightly tested, more testing to follow right now. This commit was SVN r19358.	2008-08-18 16:05:44 +00:00
George Bosilca	a6e3a47102	Fix typo. This commit was SVN r19312.	2008-08-17 20:08:38 +00:00
Rich Graham	e64f028d62	add missing header file for errno. This commit was SVN r19246.	2008-08-12 01:34:13 +00:00
Jeff Squyres	54ab811426	Fix CID 1036: minor resource leak on error This commit was SVN r19236.	2008-08-11 20:37:36 +00:00
Rainer Keller	ee1fe9015a	- Make sure, that the *param_index are > 0 (here, we don't pass errors up...). Coverity CID 1080 - 1090 - Really make sure, the user does not specify stupid negative values. This commit was SVN r19233.	2008-08-11 11:21:04 +00:00
George Bosilca	3c8d43deed	Remove unused variable (Coverty fix 178). This commit was SVN r19195.	2008-08-06 14:09:43 +00:00
George Bosilca	567c691354	Remove unused variable (Coverty fix 177). This commit was SVN r19194.	2008-08-06 14:08:34 +00:00
George Bosilca	f6ebdf8896	Remove unused variable (Coverty fix 176). This commit was SVN r19193.	2008-08-06 14:07:20 +00:00
George Bosilca	c021427002	Remove unused variable (Coverty fix 175). This commit was SVN r19192.	2008-08-06 14:06:08 +00:00
George Bosilca	6c8017e9b7	Remove unused variable (Coverty fix 174). This commit was SVN r19191.	2008-08-06 14:04:54 +00:00
George Bosilca	afc79d1651	Remove unused variable (Coverty fix 173). This commit was SVN r19190.	2008-08-06 14:03:33 +00:00
George Bosilca	5e3a5b7c13	Remove unused variable (Coverty fix 172). This commit was SVN r19188.	2008-08-06 14:01:33 +00:00
George Bosilca	d897710e4f	Remove unused variable (Coverty fix 171). This commit was SVN r19187.	2008-08-06 14:00:22 +00:00
George Bosilca	417b727006	Remove unused variable (Coverty fix 170). This commit was SVN r19186.	2008-08-06 13:59:03 +00:00
George Bosilca	4f91b7806c	Remove unused variable (Coverty fix 169). This commit was SVN r19185.	2008-08-06 13:57:43 +00:00
Rainer Keller	23c2292478	- Fix variable set but not used Coverity CID1058 This commit was SVN r19184.	2008-08-06 13:57:38 +00:00
Jeff Squyres	0af7ac53f2	Fixes trac:1392, #1400 * add "register" function to mca_base_component_t * converted coll:basic and paffinity:linux and paffinity:solaris to use this function * we'll convert the rest over time (I'll file a ticket once all this is committed) * add 32 bytes of "reserved" space to the end of mca_base_component_t and mca_base_component_data_2_0_0_t to make future upgrades [slightly] easier * new mca_base_component_t size: 196 bytes * new mca_base_component_data_2_0_0_t size: 36 bytes * MCA base version bumped to v2.0 * '''We now refuse to load components that are not MCA v2.0.x''' * all MCA frameworks versions bumped to v2.0 * be a little more explicit about version numbers in the MCA base * add big comment in mca.h about versioning philosophy This commit was SVN r19073. The following Trac tickets were found above: Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392	2008-07-28 22:40:57 +00:00
Jeff Squyres	d37a25a2d0	Remove per http://www.open-mpi.org/community/lists/devel/2008/07/4386.php This commit was SVN r18972.	2008-07-22 00:57:23 +00:00
Edgar Gabriel	798f47b430	Fixes ticket #1334 hierarch disables itself now if the pml module used is not ob1. The reason is, that the multi-level hierarchy detection algorithm checks the names of the btl modules used. In case there are no btl's, we would segfault. Furthermore, three minor changes: - the 2-level hierarchy detection is now the default (sm vs. everything else in the world). - add udapl to the list of protocols checked for by the multi-level hierarch detection - some of the verbose statements of hierarch were inaccurate. Fixed those comments/messages. This commit was SVN r18817.	2008-07-07 18:44:48 +00:00
Ralph Castain	9613b3176c	Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach. I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive. This commit was SVN r18619.	2008-06-09 14:53:58 +00:00
Ralph Castain	c992e99035	Remove the tags from orte_output_open and the filtering operation from orte_output - this will be handled differently to improve the XML output interface This commit was SVN r18557.	2008-06-03 14:24:01 +00:00
Rolf vandeVaart	18879285c7	Fix the selection logic to prevent memory leaks. More work may be done in the priority logic but for now we just fix the leaks and preserve current behavior. This commit fixes trac:1307. This commit was SVN r18504. The following Trac tickets were found above: Ticket 1307 --> https://svn.open-mpi.org/trac/ompi/ticket/1307	2008-05-27 14:16:39 +00:00
Rolf vandeVaart	5baa733ad5	Fix another warning (using a variable before it was initialized.) Thanks Jeff for pointing this out. This commit was SVN r18489.	2008-05-23 13:57:55 +00:00
Rich Graham	b08839f9f5	change reduce-scatter/gather for non-power of 2. Spreading out the load for the non-power of 2 phase of the reduction. This commit was SVN r18486.	2008-05-22 21:42:42 +00:00
Rich Graham	f2a4b67809	automate the allreduce selection logic. This commit was SVN r18484.	2008-05-22 20:53:35 +00:00
Rich Graham	5900415a25	for non-powers of 2, distribute the work on the first step among all the procs doing the work. This commit was SVN r18480.	2008-05-22 18:50:53 +00:00
George Bosilca	c31cc5b270	Remove a warning about line being unused. This commit was SVN r18472.	2008-05-21 20:46:22 +00:00
Edgar Gabriel	0500420bec	fixing a bug in the inter-communicator scatter operation, where we used accidentally rcount instead of scounts. This commit was SVN r18466.	2008-05-20 21:17:19 +00:00
Rolf vandeVaart	74d0259480	Add new implentation of barrier. This shows better performance on some clusters. However, no decision logic is changed by this commit so default behavior has not changed. This is only selectable by runtime parameters. This commit was SVN r18464.	2008-05-20 17:37:41 +00:00
Rolf vandeVaart	71091a19c3	Fix bug in spacing of code per https://svn.open-mpi.org/trac/ompi/wiki/CodingStyle . This commit was SVN r18463.	2008-05-20 14:11:10 +00:00
Rolf vandeVaart	763f5259a8	Fix memory leak of 88 bytes that occurred on each call to MPI_Comm_dup. Need to release the items and the item list after selecting the collective modules that are being used. Reviewed by Jeff Squyres. This commit was SVN r18457.	2008-05-19 21:34:01 +00:00
Jeff Squyres	7154776465	Removed unused variable / compiler warning. This commit was SVN r18454.	2008-05-19 13:41:45 +00:00
Rolf vandeVaart	375406e1fa	Remove the ignore files as decided at Tuesday's developers conference call. Now, hierarchical collectives will be compiled in but the priority is still at 0 requiring a user to set mca parameters to enable them. This commit was SVN r18440.	2008-05-15 01:26:52 +00:00
Jeff Squyres	671f0c379d	Remove a whole pile of orte/util/show_help.h's that I missed. :-( This commit was SVN r18437.	2008-05-14 11:32:33 +00:00
Jeff Squyres	e7ecd56bd2	This commit represents a bunch of work on a Mercurial side branch. As such, the commit message back to the master SVN repository is fairly long. = ORTE Job-Level Output Messages = Add two new interfaces that should be used for all new code throughout the ORTE and OMPI layers (we already make the search-and-replace on the existing ORTE / OMPI layers): * orte_output(): (and corresponding friends ORTE_OUTPUT, orte_output_verbose, etc.) This function sends the output directly to the HNP for processing as part of a job-specific output channel. It supports all the same outputs as opal_output() (syslog, file, stdout, stderr), but for stdout/stderr, the output is sent to the HNP for processing and output. More on this below. * orte_show_help(): This function is a drop-in-replacement for opal_show_help(), with two differences in functionality: 1. the rendered text help message output is sent to the HNP for display (rather than outputting directly into the process' stderr stream) 1. the HNP detects duplicate help messages and does not display them (so that you don't see the same error message N times, once from each of your N MPI processes); instead, it counts "new" instances of the help message and displays a message every ~5 seconds when there are new ones ("I got X new copies of the help message...") opal_show_help and opal_output still exist, but they only output in the current process. The intent for the new orte_* functions is that they can apply job-level intelligence to the output. As such, we recommend that all new ORTE and OMPI code use the new orte_* functions, not thei opal_* functions. === New code === For ORTE and OMPI programmers, here's what you need to do differently in new code: * Do not include opal/util/show_help.h or opal/util/output.h. Instead, include orte/util/output.h (this one header file has declarations for both the orte_output() series of functions and orte_show_help()). * Effectively s/opal_output/orte_output/gi throughout your code. Note that orte_output_open() takes a slightly different argument list (as a way to pass data to the filtering stream -- see below), so you if explicitly call opal_output_open(), you'll need to slightly adapt to the new signature of orte_output_open(). * Literally s/opal_show_help/orte_show_help/. The function signature is identical. === Notes === * orte_output'ing to stream 0 will do similar to what opal_output'ing did, so leaving a hard-coded "0" as the first argument is safe. * For systems that do not use ORTE's RML or the HNP, the effect of orte_output_* and orte_show_help will be identical to their opal counterparts (the additional information passed to orte_output_open() will be lost!). Indeed, the orte_* functions simply become trivial wrappers to their opal_* counterparts. Note that we have not tested this; the code is simple but it is quite possible that we mucked something up. = Filter Framework = Messages sent view the new orte_* functions described above and messages output via the IOF on the HNP will now optionally be passed through a new "filter" framework before being output to stdout/stderr. The "filter" OPAL MCA framework is intended to allow preprocessing to messages before they are sent to their final destinations. The first component that was written in the filter framework was to create an XML stream, segregating all the messages into different XML tags, etc. This will allow 3rd party tools to read the stdout/stderr from the HNP and be able to know exactly what each text message is (e.g., a help message, another OMPI infrastructure message, stdout from the user process, stderr from the user process, etc.). Filtering is not active by default. Filter components must be specifically requested, such as: {{{ $ mpirun --mca filter xml ... }}} There can only be one filter component active. = New MCA Parameters = The new functionality described above introduces two new MCA parameters: * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that help messages will be aggregated, as described above. If set to 0, all help messages will be displayed, even if they are duplicates (i.e., the original behavior). * '''orte_base_show_output_recursions''': An MCA parameter to help debug one of the known issues, described below. It is likely that this MCA parameter will disappear before v1.3 final. = Known Issues = * The XML filter component is not complete. The current output from this component is preliminary and not real XML. A bit more work needs to be done to configure.m4 search for an appropriate XML library/link it in/use it at run time. * There are possible recursion loops in the orte_output() and orte_show_help() functions -- e.g., if RML send calls orte_output() or orte_show_help(). We have some ideas how to fix these, but figured that it was ok to commit before feature freeze with known issues. The code currently contains sub-optimal workarounds so that this will not be a problem, but it would be good to actually solve the problem rather than have hackish workarounds before v1.3 final. This commit was SVN r18434.	2008-05-13 20:00:55 +00:00
Rolf vandeVaart	0e32dd1022	Add MPI_Alltoallv to tuned collectives and add a pairwise implementation of MPI_Alltoallv. However, do not change the default behavior for now. The only way to use new pairwise implementation is via mca parameters. This commit was SVN r18394.	2008-05-07 02:31:24 +00:00
Rich Graham	4d1ae7b05f	accidentally made a change in the wrong place. This commit was SVN r18262.	2008-04-23 17:32:05 +00:00
Rich Graham	293dd6ad4e	add myself to list of people building this module. This commit was SVN r18261.	2008-04-23 17:25:36 +00:00
Rich Graham	7658cc79e4	Pass in the correct module to the reduction call. This commit was SVN r18260.	2008-04-23 17:23:30 +00:00
Tim Mattox	0215474cb8	Fix two bugs in coll_sm_module.c from bit-rot: Fixed a selection bug, and removed a bogus "free(proc)" call which ultimately caused MPI_Finalize to crash. This commit was SVN r18235.	2008-04-22 18:41:21 +00:00
Rich Graham	df35223603	add selection logic for barrier and reduce. This commit was SVN r18215.	2008-04-19 22:40:04 +00:00
Rich Graham	bee8b42f29	remove debug code that would not let people run. Add infrastructure for blocking-barrier. This commit was SVN r18214.	2008-04-19 01:34:04 +00:00
Rich Graham	6c77fa4921	add a blocking shared memory algorithm. This commit was SVN r18185.	2008-04-16 22:10:23 +00:00
Rich Graham	249445d61f	added reduce-scatter followed by gather to root. This commit was SVN r18133.	2008-04-11 13:49:08 +00:00
Rich Graham	a6bdbfab97	implement allreduce as reduce-scatter, followed by an allgather. This commit was SVN r18132.	2008-04-11 04:06:29 +00:00
Rich Graham	70f3aab5f2	remove some code that is not needed. This commit was SVN r18128.	2008-04-10 17:32:04 +00:00
Rich Graham	5c7db1e315	remove 2 race conditions in the buffer recycling logic. This commit was SVN r18127.	2008-04-10 17:20:52 +00:00
Edgar Gabriel	4964434205	reverting commit 18122, since the commit was executed accidentally in the wring directory. The UH copyrights do belong into this file (i.e. because of the fix which is in the 1.2 branch, the UH copyright notes are in the header there alreary), but I want to have the proper log for that. This commit was SVN r18124.	2008-04-10 15:09:31 +00:00
Edgar Gabriel	f87830767a	the verification of recvcount==0 and rank = root was braking inter-communicator scatter, since the root (root==MPI_ROOT) might very well have recvcount=0. The same fix has been applied to gather.c just the other way round. Fixes the bug reported on the mainling list by Martin Audet. If there is a 1.2.7 this fix might be worthwhile porting it over. Please note, that while the test works now for basic and for inter, we get a 0byte malloc warning from the inter module, which we still have to fix in a separate patch. This commit was SVN r18122.	2008-04-10 14:58:51 +00:00
Rich Graham	c6783549ef	getting old This commit was SVN r18110.	2008-04-09 16:55:16 +00:00
Rich Graham	1a20c3ce51	more debug. This commit was SVN r18109.	2008-04-09 16:19:52 +00:00
Rich Graham	e7e18303f6	more debug. This commit was SVN r18108.	2008-04-09 15:10:58 +00:00
Rich Graham	b14c6b17d5	adding debug output. This commit was SVN r18107.	2008-04-09 13:32:01 +00:00
Rich Graham	10434fb2f1	add barrier synchorinzation at the end of the module init, to avoid initializing shared memory variables in use. This commit was SVN r18105.	2008-04-09 03:44:40 +00:00
Rich Graham	19bb1a2e86	fix initialization bug. This commit was SVN r18104.	2008-04-08 23:34:06 +00:00
Rich Graham	a69a8d9626	initialize the flags. This commit was SVN r18102.	2008-04-08 22:16:39 +00:00
Rich Graham	8765a2bbdd	more debug code. This commit was SVN r18101.	2008-04-08 20:38:20 +00:00
Rich Graham	08becf33b5	add more debugging. This commit was SVN r18100.	2008-04-08 18:44:50 +00:00
Rich Graham	aa1b7dd406	more debug This commit was SVN r18099.	2008-04-08 03:56:47 +00:00
Rich Graham	0c18bdeff7	more debug code. This commit was SVN r18098.	2008-04-08 03:04:20 +00:00
Rich Graham	9d5a7238df	Add some debugging code. This commit was SVN r18097.	2008-04-07 23:20:15 +00:00
Rich Graham	fa696734d5	add some debug code. This commit was SVN r18096.	2008-04-07 21:03:23 +00:00
Rich Graham	1b54e8b76e	fix buffer management for nb-barrier. This commit was SVN r18081.	2008-04-05 21:59:04 +00:00
Rich Graham	94f8fd365c	a few reduction optimizations. Add bcast. This commit was SVN r18075.	2008-04-02 19:02:33 +00:00
George Bosilca	a00ca20446	More cleanups. This commit was SVN r18069.	2008-04-02 06:38:33 +00:00
Rich Graham	eb5d6096f1	add reduction routine - fix buffer recycling logic which was totally broken. This commit was SVN r18065.	2008-04-01 22:56:18 +00:00
Rich Graham	90e53ca9ee	debug the pipeline algorithm. This commit was SVN r18008.	2008-03-28 15:10:07 +00:00
Rich Graham	e2ad9c4be2	adjust to change in orte_process_info. This commit was SVN r17986.	2008-03-27 01:25:28 +00:00
Rich Graham	441fb9fb9e	checkpoint. This commit was SVN r17985.	2008-03-27 01:16:32 +00:00
Ralph Castain	cca449e379	Move an OMPI RML tag to the OMPI layer This commit was SVN r17950.	2008-03-25 13:30:48 +00:00
Ralph Castain	dc7f45dafd	Remove the obsolete and largely unused orte_system_info structure. The only fields that were used in that struct were nodeid and nodename - these have been transferred to the orte_process_info structure. Only one place used the user name field - session_dir, when formulating the name of the top-level directory. Accordingly, the code for getting the user's id has been moved to the session_dir code. This commit was SVN r17926.	2008-03-23 23:10:15 +00:00
Rich Graham	a7c836a2b0	fix location of the restrict key word. Make the tag in the fan-in/fan-out algorithm be fragment based. This commit was SVN r17903.	2008-03-21 01:40:36 +00:00
Rich Graham	2c66d396b7	take care of some bit-rot with the fanin-fanout method. This commit was SVN r17902.	2008-03-21 01:08:49 +00:00
Rich Graham	b9520e61dc	get the sm optimized allreduce working for all but user defined operations. Added to the reduction operations a set of reduction functions that take 2 input buffers and one output buffer to avoid some extra memory copies. These can't be used with user defined operations. The intel c collective suite passes both original, and new (new, not the user defined operations). This commit was SVN r17901.	2008-03-20 23:51:16 +00:00
Edgar Gabriel	570bbea5e0	fixing the allgather problem reported on the mailing list. The problem was that at one locatin we had the local-size instead of the remote size as a receive argument. This commit was SVN r17849.	2008-03-17 19:42:18 +00:00
Rich Graham	27182afb67	get the timers in correctly. This commit was SVN r17832.	2008-03-16 03:25:16 +00:00
Rich Graham	afcd1016fd	move temp buffer allocation out of the iteration loop - i.e. always use the same temp loop. The algorithm is rather synchronous already... This commit was SVN r17831.	2008-03-16 03:20:46 +00:00
Rich Graham	a1766b29f6	fix some barrier addressing errors. This commit was SVN r17830.	2008-03-15 22:46:19 +00:00
Rich Graham	0453e7d2f4	bug in management memory allocation - too much memory allocated. This commit was SVN r17829.	2008-03-15 18:12:20 +00:00
Rich Graham	3c2f1eb8bf	reduce the number of temp buffers used. This commit was SVN r17828.	2008-03-15 17:23:04 +00:00
Rich Graham	0f9d642d51	temp buffer pointers are computed when they are set up. A bit more efficient, but more important, it is much easier to play around with memory layout now. This commit was SVN r17827.	2008-03-15 16:36:35 +00:00
Rich Graham	e3e336b5ab	check point This commit was SVN r17826.	2008-03-15 13:31:21 +00:00
Rich Graham	ebcf928c24	add some diagnostics. This commit was SVN r17789.	2008-03-07 22:27:41 +00:00
Rich Graham	9131461511	move some test code to another machine. This commit was SVN r17785.	2008-03-07 19:18:02 +00:00
Rich Graham	c230b65543	fix a couple of bugs. Recursive doubling seems to be working. This commit was SVN r17777.	2008-03-07 02:51:38 +00:00
Rich Graham	70157166f9	checkpoint - compiles, now neeed to debug. This commit was SVN r17775.	2008-03-07 00:39:59 +00:00
Rich Graham	4eace9d020	starting to implement recursive doubling algorithm. This commit was SVN r17765.	2008-03-06 18:38:58 +00:00
Rich Graham	67ad9b6d6b	increase max data segments size. This commit was SVN r17677.	2008-03-02 19:11:09 +00:00
Rich Graham	53126fa7bd	add calls to opal_progress() This commit was SVN r17673.	2008-02-29 23:25:09 +00:00
Rich Graham	d37db14901	get the shared memory collectives working again with the new version of orte. This commit was SVN r17672.	2008-02-29 22:28:57 +00:00
Rich Graham	c253a7bda1	simplify the code abit. This commit was SVN r17664.	2008-02-29 03:55:12 +00:00
Rich Graham	1632d8b299	revert to an older (not previosly checked in) version to get around a regression. This commit was SVN r17663.	2008-02-29 03:12:12 +00:00
Rich Graham	827e8d877e	fix bug in node type, and some memory copy optimizations. This commit was SVN r17661.	2008-02-29 01:20:11 +00:00
Rich Graham	940d6732c9	remove compiler warnings. This commit was SVN r17656.	2008-02-28 22:01:19 +00:00

1 2 3 4 5 ...

571 Коммитов