openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	56c653ebcd	Add some comments. This commit was SVN r22008.	2009-09-24 00:08:28 +00:00
Edgar Gabriel	9abeaad6e2	so here is what happens: in the v1.2 series the cid's could never go above the max. allowed for a particular pml. Because of that, pml_add_comm never checked for the cid, and in fact pml_add_comm was called in comm_set, which is before we knew the cid. in the v1.3 series (and trunk) we check now the cid to detect overflow, and because of that pml_add_comm has been moved after the cid allocation routine, namely into the comm_activate routine. in the v1.2 series, the comm_activate contained a synchronization step of the old communicator in order to prevent incoming fragments on the new communicator, with the main problem being that the allreduce in the communicator allocation finished at different times on different processes, and thus, this scenario could and did really occur. in the v1.3 series, the comm_activate does not contain the synchronization step anymore, since we introduced the new queue for fragments with unknown cid. The problem is however, that whether a fragment is known or not is decided by using ompi_comm_lookup(), which will return something useful as soon as the cid allocation finished, even before pml_add_comm has been called. So there is a small time gap where we will not post a message into queue for unknown cid's, but we can also not look up the process structure belonging to the rank in that comm ( that is in pml_ob1_match_recv_frag or something like that). The current fix reintroduces the synchronization step in comm_activate, and ensures that no fragment can be received for a new communicator before the synchronization occurs , and thus comm_nextcid() and pml_add_comm has been called. It seems to be the safest and easiest way for now. Welcome back, v1.2. This commit was SVN r21970.	2009-09-17 14:37:02 +00:00
Jeff Squyres	c879170c9e	Actually, invoke the error on MPI_COMM_WORLD if you have an invalid communicator. :-) This commit was SVN r21942.	2009-09-04 07:40:28 +00:00
Jeff Squyres	a211c55cce	Fix some attribute error detection problems reported by Lisandro Dalcin. This commit was SVN r21941.	2009-09-04 05:18:49 +00:00
Jeff Squyres	11d44cec1b	Fix MPI_COMM_SPAWN[_MULTIPLE] to only check the info handles for errors on the root. Thanks to Federico Golfre Andreasi for reporting the problem. This commit was SVN r21838.	2009-08-19 13:24:12 +00:00
Jeff Squyres	c3afac1d50	Fix comment typo This commit was SVN r21824.	2009-08-14 12:09:19 +00:00
Jeff Squyres	b8332ea2b2	Patch from Kiril to make the parameter checking on MPI_CART_CREATE a bit more relaxed. This commit was SVN r21816.	2009-08-13 22:06:38 +00:00
Rainer Keller	6c5532072a	- Split the datatype engine into two parts: an MPI specific part in OMPI and a language agnostic part in OPAL. The convertor is completely moved into OPAL. This offers several benefits as described in RFC http://www.open-mpi.org/community/lists/devel/2009/07/6387.php namely: - Fewer basic types (int* and float* types, boolean and wchar - Fixing naming scheme to ompi-nomenclature. - Usability outside of the ompi-layer. - Due to the fixed nature of simple opal types, their information is completely known at compile time and therefore constified - With fewer datatypes (22), the actual sizes of bit-field types may be reduced from 64 to 32 bits, allowing reorganizing the opal_datatype structure, eliminating holes and keeping data required in convertor (upon send/recv) in one cacheline... This has implications to the convertor-datastructure and other parts of the code. - Several performance tests have been run, the netpipe latency does not change with this patch on Linux/x86-64 on the smoky cluster. - Extensive tests have been done to verify correctness (no new regressions) using: 1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and ompi-ddt: a. running both trunk and ompi-ddt resulted in no differences (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run correctly). b. with --enable-memchecker and running under valgrind (one buglet when run with static found in test-suite, commited) 2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt: all passed (except for the dynamic/ tests failed!! as trunk/MTT) 3. compilation and usage of HDF5 tests on Jaguar using PGI and PathScale compilers. 4. compilation and usage on Scicortex. - Please note, that for the heterogeneous case, (-m32 compiled binaries/ompi), neither ompi-trunk, nor ompi-ddt branch would successfully launch. This commit was SVN r21641.	2009-07-13 04:56:31 +00:00
Rainer Keller	b572dc3591	- As discussed revert r21330, Fortran-configure info should not end up in OPAL - Will post an updated patch for the OMPI_ALIGNMENT_ parts (within C). This commit was SVN r21342. The following SVN revision numbers were found above: r21330 --> open-mpi/ompi@95596d1814	2009-06-01 19:02:34 +00:00
Rainer Keller	95596d1814	- Move alignment and size output generated by configure-tests into the OPAL namespace, eliminating cases like opal/util/arch.c testing for ompi_fortran_logical_t. As this is processor- and compiler-related information (e.g. does the compiler/architecture support REAL*16) this should have been on the OPAL layer. - Unifies f77 code using MPI_Flogical instead of opal_fortran_logical_t - Tested locally (Linux/x86-64) with mpich and intel testsuite but would like to get this week-ends MTT output - PLEASE NOTE: configure-internal macro-names and ompi_cv_ variables have not been changed, so that external platform (not in contrib/) files still work. This commit was SVN r21330.	2009-05-30 15:54:29 +00:00
Edgar Gabriel	d93def71ea	second part of the 'running out of cids problem', this time focusing on what happens when hierarch is used. . Two major items: - modify the comm_activate step to take an additional argument, indicating whether the new communicatio has to go through the collective selection step. This is not required sometimes (e.g. when a process calls MPI_COMM_SPLIT with color=MPI_UNDEFINED), and contributed significantly to the exhaustion of cids. - when freeing a communicator, check whether we can reuse the block of cids assigned to that comm. This only works if the current front of the cid assignment (cid_block_start) is right ater the block of cids assigned to this comm. Fixes trac:1904 Fixes trac:1926 This commit was SVN r21296. The following Trac tickets were found above: Ticket 1904 --> https://svn.open-mpi.org/trac/ompi/ticket/1904 Ticket 1926 --> https://svn.open-mpi.org/trac/ompi/ticket/1926	2009-05-27 15:21:07 +00:00
Edgar Gabriel	0bc8164a11	fix the group_compare operation which failed to recognize unequal groups in case the first process of the group was not represented at all in the second group. Also added some cleanup of the code w.r.t. booleans vs. ints. Thanks for Geoffrey Irving for reporting the bug and providing the initial solution. This commit was SVN r21192.	2009-05-08 13:51:28 +00:00
Rainer Keller	2941cb1494	- Fix Coverity CID 525 and 526 --- and some more; - due to the <= with we could overrun the array - we didn't correctly test at _all_, since we never marked the ranks already excluded / included... - when returning in error, we should free (elements_int_list)... This commit was SVN r21186.	2009-05-07 16:45:18 +00:00
Greg Koenig	60485ff95f	This is a very large change to rename several #define values from OMPI_* to OPAL_*. This allows opal layer to be used more independent from the whole of ompi. NOTE: 9 "svn mv" operations immediately follow this commit. This commit was SVN r21180.	2009-05-06 20:11:28 +00:00
Rainer Keller	221fb9dbca	... Delayed due to notifier commits earlier this day ... - Delete unnecessary header files using contrib/check_unnecessary_headers.sh after applying patches, that include headers, being "lost" due to inclusion in one of the now deleted headers... In total 817 files are touched. In ompi/mpi/c/ header files are moved up into the actual c-file, where necessary (these are the only additional #include), otherwise it is only deletions of #include (apart from the above additions required due to notifier...) - To get different MCAs (OpenIB, TM, ALPS), an earlier version was successfully compiled (yesterday) on: Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled This commit was SVN r21096.	2009-04-29 01:32:14 +00:00
George Bosilca	05ee4c280e	Mismatch between the reported subversion and the one in the mpi.h. Thanks to Rob Egan for the report. This commit was SVN r20985.	2009-04-14 05:29:07 +00:00
Jeff Squyres	bf8defc475	Shaun Jackson noted that MPI_STATUS_IGNORE is actually (effectively) NULL, so testing for NULL as a bad status parameter here is a bad idea. This commit was SVN r20891.	2009-03-28 01:24:41 +00:00
Rainer Keller	d8cf4c0fec	- Get pgcc on XT to complain less: In case we use memcmp, strlen, strup and friends include <string.h> Also several constants.h are not included directly - Let's have mca_topo_base_cart_create return ompi-errors in ompi/mca/topo/base/topo_base_cart_create.c This commit was SVN r20773.	2009-03-13 02:10:32 +00:00
Rainer Keller	9dea63d63a	- Last of intrusive commits (promised)... err for now. Anyway, this is blocking the move: do not include pml.h if not really needed, aka none of the following used: mca_pml MCA_PML_CALL OMPI_ANY_TAG OMPI_ANY_SOURCE OMPI_PROC_NULL - Notable exceptions (deleting in one header->adding): - ompi/mca/mtl/psm/ - ompi/mca/osc/rdma/ - ompi/mca/btl/openib/btl_openib_endpoint.c depended on pml_base_sendreq.h - Tested on Linux/x86-64, this time including make check (thanks Jeff and Ralph) This commit was SVN r20725.	2009-03-04 17:06:51 +00:00
Terry Dontje	0178b6c45f	Added padding to predefined handle structures to maintain library version to version compatibility. This commit was SVN r20627.	2009-02-24 17:17:33 +00:00
Jeff Squyres	f1a6d170dc	Revert part of r20537: per lengtyh discussion on the phone and the devel list, it ''is'' within in the spirit of MPI to allow MPI_REQUEST_NULL to be passed to MPI_REQUEST_GET_STATUS. I filed a ticket proposal with MPI-2.2 to make this officially accepted: https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/137 Plus, r20537 didn't revert out all of the machinery for allowing MPI_REQUEST_NULL or inactive requests, anyway. So this commit simply removes the parameter check that was added in r20537, and we're back to where we were before this whole conversation. :-) This commit was SVN r20616. The following SVN revision numbers were found above: r20537 --> open-mpi/ompi@38aab37bb3	2009-02-20 19:57:46 +00:00
Jeff Squyres	7e210fdaf8	Return MPI_ERR_COMM and MPI_ERR_WIN, respectively, for MPI_COMM\|WIN_SET\|GET_ERRHANDLER if a bad MPI handle is passed. Thanks to Lisandro Dalcín for reporting the issue. This commit was SVN r20615.	2009-02-20 19:53:48 +00:00
Rainer Keller	d81443cc5a	- On the way to get the BTLs split out and lessen dependency on orte: Often, orte/util/show_help.h is included, although no functionality is required -- instead, most often opal_output.h, or orte/mca/rml/rml_types.h Please see orte_show_help_replacement.sh commited next. - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration actually showed two missing #include "orte/util/show_help.h" in orte/mca/odls/base/odls_base_default_fns.c and in orte/tools/orte-top/orte-top.c Manually added these. Let's have MTT the last word. This commit was SVN r20557.	2009-02-14 02:26:12 +00:00
Jeff Squyres	44092c6a21	Don't allow freeing of predefined datatypes. Thanks to Lisandro Dalcín for reporting the issue. This commit was SVN r20538.	2009-02-13 00:00:55 +00:00
Jeff Squyres	38aab37bb3	Be a little tougher looking for MPI_*_NULL cases in some functions. Thanks to Lisandro Dalcín for reporting the issue. This commit was SVN r20537.	2009-02-12 23:57:41 +00:00
Jeff Squyres	c596a1bcb3	Fix MPI_File_c2f -- ensure that if you invoke MPI_File_c2f(MPI_FILE_NULL), you actually get 0, not -1. Thanks for Lisandro Dalcin for the bug report. This commit was SVN r20511.	2009-02-11 00:48:12 +00:00
Jeff Squyres	90c28810f4	Fix CID 1122: comm->c_name is a char array (not a pointer), so comparing it to NULL is not useful. This commit was SVN r20444.	2009-02-05 15:31:10 +00:00
Jeff Squyres	73ea7a9aa5	Fix CIDs 1211, 1212, 1214: fix error checking in MPI_REDUCE_LOCAL. This commit was SVN r20435.	2009-02-05 02:18:03 +00:00
Jeff Squyres	4d8a187450	Two major things in this commit: * New "op" MPI layer framework * Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2) = Op framework = Add new "op" framework in the ompi layer. This framework replaces the hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples for pre-defined MPI_Ops, allowing components and modules to provide the back-end functions. The intent is that components can be written to take advantage of hardware acceleration (GPU, FPGA, specialized CPU instructions, etc.). Similar to other frameworks, components are intended to be able to discover at run-time if they can be used, and if so, elect themselves to be selected (or disqualify themselves from selection if they cannot run). If specialized hardware is not available, there is a default set of functions that will automatically be used. This framework is ''not'' used for user-defined MPI_Ops. The new op framework is similar to the existing coll framework, in that the final set of function pointers that are used on any given intrinsic MPI_Op can be a mixed bag of function pointers, potentially coming from multiple different op modules. This allows for hardware that only supports some of the operations, not all of them (e.g., a GPU that only supports single-precision operations). All the hard-coded back-end MPI_Op functions for (MPI_Op, MPI_Datatype) tuples still exist, but unlike coll, they're in the framework base (vs. being in a separate "basic" component) and are automatically used if no component is found at runtime that provides a module with the necessary function pointers. There is an "example" op component that will hopefully be useful to those writing meaningful op components. It is currently .ompi_ignore'd so that it doesn't impinge on other developers (it's somewhat chatty in terms of opal_output() so that you can tell when its functions have been invoked). See the README file in the example op component directory. Developers of new op components are encouraged to look at the following wiki pages: https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework = MPI_REDUCE_LOCAL = Part of the MPI-2.2 proposal listed here: https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24 is to add a new function named MPI_REDUCE_LOCAL. It is very easy to implement, so I added it (also because it makes testing the op framework pretty easy -- you can do it in serial rather than via parallel reductions). There's even a man page! This commit was SVN r20280.	2009-01-14 23:44:31 +00:00
Jeff Squyres	895edd04f8	Fix CID 468: remove some dead code. r_proc_list was set to NULL but never used. This commit was SVN r20272.	2009-01-14 18:15:17 +00:00
Jeff Squyres	d1c6f3f89a	* Fix a truckload of Cisco copyrights to be the same as the rest of the code base. * Fix a few misspellings in other copyrights. This commit was SVN r20241.	2009-01-11 02:30:00 +00:00
Jeff Squyres	a9850c96c5	Cosmetic change. This commit was SVN r20203.	2009-01-05 19:07:06 +00:00
Jeff Squyres	611ebeab33	Cosmetic: expunge some more old 2-space-indent code (re-indent with "indent(1)"). This commit was SVN r20179.	2009-01-02 12:55:17 +00:00
Nysal Jan	ee8ec6f6b5	Remove dead/redundant code. Minimize number of calloc invocations This commit was SVN r20121.	2008-12-12 10:55:50 +00:00
Shiqing Fan	d06604c258	Get rid of the compiler warning message when --enable-picky is used. Do the checks according to inter/intracommunicator flags. This commit was SVN r20063.	2008-12-03 17:44:21 +00:00
Shiqing Fan	abd21b6d17	- An update for memchecker : 1. fix a bug in pml_ob1_recvreq/sendreq.c, buffer was made defined where the request has already been released. 2. complete memchecker support for collective functions. 3. change the wrongly spelled function name of memchecker, i.e. '_isaddressible' should be '_isaddressable' This commit was SVN r20043.	2008-11-27 16:34:02 +00:00
George Bosilca	82d1d5d785	The patch for "Unexpected message queue for unknown CID's required" ticket #1460 . I'm unable to split it in two parts, my patch and Edgar's one. So I just update copyright information for both of us. What this patch do: - it use the unexpected queue create by commit r19562 to dispatch the unexpected message to the right communicator (once this communicator is created and initialized). - delay the PML comm_add until we have the context_id for the new communicator. - only do the PML comm_add on processes that really belong to the new communicator. Please read the lengthy comment in the source code for the reason behind this. This commit was SVN r19929. The following SVN revision numbers were found above: r19562 --> open-mpi/ompi@acd3406aa7	2008-11-04 21:58:06 +00:00
Jeff Squyres	57a3dce9ba	LANL noticed that calling MPI_ABORT invokes opal_output(0, ...) unconditionally, which can result in a flood of messages to the user if all MPI processes invoke abort. Additionally, some users were confused because they saw the MPI_ABORT opal_output() messages from ''some'' MPI processes, but not ''all'' of them (despite the fact that every MPI process supposedly invoked MPI_ABORT). The reason is that calling MPI_ABORT triggers ORTE to kill all MPI processes, so it's a race condition as to whether a) all MPI processes actually invoke MPI_ABORT, and/or b) whether every process is able to opal_output() before they are killed. This commit does two simple things: * Now use orte_show_help() for the MPI_ABORT message, so they are aggregated. * Add a note in the message that calling MPI_ABORT kills all processes, so you might not see all output, yadda yadda yadda. This commit was SVN r19735.	2008-10-14 19:23:03 +00:00
Jeff Squyres	d0a8be6d2f	Fix CID 1117: ensure to check return values. This commit was SVN r19583.	2008-09-19 13:27:30 +00:00
Nysal Jan	4b68803260	Should be coords(i) >= dims(i) Refs trac:1463 This commit was SVN r19500. The following Trac tickets were found above: Ticket 1463 --> https://svn.open-mpi.org/trac/ompi/ticket/1463	2008-09-05 04:20:48 +00:00
Jeff Squyres	9a98423bbc	[Re-]Fix #1463 with a little thing that I like to call "the right way". Don't modify coords in the top-level API function because coords is an IN variable. Instead, as Nysal noted, the real cause of the problem was a missing ! down in topo_base_cart_rank.c. Put a comment down in topo_base_cart_rank.c explaining what's going on so that the code is not so cryptic. Refs trac:1363. This commit was SVN r19487. The following Trac tickets were found above: Ticket 1363 --> https://svn.open-mpi.org/trac/ompi/ticket/1363	2008-09-03 08:24:27 +00:00
Jeff Squyres	008fa8c5cc	Fixes trac:1236, #1237 . * Various changes to enable 0-dimensional cartesian communicators: * Set various mtc_* members to NULL when there are 0 dimensions (and don't bother trying to memcpy these arrays when duplicating the communicator -- because they're NULL) * adjust topo_base_cart_sub to correctly handle 0 dimensions (simplified it a bit) * adjust a few error codes to return ERR_OUT_OF_RESOURCE * adjust error checking of CART_CREATE, CART_RANK * Allow MPI_GRAPH_CREATE to accept 0 == nnodes. * Bump reported MPI version in mpi.h to 2.1 This commit was SVN r19461. The following Trac tickets were found above: Ticket 1236 --> https://svn.open-mpi.org/trac/ompi/ticket/1236	2008-08-31 19:31:10 +00:00
Jeff Squyres	59cb626b7c	Fixes trac:1463: ensure periodic dimensions are handled proprly for MPI_CART_RANK. This commit was SVN r19459. The following Trac tickets were found above: Ticket 1463 --> https://svn.open-mpi.org/trac/ompi/ticket/1463	2008-08-31 18:39:05 +00:00
George Bosilca	697dc524c1	Deal with the ticket #1239 and #712 . This will upgrade the Open MPI support for the F90 type create functions to the requirements of MPI 2.1 standard. Advice to implementors. An application may often repeat a call to MPI_TYPE_CREATE_F90_xxxx with the same combination of (xxxx,p,r). The application is not allowed to free the returned predefined, unnamed datatype handles. To prevent the creation of a potentially huge amount of handles, the MPI implementation should return the same datatype handle for the same (REAL/COMPLEX/INTEGER,p,r) combination. Checking for the combination (p,r) in the preceding call to MPI_TYPE_CREATE_F90_xxxx and using a hash-table to find formerly generated handles should limit the overhead of finding a previously generated datatype with same combination of (xxxx,p,r). (End of advice to implementors.) This commit fixes trac:1239, and #712. This commit was SVN r19458. The following Trac tickets were found above: Ticket 1239 --> https://svn.open-mpi.org/trac/ompi/ticket/1239	2008-08-31 18:36:32 +00:00
Jeff Squyres	93746cd594	Fixed CID 807: Remove unused variable This commit was SVN r19239.	2008-08-11 20:50:09 +00:00
Rolf vandeVaart	e105b3f254	Finish work related to ticket #1392 where the versions were bumped from v1.0.0 to v2.0.0. This change fixed #1439. This commit was SVN r19175.	2008-08-06 12:16:54 +00:00
Rainer Keller	82580701fb	- We may know the *_name is < MPI_MAX_OBJECT_NAME; Prevent does not. Fix Coverity issues CID1068 and CID1069 This commit was SVN r19167.	2008-08-06 07:59:59 +00:00
Ralph Castain	a0ae63f19e	Ensure we call close_port after comm_spawn[_multiple]. Cleanout the port name in close_port This commit was SVN r19068.	2008-07-28 16:40:11 +00:00
Jeff Squyres	74aa9689e4	From an initial patch from George, update all the set/get errhandler functions to use atomics in order to be thread safe. This commit was SVN r18807.	2008-07-03 19:28:02 +00:00
Jeff Squyres	51d833e8d1	Minor fixes and comment clarifications for MPI-2.1-mandated handling of strings. We mostly did the Right Things already; I simplified the code a bit and also had us not write to more characters in the C bindings than we're supposed to (per language in the MPI-2.1 spec). Fixes trac:1238. This commit was SVN r18705. The following Trac tickets were found above: Ticket 1238 --> https://svn.open-mpi.org/trac/ompi/ticket/1238	2008-06-21 19:33:47 +00:00

1 2 3 4 5 ...

270 Коммитов