openmpi

Автор	SHA1	Сообщение	Дата
Shiqing Fan	2326f14be5	Remove the unnecessary PROJECT command, I somehow misunderstood how it should be used on Windows.... This commit was SVN r20634.	2009-02-25 16:07:43 +00:00
Terry Dontje	0178b6c45f	Added padding to predefined handle structures to maintain library version to version compatibility. This commit was SVN r20627.	2009-02-24 17:17:33 +00:00
Eugene Loh	463f11f993	Improve shared-memory allocation: * compute mmap-file size more wisely and pass requested size to allocator * change MCA parameters: - get rid of mpool_sm_per_peer_size - get rid of mpool_sm_max_size - set default mpool_sm_min_size to 0 * no longer pad sm allocations to page boundaries * have sm_btl_first_time_init check return codes on free-list creations Have mca_btl_sm_prepare_src() check to see if it can allocate an EAGER fragment rather than a MAX fragment if the smaller size works. Remove ompi/class/ompi_[circular_buffer_]fifo.h and references thereto. Remove opal/util/pow2.[c\|h] and references thereto. This commit was SVN r20614.	2009-02-20 19:51:57 +00:00
George Bosilca	e0638c84c8	Update the test to check that all data is exposed via the convertor_raw interface. This commit was SVN r20383.	2009-01-28 23:07:02 +00:00
George Bosilca	ecdcda9268	Move the datatpye creation functions outside the test itself. Add a test for the newly added raw functionality. This commit was SVN r20374.	2009-01-28 15:42:30 +00:00
Shiqing Fan	a5281f0434	- 1/4 commit for Windows Visual Studio and CCP support: CMakeLists and .windows files. In contribs preconfigured and precompiled parts. This commit was SVN r20108.	2008-12-10 20:59:20 +00:00
Kenneth Matney	94f8189532	Under gcc 4.2.4, make check was failing without the <stdio.h>. Moreover, I could not figure out why <time.h> would need to be included twice. So, I substituted the former for the latter, in the superfluous instantiation. This commit was SVN r19859.	2008-10-31 12:18:57 +00:00
Kenneth Matney	68248a32ef	Add #include for stdio.h to allow make check to run with gcc 4.2.4 (on Cray XT platform). This commit was SVN r19605.	2008-09-22 18:00:30 +00:00
George Bosilca	2bd9ddfc28	The datatype dump function is always visible so we don't need a fake one. This commit was SVN r19158.	2008-08-05 14:45:42 +00:00
George Bosilca	e6f700bf04	Reenable the ddt_test as #1242 is now closed. This commit was SVN r19145.	2008-08-04 15:57:02 +00:00
Brian Barrett	8cff3131d6	Remove memory tests, as they're out of date This commit was SVN r18656.	2008-06-14 14:01:05 +00:00
Jeff Squyres	1f226b5898	Adjust the comment to be correct, per http://www.open-mpi.org/community/lists/devel/2008/06/4095.php. This commit was SVN r18604.	2008-06-06 01:23:58 +00:00
Ralph Castain	7c7b9b0486	Do a little cleanup on the opal graph class and opal carto framework to conform to OMPI naming conventions and avoid potential conflict with user applications - no change in functionality, passes carto test program This commit was SVN r18407.	2008-05-07 19:33:49 +00:00
Ralph Castain	dc7f45dafd	Remove the obsolete and largely unused orte_system_info structure. The only fields that were used in that struct were nodeid and nodename - these have been transferred to the orte_process_info structure. Only one place used the user name field - session_dir, when formulating the name of the top-level directory. Accordingly, the code for getting the user's id has been moved to the session_dir code. This commit was SVN r17926.	2008-03-23 23:10:15 +00:00
Jeff Squyres	0fbb399f13	Remove ddt_test from "make check" per #1242 . This commit was SVN r17818.	2008-03-14 14:21:47 +00:00
Jeff Squyres	4133b46ec5	Re-enable "make dist", at least until #1232 is fixed. This commit was SVN r17796.	2008-03-09 21:36:10 +00:00
Jeff Squyres	498190e326	Add checks to ensure that opal_init() completes successfully so that we fail gracefully (and don't segv) if opal_init() fails. This commit was SVN r17760.	2008-03-06 14:55:32 +00:00
Tim Prins	2e1bda6d23	Remove the now-unused arithmatic interface to the dss This commit was SVN r17654.	2008-02-28 21:36:51 +00:00
Ralph Castain	d70e2e8c2b	Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer This commit was SVN r17632.	2008-02-28 01:57:57 +00:00
Jeff Squyres	04e026fa98	Fix "make check"; manually include <string.h> since the datatype header files were re-orged to have fewer dependencies This commit was SVN r17427.	2008-02-12 13:02:53 +00:00
Shiqing Fan	f5792bbda5	merging the memchecker into trunk. This commit was SVN r17424.	2008-02-12 08:46:27 +00:00
Sharon Melamed	98e8de264d	Wraped the carto API in carto_base_wrapers.c This commit was SVN r17380.	2008-02-05 19:29:16 +00:00
Sharon Melamed	025b68becf	Move the carto framework to the trunk. This commit was SVN r17177.	2008-01-23 09:20:34 +00:00
George Bosilca	7eca186568	Fix a typo related to the conversion from ompi_pointer_array to opal_pointer_array. This commit was SVN r17023.	2007-12-22 05:32:40 +00:00
George Bosilca	906e8bf1d1	Replace the ompi_pointer_array with opal_pointer_array. The next step (sometimes after the merge with the ORTE branch), the opal_pointer_array will became the only pointer_array implementation (the orte_pointer_array will be removed). This commit was SVN r17007.	2007-12-21 06:02:00 +00:00
Rich Graham	27a748e7eb	change all instances of ompi_free_list_init to ompi_free_list_init_new. Header and payload data are specified separately at this stage. This commit was SVN r16633.	2007-11-01 23:38:50 +00:00
Ralph Castain	54b2cf747e	These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC. The commit has been tested for C/R and Cray operations, and on Odin (SLURM, rsh) and RoadRunner (TM). I tried to update all environments, but obviously could not test them. I know that Windows needs some work, and have highlighted what is know to be needed in the odls process component. This represents a lot of work by Brian, Tim P, Josh, and myself, with much advice from Jeff and others. For posterity, I have appended a copy of the email describing the work that was done: As we have repeatedly noted, the modex operation in MPI_Init is the single greatest consumer of time during startup. To-date, we have executed that operation as an ORTE stage gate that held the process until a startup message containing all required modex (and OOB contact info - see #3 below) info could be sent to it. Each process would send its data to the HNP's registry, which assembled and sent the message when all processes had reported in. In addition, ORTE had taken responsibility for monitoring process status as it progressed through a series of "stage gates". The process reported its status at each gate, and ORTE would then send a "release" message once all procs had reported in. The incoming changes revamp these procedures in three ways: 1. eliminating the ORTE stage gate system and cleanly delineating responsibility between the OMPI and ORTE layers for MPI init/finalize. The modex stage gate (STG1) has been replaced by a collective operation in the modex itself that performs an allgather on the required modex info. The allgather is implemented using the orte_grpcomm framework since the BTL's are not active at that point. At the moment, the grpcomm framework only has a "basic" component analogous to OMPI's "basic" coll framework - I would recommend that the MPI team create additional, more advanced components to improve performance of this step. The other stage gates have been replaced by orte_grpcomm barrier functions. We tried to use MPI barriers instead (since the BTL's are active at that point), but - as we discussed on the telecon - these are not currently true barriers so the job would hang when we fell through while messages were still in process. Note that the grpcomm barrier doesn't actually resolve that problem, but Brian has pointed out that we are unlikely to ever see it violated. Again, you might want to spend a little time on an advanced barrier algorithm as the one in "basic" is very simplistic. Summarizing this change: ORTE no longer tracks process state nor has direct responsibility for synchronizing jobs. This is now done via collective operations within the MPI layer, albeit using ORTE collective communication services. I -strongly- urge the MPI team to implement advanced collective algorithms to improve the performance of this critical procedure. 2. reducing the volume of data exchanged during modex. Data in the modex consisted of the process name, the name of the node where that process is located (expressed as a string), plus a string representation of all contact info. The nodename was required in order for the modex to determine if the process was local or not - in addition, some people like to have it to print pretty error messages when a connection failed. The size of this data has been reduced in three ways: (a) reducing the size of the process name itself. The process name consisted of two 32-bit fields for the jobid and vpid. This is far larger than any current system, or system likely to exist in the near future, can support. Accordingly, the default size of these fields has been reduced to 16-bits, which means you can have 32k procs in each of 32k jobs. Since the daemons must have a vpid, and we require one daemon/node, this also restricts the default configuration to 32k nodes. To support any future "mega-clusters", a configuration option --enable-jumbo-apps has been added. This option increases the jobid and vpid field sizes to 32-bits. Someday, if necessary, someone can add yet another option to increase them to 64-bits, I suppose. (b) replacing the string nodename with an integer nodeid. Since we have one daemon/node, the nodeid corresponds to the local daemon's vpid. This replaces an often lengthy string with only 2 (or at most 4) bytes, a substantial reduction. (c) when the mca param requesting that nodenames be sent to support pretty error messages, a second mca param is now used to request FQDN - otherwise, the domain name is stripped (by default) from the message to save space. If someone wants to combine those into a single param somehow (perhaps with an argument?), they are welcome to do so - I didn't want to alter what people are already using. While these may seem like small savings, they actually amount to a significant impact when aggregated across the entire modex operation. Since every proc must receive the modex data regardless of the collective used to send it, just reducing the size of the process name removes nearly 400MBytes of communication from a 32k proc job (admittedly, much of this comm may occur in parallel). So it does add up pretty quickly. 3. routing RML messages to reduce connections. The default messaging system remains point-to-point - i.e., each proc opens a socket to every proc it communicates with and sends its messages directly. A new option uses the orteds as routers - i.e., each proc only opens a single socket to its local orted. All messages are sent from the proc to the orted, which forwards the message to the orted on the node where the intended recipient proc is located - that orted then forwards the message to its local proc (the recipient). This greatly reduces the connection storm we have encountered during startup. It also has the benefit of removing the sharing of every proc's OOB contact with every other proc. The orted routing tables are populated during launch since every orted gets a map of where every proc is being placed. Each proc, therefore, only needs to know the contact info for its local daemon, which is passed in via the environment when the proc is fork/exec'd by the daemon. This alone removes ~50 bytes/process of communication that was in the current STG1 startup message - so for our 32k proc job, this saves us roughly 32k50 = 1.6MBytes sent to 32k procs = 51GBytes of messaging. Note that you can use the new routing method by specifying -mca routed tree - if you so desire. This mode will become the default at some point in the future. There are a few minor additional changes in the commit that I'll just note in passing: propagation of command line mca params to the orteds - fixes ticket #1073. See note there for details. * requiring of "finalize" prior to "exit" for MPI procs - fixes ticket #1144. See note there for details. * cleanup of some stale header files This commit was SVN r16364.	2007-10-05 19:48:23 +00:00
Shiqing Fan	0f468f3668	- Remove the solution and project files, will commit them later. This commit was SVN r15705.	2007-07-31 17:07:02 +00:00
Shiqing Fan	4d7b349cdb	- Add VC8 solution and project files. - If one wants to use this solution, remember to unload the project 'orte-restart' which is currently not working for Windows. This commit was SVN r15680.	2007-07-30 11:05:34 +00:00
Tim Prins	7445a11f61	Remove duplicate tests. The current version of the dss tests are in orte/test/unit/dss Remove defunct testing matrix This commit was SVN r15535.	2007-07-20 13:37:44 +00:00
Ralph Castain	511457feb5	Remove stale test code. At least we were wise enough to have eliminated this code from the "make check" tree, but almost none of it compiles and of what does compile, nothing seems to really work. This commit was SVN r15446.	2007-07-16 16:34:14 +00:00
Jeff Squyres	f72b52bb1d	s/ifdef/if/ fro OMPI_C_HAVE_VISIBILITY to enable static builds. This commit was SVN r14985.	2007-06-11 13:20:56 +00:00
George Bosilca	29dd535c01	Remove all references to the orte_bitmap as well as the files. This commit was SVN r14928.	2007-06-06 20:24:07 +00:00
Brian Barrett	42c74b2cf7	fix test case so that condition variables work right, at least on PTHREADS. I'm pretty sure condition variables are wrong for Solaris threads. This commit was SVN r14877.	2007-06-05 19:24:17 +00:00
Brian Barrett	60571567a4	Better fix than r14831 -- ddt_pack was removed from TESTS because it calls MPI_INIT and that causes problems during make distcheck. Instead put it in check_PROGRAMS which lets it get built, but doesn't run it. This commit was SVN r14832. The following SVN revision numbers were found above: r14831 --> open-mpi/ompi@9258c5200a	2007-06-01 14:34:06 +00:00
Rainer Keller	9258c5200a	- As we need to reconfigure anyhow, get rid of autogen warning. This commit was SVN r14831.	2007-06-01 08:20:38 +00:00
Brian Barrett	c7937ec02e	until I figure out why MPI_INIT failed during make distcheck This commit was SVN r14816.	2007-05-31 02:31:12 +00:00
George Bosilca	07f51ae5dc	Make the test a little bit more difficult. This commit was SVN r14814.	2007-05-30 22:40:16 +00:00
Brian Barrett	f02a9525dc	add pack / unpack test This commit was SVN r14801.	2007-05-30 17:41:15 +00:00
Sven Stork	fc932f1fb4	- changes to get the tests running with visibility enabled This commit was SVN r14730.	2007-05-23 15:02:36 +00:00
Brian Barrett	21e00f6f0c	Clean up a couple of configure things: * Require Autoconf 2.60 or higher and remove some cruft required for AC 2.59 or the AC 2.59 / AC 2.60 mix * Remove a bunch of now unnecessary AC_SUBST calls * Use the libtool-provided variables for the -I and library to use when compiling against ltdl Fixes trac:1000 This commit was SVN r14652. The following Trac tickets were found above: Ticket 1000 --> https://svn.open-mpi.org/trac/ompi/ticket/1000	2007-05-15 04:23:48 +00:00
George Bosilca	f2a6b9394f	Deal with the include spree. Protect "environ" on Windows. Some others minors modifications in order to make it compile [again] on Windows. This commit was SVN r14188.	2007-04-01 16:16:54 +00:00
Jeff Squyres	4d8ee3d1e1	Add missing #include; fix the build for some picky compilers. This commit was SVN r13696.	2007-02-17 11:54:40 +00:00
George Bosilca	beb9be3fe4	Don't import the datatype debug output if we're not in debug mode. This commit was SVN r13650.	2007-02-14 16:47:12 +00:00
George Bosilca	06044db69a	Add another test for the data-type engine. This test pack and unpack the data in a way similar to the multi-network OB1 PML. This commit was SVN r13632.	2007-02-13 09:30:19 +00:00
Jeff Squyres	9fb004ab8e	remove the legal_numbits tests This commit was SVN r13575.	2007-02-09 03:18:33 +00:00
Brian Barrett	6f8b366acb	Rename liborte to libopen-rte and libopal to libopen-pal per telecon today and bug #632. Refs trac:632 This commit was SVN r12762. The following Trac tickets were found above: Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632	2006-12-05 18:27:24 +00:00
George Bosilca	56748d5f57	Correctly initialize the unpack buffer. This commit was SVN r12529.	2006-11-10 05:11:02 +00:00
Sven Stork	9cf5b3709c	- Add comment for volatile. This commit was SVN r12436.	2006-11-06 14:00:43 +00:00
Sven Stork	27420fbda3	- Make counter volatile to prohibit compiler to perform optimisations. Without this a compiler could assume that the counter is not updated my the malloc call and remove the test in the assert and always trigger the assertion. This commit was SVN r12419.	2006-11-03 10:46:18 +00:00

1 2 3 4 5 ...

477 Коммитов