openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	f966d9f972	Fix visibility issues with opal_graph functions. Fix the carto test so it can compile - need to update input file so it can run This commit was SVN r21403.	2009-06-09 15:02:57 +00:00
Rainer Keller	fc65875542	- As in r21238, do not use printf %z for size_t... This commit was SVN r21239. The following SVN revision numbers were found above: r21238 --> open-mpi/ompi@b2f8095ba7	2009-05-14 14:11:31 +00:00
Greg Koenig	60485ff95f	This is a very large change to rename several #define values from OMPI_* to OPAL_*. This allows opal layer to be used more independent from the whole of ompi. NOTE: 9 "svn mv" operations immediately follow this commit. This commit was SVN r21180.	2009-05-06 20:11:28 +00:00
Rainer Keller	7663fb47f0	- In the included headers, the string.h is missing. - For size_t, Posix offers %z length modifier, get rid of warning (or need to cast...) This commit was SVN r21165.	2009-05-05 15:42:31 +00:00
Ralph Castain	e1673778be	Replace missing headers This commit was SVN r21136.	2009-05-01 15:09:10 +00:00
Jeff Squyres	80a1ae45ba	Add missing header This commit was SVN r21122.	2009-04-30 11:36:35 +00:00
Rainer Keller	221fb9dbca	... Delayed due to notifier commits earlier this day ... - Delete unnecessary header files using contrib/check_unnecessary_headers.sh after applying patches, that include headers, being "lost" due to inclusion in one of the now deleted headers... In total 817 files are touched. In ompi/mpi/c/ header files are moved up into the actual c-file, where necessary (these are the only additional #include), otherwise it is only deletions of #include (apart from the above additions required due to notifier...) - To get different MCAs (OpenIB, TM, ALPS), an earlier version was successfully compiled (yesterday) on: Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled This commit was SVN r21096.	2009-04-29 01:32:14 +00:00
Shiqing Fan	3d4e0472d6	Add windows support files into the tarball, including .windows, CMakeLists.txt files, and CMake modules. Thanks to Jeff for testing it on Linux. This commit was SVN r21069.	2009-04-24 16:39:33 +00:00
Rainer Keller	ec0ed48718	- Revert r20739 This commit was SVN r20742. The following SVN revision numbers were found above: r20739 --> open-mpi/ompi@781caee0b6	2009-03-05 21:56:03 +00:00
Rainer Keller	781caee0b6	- First of two or three patches, in orte/util/proc_info.h: Adapt orte_process_info to orte_proc_info, and change orte_proc_info() to orte_proc_info_init(). - Compiled on linux-x86-64 - Discussed with Ralph This commit was SVN r20739.	2009-03-05 20:36:44 +00:00
Jeff Squyres	8fe40fb4a1	r20701 was a lie; we ''do'' need the libraries when compiling in debug mode, because some functions are not inlined. This commit was SVN r20736. The following SVN revision numbers were found above: r20701 --> open-mpi/ompi@b440c92455	2009-03-05 15:30:50 +00:00
Ralph Castain	1d4bbee096	Fix bitmap test so make tarball can succeed This commit was SVN r20713.	2009-03-04 12:26:45 +00:00
Rainer Keller	811f2bd9b4	- As discussed on RFC, move the ompi_bitmap to the opal layer. Add a check against a maximum (actually get rid of ifs internally to opal_bitmap.c) -- the functionality to set the current maximum size opal_bitmap_set_max_size() is currently only used in attribute.c to set the maximum OMPI_FORTRAN_HANDLE_MAX... Tested on linux/x86-64 with intel-tests with all_tests_no_perf_f run with 6 procs. Let's look into MTT as well... This commit was SVN r20708.	2009-03-03 22:25:13 +00:00
Jeff Squyres	b440c92455	We don't need to link against any of the OMPI libraries; this test just slurps in .h files. This commit was SVN r20701.	2009-03-03 17:06:46 +00:00
Shiqing Fan	2326f14be5	Remove the unnecessary PROJECT command, I somehow misunderstood how it should be used on Windows.... This commit was SVN r20634.	2009-02-25 16:07:43 +00:00
Terry Dontje	0178b6c45f	Added padding to predefined handle structures to maintain library version to version compatibility. This commit was SVN r20627.	2009-02-24 17:17:33 +00:00
Eugene Loh	463f11f993	Improve shared-memory allocation: * compute mmap-file size more wisely and pass requested size to allocator * change MCA parameters: - get rid of mpool_sm_per_peer_size - get rid of mpool_sm_max_size - set default mpool_sm_min_size to 0 * no longer pad sm allocations to page boundaries * have sm_btl_first_time_init check return codes on free-list creations Have mca_btl_sm_prepare_src() check to see if it can allocate an EAGER fragment rather than a MAX fragment if the smaller size works. Remove ompi/class/ompi_[circular_buffer_]fifo.h and references thereto. Remove opal/util/pow2.[c\|h] and references thereto. This commit was SVN r20614.	2009-02-20 19:51:57 +00:00
George Bosilca	e0638c84c8	Update the test to check that all data is exposed via the convertor_raw interface. This commit was SVN r20383.	2009-01-28 23:07:02 +00:00
George Bosilca	ecdcda9268	Move the datatpye creation functions outside the test itself. Add a test for the newly added raw functionality. This commit was SVN r20374.	2009-01-28 15:42:30 +00:00
Shiqing Fan	a5281f0434	- 1/4 commit for Windows Visual Studio and CCP support: CMakeLists and .windows files. In contribs preconfigured and precompiled parts. This commit was SVN r20108.	2008-12-10 20:59:20 +00:00
Kenneth Matney	94f8189532	Under gcc 4.2.4, make check was failing without the <stdio.h>. Moreover, I could not figure out why <time.h> would need to be included twice. So, I substituted the former for the latter, in the superfluous instantiation. This commit was SVN r19859.	2008-10-31 12:18:57 +00:00
Kenneth Matney	68248a32ef	Add #include for stdio.h to allow make check to run with gcc 4.2.4 (on Cray XT platform). This commit was SVN r19605.	2008-09-22 18:00:30 +00:00
George Bosilca	2bd9ddfc28	The datatype dump function is always visible so we don't need a fake one. This commit was SVN r19158.	2008-08-05 14:45:42 +00:00
George Bosilca	e6f700bf04	Reenable the ddt_test as #1242 is now closed. This commit was SVN r19145.	2008-08-04 15:57:02 +00:00
Brian Barrett	8cff3131d6	Remove memory tests, as they're out of date This commit was SVN r18656.	2008-06-14 14:01:05 +00:00
Jeff Squyres	1f226b5898	Adjust the comment to be correct, per http://www.open-mpi.org/community/lists/devel/2008/06/4095.php. This commit was SVN r18604.	2008-06-06 01:23:58 +00:00
Ralph Castain	7c7b9b0486	Do a little cleanup on the opal graph class and opal carto framework to conform to OMPI naming conventions and avoid potential conflict with user applications - no change in functionality, passes carto test program This commit was SVN r18407.	2008-05-07 19:33:49 +00:00
Ralph Castain	dc7f45dafd	Remove the obsolete and largely unused orte_system_info structure. The only fields that were used in that struct were nodeid and nodename - these have been transferred to the orte_process_info structure. Only one place used the user name field - session_dir, when formulating the name of the top-level directory. Accordingly, the code for getting the user's id has been moved to the session_dir code. This commit was SVN r17926.	2008-03-23 23:10:15 +00:00
Jeff Squyres	0fbb399f13	Remove ddt_test from "make check" per #1242 . This commit was SVN r17818.	2008-03-14 14:21:47 +00:00
Jeff Squyres	4133b46ec5	Re-enable "make dist", at least until #1232 is fixed. This commit was SVN r17796.	2008-03-09 21:36:10 +00:00
Jeff Squyres	498190e326	Add checks to ensure that opal_init() completes successfully so that we fail gracefully (and don't segv) if opal_init() fails. This commit was SVN r17760.	2008-03-06 14:55:32 +00:00
Tim Prins	2e1bda6d23	Remove the now-unused arithmatic interface to the dss This commit was SVN r17654.	2008-02-28 21:36:51 +00:00
Ralph Castain	d70e2e8c2b	Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer This commit was SVN r17632.	2008-02-28 01:57:57 +00:00
Jeff Squyres	04e026fa98	Fix "make check"; manually include <string.h> since the datatype header files were re-orged to have fewer dependencies This commit was SVN r17427.	2008-02-12 13:02:53 +00:00
Shiqing Fan	f5792bbda5	merging the memchecker into trunk. This commit was SVN r17424.	2008-02-12 08:46:27 +00:00
Sharon Melamed	98e8de264d	Wraped the carto API in carto_base_wrapers.c This commit was SVN r17380.	2008-02-05 19:29:16 +00:00
Sharon Melamed	025b68becf	Move the carto framework to the trunk. This commit was SVN r17177.	2008-01-23 09:20:34 +00:00
George Bosilca	7eca186568	Fix a typo related to the conversion from ompi_pointer_array to opal_pointer_array. This commit was SVN r17023.	2007-12-22 05:32:40 +00:00
George Bosilca	906e8bf1d1	Replace the ompi_pointer_array with opal_pointer_array. The next step (sometimes after the merge with the ORTE branch), the opal_pointer_array will became the only pointer_array implementation (the orte_pointer_array will be removed). This commit was SVN r17007.	2007-12-21 06:02:00 +00:00
Rich Graham	27a748e7eb	change all instances of ompi_free_list_init to ompi_free_list_init_new. Header and payload data are specified separately at this stage. This commit was SVN r16633.	2007-11-01 23:38:50 +00:00
Ralph Castain	54b2cf747e	These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC. The commit has been tested for C/R and Cray operations, and on Odin (SLURM, rsh) and RoadRunner (TM). I tried to update all environments, but obviously could not test them. I know that Windows needs some work, and have highlighted what is know to be needed in the odls process component. This represents a lot of work by Brian, Tim P, Josh, and myself, with much advice from Jeff and others. For posterity, I have appended a copy of the email describing the work that was done: As we have repeatedly noted, the modex operation in MPI_Init is the single greatest consumer of time during startup. To-date, we have executed that operation as an ORTE stage gate that held the process until a startup message containing all required modex (and OOB contact info - see #3 below) info could be sent to it. Each process would send its data to the HNP's registry, which assembled and sent the message when all processes had reported in. In addition, ORTE had taken responsibility for monitoring process status as it progressed through a series of "stage gates". The process reported its status at each gate, and ORTE would then send a "release" message once all procs had reported in. The incoming changes revamp these procedures in three ways: 1. eliminating the ORTE stage gate system and cleanly delineating responsibility between the OMPI and ORTE layers for MPI init/finalize. The modex stage gate (STG1) has been replaced by a collective operation in the modex itself that performs an allgather on the required modex info. The allgather is implemented using the orte_grpcomm framework since the BTL's are not active at that point. At the moment, the grpcomm framework only has a "basic" component analogous to OMPI's "basic" coll framework - I would recommend that the MPI team create additional, more advanced components to improve performance of this step. The other stage gates have been replaced by orte_grpcomm barrier functions. We tried to use MPI barriers instead (since the BTL's are active at that point), but - as we discussed on the telecon - these are not currently true barriers so the job would hang when we fell through while messages were still in process. Note that the grpcomm barrier doesn't actually resolve that problem, but Brian has pointed out that we are unlikely to ever see it violated. Again, you might want to spend a little time on an advanced barrier algorithm as the one in "basic" is very simplistic. Summarizing this change: ORTE no longer tracks process state nor has direct responsibility for synchronizing jobs. This is now done via collective operations within the MPI layer, albeit using ORTE collective communication services. I -strongly- urge the MPI team to implement advanced collective algorithms to improve the performance of this critical procedure. 2. reducing the volume of data exchanged during modex. Data in the modex consisted of the process name, the name of the node where that process is located (expressed as a string), plus a string representation of all contact info. The nodename was required in order for the modex to determine if the process was local or not - in addition, some people like to have it to print pretty error messages when a connection failed. The size of this data has been reduced in three ways: (a) reducing the size of the process name itself. The process name consisted of two 32-bit fields for the jobid and vpid. This is far larger than any current system, or system likely to exist in the near future, can support. Accordingly, the default size of these fields has been reduced to 16-bits, which means you can have 32k procs in each of 32k jobs. Since the daemons must have a vpid, and we require one daemon/node, this also restricts the default configuration to 32k nodes. To support any future "mega-clusters", a configuration option --enable-jumbo-apps has been added. This option increases the jobid and vpid field sizes to 32-bits. Someday, if necessary, someone can add yet another option to increase them to 64-bits, I suppose. (b) replacing the string nodename with an integer nodeid. Since we have one daemon/node, the nodeid corresponds to the local daemon's vpid. This replaces an often lengthy string with only 2 (or at most 4) bytes, a substantial reduction. (c) when the mca param requesting that nodenames be sent to support pretty error messages, a second mca param is now used to request FQDN - otherwise, the domain name is stripped (by default) from the message to save space. If someone wants to combine those into a single param somehow (perhaps with an argument?), they are welcome to do so - I didn't want to alter what people are already using. While these may seem like small savings, they actually amount to a significant impact when aggregated across the entire modex operation. Since every proc must receive the modex data regardless of the collective used to send it, just reducing the size of the process name removes nearly 400MBytes of communication from a 32k proc job (admittedly, much of this comm may occur in parallel). So it does add up pretty quickly. 3. routing RML messages to reduce connections. The default messaging system remains point-to-point - i.e., each proc opens a socket to every proc it communicates with and sends its messages directly. A new option uses the orteds as routers - i.e., each proc only opens a single socket to its local orted. All messages are sent from the proc to the orted, which forwards the message to the orted on the node where the intended recipient proc is located - that orted then forwards the message to its local proc (the recipient). This greatly reduces the connection storm we have encountered during startup. It also has the benefit of removing the sharing of every proc's OOB contact with every other proc. The orted routing tables are populated during launch since every orted gets a map of where every proc is being placed. Each proc, therefore, only needs to know the contact info for its local daemon, which is passed in via the environment when the proc is fork/exec'd by the daemon. This alone removes ~50 bytes/process of communication that was in the current STG1 startup message - so for our 32k proc job, this saves us roughly 32k50 = 1.6MBytes sent to 32k procs = 51GBytes of messaging. Note that you can use the new routing method by specifying -mca routed tree - if you so desire. This mode will become the default at some point in the future. There are a few minor additional changes in the commit that I'll just note in passing: propagation of command line mca params to the orteds - fixes ticket #1073. See note there for details. * requiring of "finalize" prior to "exit" for MPI procs - fixes ticket #1144. See note there for details. * cleanup of some stale header files This commit was SVN r16364.	2007-10-05 19:48:23 +00:00
Shiqing Fan	0f468f3668	- Remove the solution and project files, will commit them later. This commit was SVN r15705.	2007-07-31 17:07:02 +00:00
Shiqing Fan	4d7b349cdb	- Add VC8 solution and project files. - If one wants to use this solution, remember to unload the project 'orte-restart' which is currently not working for Windows. This commit was SVN r15680.	2007-07-30 11:05:34 +00:00
Tim Prins	7445a11f61	Remove duplicate tests. The current version of the dss tests are in orte/test/unit/dss Remove defunct testing matrix This commit was SVN r15535.	2007-07-20 13:37:44 +00:00
Ralph Castain	511457feb5	Remove stale test code. At least we were wise enough to have eliminated this code from the "make check" tree, but almost none of it compiles and of what does compile, nothing seems to really work. This commit was SVN r15446.	2007-07-16 16:34:14 +00:00
Jeff Squyres	f72b52bb1d	s/ifdef/if/ fro OMPI_C_HAVE_VISIBILITY to enable static builds. This commit was SVN r14985.	2007-06-11 13:20:56 +00:00
George Bosilca	29dd535c01	Remove all references to the orte_bitmap as well as the files. This commit was SVN r14928.	2007-06-06 20:24:07 +00:00
Brian Barrett	42c74b2cf7	fix test case so that condition variables work right, at least on PTHREADS. I'm pretty sure condition variables are wrong for Solaris threads. This commit was SVN r14877.	2007-06-05 19:24:17 +00:00
Brian Barrett	60571567a4	Better fix than r14831 -- ddt_pack was removed from TESTS because it calls MPI_INIT and that causes problems during make distcheck. Instead put it in check_PROGRAMS which lets it get built, but doesn't run it. This commit was SVN r14832. The following SVN revision numbers were found above: r14831 --> open-mpi/ompi@9258c5200a	2007-06-01 14:34:06 +00:00
Rainer Keller	9258c5200a	- As we need to reconfigure anyhow, get rid of autogen warning. This commit was SVN r14831.	2007-06-01 08:20:38 +00:00

1 2 3 4 5 ...

491 Коммитов