openmpi

Автор	SHA1	Сообщение	Дата
Josh Hursey	99144db970	Improve checkpoint/restart support by allowing a checkpoint to progress when the process is not in the MPI library. This involves creating a separate thread for polling for a checkpoint request. This thread is active when the MPI process is not in the MPI library, and paused when the MPI process is in the library. Some MPI C interface files saw some spacing changes to conform to the coding standards of Open MPI. Changed MPI C interface files to use {{{OPAL_CR_ENTER_LIBRARY()}}} and {{{OPAL_CR_EXIT_LIBRARY()}}} instead of just {{{OPAL_CR_TEST_CHECKPOINT_READY()}}}. This will allow the checkpoint/restart system more flexibility in how it is to behave. Fixed the configure check for {{{--enable-ft-thread}}} so it has a know dependance on {{{--enable-mpi-thread}}} (and/or {{{--enable-progress-thread}}}). Added a line for Checkpoint/Restart support to {{{ompi_info}}}. Added some options to choose at runtime whether or not to use the checkpoint polling thread. By default, if the user asked for it to be compiled in, then it is used. But some users will want the ability to toggle its use at runtime. There are still some places for improvement, but the feature works correctly. As always with Checkpoint/Restart, it is compiled out unless explicitly asked for at configure time. Further, if it was configured in, then it is not used unless explicitly asked for by the user at runtime. This commit was SVN r17516.	2008-02-19 22:15:52 +00:00
Jeff Squyres	fe6ba96dd6	Be a little friendlier for mercurial checkouts. This commit was SVN r17271.	2008-01-28 03:04:53 +00:00
Gleb Natapov	bd47da4699	Initial XRC support by Mellanox. This commit was SVN r16787.	2007-11-28 07:18:59 +00:00
Ethan Mallove	005652c9d4	* Embed ident strings into the Open MPI libraries using one of the following methods (in order of precedence): 1. #pragma ident <ident string> (e.g., Intel and Sun) 1. #ident <ident string> (e.g., GCC) 1. static const char ident[] = <ident string> (all others) By default, the ident string used is the standard Open MPI version string. Only the following libraries will get the embedded version strings (e.g., DSOs will not): * libmpi.so * libmpi_cxx.so * libmpi_f77.so * libopen-pal.so * libopen-rte.so * Added two new configure options: * `--with-package-name="STRING"` (defaults to "Open MPI username@hostname Distribution"). `STRING` is displayed by `ompi_info` next to the "Package" heading. * `--with-ident-string="STRING"` (defaults to the standard Open MPI version string - e.g., X.Y.Zr######). `%VERSION%` will expand to the Open MPI version string if it is supplied to this configure option. This commit was SVN r16644.	2007-11-03 02:40:22 +00:00
Jeff Squyres	74fd678de8	Fix a help message to also show the default value. This commit was SVN r16369.	2007-10-06 14:25:38 +00:00
Ralph Castain	54b2cf747e	These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC. The commit has been tested for C/R and Cray operations, and on Odin (SLURM, rsh) and RoadRunner (TM). I tried to update all environments, but obviously could not test them. I know that Windows needs some work, and have highlighted what is know to be needed in the odls process component. This represents a lot of work by Brian, Tim P, Josh, and myself, with much advice from Jeff and others. For posterity, I have appended a copy of the email describing the work that was done: As we have repeatedly noted, the modex operation in MPI_Init is the single greatest consumer of time during startup. To-date, we have executed that operation as an ORTE stage gate that held the process until a startup message containing all required modex (and OOB contact info - see #3 below) info could be sent to it. Each process would send its data to the HNP's registry, which assembled and sent the message when all processes had reported in. In addition, ORTE had taken responsibility for monitoring process status as it progressed through a series of "stage gates". The process reported its status at each gate, and ORTE would then send a "release" message once all procs had reported in. The incoming changes revamp these procedures in three ways: 1. eliminating the ORTE stage gate system and cleanly delineating responsibility between the OMPI and ORTE layers for MPI init/finalize. The modex stage gate (STG1) has been replaced by a collective operation in the modex itself that performs an allgather on the required modex info. The allgather is implemented using the orte_grpcomm framework since the BTL's are not active at that point. At the moment, the grpcomm framework only has a "basic" component analogous to OMPI's "basic" coll framework - I would recommend that the MPI team create additional, more advanced components to improve performance of this step. The other stage gates have been replaced by orte_grpcomm barrier functions. We tried to use MPI barriers instead (since the BTL's are active at that point), but - as we discussed on the telecon - these are not currently true barriers so the job would hang when we fell through while messages were still in process. Note that the grpcomm barrier doesn't actually resolve that problem, but Brian has pointed out that we are unlikely to ever see it violated. Again, you might want to spend a little time on an advanced barrier algorithm as the one in "basic" is very simplistic. Summarizing this change: ORTE no longer tracks process state nor has direct responsibility for synchronizing jobs. This is now done via collective operations within the MPI layer, albeit using ORTE collective communication services. I -strongly- urge the MPI team to implement advanced collective algorithms to improve the performance of this critical procedure. 2. reducing the volume of data exchanged during modex. Data in the modex consisted of the process name, the name of the node where that process is located (expressed as a string), plus a string representation of all contact info. The nodename was required in order for the modex to determine if the process was local or not - in addition, some people like to have it to print pretty error messages when a connection failed. The size of this data has been reduced in three ways: (a) reducing the size of the process name itself. The process name consisted of two 32-bit fields for the jobid and vpid. This is far larger than any current system, or system likely to exist in the near future, can support. Accordingly, the default size of these fields has been reduced to 16-bits, which means you can have 32k procs in each of 32k jobs. Since the daemons must have a vpid, and we require one daemon/node, this also restricts the default configuration to 32k nodes. To support any future "mega-clusters", a configuration option --enable-jumbo-apps has been added. This option increases the jobid and vpid field sizes to 32-bits. Someday, if necessary, someone can add yet another option to increase them to 64-bits, I suppose. (b) replacing the string nodename with an integer nodeid. Since we have one daemon/node, the nodeid corresponds to the local daemon's vpid. This replaces an often lengthy string with only 2 (or at most 4) bytes, a substantial reduction. (c) when the mca param requesting that nodenames be sent to support pretty error messages, a second mca param is now used to request FQDN - otherwise, the domain name is stripped (by default) from the message to save space. If someone wants to combine those into a single param somehow (perhaps with an argument?), they are welcome to do so - I didn't want to alter what people are already using. While these may seem like small savings, they actually amount to a significant impact when aggregated across the entire modex operation. Since every proc must receive the modex data regardless of the collective used to send it, just reducing the size of the process name removes nearly 400MBytes of communication from a 32k proc job (admittedly, much of this comm may occur in parallel). So it does add up pretty quickly. 3. routing RML messages to reduce connections. The default messaging system remains point-to-point - i.e., each proc opens a socket to every proc it communicates with and sends its messages directly. A new option uses the orteds as routers - i.e., each proc only opens a single socket to its local orted. All messages are sent from the proc to the orted, which forwards the message to the orted on the node where the intended recipient proc is located - that orted then forwards the message to its local proc (the recipient). This greatly reduces the connection storm we have encountered during startup. It also has the benefit of removing the sharing of every proc's OOB contact with every other proc. The orted routing tables are populated during launch since every orted gets a map of where every proc is being placed. Each proc, therefore, only needs to know the contact info for its local daemon, which is passed in via the environment when the proc is fork/exec'd by the daemon. This alone removes ~50 bytes/process of communication that was in the current STG1 startup message - so for our 32k proc job, this saves us roughly 32k50 = 1.6MBytes sent to 32k procs = 51GBytes of messaging. Note that you can use the new routing method by specifying -mca routed tree - if you so desire. This mode will become the default at some point in the future. There are a few minor additional changes in the commit that I'll just note in passing: propagation of command line mca params to the orteds - fixes ticket #1073. See note there for details. * requiring of "finalize" prior to "exit" for MPI procs - fixes ticket #1144. See note there for details. * cleanup of some stale header files This commit was SVN r16364.	2007-10-05 19:48:23 +00:00
Mohamad Chaarawi	59a7bf8a9f	Merging in the Sparse Groups.. This commit includes config changes.. This commit was SVN r15764.	2007-08-04 00:41:26 +00:00
Josh Hursey	dadca7da88	Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD). This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158	2007-03-16 23:11:45 +00:00
Rainer Keller	31e12cbe71	- Get --coverage to work with new gcov and cleanup the generated files. - Use regexp to check for optimizations in flags. This commit was SVN r13211.	2007-01-19 14:28:52 +00:00
Brian Barrett	e7a7a64e4c	Implement MPI::SEEK_{SET, END, POS} for the C++ bindings, working around some issues with the C #defines SEEK_{SET, END, POS}. The workaround involves some hackery that should work in almost every common use case for the C stdio constants (and all the legal issues of the MPI constants). The one issue is that the C stdio constants are now const ints instead of #defines, which means that #ifdef checks will fail for the constants. Behavior can be disabled at either configure time or build time. Refs trac:387 This commit was SVN r12121. The following Trac tickets were found above: Ticket 387 --> https://svn.open-mpi.org/trac/ompi/ticket/387	2006-10-15 23:50:24 +00:00
Brian Barrett	bc9c6d65c6	The last of the Alpha fixes. The Alpha sh shell's builtin test doesn't like == that much... Refs trac:380 This commit was SVN r11860. The following Trac tickets were found above: Ticket 380 --> https://svn.open-mpi.org/trac/ompi/ticket/380	2006-09-28 03:45:27 +00:00
Jeff Squyres	8226dab86c	Fixes trac:377 Add --enable-orterun-prefix-by-default (and a synonym: --enable-mpirun-prefix-by-default) to make orterun always behave as if "--prefix $prefix" was given on the command line (where $prefix is the value given to the --prefix option to configure). This prevents many rsh/ssh users from needing to modify their shell startup files to set the LD_LIBRARY_PATH for Open MPI (they will still need to set PATH or otherwise find the OMPI executables to mpicc/mpirun/etc. their MPI applications). Also added --noprefix option to orterun to disable this behavior. Finally, note that even if --enable-orterun-prefix-by-default is specified, if the user specifies --prefix or /path/to/mpirun, these options will override the default value of the prefix ($prefix). This commit was SVN r11669. The following Trac tickets were found above: Ticket 377 --> https://svn.open-mpi.org/trac/ompi/ticket/377	2006-09-15 02:52:08 +00:00
Jeff Squyres	c9d244a298	Rename some OMPI_* macros to be OPAL_* macros. This commit was SVN r11598.	2006-09-08 23:42:32 +00:00
Jeff Squyres	c068bc155a	First steps towards IPv6 support. Mainly to support the guys working on it, even though there's no other IPv6 code in the tree yet. This commit was SVN r11561.	2006-09-08 00:10:40 +00:00
Jeff Squyres	5f356edb64	Bring over changes from the /tmp/fortran-stuff series: - Make the F90 bindings compile and link properly with gfortran 4.0, 4.1, Intel 9.0, PGI 6.1, Sun (don't know version offhand -- the most current as of this writing, I think), and NAG 5.2, although some have limitations (e.g., NAG can't seem to handle the medium and large sizes) - Building the F90 "small" module size is now the default, even for developers - Split up mpif.h into multiple files because parts of it were toxic to the F90 bindings - Properly specify unsized/unshaped arrays to make the bindings work on all known compilers - Make ompi_info show Fortran 90 bindings size - XML somewhat lags the generated scripts as of this commit, but functionality was my main goal -- the XML can be updated later (if at all). This commit was SVN r10118.	2006-05-30 14:37:41 +00:00
Jeff Squyres	0d092abb81	Clarify help string, per change to 1.0 branch. This commit was SVN r9906.	2006-05-12 03:06:54 +00:00
Brian Barrett	52369307f8	Add a feature to the build system that Terry from Sun and I talked about in San Jose. Allow the configure option --disable-binaries to build OMPI, but not build or install the support binaries (so basically, just build the libraries). This commit was SVN r9777.	2006-04-29 02:16:41 +00:00
Jeff Squyres	79a3678924	Fix typo This commit was SVN r9723.	2006-04-26 11:52:25 +00:00
Brian Barrett	5bd1be7ac4	* clean up some configure --help output as reported on OMPI mailing list. This should probably go to 1.1... This commit was SVN r9722.	2006-04-26 02:07:19 +00:00
Brian Barrett	2ad29df0a1	* Add option to disable the adding of -g to CFLAGS/CXXFLAGS when --enable-debug is given. Generally not useful, unless you're on a platform without a debugger... This commit was SVN r9684.	2006-04-22 19:23:26 +00:00
Jeff Squyres	f8e634d6ca	Bring over /tmp/f90-stuff branch to the trunk. svn merge -r 9453:9609 https://svn.open-mpi.org/svn/ompi/tmp/f90-stuff . Several improvements over the current F90 MPI bindings: - The capability to make 4 sizes of the F90 bindings: - trivial: only the F90-specific MPI functions (sizeof and a few others) - small: (this is the default) all MPI functions that do not take choice buffers - medium: small + all MPI functions that take one choice buffer (e.g., MPI_SEND) - large: all MPI functions, but those that take 2 choice buffers (e.g., MPI_GATHER) only allow both buffers to be of the same type - Remove all non-standard MPI types (LOGICALx, CHARACTERx) - Remove use of selected__kind() and only use MPI-defined types (INTEGERx, etc.) - Decrease complexity of the F90 configure and build system This commit was SVN r9610.	2006-04-11 03:33:38 +00:00
George Bosilca	aef1358808	First import or peruse. As it look like SVN doesn't like to import simultaneously 2 directories having the same name I have to split the import in 2. I start with the test and the configure. This commit was SVN r9372.	2006-03-23 04:54:10 +00:00
Brian Barrett	b1d2424013	Merge in present work on the MPI-2 onesided chapter. The current code is not complete, but stable enough that it will have no impact on general development, so into the trunk it goes. Changes in this commit include: - Remove the --with option for disabling MPI-2 onesided support. It complicated code, and has no real reason for existing - add a framework osc (OneSided Communication) for encapsulating all the MPI-2 onesided functionality - Modify the MPI interface functions for the MPI-2 onesided chapter to properly call the underlying framework and do the required error checking - Created an osc component pt2pt, which is layered over the BML/BTL for communication (although it also uses the PML for long message transfers). Currently, all support functions, all communication functions (Put, Get, Accumulate), and the Fence synchronization function are implemented. The PWSC active synchronization functions and Lock/Unlock passive synchronization functions are still not implemented This commit was SVN r8836.	2006-01-28 15:38:37 +00:00
Brian Barrett	22c50c2cca	* allow loading of configure options from an external file, for use in cross-compile environment This commit was SVN r8706.	2006-01-16 23:38:42 +00:00
Brian Barrett	129451277e	* add missing constant when using the MPI-2 onesided shell functions. This should probably go to the branch This commit was SVN r8222.	2005-11-21 23:43:48 +00:00
Brian Barrett	660d2f61b6	Don't add external declarations for the PMPI_W{TICK,TIME} functions if profiling isn't enabled. It appers that some compilers (g95) will try to resolve the symbols if they are prototyped. This commit was SVN r8110.	2005-11-11 00:12:40 +00:00
Jeff Squyres	42ec26e640	Update the copyright notices for IU and UTK. This commit was SVN r7999.	2005-11-05 19:57:48 +00:00
Brian Barrett	65bcc283c0	* Change the --enable-{cxx,f77,f90} options to --enable-mpi-{cxx,f77,f90} so that people aren't confused about what they are actually disabling. This should go to the 1.0 branch This commit was SVN r7851.	2005-10-25 02:53:54 +00:00
Jeff Squyres	e72e1f0050	Fix some incorrect fortran parameter values This commit was SVN r7584.	2005-10-02 14:59:27 +00:00
Jeff Squyres	67cde6c212	- Minor cleanups - Add --enable-trace which turns on some internal tracing and dumps a file per process in the session directory tree. Meant for internal developer tracing, not for tracing MPI applications in the traditional sense. This commit was SVN r7229.	2005-09-08 09:44:50 +00:00
Brian Barrett	77dafc7826	* Make Fortran 90 turned on by default (unless it's a developer build, in which case, skip it, since it takes so bloody long to compile) * Dsiable the XGrid PLS when compiling in 64 bit mode, as Tiger only ships with XGrid libraries for 32bit apps * ompi_config.h and orte_config.h (and supporting headers) are now only installed if --with-devel-headers is enabled. Since they are no longer needed for MPI applications, it doesn't make sense to install them if we are only installing mpi.h and mpif.h. Also, since we are no longer including ompi_config.h in mpi.h, there is no longer a need to do the dumb sed trick on install This commit was SVN r7042.	2005-08-26 00:11:30 +00:00
Jeff Squyres	4eba48b430	Bunches of fixes for the f90 bindings - fix the --with-f90-max-array-dim configure switch - fix configure test to find the supported f90 linker switch to find fortran modules - Unbelievably, some versions of sh (cough cough Solaris 9 cough cough) actually seem to internally perform a "cd" into a subdirectory when you run "./foo/bar", such that if you try to source a script in the top-level directory in the bar script (i.e., ". some_script" in the bar script), it will try to run it in the "foo" subdirectory, rather than the top-level directory! #$@#$%#$% So we have to pass in the pwd to the scripts so that they know where some_script is. - Reworked much of ompi/mpi/f90/Makefile.am for lots of reasons. See the internal comments (mostly having to do with dependency stuff -- Libtool does not apparently support F90, so we can only build the F90 library statically. This commit was SVN r6993.	2005-08-24 02:11:02 +00:00
Brian Barrett	1fe9356d37	* Clean up the --with-platform option to automagically set a whole bunch of flags to configure. Now don't need to specify the contrib/platform part of the path if you don't want to * Add "optimized" platform setting that will undo all the performance- affecting things that a developer build sets up. This commit was SVN r6946.	2005-08-20 20:43:59 +00:00
Brian Barrett	b0b6ddd078	* add --enable-heterogeneous (default: enabled) to enable heterogeneous support in OMPI. Currently only enables/disables the architecture sharing modex in ob1 pml. * Add sds framework to ompi_info * Figure out table ids to use for Portals BTL at configure time, since we should use 30 & 31 on Red Storm, but the reference implementation only supports 0-8. * Some bug fixes in Portals UTCP sds This commit was SVN r6650.	2005-07-28 16:16:13 +00:00
Brian Barrett	14b89e0e50	Bunch more updates from operation Red Storm: * Add ability to completely disable libltdl (the dlopen code to load dynamic shared objects) to configure: --disable-dlopen * Added MCA param (component_disable_dlopen) to disable DSO loading at runtime * Made the event library behave in some not-completely-erroneous way on platforms where it has absolutely no eventops support (ie, no select, poll, or epoll) * Disabled orte_wait, opal_few, and opal_daemon_init code on platforms without fork, waitpid support. All non-init functions will return OPMI_ERR_NOT_SUPPORTED * Disable orteprobe tool when fork or pipe aren't supported This commit was SVN r6490.	2005-07-14 18:05:30 +00:00
Brian Barrett	4d580fa706	* disable TCP ptl and oob components if there is no TCP support (look at sockaddr_in - seems to be a good indicator) * disable util/if code if no inet devices (again, no sockaddr_in) * add enable/disable flag to disable stacktrace pretty-print code (defaults to enabled). Seems there's something funky going on with the preprocessor on Red Storm that was causing problems - this was the easiest fix * clean up a bunch of the configure.m4 files to remove bogus comments, properly comment them, fix the dumb logic for happy/unhappy * Create a macro for testing both header and library for a package, since we seem to do this kind of test quite often. Handles the -I and -L search paths properly (including stripping out /usr and /usr/local if not needed) * Converted mvapi components to configure.m4, using the nice new ompi_check_package macro (above) This commit was SVN r6454.	2005-07-13 04:16:03 +00:00
Jeff Squyres	08082be721	Silly mistake (how on earth did it live this long?): adding -DNDEBUG to CXXFLAGS should use $CXXFLAGS, not $CFLAGS. This commit was SVN r5787.	2005-05-19 23:52:13 +00:00
Prabhanjan Kambadur	ddead64bcf	1. Moving WRAPPER__FLAGS initialization to configure.ac instead of having it in config/ompi_setup_cxx.m4 2. Adding --enable-coverage option. This will add teh flags -ftest-coverage and -fprofile-arcs to the flags. Also, one needs to compile with debug and static only to enable code coverage 3. Adding the coverage flag options to WRAPPER__FLAGS so that mpicc and co., will add these to teh executables when they are compiled This commit was SVN r5416.	2005-04-18 16:38:27 +00:00
Brian Barrett	e3587652b7	* Add support for using ptys for stdout when doing I/O forwarding. This is enough to make use applications be line buffered instead of block buffered, which makes output come much faster :) This commit was SVN r5400.	2005-04-15 21:18:20 +00:00
Jeff Squyres	6f15d1071c	Add --with-f90-max-array-dim configure option to specify how many dimensions the f90 MPI bindings should support (they are strongly typed, and the number of dimensions of choice arguments must be specified -- it cannot be arbitrary). The default is four. Note that even though increasing this value has essentially a linear effect on the code, the multiplier constant is fairly large (only a small number of functions have 2 choice buffers, so the exponential factor is relatively small). Increasing this value increases the amount of time f90 compilers will spend compiling src/mpi/f90/mpi.f90 (some compilers will crash if it is too big). This commit was SVN r5268.	2005-04-12 10:17:52 +00:00
Jeff Squyres	b3a75f27f6	Fix some typos in processing configure options This commit was SVN r5081.	2005-03-29 02:47:43 +00:00
Jeff Squyres	3f5541349a	Add UC copyright This commit was SVN r5009.	2005-03-24 12:43:37 +00:00
Craig E Rasmussen	74af2a5fe4	Set default f90 MPI bindings to disabled. This commit was SVN r4460.	2005-02-17 21:29:50 +00:00
Brian Barrett	be7d989b0e	* remove the code to disable the event signal handling code, since we really need it for anything in the startup code to work This commit was SVN r3803.	2004-12-14 03:10:48 +00:00
Jeff Squyres	3966e30902	Remove every part of MPI-2 one-sided functionality from the tree with #if OMPI_WANT_MPI2_ONE_SIDED and some automake conditionals. Also had to add some AC_SUBSTs to eliminate part of mpif.h (otherwise the "external" statements would have made undefined symbols). All the MPI-2 one-sided functionality (including the skeleton top-level MPI API functions that only invoke an MPI exception) can be re-enabled with --enable-mpi2-one-sided. This commit was SVN r3802.	2004-12-14 02:35:03 +00:00
Jeff Squyres	616269a9be	Add HLRS copyright This commit was SVN r3665.	2004-11-28 20:09:25 +00:00
Jeff Squyres	e9ed717748	First cut at copyrights: IU, UTK, and some OSU. LANL and HLRS still pending. This commit was SVN r3655.	2004-11-22 01:38:40 +00:00
Brian Barrett	2660d42f90	* fix typo that broke things in the no signals in the event library case. This commit was SVN r2749.	2004-09-17 17:27:08 +00:00
Brian Barrett	857fd5740f	* Test enabling signal handling in the event library. Because this might have some nasty side effect we don't know about, make it a configure option for now. Also add a harmless signal handler to the pcm open (since pcm_open will have a signal handler eventually for SIGCHLD, I think). Use --enable-event-signals / --disable-event-signals to control behavior. This commit was SVN r2748.	2004-09-17 17:04:05 +00:00
Jeff Squyres	31bacaee5a	Fix a minor problem with one of the configure command line options, and add more description to its help message This commit was SVN r2728.	2004-09-16 19:42:52 +00:00
Jeff Squyres	161bab95f0	Simplify / clarify some of the logic for configure options; make --enable-picky not be the default for users This commit was SVN r2356.	2004-08-28 10:38:40 +00:00
Brian Barrett	000644007f	* C++ MPI bindings. MPI:: only This commit was SVN r1712.	2004-07-14 14:11:03 +00:00
Jeff Squyres	3fc37a863e	Change default for users to not have developer-picky compiler options. This commit was SVN r1276.	2004-06-15 19:36:36 +00:00
David Daniel	2f96ba71fe	renaming files This commit was SVN r1192.	2004-06-07 15:40:19 +00:00

1 2 3

104 Коммитов