openmpi

Автор	SHA1	Сообщение	Дата
Rainer Keller	fd28b392bf	- An intrusive commit yet again (sorry): with the separation we get bitten by header depending on having already included the corresponding [opal\|orte\|ompi]_config.h header. When separating, things like [OPAL\|ORTE\|OMPI]_DECLSPEC are missed. Script to add the corresponding header in front of all following (taking care of possible #ifdef HAVE_...) - Including some minor cleanups to - ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H - ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H - ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for orte/util/output.h - ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h - ompi/mca/btl/btl.h -- reorder to fit - ompi/mca/bml/bml.h -- reorder to fit - ompi/runtime/ompi_mpi_finalize.c -- reorder to fit - ompi/request/request.h -- additionally need ompi/constants.h - Tested on linux/x86-64 This commit was SVN r20720.	2009-03-04 15:35:54 +00:00
George Bosilca	5f6896ce5b	No memory leaks (this is an improvement for r20706). This commit was SVN r20707. The following SVN revision numbers were found above: r20706 --> open-mpi/ompi@248bbb8a2f	2009-03-03 22:14:05 +00:00
George Bosilca	248bbb8a2f	Give a small chance to those with an "IP guru" admin-sys to define what exactly is a private IPv4 address. By deafult we obide to the RFC1918 and RFC3330, but we have the opportunity to change them. Based on a patch from Camille Coti. This commit was SVN r20706.	2009-03-03 22:06:09 +00:00
Jeff Squyres	cfcca7d80e	Fix some typos in comments submitted by Bert Wesarg. This commit was SVN r20695.	2009-03-03 12:50:46 +00:00
Eugene Loh	463f11f993	Improve shared-memory allocation: * compute mmap-file size more wisely and pass requested size to allocator * change MCA parameters: - get rid of mpool_sm_per_peer_size - get rid of mpool_sm_max_size - set default mpool_sm_min_size to 0 * no longer pad sm allocations to page boundaries * have sm_btl_first_time_init check return codes on free-list creations Have mca_btl_sm_prepare_src() check to see if it can allocate an EAGER fragment rather than a MAX fragment if the smaller size works. Remove ompi/class/ompi_[circular_buffer_]fifo.h and references thereto. Remove opal/util/pow2.[c\|h] and references thereto. This commit was SVN r20614.	2009-02-20 19:51:57 +00:00
Jeff Squyres	67a5374a61	Re CID 1180: Actually, it would be better to also print something in the case of an error, too... This commit was SVN r20443.	2009-02-05 15:26:44 +00:00
Jeff Squyres	598e530de9	Fix CID 1180: ensure to check the output from snprintf, since we pass it to write(). This commit was SVN r20442.	2009-02-05 15:24:48 +00:00
Jeff Squyres	bb3d258562	Round up a few places where PATH_MAX was used instead of OMPI_PATH_MAX. Thanks to Andrea Iob for the bug report. This commit was SVN r20360.	2009-01-27 22:57:50 +00:00
Ralph Castain	88a0af9726	Revise the way we output resolved hostnames to make life easier for the Eclipse folks. Store aliases for individual nodes (only when requested to show resolved hostnames) and then report them out as part of the display-map option. This commit was SVN r20284.	2009-01-15 18:11:50 +00:00
Jeff Squyres	d1c6f3f89a	* Fix a truckload of Cisco copyrights to be the same as the rest of the code base. * Fix a few misspellings in other copyrights. This commit was SVN r20241.	2009-01-11 02:30:00 +00:00
Jeff Squyres	ad7cfe63a3	Fix CID 1180: check for negative return from snprintf. This commit was SVN r20192.	2009-01-03 15:33:54 +00:00
George Bosilca	80fd24c948	Small cleanups: remove an unused dependency to signal.h and include output.h. This commit was SVN r20155.	2008-12-18 22:39:49 +00:00
Tim Mattox	4fa13a1a4d	Fix two typos inside of comments. This commit was SVN r20112.	2008-12-10 21:18:13 +00:00
Ralph Castain	728a24c8ec	After considerable patience and help with debugging/testing from Tim M and Jeff S, return a completed and pretty well tested patch of the IOF to the trunk. This commit includes the previously reverted r20074, r20068, and r20064, as well as changes to fix those commits. Basically, the remaining problem turned out to be: 1. closing stdout/stderr during orte_finalize of mpirun 2. inadvertently setting up a write event on fd = -1 3. devising a scheme to more accurately track when the stdin write event was active vs closed so it only got released once This passed prelim MTT testing by Jeff and Tim, but should soak for awhile before migrating to 1.3. This commit was SVN r20106. The following SVN revision numbers were found above: r20064 --> open-mpi/ompi@a07660aea8 r20068 --> open-mpi/ompi@ec930d14a9 r20074 --> open-mpi/ompi@2940309613	2008-12-10 20:40:47 +00:00
Jeff Squyres	d7f3dd2230	Add a comment explaining exactly what is returned by this function because we wasted a good amount of time today assuming that it was returning the actual netmask. Specifically, we were confused why it returned 0x18 instead of 0xffffff00 for a class C subnet (the head-smacking moment wasn't until [much] later when we converted 0x18 to decimal, which is 24. Then the Clue Light(tm) went on). This commit was SVN r20002.	2008-11-14 22:59:41 +00:00
George Bosilca	6344b8dffe	Force an explicit cast to keep the compilers quiet. This commit was SVN r19975.	2008-11-11 14:58:53 +00:00
Rolf vandeVaart	cad49da72d	Fix the tcp btl so it makes use of the btl_tcp_if_include and btl_tcp_if_exclude parameters on the connecting side also. Also move define of IF_NAMESIZE into if.h file. And lastly, add one verbose debug message which may be useful if we run into other issues like this. This commit fixes trac:1573. This commit was SVN r19932. The following Trac tickets were found above: Ticket 1573 --> https://svn.open-mpi.org/trac/ompi/ticket/1573	2008-11-05 18:45:42 +00:00
Shiqing Fan	a456c057d6	- Skip the loopback address on windows. This commit was SVN r19862.	2008-10-31 17:02:41 +00:00
Matthias Jurenz	47bf6c213c	Fixed memory leak in 'opal_vsnprintf()' This commit was SVN r19843.	2008-10-29 12:59:11 +00:00
George Bosilca	579d70edad	We should use #ifdef and not #if This commit was SVN r19504.	2008-09-05 12:44:19 +00:00
Shiqing Fan	32243829d8	Add the BEGIN/END_C_DECLS declarations. This commit was SVN r19445.	2008-08-28 13:06:14 +00:00
Ralph Castain	04fe1e1875	Avoid using the "access" function for security reasons as per its documentation. Also, check to ensure it is a file (and not something else) when we are looking for a file Fixes trac:1272 This commit was SVN r19373. The following Trac tickets were found above: Ticket 1272 --> https://svn.open-mpi.org/trac/ompi/ticket/1272	2008-08-20 15:30:25 +00:00
Rainer Keller	5eb4ebe8fa	- Fix buffer size warning Coverity CID 8 This commit was SVN r19198.	2008-08-06 14:53:43 +00:00
Rainer Keller	dbafe83999	- Update the warn_unused result from allocating functions - Set __opal_attribute_nonnull__ where an argument must not be null - Mark unused functions This commit was SVN r19107.	2008-07-31 15:46:09 +00:00
Ralph Castain	a62b2a0150	Per the July technical meeting: Standardize the handling of the orte launch agent option across PLMs. This has been a consistent complaint I have received - each PLM would register its own MCA param to get input on the launch agent for remote nodes (in fact, one or two didn't, but most did). This would then get handled in various and contradictory ways. Some PLMs would accept only a one-word input. Others accepted multi-word args such as "valgrind orted", but then some would error by putting any prefix specified on the cmd line in front of the incorrect argument. For example, while using the rsh launcher, if you specified "valgrind orted" as your launch agent and had "--prefix foo" on you cmd line, you would attempt to execute "ssh foo/valgrind orted" - which obviously wouldn't work. This was all -very- confusing to users, who had to know which PLM was being used so they could even set the right mca param in the first place! And since we don't warn about non-recognized or non-used mca params, half of the time they would wind up not doing what they thought they were telling us to do. To solve this problem, we did the following: 1. removed all mca params from the individual plms for the launch agent 2. added a new mca param "orte_launch_agent" for this purpose. To further simplify for users, this comes with a new cmd line option "--launch-agent" that can take a multi-word string argument. The value of the param defaults to "orted". 3. added a PLM base function that processes the orte_launch_agent value and adds the contents to a provided argv array. This can subsequently be harvested at-will to handle multi-word values 4. modified the PLMs to use this new function. All the PLMs except for the rsh PLM required very minor change - just called the function and moved on. The rsh PLM required much larger changes as - because of the rsh/ssh cmd line limitations - we had to correctly prepend any provided prefix to the correct argv entry. 5. added a new opal_argv_join_range function that allows the caller to "join" argv entries between two specified indices Please let me know of any problems. I tried to make this as clean as possible, but cannot compile all PLMs to ensure all is correct. This commit was SVN r19097.	2008-07-30 18:26:24 +00:00
Ralph Castain	83e7c19d33	Remove deprecated function - this was incorporated into the paffinity framework a long time ago. Fortunately, nobody was actually using it! This commit was SVN r18990.	2008-07-23 03:43:31 +00:00
Jeff Squyres	480c17c332	Fix in minore memory leak This commit was SVN r18857.	2008-07-10 00:37:08 +00:00
Ralph Castain	9613b3176c	Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach. I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive. This commit was SVN r18619.	2008-06-09 14:53:58 +00:00
Ralph Castain	ca91ec525b	Add a suffix to the opal_output stream descriptor object - we can now output both a prefix and a suffix for a given stream. Default the suffix to NULL. Remove lingering references to a filtering system as this will no longer be implemented. This commit was SVN r18586.	2008-06-04 20:52:20 +00:00
Jeff Squyres	e7ecd56bd2	This commit represents a bunch of work on a Mercurial side branch. As such, the commit message back to the master SVN repository is fairly long. = ORTE Job-Level Output Messages = Add two new interfaces that should be used for all new code throughout the ORTE and OMPI layers (we already make the search-and-replace on the existing ORTE / OMPI layers): * orte_output(): (and corresponding friends ORTE_OUTPUT, orte_output_verbose, etc.) This function sends the output directly to the HNP for processing as part of a job-specific output channel. It supports all the same outputs as opal_output() (syslog, file, stdout, stderr), but for stdout/stderr, the output is sent to the HNP for processing and output. More on this below. * orte_show_help(): This function is a drop-in-replacement for opal_show_help(), with two differences in functionality: 1. the rendered text help message output is sent to the HNP for display (rather than outputting directly into the process' stderr stream) 1. the HNP detects duplicate help messages and does not display them (so that you don't see the same error message N times, once from each of your N MPI processes); instead, it counts "new" instances of the help message and displays a message every ~5 seconds when there are new ones ("I got X new copies of the help message...") opal_show_help and opal_output still exist, but they only output in the current process. The intent for the new orte_* functions is that they can apply job-level intelligence to the output. As such, we recommend that all new ORTE and OMPI code use the new orte_* functions, not thei opal_* functions. === New code === For ORTE and OMPI programmers, here's what you need to do differently in new code: * Do not include opal/util/show_help.h or opal/util/output.h. Instead, include orte/util/output.h (this one header file has declarations for both the orte_output() series of functions and orte_show_help()). * Effectively s/opal_output/orte_output/gi throughout your code. Note that orte_output_open() takes a slightly different argument list (as a way to pass data to the filtering stream -- see below), so you if explicitly call opal_output_open(), you'll need to slightly adapt to the new signature of orte_output_open(). * Literally s/opal_show_help/orte_show_help/. The function signature is identical. === Notes === * orte_output'ing to stream 0 will do similar to what opal_output'ing did, so leaving a hard-coded "0" as the first argument is safe. * For systems that do not use ORTE's RML or the HNP, the effect of orte_output_* and orte_show_help will be identical to their opal counterparts (the additional information passed to orte_output_open() will be lost!). Indeed, the orte_* functions simply become trivial wrappers to their opal_* counterparts. Note that we have not tested this; the code is simple but it is quite possible that we mucked something up. = Filter Framework = Messages sent view the new orte_* functions described above and messages output via the IOF on the HNP will now optionally be passed through a new "filter" framework before being output to stdout/stderr. The "filter" OPAL MCA framework is intended to allow preprocessing to messages before they are sent to their final destinations. The first component that was written in the filter framework was to create an XML stream, segregating all the messages into different XML tags, etc. This will allow 3rd party tools to read the stdout/stderr from the HNP and be able to know exactly what each text message is (e.g., a help message, another OMPI infrastructure message, stdout from the user process, stderr from the user process, etc.). Filtering is not active by default. Filter components must be specifically requested, such as: {{{ $ mpirun --mca filter xml ... }}} There can only be one filter component active. = New MCA Parameters = The new functionality described above introduces two new MCA parameters: * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that help messages will be aggregated, as described above. If set to 0, all help messages will be displayed, even if they are duplicates (i.e., the original behavior). * '''orte_base_show_output_recursions''': An MCA parameter to help debug one of the known issues, described below. It is likely that this MCA parameter will disappear before v1.3 final. = Known Issues = * The XML filter component is not complete. The current output from this component is preliminary and not real XML. A bit more work needs to be done to configure.m4 search for an appropriate XML library/link it in/use it at run time. * There are possible recursion loops in the orte_output() and orte_show_help() functions -- e.g., if RML send calls orte_output() or orte_show_help(). We have some ideas how to fix these, but figured that it was ok to commit before feature freeze with known issues. The code currently contains sub-optimal workarounds so that this will not be a problem, but it would be good to actually solve the problem rather than have hackish workarounds before v1.3 final. This commit was SVN r18434.	2008-05-13 20:00:55 +00:00
Jeff Squyres	db2695ccab	Make the symbols be visible. This commit was SVN r18201.	2008-04-18 00:26:17 +00:00
Ralph Castain	fa082cafa9	Shift the architecture calculation from the ompi/datatype engine to the opal/util area. This allows us to compute the architecture earlier in the launch and communicate it outside of the modex. Note: this is an early preliminary step in the movement of portions of the datatype engine to the opal layer. This commit was SVN r18198.	2008-04-17 20:43:56 +00:00
Ralph Castain	e7487ad533	Implement the seq rmaps module that sequentially maps process ranks to a list hosts in a hostfile. Restore the "do-not-launch" functionality so users can test a mapping without launching it. Add a "do-not-resolve" cmd line flag to mpirun so the opal/util/if.c code does not attempt to resolve network addresses, thus enabling a user to test a hostfile mapping without hanging on network resolve requests. Add a function to hostfile to generate an ordered list of host names from a hostfile This commit was SVN r18190.	2008-04-17 13:50:59 +00:00
George Bosilca	b359d84661	Use the correct prefix. This commit was SVN r18048.	2008-03-31 21:42:59 +00:00
George Bosilca	be2454e0c5	Default the temporary directory to /tmp if no special environment variables are set. This commit was SVN r18046.	2008-03-31 20:15:49 +00:00
George Bosilca	ee784b601e	For consistency reasons always use opal_home_directory and opal_tmp_directory. This commit was SVN r18043.	2008-03-31 18:13:41 +00:00
George Bosilca	028c7391d3	Coverty fix: Replace strcpy by strncpy. This commit was SVN r17961.	2008-03-25 22:39:24 +00:00
George Bosilca	3997639ec6	Hide what should be hidden, and expose the others. Plus some indentation. This commit was SVN r17856.	2008-03-18 03:00:08 +00:00
George Bosilca	210631962c	Add two convenience functions in order to make sure we get these environment variables in a consistent manner. These functions retrieve the user and the temporary directories (based on the system). This commit was SVN r17815.	2008-03-13 17:56:44 +00:00
Rainer Keller	32dcd9e551	- Adding #include <stdbool.h> with protection in r17488 and r17504 seemed to be the right thing(tm), but broke the Sun Studio C++ compiler under Linux (ticket 747). This patch should allow inclusion into C and C++ from other header files without problems. This commit was SVN r17792. The following SVN revision numbers were found above: r17488 --> open-mpi/ompi@d53131f261 r17504 --> open-mpi/ompi@b22e8e7567	2008-03-08 12:53:10 +00:00
Ralph Castain	d70e2e8c2b	Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer This commit was SVN r17632.	2008-02-28 01:57:57 +00:00
Rainer Keller	d53131f261	- Need stdbool.h if included in userland; additionally protect stdbool / stdarg.h This commit was SVN r17488.	2008-02-18 08:11:57 +00:00
Adrian Knoth	601fb4389d	Cosmetics for r17150. Closes trac:1201 This commit was SVN r17151. The following SVN revision numbers were found above: r17150 --> open-mpi/ompi@4b50f02126 The following Trac tickets were found above: Ticket 1201 --> https://svn.open-mpi.org/trac/ompi/ticket/1201	2008-01-17 12:29:12 +00:00
Adrian Knoth	4b50f02126	Only free res iff it's been allocated before. Re #1201 This patch fixes the segfault, so closing the ticket might be possible. It's a very conservative patch. Perhaps the freeaddrinfo spec says that it will never allocate res in case of errors, but for now, I neither have the spec nor the will to rely on it. This commit was SVN r17150.	2008-01-17 10:01:52 +00:00
Jeff Squyres	00131df353	Fix typo in incorrect variable name; only noticed now because someone actually compiled on a system without syslog support (Brian B.). :-) This commit was SVN r16863.	2007-12-06 11:36:44 +00:00
Torsten Hoefler	e985812e1f	fixing a comment to be more detailed about opal_output_open functionality ... This commit was SVN r16370.	2007-10-06 17:33:57 +00:00
Tim Prins	e25bb7f187	Some platforms (such as FreeBSD) need libutil.h included for openpty. Thanks to Karol Mroz for pointing this out. This commit was SVN r16163.	2007-09-19 21:59:22 +00:00
George Bosilca	d1364c53de	Don't allocate the temporary buffer on the stack. It get way too much space. This commit was SVN r16127.	2007-09-14 02:09:38 +00:00
George Bosilca	2c8c75ef94	Coverty blame list: - Remove memory leaks - uninitialized return This commit was SVN r16126.	2007-09-14 02:08:37 +00:00
George Bosilca	921d79c2b8	Remove few memory leaks. Close the files where we're done with them. This commit was SVN r16125.	2007-09-14 02:06:26 +00:00

1 2 3 4 5 ...

273 Коммитов