openmpi

Автор	SHA1	Сообщение	Дата
Josh Hursey	f88aa6c273	This commit cleans up the AMCA parameter implementation a bit. * Remove the 'opal_mca_base_param_use_amca_sets' global variable * Harness the fact that you can (read should) call the cmd_line functions before initializing opal_init_util(). This pushes the MCA/GMCA/AMCA command line options into the environment before OPAL inits and starts to use these values. By putting the cmd_line parse before opal_init_util in orterun and orted we only parse the MCA parameter files once, and correctly (alleviating the need to 'recache' the files on init.) Small bits of cleanup. This commit was SVN r15219.	2007-06-27 01:03:31 +00:00
Rainer Keller	15c03e8acc	- Apply patch 31_manpages_lintian.dpatch Thanks to Dirk Eddelbuettel <edd@debian.org> This commit was SVN r15215.	2007-06-26 21:13:10 +00:00
Sven Stork	0edcf1d47e	- export required symbol This commit was SVN r15190.	2007-06-25 14:27:04 +00:00
Josh Hursey	84f102c343	Fix/Cleanup the Checkpoint Error propagation through the Snapc Full component. This commit was SVN r15175.	2007-06-22 16:14:25 +00:00
Jeff Squyres	bd56dc7e5d	Fixes trac:1060 Per suggestion, if we don't find a valid shell via getpwuid(), also check the $SHELL environment variable. Also perform a few minor cleanups along the way. This commit was SVN r15156. The following Trac tickets were found above: Ticket 1060 --> https://svn.open-mpi.org/trac/ompi/ticket/1060	2007-06-21 11:40:42 +00:00
Josh Hursey	78df098aee	If we can not checkpoint, then make sure we return an error This commit was SVN r15151.	2007-06-20 21:05:19 +00:00
Josh Hursey	dd021e7121	Remove some leftover debugging that must have been accidentally left in r15142. This commit was SVN r15145. The following SVN revision numbers were found above: r15142 --> open-mpi/ompi@a3998a1676	2007-06-20 14:06:13 +00:00
Josh Hursey	db59235af5	Fix an AMCA parameter regression introduced (as a side effect of) in r14449 (and, due to lack of in code documentation, in r14661). The {{{opal_mca_base_param_use_amca_sets}}} flag tells the orted that it should not look at the parameter files just yet since it may have an AMCA parameter file to look at first. So we need to set this to {{{false}}} before initializing the MCA paras, then quickly turn around and re-init them when we have the full information. This commit fixes trac:1058 This commit was SVN r15144. The following SVN revision numbers were found above: r14449 --> open-mpi/ompi@0ba47105ed r14661 --> open-mpi/ompi@df86202202 The following Trac tickets were found above: Ticket 1058 --> https://svn.open-mpi.org/trac/ompi/ticket/1058	2007-06-20 14:00:40 +00:00
George Bosilca	a3998a1676	Allow the symbols required by TotalView to be exported even when the visibility feature is on. This commit was SVN r15142.	2007-06-19 22:35:23 +00:00
Josh Hursey	edb2cbd150	In r15007 the --bootproxy orted argument was removed to support daemon reuse. The SnapC Full local Coordinator used this argument to attach to the job the daemon would be launching. So once this option was removed C/R support broke. This commit has the local coordinator attach to the job just before it is launched by the ODLS module. This is a much cleaner solution, and will eventually allow the SnapC modules to attach to multiple jobs launched on a single machine. This commit fixes the C/R regression introduced in r15007. This commit was SVN r15121. The following SVN revision numbers were found above: r15007 --> open-mpi/ompi@85df3bd92f	2007-06-18 15:39:04 +00:00
Josh Hursey	5719182a4e	Fix a break introduced in r14706 when RANK_KEY changed types. This commit was SVN r15120. The following SVN revision numbers were found above: r14706 --> open-mpi/ompi@d9acc93efa	2007-06-18 14:57:53 +00:00
Shiqing Fan	2a77d46117	Fix a small bug. This commit was SVN r15119.	2007-06-18 12:50:29 +00:00
George Bosilca	55cf6fc866	Be a little bit more verbose: tell which file we have trouble with... This commit was SVN r15115.	2007-06-17 04:59:15 +00:00
George Bosilca	99e701062a	The Windows job scheduler PLS. Initial commit as I have to move to another Windows cluster. Right now it's not in a usable state. This commit was SVN r15113.	2007-06-17 04:54:07 +00:00
Ralph Castain	e653da1d11	Where or where did that patch go??? Ah - there it went! ;-) Fix singleton operations - allow multiple xcasts to be queued. This commit was SVN r15097.	2007-06-15 13:45:29 +00:00
George Bosilca	35e824377e	There seems to be a subtle race condition when we fail to spawn a child. Marking the child as failed solve the issue. This commit was SVN r15087.	2007-06-14 22:36:47 +00:00
George Bosilca	a4d99ddef6	More synchronizations for the Windows version. The problem came from the multiple threads accessing the OOB/registry asynchronously via the callbacks. The quickest solution (but definitively not the cleanest) is to serialize these callbacks in such a way that at any given time only one thread can execute a callbacks. This commit was SVN r15086.	2007-06-14 22:35:38 +00:00
George Bosilca	fb9ff5cc75	Don't remove the tcp events from the list, they will remove themselves in the destructor. This commit was SVN r15085.	2007-06-14 22:33:09 +00:00
Josh Hursey	6cdfefad87	Fix portals BTL and cnos RML. Both were failing due to interface changes that were never applied to them properly. This commit was SVN r15082.	2007-06-14 18:49:41 +00:00
Ralph Castain	fde15ac97d	Bring the TM launcher online This commit was SVN r15076.	2007-06-14 12:33:34 +00:00
George Bosilca	95a607b945	A more Windows friendly version. As the socket event will be generated through the win dll using multiple threads, we have to insure that the oob callbacks happens only in a synchronous way or really bad things happens with the current design (blocking messages from a receive callback). This commit was SVN r15069.	2007-06-14 04:38:06 +00:00
George Bosilca	de324502bc	Update the Windows wait functions. The most important change is for the event registration, which in the case of a process dead detection should be marked as fire once and taking long time. This commit was SVN r15068.	2007-06-14 04:35:46 +00:00
George Bosilca	8dfa06a617	Only output when the user request it. This commit was SVN r15067.	2007-06-14 04:33:18 +00:00
George Bosilca	13a693faa0	Update the Windows process ODLS. This commit was SVN r15066.	2007-06-14 04:32:19 +00:00
Pak Lui	de0f1eef89	No major changes here. Just updates to remove unused code and comments. This commit was SVN r15051.	2007-06-13 17:23:03 +00:00
Pak Lui	03a93a38c5	Added an option for daemonizing orted. The existing behavior to --no-daemonize for gridengine is not changed. This commit was SVN r15050.	2007-06-13 17:11:37 +00:00
Ralph Castain	5adef03179	Clean up a diagnostic so it only outputs when requested This commit was SVN r15048.	2007-06-13 15:53:10 +00:00
George Bosilca	18c2bb0ed6	Don't forget to set the name argument before spawning the daemon. This commit was SVN r15047.	2007-06-13 15:45:34 +00:00
Pak Lui	8e7daea11f	bring inline more changes with r15007. This commit was SVN r15044. The following SVN revision numbers were found above: r15007 --> open-mpi/ompi@85df3bd92f	2007-06-13 15:30:18 +00:00
Ralph Castain	425fed95ff	Bring the SGE component online This commit was SVN r15043.	2007-06-13 15:02:47 +00:00
Rainer Keller	7e0b400f3f	- Small Fix. This commit was SVN r15037.	2007-06-13 10:43:03 +00:00
George Bosilca	278ec7fd4f	I wonder how this one compiled before ... or how do I manage to miss it ... This commit was SVN r15032.	2007-06-12 23:24:39 +00:00
George Bosilca	9d342ccb61	Shorter warning message. This commit was SVN r15031.	2007-06-12 23:22:09 +00:00
George Bosilca	715f6012cf	The DSS pack function can use the const attribute for the src field as it is never modified by the pack functions directly. Enforce it all over the code base. This commit was SVN r15026.	2007-06-12 22:47:14 +00:00
George Bosilca	649ab84654	Don't do SIGPIPE handling on Windows. This commit was SVN r15025.	2007-06-12 22:44:39 +00:00
George Bosilca	9e89abbd57	HAVE_SYS_TYPES_H require an ifdef. This commit was SVN r15024.	2007-06-12 22:43:18 +00:00
George Bosilca	432185d617	Forget to remove the MCA parameter corresponding to the 2 unused fields in the RSH PLS component. This commit was SVN r15023.	2007-06-12 22:41:38 +00:00
George Bosilca	49e7bf3069	Be a little bit more clear when we fail to identify the shell. This commit was SVN r15022.	2007-06-12 22:40:44 +00:00
George Bosilca	5b7796dfcd	Remove 2 unused fields. This commit was SVN r15021.	2007-06-12 22:39:57 +00:00
George Bosilca	16c38cabe1	Update the Windows ODLS component. This commit was SVN r15020.	2007-06-12 22:37:04 +00:00
George Bosilca	bf6f30a42c	Make the Windows PLS component match the current requirements for a PLS module. This commit was SVN r15019.	2007-06-12 22:34:56 +00:00
Ralph Castain	af64009368	Bring the CNOS component of the PLS back online This commit was SVN r15018.	2007-06-12 22:17:05 +00:00
Jeff Squyres	54064f6fa1	Fix a warning that Tim P. found this morning. The warning was indicative of overly-complex code anyway. So I removed the "first" bool and simply use a sentinel value in seq_min to indicate that nothing has changed. Note that this is "correct enough" for the moment -- more fixes will come in this area with tickets #1049 and/or #1051. This commit was SVN r15013.	2007-06-12 17:30:54 +00:00
Brian Barrett	84d1512fba	Add the potential for doing some basic error checking on mutexes during single threaded builds. In its default configuration, all this does is ensure that there's at least a good chance of threads building based on non-threaded development (since the variable names will be checked). There is also code to make sure that a "mutex" is never "double locked" when using the conditional macro mutex operations. This is off by default because there are a number of places in both ORTE and OMPI where this alarm spews mega bytes of errors on a simple test. So we have some work to do on our path towards thread support. Also removed the macro versions of the non-conditional thread locks, as the only places they were used, the author of the code intended to use the conditional thread locks. So now you have upper-case macros for conditional thread locks and lowercase functions for non-conditional locks. Simple, right? :). This commit was SVN r15011.	2007-06-12 16:25:26 +00:00
Ralph Castain	4e8081ed1e	Cleanup a now unnecessary variable This commit was SVN r15010.	2007-06-12 14:23:33 +00:00
Tim Prins	1467558157	Cleanup a couple warnings. Update svn:ignore This commit was SVN r15009.	2007-06-12 14:11:06 +00:00
Ralph Castain	85df3bd92f	Bring in the generalized xcast communication system along with the correspondingly revised orted launch. I will send a message out to developers explaining the basic changes. In brief: 1. generalize orte_rml.xcast to become a general broadcast-like messaging system. Messages can now be sent to any tag on the daemons or processes. Note that any message sent via xcast will be delivered to ALL processes in the specified job - you don't get to pick and choose. At a later date, we will introduce an augmented capability that will use the daemons as relays, but will allow you to send to a specified array of process names. 2. extended orte_rml.xcast so it supports more scalable message routing methodologies. At the moment, we support three: (a) direct, which sends the message directly to all recipients; (b) linear, which sends the message to the local daemon on each node, which then relays it to its own local procs; and (b) binomial, which sends the message via a binomial algo across all the daemons, each of which then relays to its own local procs. The crossover points between the algos are adjustable via MCA param, or you can simply demand that a specific algo be used. 3. orteds no longer exhibit two types of behavior: bootproxy or VM. Orteds now always behave like they are part of a virtual machine - they simply launch a job if mpirun tells them to do so. This is another step towards creating an "orteboot" functionality, but also provided a clean system for supporting message relaying. Note one major impact of this commit: multiple daemons on a node cannot be supported any longer! Only a single daemon/node is now allowed. This commit is known to break support for the following environments: POE, Xgrid, Xcpu, Windows. It has been tested on rsh, SLURM, and Bproc. Modifications for TM support have been made but could not be verified due to machine problems at LANL. Modifications for SGE have been made but could not be verified. The developers for the non-verified environments will be separately notified along with suggestions on how to fix the problems. This commit was SVN r15007.	2007-06-12 13:28:54 +00:00
Brian Barrett	27ad954265	Fix a couple of problems with the way we were using orte_process_name_t structures in the system. Instead of using memcmp, use the ns function. This won't cause a problem as long as all three elements of the name are ints, but if they have different sizes, alignment and padding rules can cause memcmp() to compare padding space, which rarely holds a sane value. This commit was SVN r14998.	2007-06-11 19:12:11 +00:00
Brian Barrett	1d11cc4b2d	Fix mis-declared variable type This commit was SVN r14994.	2007-06-11 16:48:35 +00:00
Shiqing Fan	d9fa58dc33	Add two more arguments to call. The definition of the function has been modified with 2 additional arguments. This commit was SVN r14990.	2007-06-11 14:27:36 +00:00

1 2 3 4 5 ...

1284 Коммитов