openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	96bfeb591c	Ensure flag is passed to remote daemons This commit was SVN r26383.	2012-05-03 22:31:25 +00:00
Ralph Castain	45fee2b491	Resolve the case where only the HNP is in the system (i.e., single-node operation) This commit was SVN r26382.	2012-05-03 18:00:01 +00:00
Ralph Castain	b2f77bf08f	Extend the iof by adding two new components to support map-reduce IO chaining. Add a mapreduce tool for running such applications. Fix the state machine to support multiple jobs being simultaneously launched as this is not only required for mapreduce, but can happen under comm-spawn applications as well. This commit was SVN r26380.	2012-05-02 21:00:22 +00:00
Ralph Castain	3461809341	Fix reporting of launch progress so the numbers are correct and appear when they should This commit was SVN r26342.	2012-04-26 00:10:09 +00:00
Ralph Castain	71805bf7e4	Clearout the startup_timeout event if the job did in fact start. Have ORTE_TERMINATE use the job state macro so debug will show where it was called This commit was SVN r26334.	2012-04-25 01:05:17 +00:00
Ralph Castain	4d16790836	Fix collectives for jobs running across partial allocations This commit was SVN r26267.	2012-04-13 00:38:47 +00:00
Ralph Castain	bd8b4f7f1e	Sorry for mid-day commit, but I had promised on the call to do this upon my return. Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code. Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch. This commit was SVN r26242.	2012-04-06 14:23:13 +00:00
Ralph Castain	b2f1bade37	Fix the -H localhost issue This commit was SVN r26071.	2012-02-29 16:56:00 +00:00
Ralph Castain	b3aabf1565	Cleanup the --without-hwloc build. Thanks to Paul Hargrove for reporting it broken. This commit was SVN r25931.	2012-02-15 11:08:57 +00:00
Ralph Castain	bba6508b4b	Handle the default hostfile case a little better... This commit was SVN r25928.	2012-02-15 03:33:49 +00:00
Ralph Castain	3f31feee6f	Handle the case where a user's rankfile specifies only cpus, and not socket:cpu pairs. This commit was SVN r25803.	2012-01-27 12:21:45 +00:00
Ralph Castain	9d556e2f17	Allow daemons to use PMI to get their name where PMI support is available while using the standard grpcomm and other capabilities. Remove the GNI code from the alps ess component as that component should only be for alps/cnos installations. This commit was SVN r25737.	2012-01-18 20:56:53 +00:00
Ralph Castain	bf103de66c	My apologies for doing this outside of the usual time restrictions, but we need to get this in so we can make progress. Move the ORTE-level debugger code back into orterun and out of the ORTE library to resolve symbol conflicts. This commit was SVN r25713.	2012-01-11 15:53:09 +00:00
Ralph Castain	840841bb8f	Missed a couple This commit was SVN r25686.	2011-12-29 23:30:19 +00:00
Ralph Castain	af7fb68cfb	If we forward envars in rsh, then we have to be very careful about both duplicate entries and disallowed characters on the cmd line. To aid with detecting duplicates, make all cmd line options be given in their mca variant. Check anything we might add for semi-colons and protect those values with quotes. This commit was SVN r25685.	2011-12-29 23:25:25 +00:00
Ralph Castain	2dd2694f25	Fix comm_spawn in oversubscribed conditions. IF oversubscription is allowed, let nodes flow into the mapper even if they are oversubscribed, constrained by the slots_max absolute ceiling. Cleanup error messages when comm_spawn fails so it correctly and succintly reports the ereror. This commit was SVN r25659.	2011-12-15 18:04:48 +00:00
Ralph Castain	912abe8a6c	Catch one more use-case This commit was SVN r25649.	2011-12-14 21:03:19 +00:00
Ralph Castain	f531b09a8d	Correctly handle -host and -hostfile options. Ensure the initial vm launch constrains itself to the union of specified hosts if those options are given. Get oversubscribe set correctly for that case. This commit was SVN r25648.	2011-12-14 20:01:15 +00:00
Ralph Castain	07655e2945	Handle the case where the allocator "fibs" to us about the node names. In some cases (ahem...you know who you are!), the allocator will tell us a node number (e.g., "16"). However, the daemon will return a node name (e.g., "nid0016") - leaving us not recognizing its location. So provide a new parameter (can't have too many!) that handles this situation by stripping the prefix from the returned node name. Also do a little cleanup to ensure we cleanly exit from errors, without generating too many annoying messages. This commit was SVN r25562.	2011-12-02 14:10:08 +00:00
Ralph Castain	641e17f26c	A better way of handling fqdn allocations. Prior method was wrong as it equated "node1" with "node10", which definitely caused problems. Detect the addition of fqdn nodes in the allocation. If not found, then strip all incoming hostnames from daemons of any domain info when matching those names against the names in the node pool. Leave some protection and "live" diagnostic output in place so we can continue to detect problems across all environments. This commit was SVN r25557.	2011-12-01 14:24:43 +00:00
Ralph Castain	512aea79bc	Print the right nodename value, fix the strange case This commit was SVN r25556.	2011-12-01 02:31:56 +00:00
Ralph Castain	44394c6b34	Add a little more protection This commit was SVN r25555.	2011-12-01 00:30:56 +00:00
Ralph Castain	c4ea7a252a	Add a little protection against badly formed node names so we don't segfault if they are encountered This commit was SVN r25554.	2011-11-30 23:33:59 +00:00
Ralph Castain	c56acf60ca	Although we never really thought about it, we made an unconscious assumption in the mapper system - we assumed that the daemons would be placed on nodes in the order that the nodes appear in the allocation. In other words, we assumed that the launch environment would map processes in node order. Turns out, this isn't necessarily true. The Cray, for example, launches processes in a toroidal pattern, thus causing the daemons to wind up somewhere other than what we thought. Other environments (e.g., slurm) are also capable of such behavior, depending upon the default mapping algorithm they are told to use. Resolve this problem by making the daemon-to-node assignment in the affected environments when the daemon calls back and tells us what node it is on. Order the nodes in the mapping list so they are in daemon-vpid order as opposed to the order in which they show in the allocation. For environments that don't exhibit this mapping behavior (e.g., rsh), this won't have any impact. Also, clean up the vm launch procedure a little bit so it more closely aligns with the state machine implementation that is coming, and remove some lingering "slave" code. This commit was SVN r25551.	2011-11-30 19:58:24 +00:00
Ralph Castain	b475421c16	As promised, rationalize the rsh support. Remove rshbase and the base rsh support, centralizing all rsh support into the rsh component. Remove the "slave" launch support as that experiment is complete. Fix tree spawn and make that the default method for rsh launch, turning it "off" for qrsh as that system does not support tree spawn. This commit was SVN r25507.	2011-11-26 02:33:05 +00:00
Ralph Castain	9b59d8de6f	This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. Brian dealt with this in the past by creating platform files and using "no-build" to block the components. This was clunky, but acceptable when only one organization was using that option. However, that number has now expanded to at least two more locations. Accordingly, make --without-rte-support actually work by adding appropriate configury to prevent components from building when they shouldn't. While doing so, remove two frameworks (db and rmcast) that are no longer used as ORCM comes to a close (besides, they belonged in ORCM now anyway). Do some minor cleanups along the way. This commit was SVN r25497.	2011-11-22 21:24:35 +00:00
Ralph Castain	6310361532	At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here: https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation. In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions: 1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior. 2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation. 3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so. As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes. This commit was SVN r25476.	2011-11-15 03:40:11 +00:00
Ralph Castain	f00753881e	Handle the case where mpirun -is- of the same topology as the compute nodes. This commit was SVN r25412.	2011-11-01 22:26:03 +00:00
Ralph Castain	d28dd55d33	Minimize the amount of topology info returned by the daemons. Most clusters, especially at scale, use the same node topology on every node, so there is no re ason to return the topology from every daemon. Borrow a page from the --hetero-apps page and let users indicate that the node topology differs by adding a -- hetero-nodes option to mpirun. If the option is set, then every daemon returns topology info. If not set, then only daemon vpid=1 returns it. We always want one daemon to return the topology as the head node is often different from the compute nodes. Having one daemon return the compute node topolo gy allows us to detect any such difference. All compute nodes are then set to the same topology. This commit was SVN r25408.	2011-11-01 18:43:10 +00:00
Ralph Castain	24a46f2acb	These were missed by prior commit - need to remove lingering references to OPAL_HWLOC_HAVE_XML This commit was SVN r25272.	2011-10-12 16:54:03 +00:00
Ralph Castain	92c7372e20	Per the RFC from Jeff, move hwloc from opal/mca/common to its own static framework ala libevent. Have ORTE daemons collect the topology info at startup and, if --enable-hwloc-xml is set, send that info back to the HNP for later use. The HNP only retains unique topology "templates" to reduce memory footprint. Have the daemon include the local topology info in the nidmap buffer sent to each app so the apps don't all hammer the local system to discover it for themselves. Remove the sysinfo framework as hwloc replaces that functionality. This commit was SVN r25124.	2011-09-11 19:02:24 +00:00
George Bosilca	a4245b8d63	Remove some warnings related to the resilience patch. This commit was SVN r25097.	2011-08-27 00:15:34 +00:00
Wesley Bland	4e7ff0bd5e	By popular demand the epoch code is now disabled by default. To enable the epochs and the resilient orte code, use the configure flag: --enable-resilient-orte This will define both: ORTE_ENABLE_EPOCH ORTE_RESIL_ORTE This commit was SVN r25093.	2011-08-26 22:16:14 +00:00
Wesley Bland	09274cd047	Make sure that the epoch is initialized everywhere so we don't get weird output during valgrind. This shouldn't have caused any problems with any actual execution. Just extra warnings in valgrind. This commit was SVN r25015.	2011-08-08 15:11:55 +00:00
Ralph Castain	1ee7c39982	Fix some major bit-rot on scalable launch. If static ports are provided, then daemons can connect back to the HNP via the routed connection tree instead of doing so directly. In order to do that at scale, the node list must be passed as a regular expression - otherwise, the orted command line gets too long. Over the course of time, usage of static ports got corrupted in several places, the "parent" info got incorrectly reset, etc. So correct all that and get the regex-based wireup going again. Also, don't pass node lists if static ports aren't enabled - they are of no value to the orted and just create the possibility of overly-long cmd lines. This commit was SVN r24860.	2011-07-07 18:54:30 +00:00
Wesley Bland	84be81df95	Standardize the initialization of the EPOCH's. Everyone will be starting at MIN anyway (until we implement restart of course) so there's no reason to set the epoch to INVALID and then immediately reset them to MIN. This way there's less room to make mistakes later. This commit was SVN r24829.	2011-06-28 14:20:33 +00:00
Wesley Bland	e1ba09ad51	Add a resilience to ORTE. Allows the runtime to continue after a process (or ORTED) failure. Note that more work will be necessary to allow the MPI layer to take advantage of this. Per RFC: http://www.open-mpi.org/community/lists/devel/2011/06/9299.php This commit was SVN r24815.	2011-06-23 20:38:02 +00:00
Ralph Castain	30fb002524	Take the first small step towards rationalizing rsh support. Create a new "rshbase" component that contains a simple rsh module - no tree spawn, uses all the base functions for launch support. Extend the base rsh support functions to include those functions in common across all rsh modules. Only a minor change made to the current rsh module to avoid a naming conflict. Otherwise, left it alone to avoid creating conflicts with other external work. The current rsh module remains the default for rsh/ssh support, and continues to contain the support for SGE and Loadleveler. This commit was SVN r24593.	2011-03-30 01:15:07 +00:00
Ralph Castain	c1396b278c	Resolve the rsh confusion by splitting the initial search for a launch agent from the actual setup of the launch agent values in the plm base globals. Have each aspiring rsh-clone call lookup to see if their desired launch agent is available - if not, then reject that plm component. If so, then setup the actual launch agent values only when the module init function is called. This resolves the current conflict between the rsh and rshd components. Hopefully, it may avoid future problems in this area -provided- any new uses of rsh-like launchers abide by the lookup-and-then-setup rule. This commit was SVN r24550.	2011-03-22 02:23:09 +00:00
Ralph Castain	9b38525d1e	Remove unused include files This commit was SVN r24394.	2011-02-16 00:32:47 +00:00
Ralph Castain	5120e6aec3	Redefine the rmaps framework to allow multiple mapper modules to be active at the same time. This allows users to map the primary job one way, and map any comm_spawn'd job in a different way. Modules are given the opportunity to map a job in priority order, with the round-robin mapper having the highest default priority. Priority of each module can be defined using mca param. When called, each mapper checks to see if it can map the job. If npernode is provided, for example, then the loadbalance mapper accepts the assignment and performs the operation - all mappers before it will "pass" as they can't map npernode requests. Also remove the stale and never completed topo mapper. This commit was SVN r24393.	2011-02-15 23:24:31 +00:00
Ralph Castain	a3607ff35d	Make it easier to send a kill-local-procs command for an arbitrary number of procs This commit was SVN r24386.	2011-02-15 13:26:11 +00:00
Ralph Castain	a9dca25ca5	Remove the distinction between local and global restarts - leave it up to the error strategy to decide which to do. Cleanup the heartbeat handling so it is associated with the proc, not a node. Cleanup handling of recovery options so that defaults do not override user values iff they are provided. This commit was SVN r24382.	2011-02-14 20:49:12 +00:00
Ralph Castain	9ea2b196ce	Convert the opal_event framework to use direct function calls instead of hiding functions behind function pointers. Eliminate the opal_object_t abstraction of libevent's event struct so it can be directly passed to the libevent functions. Note: the ompi_check_libfca.m4 file had to be modified to avoid it stomping on global CPPFLAGS and the like. The file was also relocated to the ompi/config directory as it pertains solely to an ompi-layer component. Forgive the mid-day configure change, but I know Shiqing is working the windows issues and don't want to cause him unnecessary redo work. This commit was SVN r23966.	2010-10-28 15:22:46 +00:00
Shiqing Fan	a3d9c91ff7	Exclude stdbool.h for Windows, and use the definition in opal. Immigrate the socket pair support from libevent. Fix other minor things and make it compile. This commit was SVN r23951.	2010-10-26 14:53:50 +00:00
Ralph Castain	86c7365e8e	Clean up a few initialization issues - don't think these are impacting the shared memory situation as it didn't fix the problem. Setup the event API to support multiple bases in preparation for splitting the OMPI and ORTE events. Holding here pending shared memory resolution. This commit was SVN r23943.	2010-10-26 02:41:42 +00:00
Ralph Castain	fceabb2498	Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac. This is a fairly intrusive change, but outside of the moving of opal/event to opal/mca/event, the only changes involved (a) changing all calls to opal_event functions to reflect the new framework instead, and (b) ensuring that all opal_event_t objects are properly constructed since they are now true opal_objects. Note: Shiqing has just returned from vacation and has not yet had a chance to complete the Windows integration. Thus, this commit almost certainly breaks Windows support on the trunk. However, I want this to have a chance to soak for as long as possible before I become less available a week from today (going to be at a class for 5 days, and thus will only be sparingly available) so we can find and fix any problems. Biggest change is moving the libevent code from opal/event to a new opal/mca/event framework. This was done to make it much easier to update libevent in the future. New versions can be inserted as a new component and tested in parallel with the current version until validated, then we can remove the earlier version if we so choose. This is a statically built framework ala installdirs, so only one component will build at a time. There is no selection logic - the sole compiled component simply loads its function pointers into the opal_event struct. I have gone thru the code base and converted all the libevent calls I could find. However, I cannot compile nor test every environment. It is therefore quite likely that errors remain in the system. Please keep an eye open for two things: 1. compile-time errors: these will be obvious as calls to the old functions (e.g., opal_evtimer_new) must be replaced by the new framework APIs (e.g., opal_event.evtimer_new) 2. run-time errors: these will likely show up as segfaults due to missing constructors on opal_event_t objects. It appears that it became a typical practice for people to "init" an opal_event_t by simply using memset to zero it out. This will no longer work - you must either OBJ_NEW or OBJ_CONSTRUCT an opal_event_t. I tried to catch these cases, but may have missed some. Believe me, you'll know when you hit it. There is also the issue of the new libevent "no recursion" behavior. As I described on a recent email, we will have to discuss this and figure out what, if anything, we need to do. This commit was SVN r23925.	2010-10-24 18:35:54 +00:00
Ralph Castain	2c1a658232	Fix debugger attach This commit was SVN r23921.	2010-10-22 20:07:24 +00:00
Ralph Castain	554aede041	Fix a situation where we were unlocking a thread that isn't locked for the main launch - it is only used for dynamic spawns. This commit was SVN r23682.	2010-08-28 14:03:17 +00:00
Rainer Keller	12ed573e5e	- Include <strings.h> for rindex(3). Thanks to Paul Hargrove. Please CMR:v1.5 This commit was SVN r23671.	2010-08-26 13:42:36 +00:00

1 2 3 4 5

227 Коммитов