openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	ba879c2c51	Remove the unused map. This commit was SVN r26960.	2012-08-07 12:06:13 +00:00
Ralph Castain	53b1a1c976	Cleanly error out when someone asks to map-to <object> if that object doesn't exist on a node. This commit was SVN r26950.	2012-08-04 21:52:36 +00:00
Ralph Castain	96f6f94c24	Ensure we don't get trapped in an infinite loop when ranking bynode if something isn't right This commit was SVN r26948.	2012-08-03 21:45:10 +00:00
Ralph Castain	431d5361ed	For those who really preferred our prior mode of operation that mapped procs and only launched daemons on the nodes that had procs on them, introduce the "novm" state machine component. This recreates the old mode of operation by re-ordering the launch sequence so that we allocate, then map, and then launch daemons only on the reqd nodes (instead of across the entire allocation). This commit was SVN r26946.	2012-08-03 16:30:05 +00:00
Shiqing Fan	12d99a9ebb	Update the hwloc build on Windows and related files. This commit was SVN r26818.	2012-07-20 12:14:28 +00:00
Jeff Squyres	2ba10c37fe	Per RFC, bring in the following changes: * Remove paffinity, maffinity, and carto frameworks -- they've been wholly replaced by hwloc. * Move ompi_mpi_init() affinity-setting/checking code down to ORTE. * Update sm, smcuda, wv, and openib components to no longer use carto. Instead, use hwloc data. There are still optimizations possible in the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old carto-based code found out how many NUMA nodes were ''available'' -- not how many were used ''in this job''. The new hwloc-using code computes the same value -- it was not updated to calculate how many NUMA nodes are used ''by this job.'' * Note that I cannot compile the smcuda and wv BTLs -- I ''think'' they're right, but they need to be verified by their owners. * The openib component now does a bunch of stuff to figure out where "near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors (I do not have a NUMA machine with an OpenFabrics device that is a non-uniform distance from multiple different NUMA nodes). * Completely rewrite the OMPI_Affinity_str() routine from the "affinity" mpiext extension. This extension now understands hyperthreads; the output format of it has changed a bit to reflect this new information. * Bunches of minor changes around the code base to update names/types from maffinity/paffinity-based names to hwloc-based names. * Add some helper functions into the hwloc base, mainly having to do with the fact that we have the hwloc data reporting ''all'' topology information, but sometimes you really only want the (online \| available) data. This commit was SVN r26391.	2012-05-07 14:52:54 +00:00
Ralph Castain	bd8b4f7f1e	Sorry for mid-day commit, but I had promised on the call to do this upon my return. Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code. Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch. This commit was SVN r26242.	2012-04-06 14:23:13 +00:00
Ralph Castain	811413e9bc	Correctly handle multiple cpu-set ranges. Correctly support optional binding directives combined with cpu-set. This commit was SVN r26187.	2012-03-23 14:50:41 +00:00
Ralph Castain	ce0caf7567	Support -cpu-set by binding to the specified cpus in the absence of any other binding directive. Allows users to subdivide nodes for multiple parallel mpirun invocations. This commit was SVN r26186.	2012-03-23 14:05:52 +00:00
Ralph Castain	bba6508b4b	Handle the default hostfile case a little better... This commit was SVN r25928.	2012-02-15 03:33:49 +00:00
Ralph Castain	f14c4be580	Correct the ordering logic so the list gets correctly built in daemon vpid order This commit was SVN r25818.	2012-01-30 16:25:07 +00:00
Shiqing Fan	bfbd3c67a5	Add a windows file into the tarball. This commit was SVN r25811.	2012-01-29 10:12:02 +00:00
Ralph Castain	3f31feee6f	Handle the case where a user's rankfile specifies only cpus, and not socket:cpu pairs. This commit was SVN r25803.	2012-01-27 12:21:45 +00:00
Ralph Castain	ef94e606c7	Add some debug This commit was SVN r25791.	2012-01-26 19:23:32 +00:00
Shiqing Fan	2c9a4beffd	Add and remove a few components for windows build. This commit was SVN r25775.	2012-01-25 09:01:27 +00:00
Ralph Castain	477582abef	Grrrr....fix ALL the cases where the membind warning occurs. This commit was SVN r25715.	2012-01-11 23:51:18 +00:00
Ralph Castain	167ad944c4	Surprise, surprise - hwloc treats memory binding as at the thread, not process, level. Thus, hwloc always sets the membind proc-level support flag to false, and indicates actual memory binding support via the thread-level flag. So...just to be safe, test -both- flags and issue the "no support" warning ONLY if both are false. This commit was SVN r25709.	2012-01-11 01:12:57 +00:00
Ralph Castain	2dd2694f25	Fix comm_spawn in oversubscribed conditions. IF oversubscription is allowed, let nodes flow into the mapper even if they are oversubscribed, constrained by the slots_max absolute ceiling. Cleanup error messages when comm_spawn fails so it correctly and succintly reports the ereror. This commit was SVN r25659.	2011-12-15 18:04:48 +00:00
Ralph Castain	e683b2f9c7	Minor touchup - reset the pointer to the end of the list each time to ensure we get the nodes in correct daemon order This commit was SVN r25651.	2011-12-14 22:16:52 +00:00
Ralph Castain	f531b09a8d	Correctly handle -host and -hostfile options. Ensure the initial vm launch constrains itself to the union of specified hosts if those options are given. Get oversubscribe set correctly for that case. This commit was SVN r25648.	2011-12-14 20:01:15 +00:00
Nathan Hjelm	be11acf727	bug fix. don't add node to allocated_nodes twice This commit was SVN r25619.	2011-12-12 19:14:41 +00:00
Ralph Castain	15facc4ba6	Fix comm_spawn yet again...add another test This commit was SVN r25579.	2011-12-06 20:15:40 +00:00
Ralph Castain	90b7f2a7bf	The rest of the multi app_context fix. Remove the restriction on number of app_contexts that can have zero np specified as multiple mappers now support that use-case. Update the ranking algorithms to respect and track bookmarks. Ensure we properly set the oversubscribed flag on a per-node basis. This commit was SVN r25578.	2011-12-06 17:28:29 +00:00
Ralph Castain	d9c7764e9b	Remove some debug This commit was SVN r25575.	2011-12-05 22:04:50 +00:00
Ralph Castain	df2f594aa8	Some cleanup associated with multiple app_contexts. Ensure nodes only get entered once into the map. Correctly handle bookmarks. Cleanup tracking of slots_inuse and correct detection of oversubscription. Still need to resolve the ranking issue so it starts at the bookmark, but that will come next. This commit was SVN r25574.	2011-12-05 22:01:08 +00:00
Ralph Castain	07655e2945	Handle the case where the allocator "fibs" to us about the node names. In some cases (ahem...you know who you are!), the allocator will tell us a node number (e.g., "16"). However, the daemon will return a node name (e.g., "nid0016") - leaving us not recognizing its location. So provide a new parameter (can't have too many!) that handles this situation by stripping the prefix from the returned node name. Also do a little cleanup to ensure we cleanly exit from errors, without generating too many annoying messages. This commit was SVN r25562.	2011-12-02 14:10:08 +00:00
Ralph Castain	c56acf60ca	Although we never really thought about it, we made an unconscious assumption in the mapper system - we assumed that the daemons would be placed on nodes in the order that the nodes appear in the allocation. In other words, we assumed that the launch environment would map processes in node order. Turns out, this isn't necessarily true. The Cray, for example, launches processes in a toroidal pattern, thus causing the daemons to wind up somewhere other than what we thought. Other environments (e.g., slurm) are also capable of such behavior, depending upon the default mapping algorithm they are told to use. Resolve this problem by making the daemon-to-node assignment in the affected environments when the daemon calls back and tells us what node it is on. Order the nodes in the mapping list so they are in daemon-vpid order as opposed to the order in which they show in the allocation. For environments that don't exhibit this mapping behavior (e.g., rsh), this won't have any impact. Also, clean up the vm launch procedure a little bit so it more closely aligns with the state machine implementation that is coming, and remove some lingering "slave" code. This commit was SVN r25551.	2011-11-30 19:58:24 +00:00
Ralph Castain	866edf6a89	Now that George has found his problem, we no longer need the bozo check. Interesting how these platform-specific issues surface... This commit was SVN r25493.	2011-11-18 17:43:14 +00:00
Ralph Castain	1e5e9bde77	Add protection against a bozo case where we could end up in an infinite loop while calculating ranks This commit was SVN r25491.	2011-11-18 15:35:55 +00:00
George Bosilca	61f273b987	Do not tolerate uninitialized variables. This commit was SVN r25489.	2011-11-18 10:19:24 +00:00
Ralph Castain	6310361532	At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here: https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation. In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions: 1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior. 2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation. 3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so. As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes. This commit was SVN r25476.	2011-11-15 03:40:11 +00:00
Ralph Castain	fcee46b063	Add an option for printing a diffable process map for testing mappers This commit was SVN r25428.	2011-11-03 14:22:07 +00:00
Ralph Castain	648c85b41b	Add a simple pattern mapper as an example of how to use the topology info to create desired mappings. Let the user specify a pattern based on resource types, and map that pattern across all available nodes as resources permit. Don't automatically display the topology for each node when --display-devel-map is set as it can overwhelm the reader. Use a separate flag --display-topo to get it. This commit was SVN r25396.	2011-10-29 15:12:45 +00:00
Ralph Castain	2958f3de34	Add some clarifying comments and a small efficiency improvement This commit was SVN r25322.	2011-10-18 18:30:43 +00:00
Ralph Castain	ae8e556d14	Okay, once again let's fix the vpid calculator. Identified problem with prior commit (some rmaps components already place their procs in the jdata->procs array, and others don't), so account for those variations. This commit was SVN r25315.	2011-10-18 15:50:11 +00:00
George Bosilca	f28890fbb7	Revert r25302 as it break the --bynode option. This commit was SVN r25311. The following SVN revision numbers were found above: r25302 --> open-mpi/ompi@d7a8553179	2011-10-18 02:48:17 +00:00
Ralph Castain	d7a8553179	Fix the mapping algo for computing vpids - it was borked for bynode operations when using nperxxx directives This commit was SVN r25302.	2011-10-17 19:49:04 +00:00
Ralph Castain	92c7372e20	Per the RFC from Jeff, move hwloc from opal/mca/common to its own static framework ala libevent. Have ORTE daemons collect the topology info at startup and, if --enable-hwloc-xml is set, send that info back to the HNP for later use. The HNP only retains unique topology "templates" to reduce memory footprint. Have the daemon include the local topology info in the nidmap buffer sent to each app so the apps don't all hammer the local system to discover it for themselves. Remove the sysinfo framework as hwloc replaces that functionality. This commit was SVN r25124.	2011-09-11 19:02:24 +00:00
Wesley Bland	4e7ff0bd5e	By popular demand the epoch code is now disabled by default. To enable the epochs and the resilient orte code, use the configure flag: --enable-resilient-orte This will define both: ORTE_ENABLE_EPOCH ORTE_RESIL_ORTE This commit was SVN r25093.	2011-08-26 22:16:14 +00:00
Wesley Bland	09274cd047	Make sure that the epoch is initialized everywhere so we don't get weird output during valgrind. This shouldn't have caused any problems with any actual execution. Just extra warnings in valgrind. This commit was SVN r25015.	2011-08-08 15:11:55 +00:00
Wesley Bland	e1ba09ad51	Add a resilience to ORTE. Allows the runtime to continue after a process (or ORTED) failure. Note that more work will be necessary to allow the MPI layer to take advantage of this. Per RFC: http://www.open-mpi.org/community/lists/devel/2011/06/9299.php This commit was SVN r24815.	2011-06-23 20:38:02 +00:00
Ralph Castain	7f2d2e3de7	Track the app_context rank - will equal overall rank for single app_context jobs This commit was SVN r24778.	2011-06-16 20:31:30 +00:00
Ralph Castain	e039c7b7ea	Avoid crashing when debugging rmaps and a non-string resource constraint is given This commit was SVN r24770.	2011-06-10 16:27:30 +00:00
Ralph Castain	bd8d9a943a	Add diagnostics This commit was SVN r24748.	2011-06-05 19:17:56 +00:00
Ralph Castain	8f401a0563	Enable the ability to constrain applications to hosts on the basis of resources. This commit was SVN r24736.	2011-05-28 22:18:19 +00:00
Ralph Castain	dc6f616599	Enable VM launch. For some time, ORTE has had the ability to launch daemons on all nodes prior to launching an application. It has largely been used outside of the OMPI community, and so was never explicitly turned "on" inside OMPI releases. Nevertheless, the code has been there. Allowing VM launches does not require ANY changes to existing PLM components. All that was required was to have orterun launch the daemons as a separate call to orte_plm.spawn -prior- to launching the applications. The rest of the VM support code resides in the rmaps framework: (a) a check when asked to map a job to see if it is the daemon job, and (b) a separate "setup_virtual_machine" mapper in the rmaps base that creates the required map so the PLM's will do the right thing. In order to support those users who have no RM allocation but like to give the allocation in the form of a -host or -hostfile argument to their application, there is a little more code in orterun and the setup_virtual_machine mapper to capture information passed in that manner. This has been tested with rsh and slurm environments, and, since there is nothing environment-specific in the implementation, should work in others as well - but needs to be proven. This commit was SVN r24524.	2011-03-12 22:50:53 +00:00
Ralph Castain	1297acde13	George raised some valid concerns about the extensibility of the revised rmaps framework. Address those by: 1. removing the enum of mapper values 2. change the req_mapper and last_mapper fields to char* so they can hold the component name instead of a mapper flag 3. revise the selection logic in the mapper components to reflect the change. Components now look for their name in the req_mapper field, or to see if other criteria (e.g., npernode) are set that mandate their doing the mapping Several MCA params resided in the rmaps base for historical reasons - they have been in the base since at least the original 1.2 release (and perhaps earlier). However, George correctly pointed out that they really should reside in their respective components. Accordingly, move them to the components, but register synonyms to the old names to avoid breaking backward compatibility. These revisions retain the current functionality of allowing comm_spawn'd jobs to use different mappers than the original job, and for the errmgr to utilize the resilient mapper to recover processes regardless of how they were originally mapped. Given the large number of possible combinations, I am sure that someone will find a corner-case combination of values and selection criteria that cause either no mapper to be selected, or one other than the intended to be used. No one can test all the ways people will use this system, so I expect debugging to continue for awhile. The ability of comm_spawn'd jobs to exploit this functionality relies on changes to the orte_dpm component - this will be committed separately. This commit was SVN r24520.	2011-03-12 05:30:09 +00:00
Ralph Castain	3b4421d8e3	Separately track requested and last-used mapper so we don't lose that info This commit was SVN r24502.	2011-03-09 18:51:36 +00:00
George Bosilca	9bbe00bdc3	Set the return code from the processes upstream. This commit was SVN r24483.	2011-03-03 00:02:21 +00:00
George Bosilca	c6a5f9706a	Thomas's patch: Assume we won't fail unless notified by a child. This commit was SVN r24482.	2011-03-02 23:50:01 +00:00

1 2 3 4 5

225 Коммитов