Ralph Castain
11de735e8a
Complete the revamp of hostfile support in non-managed environments. Working at the app level, ensure that we utilize only those nodes specified for that app, but fall back to the default hostfile (if available) for those with no specification, further falling back to the local host if the default hostfile is not present or is empty.
...
This commit was SVN r27230.
2012-09-04 16:34:05 +00:00
Ralph Castain
1b659de132
Get staged execution working on multi-node setups. Improve efficiency by only remapping if all procs not yet mapped in the job.
...
This commit was SVN r27181.
2012-08-29 20:35:52 +00:00
Ralph Castain
98580c117b
Introduce staged execution. If you don't have adequate resources to run everything without oversubscribing, don't want to oversubscribe, and aren't using MPI, then staged execution lets you (a) run as many procs as there are available resources, and (b) start additional procs as others complete and free up resources. Adds a new mapper as well as a new state machine.
...
Remove some stale configure.m4's we no longer need.
Optimize the nidmaps a bit by only sending info that has changed each time, instead of sending a complete copy of everything. Makes no difference for the typical MPI job - only impacts things like staged execution where we are sending multiple (possibly many) launch messages.
This commit was SVN r27165.
2012-08-28 21:20:17 +00:00
Ralph Castain
e0c39c94e8
Complete the cleanup of the preload files system. Remove the dest_dir option as moving things to arbitrary locations - especially absolute paths - can prove disastrous. Remove the preload_libs option as these can be treated as just files. Cleanup some of the pack/unpack code as the dss handles NULL strings just fine. Deal a little better with absolute paths, noting that tar now strips the leading '/' for us (showing my age as it didn't used to do so).
...
Remove the odls_base_state.c file as that code is now covered by the new broadcast form of preload_files.
This commit was SVN r27127.
2012-08-24 02:28:29 +00:00
Ralph Castain
b4a544ad2a
Per discussion with Josh, use the --preload-xxx cmd line options to broadcast files to all nodes. Add --set-cwd-to-session-dir option to start procs in their session directories. Add OMPI_FILE_LOCATION envar to tell procs where their prepositioned files went.
...
This commit was SVN r27125.
2012-08-23 21:28:05 +00:00
Ralph Castain
e4d82b8912
Turn off the common port by default by now until we get rollup working properly on ALL platforms
...
This commit was SVN r27060.
2012-08-15 22:13:04 +00:00
Ralph Castain
cb48fd52d4
Implement the MPI_Info part of MPI-3 Ticket 313. Add an MPI_info object MPI_INFO_GET_ENV that contains a number of run-time related pieces of info. This includes all the required ones in the ticket, plus a few that specifically address recent user questions:
...
"num_app_ctx" - the number of app_contexts in the job
"first_rank" - the MPI rank of the first process in each app_context
"np" - the number of procs in each app_context
Still need clarification on the MPI_Init portion of the ticket. Specifically, does the ticket call for returning an error is someone calls MPI_Init more than once in a program? We set a flag to tell us that we have been initialized, but currently never check it.
This commit was SVN r27005.
2012-08-12 01:28:23 +00:00
Ralph Castain
431d5361ed
For those who really preferred our prior mode of operation that mapped procs and only launched daemons on the nodes that had procs on them, introduce the "novm" state machine component. This recreates the old mode of operation by re-ordering the launch sequence so that we allocate, then map, and then launch daemons only on the reqd nodes (instead of across the entire allocation).
...
This commit was SVN r26946.
2012-08-03 16:30:05 +00:00
Shiqing Fan
660188307c
fix an export declaration name
...
This commit was SVN r26895.
2012-07-27 13:26:24 +00:00
Shiqing Fan
8c4a3e1269
correct the symbol dllexports for windows build
...
This commit was SVN r26827.
2012-07-22 08:54:50 +00:00
Ralph Castain
e335de3564
Refactor ompi_info, splitting it into parts according to the layer involved. Thus, we call down to the opal layer to get those frameworks and components, and down to the orte layer to get those. Still some abstraction breaks, but they mostly involve renaming of OMPI_foo labels that have been around since before we split the build system by layer.
...
This commit was SVN r26695.
2012-06-28 18:23:34 +00:00
Ralph Castain
0dfe29b1a6
Roll in the rest of the modex change. Eliminate all non-modex API access of RTE info from the MPI layer - in some cases, the info was already present (either in the ompi_proc_t or in the orte_process_info struct) and no call was necessary. This removes all calls to orte_ess from the MPI layer. Calls to orte_grpcomm remain required.
...
Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework.
Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one.
This commit was SVN r26678.
2012-06-27 14:53:55 +00:00
Ralph Castain
b990c65a53
Remove another antiquated dss function - the 'size' API isn't used anywhere since the GPR went away
...
This commit was SVN r26646.
2012-06-25 13:33:45 +00:00
Ralph Castain
abe7dd8274
Cleanup the dss by removing unused functions
...
This commit was SVN r26644.
2012-06-23 21:20:09 +00:00
Ralph Castain
019857b616
Ensure that we don't attempt to use common ports if --disable-static was specified.
...
This commit was SVN r26620.
2012-06-20 03:14:11 +00:00
Ralph Castain
9b026c6695
For now, run MTT with the use_common_port option enabled. This would be the desirable scenario for users, especially at scale, so let's see if it creates any issues.
...
This commit was SVN r26609.
2012-06-15 15:46:38 +00:00
Ralph Castain
96c778656a
Improve launch performance on clusters that use dedicated nodes by instructing the orteds to use the same port as the HNP, thus allowing them to "rollup" their initial callback via the routed network. This substantially reduces the HNP bottleneck and the number of ports opened by the HNP.
...
Restore enable-static-ports option by default - the Cray will have to disable it to get around their library issues, but that's just a warning problem as opposed to blocking the build.
This commit was SVN r26606.
2012-06-15 10:15:07 +00:00
Ralph Castain
9506ac1617
Remove debug
...
This commit was SVN r26592.
2012-06-11 20:02:53 +00:00
Ralph Castain
269cb2b8d9
Some cleanup to remove calls to opal_progress when running with orte progress threads, and to ensure that all orte-related events are in the orte event base.
...
This commit was SVN r26591.
2012-06-11 19:59:53 +00:00
Ralph Castain
0442a807c0
Default the OOB to the "ud" component IFF the HNP finds itself on a node with a supported Infiniband device. Ensure that the daemons all pick the matching component by dictating the selection via mca param on the orted cmd line.
...
This commit was SVN r26582.
2012-06-08 01:23:08 +00:00
Ralph Castain
d6279fc971
Fix the debugger daemon launch support to fit the new state machine. Treat debugger daemons just like any other job, except that we map them only to nodes where an app process currently exists (as opposed to every node in the system). Trigger breakpoint and rank0 release only after the debugger daemons are in position.
...
This commit was SVN r26556.
2012-06-06 02:01:23 +00:00
Ralph Castain
9883f42caf
Add missing commit
...
This commit was SVN r26501.
2012-05-28 02:20:20 +00:00
Ralph Castain
be6ed9c2df
Allow partial use of allocations by specifying the max number of daemons (i.e., max VM size) for the job
...
This commit was SVN r26499.
2012-05-27 16:48:19 +00:00
Ralph Castain
83d69b6c95
Enable the ORTE progress thread for apps (not needed in the tools as they already continuously loop in the event lib). This appears to be working, at least for MPI apps that only use shared memory (a simple "hello"). More testing is required to identify where problems will occur - this is only intended to allow further development.
...
In order to use the progress thread, you must configure with:
--enable-orte-progress-threads --enable-event-thread-support
This commit was SVN r26457.
2012-05-20 15:14:43 +00:00
Jeff Squyres
dab7d36a81
Fix location of the default hostfile. Thanks to Götz Waschk for
...
identifying the problem.
This commit was SVN r26441.
2012-05-15 16:13:39 +00:00
Jeff Squyres
2ba10c37fe
Per RFC, bring in the following changes:
...
* Remove paffinity, maffinity, and carto frameworks -- they've been
wholly replaced by hwloc.
* Move ompi_mpi_init() affinity-setting/checking code down to ORTE.
* Update sm, smcuda, wv, and openib components to no longer use carto.
Instead, use hwloc data. There are still optimizations possible in
the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old
carto-based code found out how many NUMA nodes were ''available''
-- not how many were used ''in this job''. The new hwloc-using
code computes the same value -- it was not updated to calculate how
many NUMA nodes are used ''by this job.''
* Note that I cannot compile the smcuda and wv BTLs -- I ''think''
they're right, but they need to be verified by their owners.
* The openib component now does a bunch of stuff to figure out where
"near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT
BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors
(I do not have a NUMA machine with an OpenFabrics device that is a
non-uniform distance from multiple different NUMA nodes).
* Completely rewrite the OMPI_Affinity_str() routine from the
"affinity" mpiext extension. This extension now understands
hyperthreads; the output format of it has changed a bit to reflect
this new information.
* Bunches of minor changes around the code base to update names/types
from maffinity/paffinity-based names to hwloc-based names.
* Add some helper functions into the hwloc base, mainly having to do
with the fact that we have the hwloc data reporting ''all''
topology information, but sometimes you really only want the
(online | available) data.
This commit was SVN r26391.
2012-05-07 14:52:54 +00:00
Ralph Castain
b2f77bf08f
Extend the iof by adding two new components to support map-reduce IO chaining. Add a mapreduce tool for running such applications.
...
Fix the state machine to support multiple jobs being simultaneously launched as this is not only required for mapreduce, but can happen under comm-spawn applications as well.
This commit was SVN r26380.
2012-05-02 21:00:22 +00:00
Ralph Castain
47a5e30095
Ensure debug output levels if we are debugging
...
This commit was SVN r26358.
2012-04-29 00:03:28 +00:00
Ralph Castain
f3e3704c9e
Per request from Brian, enable mapping of stddiag output (output from opal_output calls) to stderr of the local process. This allows you to obtain that output in a local window (for example, when using xterm for each process) instead of having it automatically forwarded to mpirun. Turn this on automatically whenever someone uses the -xterm option, and to be set manually using the orte_map_stddiag_to_stderr mca param.
...
This commit was SVN r26352.
2012-04-27 14:39:34 +00:00
Ralph Castain
3b5b185c86
Don't double free timer events
...
This commit was SVN r26341.
2012-04-25 17:36:12 +00:00
Ralph Castain
ddfbde587f
Change the default to "abort" the job when any process exits with a non-zero status. Add the required code to ensure the orted tells the HNP about the problem.
...
This commit was SVN r26270.
2012-04-13 21:19:46 +00:00
Ralph Castain
4d16790836
Fix collectives for jobs running across partial allocations
...
This commit was SVN r26267.
2012-04-13 00:38:47 +00:00
Ralph Castain
5d14fa7546
Fix mpi_abort, minimize error output.
...
This commit was SVN r26266.
2012-04-11 14:37:08 +00:00
Ralph Castain
9cd4c06488
Get things to build and run when --disable-orte is specified
...
This commit was SVN r26263.
2012-04-10 21:50:01 +00:00
Ralph Castain
14d5525fb1
Some minor cleanups. Get singletons working. Cleanup abort handling so it gets properly identified.
...
This commit was SVN r26261.
2012-04-10 19:08:54 +00:00
Ralph Castain
48de3a2501
Minor fixes so orte_progress_thread can work
...
This commit was SVN r26248.
2012-04-06 15:50:49 +00:00
Ralph Castain
bd8b4f7f1e
Sorry for mid-day commit, but I had promised on the call to do this upon my return.
...
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.
Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.
This commit was SVN r26242.
2012-04-06 14:23:13 +00:00
Ralph Castain
ca3ff58c76
Ensure we get a non-zero exit status when we can't find the specified fork agent. Output a better error message, and ensure we don't multiply report the problem.
...
This commit was SVN r26191.
2012-03-24 00:49:38 +00:00
Josh Hursey
a595525366
Add the callers name to the 'comm failed' error message, so we know between which two peers the communication failed.
...
This commit was SVN r26117.
2012-03-08 21:55:19 +00:00
Ralph Castain
3d718863a8
Fix typo - thanks Pascal
...
This commit was SVN r26064.
2012-02-28 14:33:55 +00:00
Ralph Castain
d7d8a8cdf7
Some cleanup of the tmpdir session directory specifications. Remove the --tmpdir option from orterun as it was confusing. Create an orte_local_tmpdir_base mca param in its place. Clarify the role of the local vs remote vs global tmpdir base params, and ensure that you don't set conflicting options.
...
Remove the OMPI_PREFIX_ENV environmental variable as that was totally confusing as a way of setting a tmpdir base location.
This commit was SVN r25941.
2012-02-16 16:10:01 +00:00
Ralph Castain
3da1787c06
Allow there to be no default hostfile without generating an error
...
This commit was SVN r25930.
2012-02-15 04:16:05 +00:00
Ralph Castain
a5e4dc6803
In accordance with prior releases, we are supposed to default to looking at the openmpi-default-hostfile as a default hostfile. Restore that behavior, but ignore the file if it is empty. Allow the user to ignore any MCA param setting pointing to a default hostfile by setting the param to "none" (via cmd line or whatever) - this allows them to override a setting in the system default MCA param file.
...
This commit was SVN r25851.
2012-02-01 17:40:44 +00:00
Ralph Castain
11a37d3978
Fix the default
...
This commit was SVN r25733.
2012-01-17 21:09:27 +00:00
Ralph Castain
fd0d9f73c6
Make preload_binaries an MCA param so it can be set in the default MCA parameters for a system
...
This commit was SVN r25728.
2012-01-17 17:16:05 +00:00
Ralph Castain
ce7ddd0e10
Create the debugger attach fifo unless the user requests that we periodically poll insteaad.
...
This commit was SVN r25714.
2012-01-11 19:44:22 +00:00
Ralph Castain
bf103de66c
My apologies for doing this outside of the usual time restrictions, but we need to get this in so we can make progress.
...
Move the ORTE-level debugger code back into orterun and out of the ORTE library to resolve symbol conflicts.
This commit was SVN r25713.
2012-01-11 15:53:09 +00:00
Ralph Castain
437c52d2bf
Routing must be enabled by default
...
This commit was SVN r25657.
2011-12-15 17:13:52 +00:00
Ralph Castain
7510339725
Remove stale orte_vm_launch param. Add a param that allows users to specify envars to forward/set so they can do it in the MCA param file instead of only via mpirun cmd line.
...
This commit was SVN r25580.
2011-12-06 21:31:22 +00:00
Ralph Castain
df2f594aa8
Some cleanup associated with multiple app_contexts. Ensure nodes only get entered once into the map. Correctly handle bookmarks. Cleanup tracking of slots_inuse and correct detection of oversubscription.
...
Still need to resolve the ranking issue so it starts at the bookmark, but that will come next.
This commit was SVN r25574.
2011-12-05 22:01:08 +00:00