Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.
Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.
This commit was SVN r26242.
Brian dealt with this in the past by creating platform files and using "no-build" to block the components. This was clunky, but acceptable when only one organization was using that option. However, that number has now expanded to at least two more locations.
Accordingly, make --without-rte-support actually work by adding appropriate configury to prevent components from building when they shouldn't. While doing so, remove two frameworks (db and rmcast) that are no longer used as ORCM comes to a close (besides, they belonged in ORCM now anyway). Do some minor cleanups along the way.
This commit was SVN r25497.
https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement
The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation.
In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions:
1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior.
2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation.
3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so.
As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes.
This commit was SVN r25476.
Fix a termination issue that caused procs local to mpirun to not be killed if they weren't calling into the library. Thanks to Terry Dontje for spending countless hours chasing his tail on this one! :-(
This commit was SVN r25285.
To enable the epochs and the resilient orte code, use the configure flag:
--enable-resilient-orte
This will define both:
ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE
This commit was SVN r25093.
refactoring to improve readability and to compute num_procs_alive
correctly and to remove the use of loop iteration variables for
two loops nested one inside another (causing MPI_Comm_spawn_multiple
to fail).
This commit was SVN r24903.
Also, fix an error in mpirx debugger module - the pointer array object is the pointer to the object itself, not the object "super" like in an opal_list.
This commit was SVN r24894.
Over the course of time, usage of static ports got corrupted in several places, the "parent" info got incorrectly reset, etc. So correct all that and get the regex-based wireup going again.
Also, don't pass node lists if static ports aren't enabled - they are of no value to the orted and just create the possibility of overly-long cmd lines.
This commit was SVN r24860.
Everyone will be starting at MIN anyway (until we implement restart of course)
so there's no reason to set the epoch to INVALID and then immediately reset them
to MIN. This way there's less room to make mistakes later.
This commit was SVN r24829.
Rename the memusage sensor plugin to "resusage" as it will soon be updated to include full process stat monitoring.
Extend the heartbeat sensor to report node and process stats in the heartbeat.
Store the process and node stats in their respective orte_xxx_t object.
This commit was SVN r24629.
Provide a new MCA param that allows the user to direct that we abort the job once a process exits with non-zero status. No recovery is allowed in such cases to avoid trying to restart a process that has already exited MPI.
This commit was SVN r24614.
I want to thank Hugo Meyer for reporting this/these bugs.
Notes:
* Moved over a patch from the stabilization branch that makes sure we close the peer socket in the OOB TCP component fully during shutdown (after the de-registration sync). It also ensures that we free the rml_uri only after we are done communicating with the peer (in the odls_base deregister sync operation).
* When an error is detected while delivering messages, we really want to bail out of the loop since the error manager is likely mutating the orte_local_children data structure, so it is no longer safe to iterate over in the orte_odls_base_default_deliver_message() function.
* When the HNP is hosting processes make sure it accounts for processes that may have failed locally in the ErrMgr HNP component by decrementing the num_local_procs. This makes it match the orted ErrMgr component accounting. This is what was causing the modex to fail (the number of participants was wrong on a rolling recovery.
* The crmig and autor features of the hnp ErrMgr component now check for the jobid from both the 'job' parameter and from the process name (since one may be there and not the other). This caused some additional error messages during startup.
* If we fail to migrate (e.g., due to invalid node specification), print only the error message, not the error and success messages. This can be misleading.
This commit was SVN r24317.
Restore the use of override_oversubscribe to indicate that the data source for resources on the backend nodes used in mapping is unreliable. In this situation (e.g., data came from hostfile, or we are just using localhost because nothing was provided), we don't trust the oversubscribe condition passed by the mapper. Instead, we check locally to ensure we set sched_yield correctly.
This commit was SVN r24130.