1
1
openmpi/orte/mca
rhc54 2414244171 Merge pull request #1872 from rhc54/topic/continuous
Add support for continuously operating applications
2016-07-13 15:29:31 -07:00
..
common configury: clean up .so version numbers 2015-12-18 12:50:23 -05:00
dfs mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
errmgr Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. 2016-07-13 15:28:33 -07:00
ess Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field 2016-07-05 14:59:50 -07:00
filem Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field 2016-07-05 14:59:50 -07:00
grpcomm Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need. 2016-05-01 11:30:25 -07:00
iof Remove stale map-reduce support 2016-06-12 07:41:57 -07:00
notifier ORTE: update for the new opal_progress_thread API 2015-08-07 10:13:40 -07:00
odls Cleanup the forced termination a bit by restoring the delay before issuing the sigkill, and eliminating the large time loss spent checking if the proc died. The latter is responsible for a large number of test timeouts in MTT 2016-06-02 17:48:21 -07:00
oob orte: fixup hostname max length usage 2016-04-25 07:08:23 +02:00
plm Fix a bug in the handling of nper<foo> when -host or -hostfile was given. Correctly mark slots as "given" when we auto-assign them. Ensure we don't set the number of procs when using nper<foo> so the PPR mapper can correctly assing them. 2016-07-12 09:27:02 -07:00
ras Enable simulation of large-scale clusters by allowing multiple daemons/node. Specifying the ras_base_multiplier parameter to be greater than 1 will cause ORTE to replicate each allocated node by that factor. A daemon will be spawned for each replica, thus letting ORTE function as if it were on a much larger cluster. 2016-05-29 18:56:18 -07:00
rmaps Fix a bug in the handling of nper<foo> when -host or -hostfile was given. Correctly mark slots as "given" when we auto-assign them. Ensure we don't set the number of procs when using nper<foo> so the PPR mapper can correctly assing them. 2016-07-12 09:27:02 -07:00
rml Add a timeout cmd line option and an option to report state info upon timeout to assist with debugging Jenkins tests 2016-05-28 08:36:25 -07:00
routed Release child object when we are recording someone's relatives. 2016-02-15 20:50:42 -08:00
rtc Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given 2015-09-04 16:54:40 -07:00
schizo Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. 2016-07-13 15:28:33 -07:00
snapc mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
sstore mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
state Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. 2016-07-13 15:28:33 -07:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
mca.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00