1
1
openmpi/orte/runtime
Ralph Castain 503e1274a9 Per the discussion on the telecon, change the -host behavior so we only run one instance if no slots were provided and the user didn't specify #procs to run. However, if no slots are given and the user does specify #procs, then let the number of slots default to the #found processing elements
Ensure the returned exit status is non-zero if we fail to map

If no -np is given, but either -host and/or -hostfile was given, then error out with a message telling the user that this combination is not supported.

If -np is given, and -host is given with only one instance of each host, then default the #slots to the detected #pe's and enforce oversubscription rules.

If -np is given, and -host is given with more than one instance of a given host, then set the #slots for that host to the number of times it was given and enforce oversubscription rules. Alternatively, the #slots can be specified via "-host foo:N". I therefore believe that row #7 on Jeff's spreadsheet is incorrect.

With that one correction, this now passes all the given use-cases on that spreadsheet.

Make things behave under unmanaged allocations more like their managed cousins - if the #slots is given, then no-np shall fill things up.

Fixes #1344
2016-03-29 11:21:57 -07:00
..
data_type_support Per the discussion on the telecon, change the -host behavior so we only run one instance if no slots were provided and the user didn't specify #procs to run. However, if no slots are given and the user does specify #procs, then let the number of slots default to the #found processing elements 2016-03-29 11:21:57 -07:00
help-orte-runtime.txt Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_cr.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_cr.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_data_server.c Cleanup warnings in opal and orte layers when building optimized on Mac 2015-12-17 07:51:24 -08:00
orte_data_server.h Bring the MPI_Publish and friends online 2015-09-02 12:04:07 -07:00
orte_finalize.c Fix a number of issues, some of which have lingered for a long time: 2016-03-01 06:53:00 -08:00
orte_globals.c Do not push child processes into separate process groups so that any host RM can still "see" them, and ensure that any signal sent to the orted's themselves will be provided to all child processes. Forward all signals from mpirun to the child processes, removing the old MCA parameter required to turn that behavior "on". 2016-03-06 17:55:09 -08:00
orte_globals.h Do not push child processes into separate process groups so that any host RM can still "see" them, and ensure that any signal sent to the orted's themselves will be provided to all child processes. Forward all signals from mpirun to the child processes, removing the old MCA parameter required to turn that behavior "on". 2016-03-06 17:55:09 -08:00
orte_info_support.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_info_support.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_init.c Fix a number of issues, some of which have lingered for a long time: 2016-03-01 06:53:00 -08:00
orte_locks.c initialize common symbols from orte 2015-05-08 10:11:58 +09:00
orte_locks.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_mca_params.c Do not push child processes into separate process groups so that any host RM can still "see" them, and ensure that any signal sent to the orted's themselves will be provided to all child processes. Forward all signals from mpirun to the child processes, removing the old MCA parameter required to turn that behavior "on". 2016-03-06 17:55:09 -08:00
orte_quit.c Convert the orte_job_data pointer array to a hash table so it doesn't grow forever as we run lots and lots of jobs in the persistent DVM. 2016-02-21 11:55:49 -08:00
orte_quit.h Refactor the ORTE DVM code so that external codes can submit multiple jobs using only a single connection to the HNP. 2016-02-13 08:10:44 -08:00
orte_wait.c Resolve a race condition that prevented the sigchild callback from being registered before short-lived apps terminated 2015-10-23 21:02:31 -07:00
orte_wait.h more c99 updates 2015-06-25 10:14:13 -06:00
runtime_internals.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
runtime.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00