1
1
openmpi/orte/util
Ralph Castain 7de4d6922b Change the behavior of cpus-per-rank. We previously counted each cpu against the #slots. However, IBM has pointed out that "slot" is equated to the number of processes allowed to run on each node, and not the number of cpus on the node. This has been a continuing source of confusion, so make the distinction a "hard" one.
Each process occupies a "slot". We automatically set #slots = #cpus if nothing else is told to us. If you want to run more procs and slots, you must tell us to allow oversubscription.

A process can utilize multiple pe's if that option is given. If you try to bind more than one proc to a given pe, then we will error out unless you tell us to allow overloading.
2016-08-22 15:54:41 -07:00
..
comm Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
dash_host Change the behavior of cpus-per-rank. We previously counted each cpu against the #slots. However, IBM has pointed out that "slot" is equated to the number of processes allowed to run on each node, and not the number of cpus on the node. This has been a continuing source of confusion, so make the distinction a "hard" one. 2016-08-22 15:54:41 -07:00
hostfile Refactor the ORTE DVM code so that external codes can submit multiple jobs using only a single connection to the HNP. 2016-02-13 08:10:44 -08:00
attr.c Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. 2016-07-13 15:28:33 -07:00
attr.h Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. 2016-07-13 15:28:33 -07:00
context_fns.c Silence Coverity warning 2016-03-14 09:42:43 -07:00
context_fns.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
error_strings.c Add support for PMIx tool connections and queries. Initially only support a request to list all known namespaces (jobids) from ORTE, but other folks will extend that support to include additional information 2016-06-29 19:19:19 -07:00
error_strings.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
help-regex.txt Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
hnp_contact.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
hnp_contact.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
listener.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
listener.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Makefile.am configury: clean the flex generated .c files 2016-06-01 11:13:31 +09:00
name_fns.c orte_util_snprintf_jobid: return ORTE_SUCCESS or ORTE_ERROR 2016-01-18 09:44:33 +09:00
name_fns.h sentinel: fix sentinel to proc_name conversion 2016-02-10 15:44:07 +09:00
nidmap.c The node index isn't normally passed with the packed node object, so we need to set it on the remote end as the orted needs to pass it down to the procs. Refactor the registration code to better package proc-level info - we will separate out the node and app levels in a subsequent change. 2016-08-12 12:06:23 -07:00
nidmap.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
parse_options.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
parse_options.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
pre_condition_transports.c Improve the transport key print statement to ensure that we don't get zero fields as this can be a problem for PSM 2016-04-28 20:11:12 -07:00
pre_condition_transports.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
proc_info.c Update the session dir structure. Restore the creation of a top-level dir based on userid so that everything is contained under the user's top-level dir. Make the next level down (the "job family" level) be either the pid (indicated by a name of "pid.N") or the job family if not launched by mpirun. This allows for proper rendezvous by direct-launched procs. 2016-08-15 22:46:46 -05:00
proc_info.h Update the session dir structure. Restore the creation of a top-level dir based on userid so that everything is contained under the user's top-level dir. Make the next level down (the "job family" level) be either the pid (indicated by a name of "pid.N") or the job family if not launched by mpirun. This allows for proper rendezvous by direct-launched procs. 2016-08-15 22:46:46 -05:00
regex.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
regex.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
session_dir.c Update the session dir structure. Restore the creation of a top-level dir based on userid so that everything is contained under the user's top-level dir. Make the next level down (the "job family" level) be either the pid (indicated by a name of "pid.N") or the job family if not launched by mpirun. This allows for proper rendezvous by direct-launched procs. 2016-08-15 22:46:46 -05:00
session_dir.h Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field 2016-07-05 14:59:50 -07:00
show_help.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
show_help.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00