1
1
openmpi/orte/util
Ralph Castain b59ae14a2a Fix static port and partial allocation operations
Fix static port wireup by recording the TCP port mpirun is using and correctly passing the regex of hosts to the daemons. Do a better job of closing sockets on failed connection attempts. Correctly identify the remote host in the associated error message.

Fix partial allocation operations by not attempting to set #slots on nodes that were not used, and thus don't have a daemon or topology assigned to them

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-28 10:09:44 -08:00
..
comm Revise the routed framework to be multi-select so it can support the new conduit system. Update all calls to rml.send* to the new syntax. Define an orte_mgmt_conduit for admin and IOF messages, and an orte_coll_conduit for all collective operations (e.g., xcast, modex, and barrier). 2016-10-23 21:52:39 -07:00
dash_host Extend the -host:N syntax to accept "*" or "auto" to indicate "auto-detect the #cpus and set #slots to that value" 2017-01-24 10:21:01 -08:00
hostfile util/hostfile: plug a memory leak 2017-01-06 15:38:45 +09:00
attr.c Deprecate the --slot-list paramaeter in favor of --cpu-list. Remove the --cpu-set param (mark it as deprecated) and use --cpu-list instead as it was confusing having the two params. The --cpu-list param defines the cpus to be used by procs of this job, and the binding policy will be overlayed on top of it. 2017-01-24 13:33:22 -08:00
attr.h Deprecate the --slot-list paramaeter in favor of --cpu-list. Remove the --cpu-set param (mark it as deprecated) and use --cpu-list instead as it was confusing having the two params. The --cpu-list param defines the cpus to be used by procs of this job, and the binding policy will be overlayed on top of it. 2017-01-24 13:33:22 -08:00
compress.c Compress the xcast message if bigger than a defined size to further improve launch performance at scale 2017-01-19 22:08:02 -08:00
compress.h Compress the xcast message if bigger than a defined size to further improve launch performance at scale 2017-01-19 22:08:02 -08:00
context_fns.c Silence Coverity warning 2016-03-14 09:42:43 -07:00
context_fns.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
error_strings.c Repair rsh/ssh tree spawn 2017-01-27 11:35:00 -08:00
error_strings.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
help-regex.txt Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
hnp_contact.c Revise the routed framework to be multi-select so it can support the new conduit system. Update all calls to rml.send* to the new syntax. Define an orte_mgmt_conduit for admin and IOF messages, and an orte_coll_conduit for all collective operations (e.g., xcast, modex, and barrier). 2016-10-23 21:52:39 -07:00
hnp_contact.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
listener.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
listener.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Makefile.am Compress the xcast message if bigger than a defined size to further improve launch performance at scale 2017-01-19 22:08:02 -08:00
name_fns.c orte_util_snprintf_jobid: return ORTE_SUCCESS or ORTE_ERROR 2016-01-18 09:44:33 +09:00
name_fns.h sentinel: fix sentinel to proc_name conversion 2016-02-10 15:44:07 +09:00
nidmap.c Fix static port and partial allocation operations 2017-01-28 10:09:44 -08:00
nidmap.h Next step in reducing launch time: begin reducing the size of the launch message itself. Start by expressing the daemon map as a set of three regular expression strings. On an 8k cluster, this reduces the nidmap contribution from over 200kBytes to 21 bytes in size. 2017-01-23 19:54:47 -08:00
parse_options.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
parse_options.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
pre_condition_transports.c Enable PSM to support dynamic processes 2016-09-02 10:22:04 -07:00
pre_condition_transports.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
proc_info.c Clean out old cruft from the ORCM project 2016-09-21 00:13:30 -07:00
proc_info.h Clean out old cruft from the ORCM project 2016-09-21 00:13:30 -07:00
regex.c Next step in reducing launch time: begin reducing the size of the launch message itself. Start by expressing the daemon map as a set of three regular expression strings. On an 8k cluster, this reduces the nidmap contribution from over 200kBytes to 21 bytes in size. 2017-01-23 19:54:47 -08:00
regex.h Next step in reducing launch time: begin reducing the size of the launch message itself. Start by expressing the daemon map as a set of three regular expression strings. On an 8k cluster, this reduces the nidmap contribution from over 200kBytes to 21 bytes in size. 2017-01-23 19:54:47 -08:00
session_dir.c Fix the session directory cleanup - only remove the jobfam session dir level if we are the local daemon and are cleaning up our own session directory. 2016-12-03 09:59:18 -08:00
session_dir.h Several fixes related to session directories: 2016-09-05 07:48:44 +03:00
show_help.c Fix static port and partial allocation operations 2017-01-28 10:09:44 -08:00
show_help.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00