From 76d4c1843ede232f5ffbd7f5722d7a87a30b34f8 Mon Sep 17 00:00:00 2001 From: Jeff Squyres Date: Thu, 8 Nov 2018 11:50:47 -0500 Subject: [PATCH 1/2] orte-rmaps-base: update out-of-slots show_help message Update the show_help message for when there are not enough slots to run an application. Also, remove a bunch of copies of this message in various show_help text files that aren't used/referred to anywhere in the code. Signed-off-by: Jeff Squyres (cherry picked from commit 430c659908f9c1ba1ff652379a694314718ff3d8) --- orte/mca/rmaps/base/help-orte-rmaps-base.txt | 33 ++++++++++++++++--- .../rmaps/rank_file/help-rmaps_rank_file.txt | 10 +----- .../rmaps/round_robin/help-orte-rmaps-rr.txt | 10 +----- orte/mca/rmaps/seq/help-orte-rmaps-seq.txt | 12 +------ orte/mca/rtc/base/help-orte-rtc-base.txt | 9 +---- 5 files changed, 32 insertions(+), 42 deletions(-) diff --git a/orte/mca/rmaps/base/help-orte-rmaps-base.txt b/orte/mca/rmaps/base/help-orte-rmaps-base.txt index 88dcab07a9..db28a746cc 100644 --- a/orte/mca/rmaps/base/help-orte-rmaps-base.txt +++ b/orte/mca/rmaps/base/help-orte-rmaps-base.txt @@ -10,7 +10,7 @@ # University of Stuttgart. All rights reserved. # Copyright (c) 2004-2005 The Regents of the University of California. # All rights reserved. -# Copyright (c) 2011-2015 Cisco Systems, Inc. All rights reserved. +# Copyright (c) 2011-2018 Cisco Systems, Inc. All rights reserved. # Copyright (c) 2011 Los Alamos National Security, LLC. # All rights reserved. # Copyright (c) 2014-2018 Intel, Inc. All rights reserved. @@ -23,12 +23,35 @@ # This is the US/English general help file for Open RTE's orterun. # [orte-rmaps-base:alloc-error] -There are not enough slots available in the system to satisfy the %d slots -that were requested by the application: +There are not enough slots available in the system to satisfy the %d +slots that were requested by the application: + %s -Either request fewer slots for your application, or make more slots available -for use. +Either request fewer slots for your application, or make more slots +available for use. + +A "slot" is the Open MPI term for an allocatable unit where we can +launch a process. The number of slots available are defined by the +environment in which Open MPI processes are run: + + 1. Hostfile, via "slots=N" clauses (N defaults to number of + processor cores if not provided) + 2. The --host:N command line parameter (N defaults to 1 if not + provided) + 3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.) + 4. If neither a hostfile, the --hosts command line parameter, nor an + RM is present, Open MPI defaults to the number of processor + cores + +In all the above cases, if you want Open MPI to default to the number +of hardware threads instead of the number of processor cores, use the +--use-hwthread-cpus option. + +Alternatively, you can use the --oversubscribe option to ignore the +number of available slots when deciding the number of processes to +launch. +# [orte-rmaps-base:not-all-mapped-alloc] Some of the requested hosts are not included in the current allocation for the application: diff --git a/orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt b/orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt index ce1705acd8..f357bf20f3 100644 --- a/orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt +++ b/orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt @@ -1,6 +1,6 @@ # Copyright (c) 2004-2005 The Regents of the University of California. # All rights reserved. -# Copyright (c) 2011 Cisco Systems, Inc. All rights reserved. +# Copyright (c) 2011-2018 Cisco Systems, Inc. All rights reserved. # Copyright (c) 2013 Los Alamos National Security, LLC. # All rights reserved. # $COPYRIGHT$ @@ -90,14 +90,6 @@ some systems may require using full hostnames, such as [bad-index] Rankfile claimed host %s by index that is bigger than number of allocated hosts. # -[orte-rmaps-rf:alloc-error] -There are not enough slots available in the system to satisfy the %d slots -that were requested by the application: - %s - -Either request fewer slots for your application, or make more slots available -for use. -# [bad-rankfile] Error, invalid rank (%d) in the rankfile (%s) # diff --git a/orte/mca/rmaps/round_robin/help-orte-rmaps-rr.txt b/orte/mca/rmaps/round_robin/help-orte-rmaps-rr.txt index 2adb978127..ca459dd7c5 100644 --- a/orte/mca/rmaps/round_robin/help-orte-rmaps-rr.txt +++ b/orte/mca/rmaps/round_robin/help-orte-rmaps-rr.txt @@ -11,6 +11,7 @@ # Copyright (c) 2004-2005 The Regents of the University of California. # All rights reserved. # Copyright (c) 2017 Intel, Inc. All rights reserved. +# Copyright (c) 2018 Cisco Systems, Inc. All rights reserved. # $COPYRIGHT$ # # Additional copyrights may follow @@ -19,15 +20,6 @@ # # This is the US/English general help file for Open RTE's orterun. # -[orte-rmaps-rr:alloc-error] -There are not enough slots available in the system to satisfy the %d slots -that were requested: - - application: %s - host: %s - -Either request fewer slots for your application, or make more slots available -for use. [orte-rmaps-rr:multi-apps-and-zero-np] RMAPS found multiple applications to be launched, with at least one that failed to specify the number of processes to execute. diff --git a/orte/mca/rmaps/seq/help-orte-rmaps-seq.txt b/orte/mca/rmaps/seq/help-orte-rmaps-seq.txt index 5fbe109593..fbab660928 100644 --- a/orte/mca/rmaps/seq/help-orte-rmaps-seq.txt +++ b/orte/mca/rmaps/seq/help-orte-rmaps-seq.txt @@ -10,6 +10,7 @@ # University of Stuttgart. All rights reserved. # Copyright (c) 2004-2005 The Regents of the University of California. # All rights reserved. +# Copyright (c) 2018 Cisco Systems, Inc. All rights reserved. # $COPYRIGHT$ # # Additional copyrights may follow @@ -18,19 +19,8 @@ # # This is the US/English general help file for Open RTE's orterun. # -[orte-rmaps-seq:alloc-error] -There are not enough slots available in the system to satisfy the %d slots -that were requested by the application: - - %s - -Either request fewer slots for your application or make more slots -available for use. If oversubscription is intended, add ---oversubscribe to the command line. -# [orte-rmaps-seq:resource-not-found] The specified hostfile contained a node (%s) that is not in your allocation. We therefore cannot map a process rank to it. Please check your allocation and hostfile to ensure the hostfile only contains allocated nodes. - diff --git a/orte/mca/rtc/base/help-orte-rtc-base.txt b/orte/mca/rtc/base/help-orte-rtc-base.txt index ade22e57b2..8414cc5885 100644 --- a/orte/mca/rtc/base/help-orte-rtc-base.txt +++ b/orte/mca/rtc/base/help-orte-rtc-base.txt @@ -10,7 +10,7 @@ # University of Stuttgart. All rights reserved. # Copyright (c) 2004-2005 The Regents of the University of California. # All rights reserved. -# Copyright (c) 2011-2014 Cisco Systems, Inc. All rights reserved. +# Copyright (c) 2011-2018 Cisco Systems, Inc. All rights reserved. # Copyright (c) 2011 Los Alamos National Security, LLC. # All rights reserved. # Copyright (c) 2014 Intel, Inc. All rights reserved. @@ -22,13 +22,6 @@ # # This is the US/English general help file for Open RTE's orterun. # -[orte-rtc-base:alloc-error] -There are not enough slots available in the system to satisfy the %d slots -that were requested by the application: - %s - -Either request fewer slots for your application, or make more slots available -for use. [orte-rtc-base:not-all-mapped-alloc] Some of the requested hosts are not included in the current allocation for the application: From 8be14b9b07ac4740224cd6923ecd287b0462a35f Mon Sep 17 00:00:00 2001 From: Jeff Squyres Date: Thu, 8 Nov 2018 14:21:47 -0800 Subject: [PATCH 2/2] orte-rmaps-base: slightly amend help message Follow on to 430c659908: clarify the help message and fix one typo. Signed-off-by: Jeff Squyres (cherry picked from commit e9bf318dcb2f337267211f37e6d59c9f8bf5d8be) --- orte/mca/rmaps/base/help-orte-rmaps-base.txt | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/orte/mca/rmaps/base/help-orte-rmaps-base.txt b/orte/mca/rmaps/base/help-orte-rmaps-base.txt index db28a746cc..0d4724aeec 100644 --- a/orte/mca/rmaps/base/help-orte-rmaps-base.txt +++ b/orte/mca/rmaps/base/help-orte-rmaps-base.txt @@ -37,12 +37,11 @@ environment in which Open MPI processes are run: 1. Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided) - 2. The --host:N command line parameter (N defaults to 1 if not - provided) + 2. The --host command line parameter, via a ":N" suffix on the + hostname (N defaults to 1 if not provided) 3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.) - 4. If neither a hostfile, the --hosts command line parameter, nor an - RM is present, Open MPI defaults to the number of processor - cores + 4. If none of a hostfile, the --host command line parameter, or an + RM is present, Open MPI defaults to the number of processor cores In all the above cases, if you want Open MPI to default to the number of hardware threads instead of the number of processor cores, use the