be more careful about closing framewworks as part of
orte_finalize. Owing to recent restructuring in opal to handle
finalize in a more general fashion, the missing framework
closes were causing meltdowns as the mca vars subsystem
was cleaning itself up.
This problem was recently reported by Siegmar:
https://www.mail-archive.com/users@lists.open-mpi.org//msg32946.html
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Add the --pset option for app_contexts so the user can provide a string
name for each app_context. Use the new PMIx pset attribute to store the
names in the PMIx local storage for retrieval
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
This commit fixes an compilation error when configured
with `--enable-timing`.
Procedures in the function `orte_ess_base_app_setup`
in `orte/mca/ess/base/ess_base_std_app.c` are moved
to `orte/mca/ess/pmi/ess_pmi_module.c`
and `orte/mca/ess/singleton/ess_singleton_module.c`
in the recent commit 57f6b94fa5.
In `ess_pmi_module.c`, the first argument of the
`OPAL_TIMING_ENV_NEXT` macro should have been adapted
to the destination function but was not.
In `ess_singleton_module.c`, `OPAL_TIMING_ENV_INIT`
was not used in the destination function originally.
So `OPAL_TIMING_ENV_NEXT` cannot be used in the function.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Update the show_help message for when there are not enough slots to
run an application.
Also, remove a bunch of copies of this message in various show_help
text files that aren't used/referred to anywhere in the code.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
If we detect that we are being debugged by an MPIR-based debugger, then
print a warning that OMPI's MPIR support has been deprecated and will be
removed in a subsequent release.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
using the standard $USER and $HOSTNAME environment variables
to make reproducible builds possible.
See https://reproducible-builds.org/ for why this is good.
This helps improve issue #3759
Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
Thanks to @hjelmn for debugging it and providing the patch
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit efa8bcc17078c89f1c9d6aabed35c90973a469bf)
Correctly aggregate slots across -H entries from each app. Take into
account any -H entry when computing nprocs when no value was given.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Do not reorder the available host list as this causes the head node process assignment to differ from those computed on the other nodes
Signed-off-by: Ralph H Castain <rhc@open-mpi.org>
We only need to pass a custom range if the target is a single process.
Otherwise, we let the range be "session".
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
According to clang on MacOS, passing a bool parameter -- which
undergoes default parameter promotion -- to va_start() results in
undefined behavior. So just change these params to int and avoid the
issue.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit disables the use of both the builtin and hand-written
atomics if proper C11 atomic support is detected. This is the first
step towards requiring the availability of C11 atomics for the C
compiler used to build Open MPI.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
When a job terminates normally but with a non zero exit code,
display the error message to stderr.
Thanks Emre Brookes for the bug report.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
We have adequate protection to ensure that we only utilize the PMIx
features related to "instant on" when they are available, so this param
is no longer required and causes confusion.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
This commit removes some code that protected the odls/alps component
from closing alps file descriptors. For some unknown reason leaving
these file descriptors open causes can cause an orted to hang when
launching apps.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
In some scenarios, we can have a daemon sharing the node with mpirun. In
those cases, we need to avoid race conditions in cleanup
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Per suggestion by @bangerth, allow mpirun to execute as root if two
envars are set to specific values
Per conversation with @jsquyres, name the envars OMPI_ALLOW_RUN_AS_ROOT
and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM
Fixes#4451
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Since version hwloc 2.0.0 has a new organization of NUMA nodes on the
topology tree. This commit adds the detection of local NUMA object for
hwloc => 2.0.0, which fixes the procs bindings policy for rmaps mindist
component.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>