* The user can set `-mca odls_base_sigkill_timeout 30` to have ORTE wait
30 seconds before sending SIGTERM then another 30 seconds before sending
SIGKILL to remaining processes. This usually happens on an abnormal
termination. Sometimes the user wants to delay the cleanup to give the
system time to write out corefile or run other diagnostics.
* The problem is that child processes may be completing while ORTE is
in this loop. The SIGCHLD will interrupt the `sleep` system call.
Without the loop the sleep could effectively be ignored in this case.
- Sleep returns the amount of time remaining to sleep. If it was
interrupted by a signal then it is a positive number less than or
equal to the parameter passed to it. If it slept the whole time
then it returns 0.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit 0e8a97c598d841d472047ea1025931813c3ef8a9)
Remove code for multiple OOB progress threads as it is an optimization
nobody uses. Also turns out to have a race condition that can cause
segfault on finalize, so maybe good that nobody is using it.
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 41eb41c3f224abcacec78f4d77c138197c36f172)
(cherry picked from commit a2f35c1834ab2fcb216285621d177a179e33dfe7)
INTERNAL: STL-59403
The OFI (libfabric) MTL does not respect the maximum message size
parameter that OFI provides in the fi_info data.
This patch adds this missing max_msg_size field to the mca_ofi_module_t
structure and adds a length check to the low-level send routines.
(cherry-picked from commit 3aca4af548a3d781b6b52f89f4d6c7e66d379609)
Change-Id: Ie50445e5edfb0f30916de0836db0edc64ecf7c60
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Reviewed-by: Brendan Cunningham <brendan.cunningham@intel.com>
- due to some refactoring and adding new functionality compilation
of ikrit module was broken
- this commit restores compilation
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 991082abf2da3a76849be021c5f7ecced8052709)
Trying out to run processes via mpirun in Podman containers has shown
that the CMA btl_vader_single_copy_mechanism does not work when user
namespaces are involved.
Creating containers with Podman requires at least user namespaces to be
able to do unprivileged mounts in a container
Even if running the container with user namespace user ID mappings which
result in the same user ID on the inside and outside of all involved
containers, the check in the kernel to allow ptrace (and thus
process_vm_{read,write}v()), fails if the same IDs are not in the same
user namespace.
One workaround is to specify '--mca btl_vader_single_copy_mechanism none'
and this commit adds code to automatically skip CMA if user namespaces
are detected and fall back to MCA_BTL_VADER_EMUL.
Signed-off-by: Adrian Reber <areber@redhat.com>
(cherry picked from commit fc68d8a90fe86284e9dc730f878b55c0412f01d2)
This commit fixes an compilation error when configured
with `--enable-timing`.
Procedures in the function `orte_ess_base_app_setup`
in `orte/mca/ess/base/ess_base_std_app.c` are moved
to `orte/mca/ess/pmi/ess_pmi_module.c`
and `orte/mca/ess/singleton/ess_singleton_module.c`
in the recent commit 57f6b94fa5.
In `ess_pmi_module.c`, the first argument of the
`OPAL_TIMING_ENV_NEXT` macro should have been adapted
to the destination function but was not.
In `ess_singleton_module.c`, `OPAL_TIMING_ENV_INIT`
was not used in the destination function originally.
So `OPAL_TIMING_ENV_NEXT` cannot be used in the function.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit 8e7d874e14a5485dceff836419e36b6b24a66f48)
- Fix MPIR_Breakpoint standard violation by returning void
instead of a void*.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
(cherry picked from commit 067adfa417f95396c713f6e6597619fac94f0048)
This commit changes how the single-copy emulation in the vader btl
operates. Before this change the BTL set its put and get limits
based on the max send size. After this change the limits are unset
and the put or get operation is fragmented internally.
References #6568
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit ae91b11de2314ab11a9842d9738cd14f8f1e393b)
If both types of interfaces are enabled, don't error out if one of them
isn't able to open listener sockets. Only one interface family may be
available on some machines, but someone might want to build the code to
run more generally.
Refs https://github.com/pmix/prrte/pull/249
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit 06d188ebf3646760f50d4513361b50642af9cec4)
elements that can be merged into a larger UINT1 type.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit 82d632278ae5ab4337984d5ef4793f818c4dd437)
This patch fixes the merge of contiguous elements into larger but more
compact datatypes, and allows for contiguous elements to have thir
blocklen increasing instead of the count. The idea is to always maximize
the blocklen, aka. the contiguous part of the datatype.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit 41e6f55807b01ad5c04e8387a3699cf743931f6a)
* Fix#6618
- See comments on Issue #6618 for finer details.
* The `plm/rsh` component uses the highest priority `routed` component
to construct the launch tree. The remote orted's will activate all
available `routed` components when updating routes. This allows the
opportunity for the parent vpid on the remote `orted` to not match
that which was expected in the tree launch. The result is that the
remote orted tries to contact their parent with the wrong contact
information and orted wireup will fail.
* This fix forces the orteds to use the same `routed` component as
the HNP used when contructing the tree, if tree launch is enabled.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
These variables were renamed in
904276bb44caec207638247f23139bc21bc6a09e; update them to use the new
names.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 2ab8109be15a7739caa72ec8f863e8e01c2c9a0f)
open-mpi/ompi@0fe756d416 Introduced
a bug in coll/hcoll component. The ompi_requests allocated by
libhcoll would be treated as coll_base_nbc_request during
ompi_coll_base_retain_<> call. Afterwards this would lead to a
segv in the request cleanup.
Fix: since libhcoll interface does not distinguish between the
blocling/non-blocking requests use coll_base_nbc_request all the
time and initialize it properly in
coll/hcoll/get_coll_handle(). It is still within 2 cache lines.
Signed-off-by: Valentin Petrov <valentinp@mellanox.com>