simply based on some local state. This is the second
part of the patch proposed for open-mpi/ompi#1183.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
It turns out that the approach of having the HNP do the
fork/exec of MPI ranks on the head node in a SLURM environment
introduces problems when users/sysadmins want to use the SLURM
scancl tool or sbatch --signal option to signal a job.
This commit disables use of the HNP fork/exec procedure when
a job is launched into a SLURM controlled allocation.
update NEWS with a blurb about new ras framework mca parameter.
related to #3998
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
This removes a copy-and-paste error where we were setting the
OPAL_ASM_SYNC_HAVE_64BIT more than once.
References #3993. Close when on master and v3.0.x.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
There was a typo in the shmem_g c11 generic interface
in shmem.h.in
Thanks to @nspark for reporting the problem and
specifying the fix.
Fixes#3968
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
Ignore errors caused by remote side having exited when closing CUDA IPC mappings.
openmpi/ompi#3244
Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com>
The `ompi_comm_set` function never sets `NULL` to its first argument
`ncomm`. So `NULL` check is unnecessary in its callers. Furthermore,
`NULL` check may obscure a real return code when an error occurs
if the variable is initialized to a `NULL` value.
Also, `NULL` check is added in the `ompi_comm_set` function to
avoid segmentation fault in an out-of-memory condition.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Per discussion at the Summer 2017 developers meeting, generate
the AUTHORS list at make dist time, rather than trying to
keep it up to date and merge on the branches by hand. While
most of the data is generated from git, the organization list
was maintained by hand. The general feeling at the meeting was
that the organization list was not adding value and there were
concrete cases where it involved much chasing by the RMs, so
it has been removed.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
This changes the default to 0, to avoid yields during progress in srun.
In mpirun, ompi_mpi_yield_when_idle is set to 1 if oversubscribed
otherwise 0. But the default is 1 though, and it is used in srun.
Now srun and mpirun have the same latency in non-oversubscribed cases.
Signed-off-by: Piotr Lesnicki <piotr.lesnicki@atos.net>
This fix is related to issue #1877, and prevents the OMPI library from
messing the user level random values.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
simply based on some local state. This is the second
part of the patch proposed for open-mpi/ompi#1183.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Update to support passing of HWLOC shmem topology to client procs
Update use of distance API per @bgoglin
Have the openib component lookup its object in the distance matrix
Bring usnic up-to-date
Restore binding for hwloc2
Signed-off-by: Ralph Castain <rhc@open-mpi.org>