Update the OPAL glue configure code to correctly link the opal/pmix3
component to the hwloc used by OMPI instead of defaulting to the
system-level hwloc. Required a corresponding update to the PMIx hwloc
configure code so we treat hwloc the same way we handle libevent in
embedded scenarios. Roll to PMIx v3.1.2 for plugging of memory leaks and
addition of faster PMIx_Get response
Signed-off-by: Ralph Castain <rhc@pmix.org>
clang 5.0 on trusty is busted with respect to C11 atomics
This can be evidenced with the simple program below.
This test was added into OPAL_PROG_CC_C11_HELPER() and disable
C11 atomics if it fails.
_Atomic uint32_t a;
uint32_t b;
atomic_fetch_xor_explicit(&a, b, memory_order_relaxed);
Refs. open-mpi/ompi#6264
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@d1fadebc65)
Valgrind warns that *newtype is uninitialized when calling from
Fortran as e.g.
use mpi
integer :: t, err
call MPI_Type_create_f90_integer(5, t, err)
Since newtype is intent(out), this should not happen. There is
no reason to convert the type using PMPI_Type_f2c, only to over-
write it immediately afterwards. The other type_create_* functions
did not convert newtype.
The valgrind warnings:
==28441== Conditional jump or move depends on uninitialised value(s)
==28441== at 0x581B555: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [...])
==28441== by 0x400C46: main (in [...])
==28441==
==28441== Conditional jump or move depends on uninitialised value(s)
==28441== at 0x581B563: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [..])
==28441== by 0x400C46: main (in [...])
==28441==
==28441== Use of uninitialised value of size 8
==28441== at 0x581B577: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [...])
==28441== by 0x400C46: main (in [...])
==28441==
Signed-off-by: Risto Toijala <risto.toijala@gmail.com>
(cherry picked from commit f14a0f4fc9)
This commit fixes a bug where add_procs can incorrectly return an
error when going through the dynamic add_procs path. This doesn't
happen normally, only when pml/ob1 is not in use.
References #6201
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 30b8336cb4)
The PMIX_MODEX and PMIX_INFO_ARRAY macros were removed from the PMIx 3.1 standard.
Open MPI does not really need them (they are only used to be reported as not supported),
so smply #ifdef protect them to support an external PMIx v3.1
The change only need to be done in ext3x/ext3x.c.
But since this file is automatically generated from pmix3x/pmix3x.c, we have to update
the latter file.
Refs. open-mpi/ompi#6247
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(back-ported from commit open-mpi/ompi@950ba16aa1)
Fixes scoll_basic failures with shmem_verifier, caused by recent changes
in handling of zero-size collectives.
- Check for zero-size length only for fixed size collect (shmem_fcollect),
but not for variable-size collect (shmem_collect)
- Add 'nlong_type' parameter to internal broadcast function, to indicate
whether the 'nlong' parameter is valid on non-root PEs, since it's
used by shmem_collect algorithm. Before this change, some components
assumed it's true (scoll_mpi) while others assumed it's false
(scoll_basic).
- In scoll_basic, if nlong_type==false, do not exit if nlong==0, since
this parameter may not be the same on all PEs.
- In scoll_mpi, fallback to scoll_basic if nlong_type==false, since MPI
requires the 'count' argument of MPI_Bcast to be valid on all ranks.
(Picked from master 939162e)
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
This is a cherry-pick of master (2820aef). The propagation is intended to resolve issue #6130
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Adding the implementations of the functions that were removed
from the MPI standard to the build list, regardless of the
state of the OMPI_ENABLE_MPI1_COMPAT.
According to the README, we want the OMPI_ENABLE_MPI1_COMPAT
configure flag to control which MPI prototypes are exposed in
mpi.h, NOT, which are built into the mpi library. Those will
remain in the mpi library until a future major release (5.0?)
NOTE: for the Fortran implementations, we instead define
OMPI_OMIT_MPI1_COMPAT_DECLS to 0 instead of
OMPI_ENABLE_MPI1_COMPAT to 1. I'm not sure why, but
this seems to work correctly.
Also changing the removed MPI_Errhandler_create implementation
to use the non removed MPI_Comm_errhandler_function prototype
(prototype remains unchanged from MPI_Comm_errhandler_fn)
NOTE: This commit is *NOT* a cherry-pick from master, because
on master, we are no longer building those symbols by
default, but on v4.0.x we _ARE_ still building these
symbols by default. This is because the v4.0.x branch
is to remain backwards compatible with v3.0.x, while at
the same time removing the "removed" symbols from mpi.h
(unless the user configures with --enable-mpi1-compatibility)
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
This commit fixes edge cases of `r = 38` and `r = 308`.
As defined in the MPI standard, `TYPE_CREATE_F90_REAL` and
`TYPE_CREATE_F90_COMPLEX` must be consistent with the Fortran
`SELECTED_REAL_KIND` function. The `SELECTED_REAL_KIND` function is
defined based on the `RANGE` function. The `RANGE` function returns
`INT(MIN(LOG10(HUGE(X)), -LOG10(TINY(X))))` for a real value `X`.
The old code considers only `INT(LOG10(HUGE(X)))` using `*_MAX_10_EXP`.
This commit adds `INT(-LOG10(TINY(X)))` part using `*_MIN_10_EXP`.
This bug affected the following `p`-`r` combinations.
| p | r | expected | returned | expected | returned |
| :------------ | --: | :-------- | :-------- | :------- | :-------- |
| MPI_UNDEFINED | 38 | REAL8 | REAL4 | COMPLEX16 | COMPLEX8 |
| 0 <= p <= 6 | 38 | REAL8 | REAL4 | COMPLEX16 | COMPLEX8 |
| MPI_UNDEFINED | 308 | REAL16 | REAL8 | COMPLEX32 | COMPLEX16 |
| 0 <= p <= 15 | 308 | REAL16 | REAL8 | COMPLEX32 | COMPLEX16 |
MPICH returns the same result as Open MPI with this fix.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit 6fb01f64fe)
Ensure that MPI extensions with mpif.h bindings have names that are
<=26 characters long. 26 is the magic number that still allows us to
have an "include ..." line in the user-facing mpif-ext.h header file
that includes this extension's header file without going over 72
characters.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@c0faf34855)
Per discussion on https://github.com/open-mpi/ompi/pull/6030
and https://github.com/open-mpi/ompi/pull/6145, move
around where MPI extension header files are installed (specifically:
the installation tree path does not need to match the source tree
path).
For reference, header files were installed like this :
- <prefix>/include/openmpi/ompi/mpiext/pcollreq/mpif-h/mpiext_pcollreq_mpifh.h
- <prefix>/include/openmpi/ompi/mpiext/pcollreq/c/mpiext_pcollreq_c.h
and they are now installed like this :
- <prefix>/include/openmpi/mpiext/mpiext_pcollreq_mpifh.h
- <prefix>/include/openmpi/mpiext/mpiext_pcollreq_c.h
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@975e3cd0c9)
remove whitespace around '=' when setting btl_uct_LIBS
Thanks Ake Sandgren for reporting this
Refs. open-mpi/ompi#6173
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@b89deeb1bb)
The intention of lowering the priority when all processes are local
was to favor Vader BTL. However, in builds including the OFI MTL it
gets selected instead.
Reviewed-by: Spruit, Neil R <neil.r.spruit@intel.com>
Reviewed-by: Gopalakrishnan, Aravind <aravind.gopalakrishnan@intel.com>
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
(cherry picked from commit fc8582c560)
Use the PRIsize_t macro to correctly print a size_t
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@78aa6fdd1d)
AC_CHECK_DECLS take a comma separated list of macros/symbols,
so replace the whitespace separator with a comma.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@b715dd2657)
Use PMIX_* macros instead of OPAL_* macros
master does things differently, so this is a one-off commit
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Though not a recommended configuration it is possible to use Open MPI
over UCX over uGNI. This configuration had some issues related to the
connection management and tl selection. This commit fixes those
issues.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit e07a64c52d)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
AC_CHECK_DECLS take a comma separated list of macros/symbols,
so replace the whitespace separator with a comma.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit b715dd2657)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>