Moving to a model where we have users actively _enable_ SEP feature for use
rather than opening SEP by default if provider supports it. This allows us to
not regress (either functionally or for performance reasons) any apps that were
working correctly on regular endpoints.
Also, providing MCA to specify number of OFI contexts to create and default
this value to 1 (Given btl/ofi also creates one by default, this reduces the
incidence of a scenario where we allocate all available contexts by default and
if btl/ofi asks for one more, then provider breaks as it doesn't support it).
While at it, spruce up README on SEP content.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
clang 5.0 on trusty is busted with respect to C11 atomics
This can be evidenced with the simple program below.
This test was added into OPAL_PROG_CC_C11_HELPER() and disable
C11 atomics if it fails.
_Atomic uint32_t a;
uint32_t b;
atomic_fetch_xor_explicit(&a, b, memory_order_relaxed);
Refs. open-mpi/ompi#6264
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Update the OPAL glue configure code to correctly link the opal/pmix4
component to the hwloc used by OMPI instead of defaulting to the
system-level hwloc. Required a corresponding update to the PMIx hwloc
configure code so we treat hwloc the same way we handle libevent in
embedded scenarios.
Signed-off-by: Ralph Castain <rhc@pmix.org>
This commit fixes the ordering of the teardown for
opal_finalize_util. The installdirs and if frameworks need to come
down before the MCA system.
Fixes#6259
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Valgrind warns that *newtype is uninitialized when calling from
Fortran as e.g.
use mpi
integer :: t, err
call MPI_Type_create_f90_integer(5, t, err)
Since newtype is intent(out), this should not happen. There is
no reason to convert the type using PMPI_Type_f2c, only to over-
write it immediately afterwards. The other type_create_* functions
did not convert newtype.
The valgrind warnings:
==28441== Conditional jump or move depends on uninitialised value(s)
==28441== at 0x581B555: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [...])
==28441== by 0x400C46: main (in [...])
==28441==
==28441== Conditional jump or move depends on uninitialised value(s)
==28441== at 0x581B563: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [..])
==28441== by 0x400C46: main (in [...])
==28441==
==28441== Use of uninitialised value of size 8
==28441== at 0x581B577: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0)
==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0)
==28441== by 0x400BA1: MAIN__ (in [...])
==28441== by 0x400C46: main (in [...])
==28441==
Signed-off-by: Risto Toijala <risto.toijala@gmail.com>
The PMIX_MODEX and PMIX_INFO_ARRAY macros were removed from the PMIx 3.1 standard.
Open MPI does not really need them (they are only used to be reported as not supported),
so smply #ifdef protect them to support an external PMIx v3.1
Refs. open-mpi/ompi#6247
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This commit fixes a bug where add_procs can incorrectly return an
error when going through the dynamic add_procs path. This doesn't
happen normally, only when pml/ob1 is not in use.
References #6201
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
be more careful about closing framewworks as part of
orte_finalize. Owing to recent restructuring in opal to handle
finalize in a more general fashion, the missing framework
closes were causing meltdowns as the mca vars subsystem
was cleaning itself up.
This problem was recently reported by Siegmar:
https://www.mail-archive.com/users@lists.open-mpi.org//msg32946.html
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
As pointed out in issue #6186 we have some challenges with
English spelling. Fixing title in the wiki page resulted
in it being renamed. So fix links to the now correctly
spelled title.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Fixes scoll_basic failures with shmem_verifier, caused by recent changes
in handling of zero-size collectives.
- Check for zero-size length only for fixed size collect (shmem_fcollect),
but not for variable-size collect (shmem_collect)
- Add 'nlong_type' parameter to internal broadcast function, to indicate
whether the 'nlong' parameter is valid on non-root PEs, since it's
used by shmem_collect algorithm. Before this change, some components
assumed it's true (scoll_mpi) while others assumed it's false
(scoll_basic).
- In scoll_basic, if nlong_type==false, do not exit if nlong==0, since
this parameter may not be the same on all PEs.
- In scoll_mpi, fallback to scoll_basic if nlong_type==false, since MPI
requires the 'count' argument of MPI_Bcast to be valid on all ranks.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
Allow MPI extensions to generate Fortran headers using Autoconf.
For example, allow following files.
```
ompi/mpiext/example/mpif-h/mpiext_example_mpifh.h.in
ompi/mpiext/example/use-mpi/mpiext_example_usempi.h.in
ompi/mpiext/example/use-mpi-f08/mpiext_example_usempif08.h.in
```
Generated MPI extension C headers are already allowed in commit
6a7d5271c4.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
If the default value of `ofc_type_size` is `$ac_cv_sizeof_int`,
`OMPI_SIZEOF_FORTRAN_*` of all unavailable types become `sizeof(int)`.
This leads `OMPI_SIZEOF_FORTRAN_REAL2 == OMPI_SIZEOF_FORTRAN_REAL`
to become true unintentionally and `OMPI_DATATYPE_MPI_REAL2` has a
wrong value in `ompi/datatype/ompi_datatype_internal.h`. This is not
an actual bug because datatypes for unavailable types are not used.
However it is confusing. I looked the source tree and the history but
could find any basis of `$ac_cv_sizeof_int`.
If we don't use `implicit none` in `OMPI_FORTRAN_GET_KIND_VALUE`, and
if a Fortran compiler does not support `ISO_C_BINDING` completely,
a random value is set in `value` and the fallback route is not used.
It is not our intention.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
opal_string_copy() takes care of all the string computations.
Specifically: when we converted to opal_string_copy(), we accidentally
left the *source* length as the argument, not the *target* length,
which resulted in one less character being copied than intended (as
was showing up in MTT C++ testing results).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>