Update the OPAL_CHECK_OFI configury macro:
- Make it safe to call the macro multiple times:
- The checks only execute the first time it is invoked
- Subsequent invocations, it just emits a friendly "checking..."
message so that configure output is sensible/logical
- With the goal of ultimately removing opal/mca/common/ofi, rename the
output variables from OPAL_CHECK_OFI to be
opal_ofi_{happy|CPPFLAGS|LDFLAGS|LIBS}.
- Update btl/ofi, btl/usnic, and mtl/ofi for these new conventions.
- Also, don't use AC_REQUIRE to invoke OPAL_CHECK_OFI because that
causes the macro to be invoked at a fairly random time, which makes
configure stdout confusing / hard to grok.
- Remove a little left-over kruft in OPAL_CHECK_OFI, too (which
resulted in an indenting change, making the change to
opal_check_ofi.m4 look larger than it really is).
Thanks Alastair McKinstry for the report and initial fix.
Thanks Rashika Kheria for the reminder.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Now that all use of libibverbs is gone from Open MPI, and all
verbs-based configury is also removed, update README to remove all
references to --with-verbs.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Now that all components that use libibverbs are gone, remove
OPAL_CHECK_VERBS and the confusingly-named OPAL_CHECK_OPENFABRICS
(which really just checked for verbs things -- not all the possible
OpenFabrics APIs/libraries).
The only code left in Open MPI that calls verbs is hwloc -- and that's
just the APIs that takes an IBV device and returns topological
information about it. Since nothing in the Open MPI code base uses
the "ibv_*" API any more, we have no need for this hwloc functionality
so we'll even remove the --with-verbs configure options.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The verbs and verbs_usnic components are now no longer necessary / no
longer used anywhere in the code base.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
So long sshmem/verbs! After many years of (mostly) faithful service,
it is time to remove the sshmem verbs component. It has been fully
replaced by other components, such as the UCX PML and OFI MTL.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
So long BTL openib! After many years of (mostly) faithful service, it
is time to remove the openib BTL. It has been fully replaced by other
components, such as the UCX PML and OFI MTL.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
PMIx is removing the --enable-embedded-libevent and
--enable-embedded-hwloc flags as they are confusing users. Instead, we
will use the --enable-embedded-mode to handle both of these options.
Update the embedded configury to handle it.
Signed-off-by: Ralph Castain <rhc@pmix.org>
It doesn't seem like the BTL was using uninitialized pointer. But simply
setting the rcache pointer to NULL after destroying it makes the valgrind
errors go away.
Fixes Issue #6345
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
`short float` support of the Intel C++ Compiler (group of C and C++
compilers), at least versions 18.0 and 19.0, is half-baked. It can
compile declarations of `short float` variables and expressions of
`sizeof(short float)` but cannot compile operations of `short float`
variables. In this situation, `AC_CHECK_TYPES(short float)` defines
`HAVE_SHORT_FLOAT` as 1 and compilation errors occur in
`ompi/mca/op/base/op_base_functions.c`. To avoid this error
tentatively, we disable `short float` support when using the Intel
C++ Compiler.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
`MPIX_C_FLOAT16` is defined as a synonym for `MPIX_SHORT_FLOAT`
if the C compiler supports `_Float16`, which is defined in
ISO/IEC JTC 1/SC 22/WG 14 N1945 (ISO/IEC TS 18661-3:2015).
This name and meaning are same as that of MPICH. This may be
a transitional datatype until the MPI Forum decides a proper
name for the type.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
This extension provides additional MPI datatypes `MPIX_SHORT_FLOAT`,
`MPIX_C_SHORT_FLOAT_COMPLEX`, and `MPIX_CXX_SHORT_FLOAT_COMPLEX`
for `short float` (C/C++), `short float _Complex` (C), and
`std::complex<short float>` (C++), respectively, or their alternate
types like `_Float16`.
See `ompi/mpiext/shortfloat/README.txt` for details.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
... and add `MPI_COMPLEX4`.
This commit changes values of existing `OMPI_DATATYPE_MPI_*` macros.
This change does not affect ABI compatibility of `libmpi.so` and the
like because these values are only used in OMPI internal code.
On the other hand, `ompi_datatype_t::id` values of existing datatypes
are not changed and 73 is newly assigned to for `MPI_COMPLEX4` to
retain ABI compatibility.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
... and `ompi_mpi_c_short_float_complex` and `ompi_mpi_cxx_sfltcplex`.
These are Open MPI internal variables intended to be defined as
`MPI_SHORT_FLOAT`, `MPI_C_SHORT_FLOAT_COMPLEX`, and
`MPI_CXX_SHORT_FLOAT_COMPLEX` in the future.
`OMPI_DATATYPE_MPI_C_SHORT_FLOAT_COMPLEX` is also required to
support `MPI_COMPLEX4` in the next commit.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
The type `short float`, which is proposed in ISO/IEC JTC 1/SC 22 WG 14
(C WG), is not supported by most compilers yet. But some compilers
(including gcc 7 for AArch64 and clang 6) support `_Float16`, which
is defined in ISO/IEC TS 18661-3:2015 (ISO/IEC JTC 1/SC 22/WG 14 N1945)
as an extensions for C. If it is detected in `configure`, it is used
as an alternate type of `short float` in Open MPI internal code.
This commit adds a `configure` option `--enable-alt-short-float=TYPE`.
It can be used to specify a type other than `short float` and `_Float16`
as the alternate type.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
The type `short float` is proposed for the C language in ISO/IEC JTC
1/SC 22 WG 14 (C WG) for mainly IEEE 754-2008 binary16, a.k.a.
half-precision floating point or FP16.
By this commit, `short float` and `short float _Complex` are detected
in `configure` and used in Open MPI internal code. `MPI_SHORT_FLOAT`
and its complex number version are not added yet.
This commit changes values of existing `OPAL_DATATYPE_*` macros.
This change does not affect ABI compatibility of `libmpi.so` and the
like because these values are only used in OPAL and OMPI internal code.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Reset ptypes when cloning a datatype in order to prevent
a double free() in the opal_datatype_t destructor.
This fixes a bug introduced in open-mpi/ompi@7c938f070fFixesopen-mpi/ompi#6346
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
The issue was a little complicated due to the internal stack used in the
convertor. The main issue was that in the case where we run out of iov
space to save the raw description of the data while hanbdling a
repetition (loop), instead of saving the current position and bailing out
directly we reading of the next predefined type element. It worked in
most cases, except the one identified by the HDF5 test. However, the
biggest issue here was the drop in performance for all ensuing calls to
the convertor pack/unpack, as instead of handling contiguous loops as a
whole (and minimizing the number of memory copies) we copied data
description by data description.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
When compiling mpi.h with a modern C++ compiler and a high degree of
pickyness (e.g., -Wold-style-cast), casting using (void*) in the
OMPI_PREDEFINED_GLOBAL and MPI_STATUS*_IGNORE macros will emit
warnings. So if we're compiling with a C++ compiler, use C++'s
static_cast<> instead of (void*).
Thanks to @shadow-fax for identifying the issue.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit fixes a bug introduced in
f62d26ddbc8cda4d985cceee531a2ec32406d1f6. That commit changed how
vader allocates fragment memory from the shared memory
segment. Unfortunately, the values used for the fragment sizes did not
include space for the fragment header. This can cause an overrun of
data from one fragment to the header of the next fragment.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Example:
For the list of hosts `a01,b00,a00` a regex is generated:
`a[2:1.0],b[2:0]`, where `a`-hosts prefixes moved to the begining,
it breaks the hosts ordering.
This commit fixes regex for that case to `a[2:1],b[2:0],a[2:0]`
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
treematch/km_partitioning.c #include "config.h",
but there is no such file when the embedded treematch is used.
In order to prevent the embedded treematch from incorrectly using
the config.h from the embedded hwloc, generate a dummy config.h.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Example:
For the nodelist `jjss,jjss0000001,jjss0000003,jjss0000002` a regular
expression was `jjss[0:0],jjss[7:1,3,2]` that led to incorrect unpacking
the first host as `jjs0`. This commit fixes an adding empty range for
not numeric hostnames. Here is the fixed regex for this exapmle:
`jjss,jjss[7:1,3,2]`
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
When we exceed the threshold number of contexts created, print appropriate help
text
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
We missed an assert to check if ALLOW_OVERTAKE is set or not before
validating the sequence number and this will cause deadlock.
Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
This commit reverted pr #6199 as it introduced deadlock in some cases.
Also removed the assert as the condition is obsoleted.
Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>