Commit Graph

1542 Commits

Author SHA1 Message Date
98c9b486d3 Правила сборки для ЗОСРВ "Нейтрино" редакции 2020 2023-02-01 17:12:57 +03:00
Jeff Squyres
Merge pull request #8254 from gleon99/v4.1.x
Replace usage of the deprecated NB API of UCX with NBX
2020-12-07 15:33:05 -05:00
Leonid Genkin
0a819bff1a Replace usage of the deprecated NB API of UCX with NBX
Signed-off-by: Leonid Genkin <lgenkin@nvidia.com>
(cherry picked from commit 7f9a305a64)
2020-11-25 16:42:05 +02:00
Gilles Gouaillardet
33aa6394d9 configury: fix OPAL_GET_VERSION
- fix path to getdate.sh
 - do not prepend "date" to the revision
 - support git worktree

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 930d3c4695)
2020-11-22 21:07:42 +09:00
Nikola Dancejic
3f863aab8a v4.1.x: Using package_rank to select between NIC of equal distance from the process.
If PMIX_PACKAGE_RANK is available, uses this value to select between multiple
NIC of equal distance between the current process. If this value is not
available, try to calculate it by getting the locality string from each local
process and assign a package_rank. If everything fails, fall back to using
process_id.rank to select the NIC. This last case is not ideal, but has a small
chance of occuring, and causes an output to be displayed to notify that this is

Some of the information in master branch is not available for the multi-NIC
patch, such as myprocinfo.rank. This info is used to select between multiple
NIC of equal distance to the process. This adapts the previous commit to work
with the v4.1.x branch.

Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
(cherry picked from commit 8017f12801)
2020-11-10 13:05:16 -08:00
Jeff Squyres
ab86c2793b opal_functions.m4: add comment
No code or logic changes.

Add commit about why it's ok to use $srcdir here

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit a6a0d511f9)
2020-11-05 12:16:11 -08:00
Jeff Squyres
c85d591b51 config/Makefile.am: ensure getdate.sh is in dist tarball
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 91a5af83cd)
2020-11-05 12:16:08 -08:00
Jeff Squyres
e7f829bbb0 getdate.sh: make the date(1) usage more portable
There are several different flavors of date(1) out there.  Try a few
different CLI options for date(1) to see which one works.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 89920bac4c)
2020-10-29 08:17:26 -07:00
Gilles Gouaillardet
35e7d86eb1 configury: make build Reproducible
If defined, use SOURCE_DATE_EPOCH environment variable; make the build
Reproducible by forcing timestamps.  See
https://reproducible-builds.org/docs/source-date-epoch/ for more

Thanks Bernhard M. Wiedemann for bringing this to our attention.

Fixes open-mpi/ompi#3759

**NOTE:** This was cherry-picked from master, and slightly modified /
  amended for the v4.1.x branch.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 7b4e8ba4aa)
2020-10-29 08:17:26 -07:00
Jeff Squyres
fa3211a7a6 opal_functions.m4: remove redundant code
This code was invoked twice.  Leave it solely in OPAL_CONFIGURE_SETUP,
which is invoked before OPAL_BASIC_SETUP.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 7c36b45847)
2020-10-29 08:17:26 -07:00
Jeff Squyres
a4e8655fbe opal_get_version.m4: properly quote dir args
Make sure to surround directory variables with quotes so that they
function properly, even if there's spaces in the directory name.

While Open MPI doesn't generally support directory names with spaces,
this fix at least allows `autogen.pl` to complete successfully.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 8f13c3b587)
2020-10-24 10:32:40 -07:00
Brian Barrett
Merge pull request #8094 from dancejic/include-v4.1.x
v4.1.x: Adding ofi include to CPPFLAGS so that configure can check fabric.h
2020-10-21 07:41:54 -07:00
Gilles Gouaillardet
af54e86670 fortran.m4: reword error message when sizeof(int) != sizeof(INTEGER)
Reword the error message to suggest only the Fortran INTEGER size
can be changed via adhoc compiler flags.

This is a one-off commit for the release branches, master does
it differently (and breaks ABI).

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-10-19 10:06:28 +09:00
Nikola Dancejic
787f4f55b1 Adding ofi include to CPPFLAGS so that configure is able to check fabric.h
configure was previously failing to check for the fi_info.nic struct because
fabric.h relied on other header files in the ofi/include dir. This adds that
include to CPPFLAGS before running that check so that configure can check for
the struct.

Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
(cherry picked from commit 08e8205fb7)
2020-10-15 13:11:46 -07:00
Howard Pritchard
c4128b21a0 suppress icc long double message
improve configury to check whether icc is handling no long double.
This prevents seeing 100s of messages like this:

icc: command line warning #10148: option '-Wno-long-double' not supported

A similar patch will be needed for pmix.

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
(cherry picked from commit 6df0e53421)
2020-08-27 16:04:50 +00:00
Christoph Niethammer
e0bd64f843 Fix memory leak in configure, which prevents leak sanitizer usage
If building Open MPI with sanitizers, e.g
$ configure CC=clang CFLAGS=-fsanitize=address ....
configure test programs are also build with the sanitizers and will
report errors resulting in configure to fail.

Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
2020-07-22 14:49:10 +02:00
b4e04bbd8a Add supports for MPI_OP using AVX512, AVX2 and MMX
Add logic to handle different architectural capabilities
Detect the compiler flags necessary to build specialized
versions of the MPI_OP. Once the different flavors (AVX512,
AVX2, AVX) are built, detect at runtime which is the best
match with the current processor capabilities.

Add validation checks for loadu 256 and 512 bits.
Add validation tests for MPI_Op.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: dongzhong <zhongdong0321@hotmail.com>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit 14b3c70628)
2020-07-13 13:49:00 -07:00
Jeff Squyres
94922937c2 fortran.m4: disallow when sizeof(int) != sizeof(INTEGER)
NOTE: This is intentionally not a cherry pick from master.  Instead,
this is a cherry-pick from the equivalent commit on the v4.0.x branch.
See below.

There is a problem with the mpi_f08 module when sizeof(int) !=
sizeof(INTEGER): the size of TYPE(MPI_Status) is too small.  This
causes buffer overruns when Open MPI is configured with (for example)
sizeof(int)==4 and sizeof(INTEGER)==8, and then you call the mpi_f08
MPI_RECV subroutine.  This will end up copying the resulting C
MPI_Status to the buffer pointing to the Fortran status, but the code
does not know if the Fortran status is an mpif.h status or a
TYPE(MPI_Status) -- it just blindly copies over as if the Fortran
status is an INTEGER array of length MPI_STATUS_SIZE.  Unfortunately,
TYPE(MPI_Status) is actually smaller than this, so we overrun the
buffer.  Hilarity ensues.

The simple fix for this is to make TYPE(MPI_Status) the same size as
INTEGER(MPI_STATUS_SIZE), but we can't do that here on the release
branch because it will break ABI.

This commit does the following:

- checks to see if we're in a sizeof(int) != sizeof(INTEGER) scenario
- if so, if the user has not specifically excluded building the
  mpi_f08 module, display a Giant Error Message (GEM) and abort

This is unusual; we don't usually abort configure when feature XYZ
can't be built -- if the user didn't specifically ask for XYZ, we
just emit a notice that we won't build XYZ and continue.

This situation is a little different because we're on a release
branch: prior releases have built mpi_f08 by default -- even in this
"bad" scenario.  Hence, in this case, we explicitly tell the user that
this is now a known-bad scenario and abort.  In the GEM, we give the
user two options:

1. Change their compiler flags so that sizeof(int) == sizeof(INTEGER)
   and re-run configure, or
2. Explicitly disable the mpi_f08 module via --enable-mpi-fortran=usempi

Thanks to @ahaichen for reporting the issue.

Note: the proper fix has been implemented on master (i.e., what will
become v5.0.0), but since that breaks ABI, we can't cherry pick it
back here to an existing release branch series. Hence, we
cherry-picked this fix from the v4.0.x branch.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 27836a614b9c29d7636cdf1a9b838b1532281a8a)
2020-07-10 14:39:34 -07:00
6e145188d9 fs/ime & fbtl/ime: Support of IME file system
Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2020-06-26 12:26:51 -04:00
Brian Barrett
Merge pull request #7824 from hoopoepg/topic/ucx-test-external-events-v4.1
COMMON/UCX: improved missing events test - v4.1
2020-06-25 08:10:54 -07:00
Jeff Squyres
Merge pull request #7814 from raafatfeki/topic/gpfs_4.1.x
fs/gpfs: Bring over GPFS component from master to v4.1
2020-06-25 10:47:39 -04:00
Jeff Squyres
Merge pull request #7831 from hppritcha/backports/v4.1.x-psm2-updates
Backports/v4.1.x psm2 updates
2020-06-25 10:40:38 -04:00
Nikola Dancejic
c51917675c common/ofi: Fixing compilation issue with ofi versions that do not support fi_info.nic
Added the flag OPAL_OFI_PCI_DATA_AVAILABLE to remove accessing the nic
object in
fi_info when the ofi version does not support that structure.

Signed-off-by: Nikola Dancejic dancejic@amazon.com
(cherry picked from commit ae2a447b0e)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Jeff Squyres
ac66f46dc7 mtl/ofi: check for FI_LOCAL_COMM+FI_REMOTE_COMM
Make sure to get an RDM provider that can provide both local and
remote communication.  We need this check because some providers could
be selected via RXD or RXM, but can't provide local communication, for

Add OPAL_CHECK_OFI_VERSION_GE() m4 macro to check that the Libfabric
we're building against is >= a target version.  Use this check in two

1. MTL/OFI: Make sure it is >= v1.5, because the FI_LOCAL_COMM /
   FI_REMOTE_COMM constants were introduced in Libfabric API v1.5.
2. BTL/usnic: It already had similar configury to check for Libfabric
   >= v1.1, but the usnic component was checking for >= v1.3.  So
   update the btl/usnic configury to use the new macro and check for
   >= v1.3.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 21bc9042e1)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Michael Heinz
b680893917 Add check for PSM2 reference counting to PSM2 MTL #7721
As discussed, a feature is being added to libpsm2 to correctly handle
the case where the library is opened by multiple OMPI transports in the same
process. (For example, the OFI BTL and the PSM2 MTL).

* Improved error message to indicate required libpsm2 version.

* Adds a test at autogen/configure time for the existence of

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
(cherry picked from commit f10305a49f)
2020-06-16 10:38:22 -06:00
Sergey Oblomov
d52b64c488 COMMON/UCX: improved missing events test
- there is new API to detect missing memmory events.
  Enabled using of new UCX API to detect missing events

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit d6bff6ffbd)
2020-06-16 14:27:02 +03:00
0864b62e12 fs/gpfs: Support of GPFS file system
Creation of gpfs module under fs component.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2020-06-12 12:57:18 -04:00
Howard Pritchard
Merge pull request #7698 from jjhursey/v4-fix-lsf-libevent
Add checks for libevent.so conflict with LSF
2020-05-19 09:14:40 -06:00
Joshua Hursey
76500e6cf8 Fix LSF configure check for libevent conflict
* Want to make sure that the result from `wc` is trimmed of spaces,
   so the `0` check returns properly
 * Add a few more comments, and fix wording in the warning message.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-05-18 15:10:46 -04:00
Joshua Hursey
fc4199e3ba Add checks for libevent.so conflict with LSF
* LSF ships a `libevent.so` that is no related to the `libevent.so`
   shipped with Libevent.
 * Add some checks to the configure logic to detect scenarios where this
   conflict can be detected, and provide the user with a descriptive
   warning message.
   - When detected by `event/external` this is just a warning since
     the internal component may be able to be used instead.
     - This happens when the user supplies the LSF path via the
       `LDFLAGS` envar instead of via `--with-lsf-libdir`.
   - When detected by a LSF component and LSF was explicitly requested
     then this becomes an error. Otherwise it will just print the warning
     and that component will fail to build.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-05-09 14:55:39 -04:00
Sergey Oblomov
7c621acf1b OPAL/UCX: enabling new API provided by UCX
- added detection of new API into configuration
- added tag_send call implemented using new API
- added MPI_Send/MPI_Isend/MPI_Recv/MPI_Irecv implementations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 75bda25ddb)
2020-05-04 10:02:10 +03:00
Howard Pritchard
c43d8e54d6 rework check lustre config to avoid rpath lib64
The original configury check for lustre was ending up rpathing in /usr/lib64 in
the compiler wrapper scripts.  This commit fixes that issue.

related to #7580

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
(cherry picked from commit ea690d008b)
2020-04-24 11:19:14 -06:00
Gilles Gouaillardet
cd03754ac3 configury: fix include path in Lustre detection
use -I$ompi_check_lustre_dir/include in order to correctly support
configure --with-lustre

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 7783e5ad09)
2020-04-24 11:18:57 -06:00
Gilles Gouaillardet
7c435186c8 configury: do fail lustre detection when llapi_file_create() is not found
The result of this test was previously and incorrectly ignored.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 72d3e29084)
2020-04-24 11:18:39 -06:00
Jeff Squyres
bc654354fe Fortran: fix the F90 compiler preprocessor check
Only check the if the Fortran compiler needs additional CLI flags for
preprocessing .F90 files if we actually have an F90 compiler.

Also fix a the AC_MSG_* usage.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit a7e4ca4dc0)
2020-04-02 16:51:36 -07:00
Gilles Gouaillardet
7938c61752 configury: try if -fpp flag is needed to preprocess .F90 files
.F90 files are preprocessed by gfortran and other compilers.
NAG compilers only preprocess .{ff,ff90,ff95} files, and the -fpp
flag is required to process .F90 files.

Fixes open-mpi/ompi#7583

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit a2c711b54b)
2020-04-01 09:55:35 -07:00
Austen Lauria
ff6b068b93 Fix pgcc18 support.
- pgcc18 defines __GNUC__ similar to Intel compilers. So we must
  check for pgi higher up, or else configury will mistake
  it for gcc.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
(cherry picked from commit 14785deb3c)
2020-02-12 15:11:03 -05:00
Jeff Squyres
fbeebdb9a0 fortran: ensure not to use [AM_]CPPFLAGS
Automake's Fortran compilation rules inexplicably use CPPFLAGS and
AM_CPPFLAGS.  Unfortunately, this can cause problems in some cases
(e.g., picking up already-installed mpi.mod in a system-default
include search path).

So in relevant module-using Fortran compilation Makefile.am's, zero

This has a side-effect of requiring that we compile the one .c file in
the F08 library in a new, separate subdirectory (with its own
Makefile.am that does _not_ have CPPFLAGS/AM_CPPFLAGS zeroed out).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit ab398f4b9a)
2020-02-04 05:15:40 -08:00
Howard Pritchard
Merge pull request #7338 from hppritcha/topic/fix_6539_v4.0.x
Topic/fix 6539 v4.0.x
2020-01-26 12:39:38 -07:00
Howard Pritchard
297505592a Fix a problem with fortran configure test.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-01-24 15:42:00 -06:00
Howard Pritchard
d12e0fdf32 make mpifort obey disable-wrapper-runpath
related to #6539

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 37b3e2f3fa)
2020-01-24 10:47:22 -06:00
Howard Pritchard
9582b76168 cray ftn: modify fortran module loc checker
to support the Cray Fortran compiler.  Cray Fortran compiler does not
contain all symbol info in the module file, have to link with the *.o
created as part of module file compilation.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 441bad9a75)
2020-01-08 13:20:02 -06:00
Howard Pritchard
Merge pull request #7116 from ggouaillardet/topic/v4.0.x/f08_bind_c_constants_revamp
v4.0.x: fortran/use-mpi-f08: revamp mpi_f08 constants
2019-12-13 08:07:42 -07:00
Jeff Squyres
87c0178ed4 opal_check_alps: fix configure output
There was a path where OPAL_CHECK_ALPS would exit its testing but
still leave `opal_check_cray_alps_happy` blank.  Fix that by setting
it to "no".

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 26705efad0)
2019-11-19 05:07:32 -07:00
Gilles Gouaillardet
0ab61c9b74 fortran/use-mpi-f08: remove unused references to OMPI_PROTECTED
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(back-ported from commit open-mpi/ompi@df6d763a53)
2019-11-06 10:10:22 +09:00
Sergey Oblomov
2fa112c0a6 UCX: added PPN hint for UCX context
- added PPN hint for UCX context init

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 43186e494b)

2019-08-09 11:51:30 +03:00
Howard Pritchard
71f240f078 btl/openib: fix issue 6785
Commit d7053a3 broke things for the case when Open MPI 4.0.x is built
without UCX support.  Problem was it was trying to partially initialize
the btl to try and delay printing of a help message till wireup.  Well
this sort of doesn't work in all cases.  Rather than keep piling on
changes to support a help message for a BTL that we are deprecating, take
a keep it simple stupid approach.

So, revert most of d7053a3 and instead put the help message back in the
original location, during scan of ports of the available HCAs to check
for whether or not link layer for that port is configured for ethernet or infiniband.
If Open MPI was built with UCX support, don't emit the help message, if
UCX was not linked in, emit the help message.

Verified on a system with connectX5 HCAs configured with two ports configured
for ethernet and two for infiniband.

relates to #6785

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2019-07-12 08:21:21 -06:00
James Clark
d8dc69feb5 Add a compilation flag that adds unwind info to all files that are present in the stack starting from MPI_Init.
This is so when a debugger attaches using MPIR, it can step out of this stack back into main.
This cannot be done with certain aggressive optimisations and missing debug information.

Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>

Co-authored-by: Jeff Squyres <jsquyres@cisco.com>

(cherry-picked from 20f5840)
2019-04-01 11:10:04 +01:00
Sergey Oblomov
14c271f993 PML/SPML/UCX: added evaluation of mmap events
- there was a set of UCX related issues reported which caused
  by mmap API hooks conflicts. We added diagnostic of such
  problems to simplify bug-resolving pipeline

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit d8e3562bae)
2019-03-14 16:48:25 +02:00
Ralph Castain
1675b8ee65 Ensure we push/pop local AC vars in the right place
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit c054d4d1cc)
2019-03-01 08:40:35 -08:00