1
1
Commit Graph

71 Commits

Author SHA1 Message Date
dongzhong
b4e04bbd8a Add supports for MPI_OP using AVX512, AVX2 and MMX
Add logic to handle different architectural capabilities
Detect the compiler flags necessary to build specialized
versions of the MPI_OP. Once the different flavors (AVX512,
AVX2, AVX) are built, detect at runtime which is the best
match with the current processor capabilities.

Add validation checks for loadu 256 and 512 bits.
Add validation tests for MPI_Op.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: dongzhong <zhongdong0321@hotmail.com>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit 14b3c70628)
2020-07-13 13:49:00 -07:00
Jeff Squyres
55fd437d0f opal_config_asm.m4: replace tabs with spaces
Whitespace change only; no code or logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 63560fe9c4)
2018-08-28 12:02:21 -07:00
Jeff Squyres
420ffe7588 opal_config_asm.m4: Fix the detection of 128 bits atomics.
Thanks to Stefan Teleman for identifying this issue and providing a
proof-of-concept patch.  We ended up revamping the detection of
128-bit atomics to reduce duplicated code and be a slightly simpler --
albiet perhaps a bit more verbose -- approach:

- Remove the --enable-cross-* options; they were confusing and
  unnecessary.
- Always try to compile / link the compiler-intrinsic 128-bit atomic
  functions.
  - Strengthen the C tests we use to be more robust.
  - Use m4 to avoid duplicating the C tests multiple times in the .m4
    source.
- If not cross-compiling, try to run a short test and ensure that they
  actually work (as of Aug 2018, there's at least one platform where
  they don't: clang 6 on ARM64).  If cross-compiling, just assume that
  they work.
- Add more comments about what is going on with all the tests; it's
  tricky stuff.  Our Future Selves will thank us.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit ff9df91887)
2018-08-28 12:02:21 -07:00
luz.paz
06b121eb70 Misc. trivial typos
Found via `codespell -q 3`

Signed-off-by: luz paz <luzpaz@users.noreply.github.com>
2018-04-09 11:45:58 -04:00
Nathan Hjelm
1c52d9dffe opal/asm: clean up no longer supported architectures
We no longer officially support MIPS or ARM before v6. This commit
updates the configury to check for sync builtins on these
architectures and removes the MIPS and IA64 assembly from
opal/include/opal/sys.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-10-11 13:09:29 -06:00
Jeff Squyres
0414c0c9d7 Merge pull request #3757 from ggouaillardet/topic/enable_builtin_atomics
configury: abort when builtin atomics cannot be built and configure'd…
2017-08-14 15:22:18 -04:00
Nathan Hjelm
ebce88b7ad opal: remove generated asm code
Every modern compiler supports either inline assembly or builtin atomic
operations. Because of this it is time to delete all the code associated
with pre-built atomics.

This commit also clean out the DEC and XLC asm checks. Neither check
does anything and the XLC compiler supports GCC ASM.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-03 09:18:58 -06:00
Nathan Hjelm
35c9b93754 config: remove erroneous define
This removes a copy-and-paste error where we were setting the
OPAL_ASM_SYNC_HAVE_64BIT more than once.

References #3993. Close when on master and v3.0.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-01 14:53:35 -06:00
Gilles Gouaillardet
e77874bbaf configury: fix gcc builtin atomic detection
test for both 32 and 64 bits.
clang only support 32 bits builtin atomics when -m32 is used

Thanks Paul Hargrove for reporting this.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-04 09:47:45 +09:00
Gilles Gouaillardet
409a3bfdbd configury: abort when builtin atomics cannot be built and configure'd with --enable-builtin-atomics
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-26 11:42:51 +09:00
Nathan Hjelm
bc54c99e12 configure: add builtin asm check for s390/s390x
We accepted a change that enabled CMA on s390 and s390x. This change
had the side-effect that we were no longer using the builtin atomics
for these systems. This is a problem since we do not have ASM for
s390 and s390x. This commit restores the atomics.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-23 08:27:48 -06:00
Nicolas Morey-Chaisemartin
b4d9d5ee0f opal: add support for s390 and s390x architectures
Signed-off-by: Nicolas Morey-Chaisemartin <NMoreyChaisemartin@suse.com>
2017-05-05 17:23:42 +02:00
Brian Barrett
ab3ac6d0ea build: Fix platform detection on FreeBSD
Look for amd64 in addition to x86_64 as the platform
type for x86_64 assembly.  The FreeBSD-packaged
Autoconf package has a patch to return
amd64-unknown-freebsd11.0 instead of the
x86_64-unknown-freebsd11.0 that a stock Autoconf
package would return.  Since we want to run Jenkins
builds on FreeBSD, working around the FreeBSD patch
is probably the easiest thing.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-04-06 20:27:22 -07:00
Howard Pritchard
db2e1298fb OSx: remove built-in atomics support
It was decided to remove support for os-x builtin atomics

Fixes #2668

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-15 12:45:33 -06:00
Gilles Gouaillardet
af0b5cffb4 asm: rename the AMD64 into X86_64
in this context, AMD64 really means amd64 or em64t, so let's
rename this into X86_64 in order to avoid any confusion

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-27 15:10:50 +09:00
Gilles Gouaillardet
2f4013ce33 configury: fix asm atomic detection
there is no need to look for an assembly file when BUILTIN_GCC is used

Fixes open-mpi/ompi#3032
Refs open-mpi/ompi#3036

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-27 10:42:50 +09:00
Gilles Gouaillardet
299a6f8d7c configury: auto-detect armhf and armel architectures on Debian
Thanks Alastair McKinstry for the patch

Fixes open-mpi/ompi#2514

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-06 14:49:54 +09:00
Gilles Gouaillardet
596613c0aa configury: add support for x32 architecture
Thanks Alastair McKinstry for the patch

Fixes open-mpi/ompi#2515

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-06 14:49:37 +09:00
Gilles Gouaillardet
c8b51a2d3b configury: remove some dead code
perl is now mandatory to build Open MPI,
so there is no need to check for it

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-06 14:49:37 +09:00
Gilles Gouaillardet
5bb3efdc74 configury: check the existence of perl
perl is required by ompi/mpi/man/make_manpage.pl, that is even used in opal.
so simply aborts at configure time if perl is not available

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-05 10:31:51 +09:00
Nathan Hjelm
795833bfac config: re-enable GCC inline ASM check for PGI
We disabled this support a long time ago. Probably safe to assume
whatever bug we were working around no longer exists.

Closes open-mpi/ompi#2044

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-02 12:44:08 -06:00
Nathan Hjelm
109389dce2 Merge pull request #1634 from hjelmn/cma
cma: add support for MIPS and ARM
2016-06-11 09:20:28 -06:00
Nathan Hjelm
0084ad0d1b opal: add armv8 support
This commit adds assembly support for aarch64.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-06-03 10:32:21 -06:00
Nathan Hjelm
d86e41ea13 atomic/gcc: add check for 128-bit CAS being lock-free
Compiler implementations are free to include support for atomics that
use locks. Unfortunately lock-free and lock atomics do not mix. Older
versions of llvm on OS X use locks to provide
__atomic_compare_exchange on 128-bit values but are lock-free on
64-bit values. This screws up our lifo implementation which mixes
64-bit and 128-bit atomics on the same values to improve
performance. This commit adds a configure-time check if 128-bit
atomics are lock free. If they are not then the 128-bit __atomic CAS
is disabled and we check for the __sync version as a fallback.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-02 15:59:05 -06:00
Nathan Hjelm
f33bbfd381 atomic: add support for __atomic builtins (#1735)
* atomic: add support for __atomic builtins

This commit adds support for the gcc __atomic builtins. The __sync
builtins are deprecated and have been replaced by these atomics. In
addition, the new atomics support atomic exchange which was not
supported by __sync.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

* atomic: add support for transactional memory

This commit adds support for using transactional memory when using
opal atomic locks. This feature is enabled if the __HLE__ feature is
available and the gcc builtin atomics are in use.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-01 21:23:47 -04:00
Nathan Hjelm
b2f33bc076 opal/asm: fall back on inline asm atomics in some cases
This commit changes the asm configure logic to fall back on inline asm
atomics on systems that 1) have __sync atomics, 2) do not have 64-bit
__sync atomics, and 3) support 64-bit asm.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-05-10 04:27:58 -06:00
Nathan Hjelm
d99a9786b6 sync_builtin: check for 64-bit atomic support
This commit adds an additional check for 64-bit atomic support for __sync
builtins. If 64-bit support is not available the opal_atomic_*_64 atomics
are disabled.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-05-09 03:17:51 -06:00
Nathan Hjelm
98ce659e0b config: check for more __sync builtins
This commit updates the check for __sync builtin atomics to see if the
compiler supports both __sync_bool_compare_and_swap and
__sync_add_and_fetch. If either of these functions are not available
then we can't use the __sync builtins.

Fixes #1487

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-12 10:02:39 -06:00
Nathan Hjelm
e1cace0b02 configure: only enable sync builtin atomics if they link
This commit fixes the check for sync builtin atomics.
AC_COMPILE_IFELSE is insufficient to check for the builtins. Need to
use AC_LINK_IFELSE.

Fixes open-mpi/ompi#1487

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-29 16:57:15 -06:00
Nathan Hjelm
664ecc8f84 configure: re-enable built-in atomic support
This commit removes an erroneous else statement from the OSX built-in
atomics check. The else branch sets the built-in atomics support to
BUILTIN_NO if either opal_cv_asm_builtin is not BUILTIN_NO or OSX
atomics support is disabled.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-03-16 20:46:09 -06:00
Gilles Gouaillardet
fec973efda configury: test portability
replace test ... -o ... with test ... || test ...
and test ... -a ... with test ... && test ...
2015-12-28 13:58:45 +09:00
bosilca
984b35b860 Merge pull request #633 from bosilca/topic/enable_atomics
Enable by default the _sync version of atomic operations on OS X.
2015-09-28 12:03:08 -04:00
Nathan Hjelm
551c2ea480 opal/asm: remove alpha support
This commit removes alpha asm support. No current processor
manufacturer makes chips compatible with DEC alpha and no
participating organization has alpha processors. This makes it
difficult to support alpha via assembly.

This doesn't mean Open MPI will no longer build/work on alpha
processors. It should continue to work with gcc's builtin sync
atomics.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-18 09:11:38 -06:00
George Bosilca
277269b641 Have the __sync atomics the default. Remove spurious checking and
result messages
2015-08-17 18:08:25 -04:00
George Bosilca
5aadf39545 Clean-up the OS X selection logic. 2015-08-17 18:08:16 -04:00
George Bosilca
3eaefd790c Enable by default the _sync version of atomic operations on OS X. 2015-08-17 18:08:03 -04:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Jeff Squyres
7d870c8b9e Merge pull request #550 from jsquyres/pr/turn-down-for-WHAT
RFC: opal_config_asm.m4: enable inline assembly on little-endian POWER
2015-04-25 06:25:11 -04:00
Jeff Squyres
0afda878a2 opal_config_asm: remove support for OS X Leopard
We haven't supported OS X Leopard (10.5) for a long time.  So remove
this dead code.
2015-04-24 03:43:56 -07:00
Jeff Squyres
f7b8c11027 opal_config_asm.m4: enable inline assembly on little endian POWER
Suggested by Paul Hargrove,
http://www.open-mpi.org/community/lists/devel/2015/04/17323.php,
amended by Nysal Jan for 32 bit, too.
2015-04-23 05:43:24 -07:00
Nathan Hjelm
81502fafa8 Merge pull request #379 from hjelmn/remove_enable_smp_locks
Per-RFC: remove the --disable-smp-locks configure option
2015-04-15 10:02:23 -06:00
Nathan Hjelm
ac82d1a6be Per-RFC: remove the --disable-smp-locks configure option
Use of this configuration option can cause crashing, hanging, and
(worse) incorrect results when btl/sm, btl/scif, or btl/vader are
in use. We discussed this at the January 2015 developers meeting
and it was decided to remove the option entirely. This commit does
just that. All usage of OPAL_WANT_SMP_LOCKS has been removed.
2015-03-04 11:31:43 -07:00
Gilles Gouaillardet
b42e344129 configury: force OPAL_HAVE_CMPXCHG16B=0 on buggy compilers
per several reports on the devel ML, the opal_lifo test hangs
with intel icc 14.0.0.080 (aka 2013sp1) and intel icc 14.0.1.106 (aka 2013sp1u1).
/* older and more recents compilers work fine
 * buggy compilers work also fine but only with -O0 */
2015-02-05 13:24:12 +09:00
Gilles Gouaillardet
b4c333fe9b config/opal_*: portability fixes
convert "test ... -o" to "test ... ||"
convert "test ... -a" to "test ... &&"
2015-02-03 15:19:22 +09:00
Nathan Hjelm
79d8f6e54d Check if the processor supports compare-and-exchange on 128-bit values.
Before this commit we checked if the compiler supported compare-and-exchange
on 128-bit values. This turned out to be insufficient. This commit strengthens
the check to see if the processor supports the instruction (or built-in). This
check will not work when cross-compiling (will always disable the 128-bit
atomic) so overrides have been added for this case.
2014-12-17 23:34:12 -07:00
Nathan Hjelm
ccbb869274 Use AC_TRY_LINK not AC_TRY_COMPILE when testing for __sync_bool_compare_and_swap on 128-bit values 2014-12-09 18:56:21 -07:00
Nathan Hjelm
0efe6baf64 Add check for -mcx16 flag for 128-bit compare and swap
Some versions of gcc require this flag to be set before the __sync
builtin atomic compare and swap will support 128-bit values. If the
flag is required this check adds the flag to the CFLAGS.
2014-12-04 14:25:53 -07:00
Nathan Hjelm
fe787512d8 Add support for __sync builtin compare and swap on 128-bit values 2014-12-04 09:23:51 -07:00
Nathan Hjelm
b2b58b31a2 Add support for 128-bit compare and swap on x86_64 when available.
A 128-bit compare-and-swap will enable a better atomic lifo implementation
that uses the pointer + counter method to avoid ABA issues. This commit
adds configury to check for the instruction (cmpxchg16b) and adds an
implementation that uses the __int128 type available in C99.
2014-12-04 08:53:28 -07:00
George Bosilca
6772b07792 Only use RDTSCP if supported by the processor.
Conflicts:
	opal/include/opal/sys/amd64/timer.h
2014-11-27 11:29:47 -05:00