1
1

30773 Коммитов

Автор SHA1 Сообщение Дата
Joseph Schuchart
73a183408f UCX osc: add support for acc_single_intrinsic info key / mca param
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-06-23 12:41:52 +02:00
Jeff Squyres
e1e8b2a373
Merge pull request #7846 from jsquyres/pr/update-master-version
VERSION: Bump to 5.0.0a1
2020-06-20 09:49:22 -04:00
Jeff Squyres
3d7ab9368a VERSION: Bump to 5.0.0a1
Now that we have an actual 4.1.x branch, we shouldn't be calling
master "v4.1.0a1" any more.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-19 18:20:39 -04:00
Austen Lauria
d03a99c647
Merge pull request #7776 from simonbyrne/patch-1
Fix language in CUDA error
2020-06-18 09:49:50 -04:00
Geoff Paulsen
692f96e87a
Merge pull request #7799 from markalle/interception_early_toc_read
noinline to avoid compiler reading TOC before PATCHER_BEGIN
2020-06-17 14:26:24 -05:00
Jeff Squyres
1c80f191bb
Merge pull request #7834 from jsquyres/pr/fix-asm-test-run-tests-script
tests/asm/run_tests: fix basename usage
2020-06-17 14:10:58 -04:00
Jeff Squyres
e8277d9d06 tests/asm/run_tests: fix basename usage
Looks like this script was left over from quite a long time ago, and
was expecting CLI params from the "old"-style Automake test engine.
Update it to look for `--test-name` to get the test name, and update a
few other minor style things.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-17 10:23:13 -07:00
Jeff Squyres
fcdcb593bb
Merge pull request #7826 from jsquyres/pr/clang-6-and-7-hate-float16
More carefully test "alternate" short float type in configure
2020-06-17 12:55:37 -04:00
Jeff Squyres
2c171718ae check_alt_short_float: minor formatting tweak
Prevent spurious #-style comments from appearing in the generated
configure script.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-17 08:39:29 -07:00
Jeff Squyres
47df60717c check_alt_short_float: ensure compiler supports math
Even if the compiler supports an "alternate" short float type (e.g.,
_Float16), check to make sure that the compiler will correctly link
applications that perform mathematical operations on that type.

Carefully choose the mathematical test in the configure check to
ensure the mathematical operation is not removed by compiler
optimization (when setting CFLAGS=-O1 or higher).

Out of the box, clang 6.0.x and 7.0.x will fail to link applications
that try to perform addition (and other mathematical operations) on
_Float16 variables (an additional CLI flag is required to enable
software emulation of _Float16).  If we detect a situation where the
type is supported by a sample program fails to link and the basename
of $CC is "clang", emit a warning and point the user to a relevant
README.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@fujitsu.com>
2020-06-17 08:38:42 -07:00
Edgar Gabriel
9af19ab205
Merge pull request #7819 from edgargabriel/pr/avg-fview-size
common/ompio: use avg. file view size in the aggregator selection logic
2020-06-16 10:04:31 -05:00
Yossi Itigin
de6a52d620
Merge pull request #7789 from hoopoepg/topic/ucx-test-external-events
COMMON/UCX: improved missing events test
2020-06-16 14:00:33 +03:00
Sergey Oblomov
d6bff6ffbd COMMON/UCX: improved missing events test
- there is new API to detect missing memmory events.
  Enabled using of new UCX API to detect missing events

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2020-06-16 12:36:44 +03:00
Jeff Squyres
283cfbf16e
Merge pull request #7817 from jsquyres/pr/protect-stdc-version
mpi.h.in: protect checking __STDC_VERSION__
2020-06-15 21:38:16 -04:00
Jeff Squyres
d6f5700a8a
Merge pull request #7820 from jsquyres/pr/allowlist
Rename the use of "whitelist"
2020-06-15 18:04:59 -04:00
Jeff Squyres
17acb775e9 Rename the use of "whitelist"
Use the term "allowlist" instead of "whitelist" in the script that
looks for common symbols.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-15 17:06:25 -04:00
Jeff Squyres
d522c27037 mpi.h.in: Remove //-style comments
Keep all comments in the user-facing mpi.h.in as "old style" C
comments: /* */.  This gives us maximum portability, just on the off
chance that a user's C compiler does not support //-style comments.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-15 12:56:51 -07:00
Jeff Squyres
835f8f1834 mpi.h.in: fixups for static assert messages
1. __STDC_VERSION__ isn't necessarily defined (e.g., by C++
   compilers).  So check to make sure it is defined before we actually
   check the value.
2. If we're in C++11 (or later), use static_assert().
3. Split the static assert macro in two macros:
   * THIS_SYMBOL_WAS_REMOVED_IN_MPI30(...): Insert a valid expression
     (i.e., 0, because it's only used with MPI_Datatype values, and
     since MPI_Datatype is a pointer, 0 is a valid RHS expression)
     before invoking the static assert so that we don't get a syntax
     error instead of the actual static assert error.
   * THIS_FUNCTION_WAS_REMOVED_IN_MPI30(...): No need for the valid
     expression; just invoke the assert functionality.

Also remove an errant "\".

Thanks to Constantine Khrulev and Martin Audet for identifying the
issue and suggesting to use C11's static_assert().

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-06-15 12:55:46 -07:00
Edgar Gabriel
4a8a330bba common/ompio: use avg. file view size in the aggregator selection logic
This is a fix  based on a bugreport on github/mailing list from CGNS.
The core of the problem was that different processes entered different branches of
our aggregator selection logic, due to the fact that in some cases processes had
a matching file_view size and contiguous chunk size (thus assuming 1-D distribution),
and some processes did not (thus assuming 2-D distribution). The fix is to calculate
the avg. file view size across all processes and use this value, thus ensuring that
all processes enter the same branch.

Fixes issue #7809

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2020-06-15 09:17:44 -05:00
Jeff Squyres
4e59d97ac7
Merge pull request #7811 from zhngaj/remove-la
openmpi.spec: Remove libtool archive files in packaging
2020-06-12 11:02:32 -04:00
Jie Zhang
9f385a0c9f openmpi.spec: Fix Open MPI packaging issue
Libtool archive files (.la files) create an unnecessary dependency
between linked applications and the development versions of packages
upon which Open MPI depends (to get the .so.1 -> .so symlink).

Remove .la libtool archive files to keep the best practice in
package builders.

Signed-off-by: Jie Zhang <zhngaj@amazon.com>
2020-06-12 02:30:46 +00:00
Ralph Castain
3accffcb6e
Merge pull request #7797 from rhc54/topic/syn2
Sync to PMIx and PRRTE master branches
2020-06-09 19:19:05 -07:00
Mark Allen
ddd1f578ec noinline to avoid compiler reading TOC before PATCHER_BEGIN
This bug was first seen in a different product that's using the same
interception code as OMPI.  But I think it's potentially in OMPI too.

In my vanilla build of OMPI master on RH8 if I "gdb libopen-pal.so" and
"disassemble intercept_brk", I'm seeing a suspicious extra instruction
in front of PATCHER_BEGIN:
   0x00000000000d6778 <+40>:    std     r2,24(r1) // something gcc put in front
   0x00000000000d677c <+44>:    std     r2,96(r1) // PATCHER_BEGIN's toc_save
   0x00000000000d6780 <+48>:    nop               // NOPs from PATCHER_BEGIN
   0x00000000000d6784 <+52>:    nop               // that get replaced
   0x00000000000d6788 <+56>:    nop               // by instructions that
   0x00000000000d678c <+60>:    nop               // change r2
   0x00000000000d6790 <+64>:    nop               //

Later there are loads from that location like
   0x000000000019e0e4 <+132>:   ld      r2,24(r1)
that make me nervous since that's the pre-updated value.

I believe this is the same thing Nathan is describing way back in a9bc692d
and his solution was to put a second call around each interception, where
the outer call is just
    intercept_brk():
        PATCHER_BEGIN
        _intercept_brk()
        PATCHER_END
and the inner call _intercept_brk() is where the bulk of the code goes.

What I'm seeing is that _intercept_brk() is being inlined and probably
negating Nathan's fix.  So I want to add __opal_attribute_noinline__ to
restore the fix.

With this commit in place, the disassembly of intercept_brk becomes tiny
because it's no longer inlining _intercept_brk() and the susipicious
early save of r2 is gone.  I made the same fix to all the intercept_*
functions, although intercept_brk was the only one that had a suspicious
save of r2.

As far as empirical failures though, we only have those from the non-OMPI
product that's using the same patcher code.  I'm not actually getting OMPI
to fail from the above suspicious data being saved in r1+24.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2020-06-09 19:25:59 -04:00
Ralph Castain
9bdf1274c0
Sync to PMIx and PRRTE master branches
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-06-09 15:32:22 -07:00
Jeff Squyres
9b55419b40
Merge pull request #7777 from markalle/IPCOP_shmat
adding op-codes for syscall ipc for shmat/shmdt
2020-06-08 15:09:17 -04:00
Ralph Castain
a879a16df5
Merge pull request #7794 from rhc54/topic/sy
Sync to PMIx and PRRTE master branches
2020-06-08 12:05:33 -07:00
Howard Pritchard
46d834d674
Merge pull request #7781 from hkuno/john.l.byrne/mca_btl_ofi_rcache_init
mtl_btl_ofi_rcache_init() before creating domain
2020-06-08 13:01:45 -06:00
Ralph Castain
ad8a567212
Sync to PMIx and PRRTE master branches
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-06-08 10:50:10 -07:00
Brian Barrett
ffa50d837a
Merge pull request #7786 from bwbarrett/dist/master-NEWS
dist: Update NEWS
2020-06-05 15:05:23 -07:00
Brian Barrett
50765ae5a2 dist: Update NEWS from release branches
We have been bad about updating the NEWS file in master with all
the changes that have gone into the release branches.  Patch up
NEWS with the changes from v3.0, v3.1, and v4.0 branches.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-05 11:14:34 -07:00
Brian Barrett
2e23893f04 dist: Fix character encodings in NEWS
The NEWS file had a mix of ISO-8859-1 and UTF-8 encodings, which
was making a mess of decoding the non-ASCII characters in the
file.  This patch unifies the NEWS file as a UTF-8 encoded file
and changes many of the places where we had ASCII-ified a persons
name.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-05 11:01:56 -07:00
Mark Allen
e8fab058da adding op-codes for syscall ipc for shmat/shmdt
These op codes used to be in bits/ipc.h but were removed in glibc in 2015
with a comment saying they should be defined in internal headers:
https://sourceware.org/bugzilla/show_bug.cgi?id=18560
and when glibc uses that syscall it seems to do so from its own definitions:
https://github.com/bminor/glibc/search?q=IPCOP_shmat&unscoped_q=IPCOP_shmat

So I think using #ifndef and defining them if they're not already defined
using the values from glibc is the best option.

At IBM it was the testing on redhat 8 that found this as an issue
(the opcodes being undefined on the system made it select the
left undefined so shmat/shmdt memory events went unintercepted).

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2020-06-04 14:20:40 -04:00
Jeff Squyres
68282a15f4
Merge pull request #7780 from mwheinz/mwheinz-7779
Add minimum library version needed to use PSM2 in OMPI #7779
2020-06-03 15:07:07 -04:00
Harumi Kuno
f1b21cb776 mtl_btl_ofi_rcache_init() before creating domain
mtl_btl_ofi_rcache_init() initializes patcher which should only take
place things are single threaded.  OFI providers may start spawn threads,
so initialize the rcache before creating OFI objects to prevent races.

Authored-by: John L. Byrne <john.l.byrne@hpe.com>
Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
2020-06-03 09:56:29 -06:00
Michael Heinz
fcabd349e4 Add minimum library version needed to use PSM2 in OMPI #7779
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
2020-06-03 11:19:58 -04:00
Simon Byrne
27a2ed8cba Fix language in CUDA error
Removes a malapropism (passed should be past), and hopefully makes it a bit clearer.

Signed-off-by: Simon Byrne <simonbyrne@gmail.com>
2020-06-02 13:25:31 -07:00
Brian Barrett
0a21a58f08
Merge pull request #7771 from dancejic/multi
common/ofi: Fixing compilation issue with ofi versions that do not support fi_info.nic
2020-06-01 18:42:07 -07:00
Nikola Dancejic
ae2a447b0e common/ofi: Fixing compilation issue with ofi versions that do not support fi_info.nic
Added the flag OPAL_OFI_PCI_DATA_AVAILABLE to remove accessing the nic
object in
fi_info when the ofi version does not support that structure.

Signed-off-by: Nikola Dancejic dancejic@amazon.com
2020-06-01 23:14:41 +00:00
Howard Pritchard
c074a23e8f
Merge pull request #7675 from hppritcha/topic/fix_issue_7578
rework argobots configury to be smarter
2020-06-01 14:02:32 -06:00
Gilles Gouaillardet
1036eca117
Merge pull request #7773 from ggouaillardet/topic/opal_str_to_bool
opal/util: fix opal_str_to_bool()
2020-06-01 10:15:16 +09:00
Gilles Gouaillardet
c450b21405 opal/util: fix opal_str_to_bool()
correctly use strlen(char *) instead of sizeof(char *)

Thanks Georg Geiser for reporting this issue.

Refs. open-mpi/ompi#7772

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-05-30 20:47:41 +09:00
Austen Lauria
d5c4f6b92a
Merge pull request #7770 from hoopoepg/topic/fixed-typo-in-hcoll-var-desc
OMPI/HCOLL: fixed typo in vars description
2020-05-29 14:33:15 -04:00
Sergey Oblomov
df0f2ac026 OMPI/HCOLL: fixed typo in vars description
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2020-05-29 20:13:35 +03:00
Ralph Castain
7c9da91362
Merge pull request #7767 from rhc54/topic/syn
Sync to PMIx and PRRTE masters
2020-05-26 20:48:21 -07:00
Ralph Castain
b27db0e2a3
Sync to PMIx and PRRTE masters
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-26 20:11:14 -07:00
Howard Pritchard
e2948c96bc
Merge pull request #7761 from hppritcha/topic/fix_issue_7755
OFI common: set include list explicitly to NULL
2020-05-26 06:43:14 -06:00
Howard Pritchard
b9498ec31b rework argobots configury to be smarter
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-05-23 14:46:41 -07:00
Howard Pritchard
45b643d0cf OFI common: set include list explicitly to NULL
related to #7755

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2020-05-23 14:05:29 -06:00
Jeff Squyres
4d0c23c029
Merge pull request #7760 from jsquyres/pr/remove-stale-lt-init-options
configure.ac: remove stale LT_INIT options
2020-05-23 09:24:55 -04:00
Jeff Squyres
62c9a25bea configure.ac: remove stale LT_INIT options
1. We haven't used the -dlopen or -preopen options for years (if
   ever?); no need for the `dlopen` LT_INIT option.
2. We haven't supported Windows for years; no need for the `win32-dll`
   LT_INIT option.

Also, this commit includes a minor fix to a comment.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-05-22 12:42:59 -07:00