1
1

5560 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
9a2902c047
Force use of pmix/preg/native component
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-03 17:54:44 -07:00
Ralph Castain
2ebc1fa1b7
Update to current PMIx 4.0.0 (master)
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-03 14:46:55 -07:00
George Bosilca
41e6f55807
Small optimization on the datatype commit.
This patch fixes the merge of contiguous elements into larger but more
compact datatypes, and allows for contiguous elements to have thir
blocklen increasing instead of the count. The idea is to always maximize
the blocklen, aka. the contiguous part of the datatype.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-30 19:56:48 -04:00
Howard Pritchard
71e1fad4a9
Merge pull request #6855 from hkuno/hkuno/mmap_loop
Fix mmap infinite recurse in memory patcher
2019-08-26 14:28:53 -06:00
Howard Pritchard
5a3646fd1d
Merge pull request #6884 from guserav/reintroduce-common-ofi
Reintroduce common ofi
2019-08-23 13:12:25 -06:00
Sergey Oblomov
182023febb SPML/UCX: fixed hang in SHMEM_FINALIZE
- used MPI _Barrier to synchronize processes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-21 12:04:46 +03:00
guserav
56c3d9a238 common/ofi: Set HPE as owner of component
Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-20 10:13:02 -07:00
George Bosilca
904276bb44 Fix the variable names used for the datatype dump.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-14 10:59:50 -04:00
George Bosilca
daf4338c31 Fix the stack displacement.
Fixes the convertor iovec description on the MPI-IO reported by Edgar.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-08-14 01:16:30 -04:00
guserav
8a67a95c99 common/ofi: Fix open-mpi/ompi#2519
As discussed in open-mpi/ompi#2519 the common component does not depend
on libfabric yet. This commit introduces this dependency by just calling
fi_version().

Signed-off-by: guserav <erik.zeiske@hpe.com>
2019-08-12 16:17:59 -07:00
guserav
0e25c95eae common/ofi: Fix check for OFI in build files
The changes made in f5e1a672ccd5db127e85e1e8f6bcfeb8a8b04527
have been done after the common/ofi component was removed and thus the
component doesn't reflect the changes made their.

Namely f5e1a672ccd5db127e85e1e8f6bcfeb8a8b04527 changed:
- How to call OPAL_CHECK_OFI (It sets opal_ofi_happy to yes now)
- Dropped the common part in the build flags for ofi

Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-12 16:15:23 -07:00
guserav
4ad78aaa15 Revert "Remove opal/mca/common/ofi."
This reverts commit dd20174532928e0c9cdbe7b206868e6e4bea9d0b.

Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-09 10:33:21 -07:00
Yossi Itigin
ec9def1406
Merge pull request #6864 from hoopoepg/topic/ucx-ppn-hint
UCX: added PPN hint for UCX context
2019-08-07 13:45:38 +03:00
Brian Barrett
827a2bcc3d
Merge pull request #6852 from wckzhang/opalifnamesize
opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
2019-08-05 16:16:05 -07:00
Sergey Oblomov
43186e494b UCX: added PPN hint for UCX context
- added PPN hint for UCX context init

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-05 18:07:06 +03:00
Harumi Kuno
fca8436064 Fix mmap infinite recurse in memory patcher
This commit fixes issue #6853 by removing
MacOS/Darwin-specific logic from intercept_mmap.
It also opportunistically converts tabs to spaces.

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
2019-08-01 10:18:25 -07:00
William Zhang
4ebb37a26c opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
Due to IF_NAMESIZE being a reused and conditionally defined macro,
issues could arise from macro mismatches. In particular, in cases where
opal/util/if.h is included, but net/if.h is not, IF_NAMESIZE will be 32.
If net/if.h is included on Linux systems, IF_NAMESIZE will be 16. This
can cause a mismatch when using the same macro on a system. Thus
different parts of the code can have differring ideas on the size of a
structure containing a char name[IF_NAMESIZE]. To avoid this error case,
we avoid reusing the IF_NAMESIZE macro and instead define our own as
OPAL_IF_NAMESIZE.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-29 21:24:39 +00:00
Ralph Castain
c5c93e3391
Update to PMIx master
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-07-29 12:20:20 -07:00
bosilca
94f26f5a51
Merge pull request #6695 from bosilca/fix/vector_stride_0
A big refresh of the datatype engine
2019-07-23 15:20:14 -04:00
Ralph Castain
d202e10c14
Provide locality for all procs on node
Update PMIx to latest master to get supporting updates. For
connect/accept (part of comm_spawn as well), lookup locality for all
participating procs on the node and compute the relative locality so it
can be used for MPI operations.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-07-22 09:23:38 -07:00
Brian Barrett
41c2007af5
Merge pull request #6820 from wckzhang/cleanup
btl tcp: Fix error path memory leak
2019-07-16 15:50:32 -07:00
Gilles Gouaillardet
06c6325bc8
Merge pull request #6822 from ggouaillardet/topic/pmix_refresh
pmix/pmix4x: refresh to the latest PMIx master
2019-07-16 12:56:31 +09:00
Gilles Gouaillardet
4510711e95 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@03a8b5daab

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-07-16 11:31:20 +09:00
William Zhang
8c3b8a87c5 btl tcp: Fix error path memory leak
After the OPAL_MODEX_RECV call, remote_addrs was not freed in the error
path. Moved the free call into cleanup to ensure we always free this
memory before leaving the function.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-15 22:35:04 +00:00
William Zhang
c0c3e1b540 reachable: Update documentation on reachable function
Added information on the type of objects provided in the list as well as
the required fields for them.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-11 21:36:12 +00:00
William Zhang
c9214cc53e reachable: Change list name from *_if to *_ifs
The parameter names were misleading due to implying a single interface
instead of a list. This will provide more clarity in distinguishing the
list of interfaces from each individual interface.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-11 21:33:24 +00:00
George Bosilca
aa17392309
Optimize the pack/unpack.
Start optimizing the code.

This commit divides the operations in 2 parts, the first, outside the
critical part, deals with partial blocks of predefined elements, and the
second, inside the critical path, only deals with full blocks of
elements. This reduces the number of expensive operations in the
critical path and results in a decent performance increase.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-11 14:49:29 -04:00
Gilles Gouaillardet
d0dc6218ba
Merge pull request #6709 from ggouaillardet/topic/opal_output_verbose
opal/util: revamp opal_output_verbose()
2019-07-11 11:12:03 +09:00
George Bosilca
3562d70679
Get rid of the division in the critical path.
Amazing how a bad instruction scheduling can have such a drastic impact
on the code performance. With this change, the get a boost of at least
50% on the performance of data with a small blocklen and/or count.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-10 00:28:29 -04:00
George Bosilca
a80255235a
Rework the datatype commit.
Optimize contiguous loops by collapsing them into a single element.
During datatype optimization collapse similar elements into larger
blocks.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:09 -04:00
George Bosilca
9ff15efac8
Optimize the position placement.
Upon detecting a datatype loop representation skip the entire loop
according the the remaining space.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:09 -04:00
George Bosilca
0a24f0374e
Small improvements on the test.
Rework the to_self test to be able to be used as a benchmark.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:09 -04:00
George Bosilca
75a53976a3
Disable checksum.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:09 -04:00
George Bosilca
46ddf5460d
Clean and sync the pack and unpack functions.
- optimize handling of contiguous with gaps datatypes.
- fixes a performance issue for all datatypes with a count of 1.
- optimize the pack/unpack of contiguous with gaps datatype.
- optimize the case of blocklen == 1

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:09 -04:00
George Bosilca
d335eea18f
Optimize the raw representation.
Merge contiguous iov in order to minimize the number of returned iovec.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:08 -04:00
George Bosilca
f25674291b
Optimized datatype description.
Move toward a base type of vector (count, type, blocklen, extent, disp)
with disp and extent applying toward the count repertition and blocklen
being a contiguous memory of type type.
Implement 2 optimizations on this description used during type_commit:
- collapse: successive similar datatype descriptions are collapsed
together with an increased count.
- fusion: fuse successive datatype descriptions in order to minimize the
number of resulting memcpy during pack/unpack.

Fixes at the OMPI datatype level including:
 - Fix the create_hindexed and vector creation.
 - Fix the handling of [get|set]_elements and _count.
 - Correctly compute the dispacement for block indexed types.
 - Support the MPI_LB and MPI_UB deprecation, aka. OMPI_ENABLE_MPI1_COMPAT.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-07-09 14:50:08 -04:00
Jeff Squyres
650dd3e4cf opal/if/linux_ipv6: unify gethostname() call behavior
Unfortunately, https://github.com/open-mpi/ompi/pull/6797 was merged
before all feedback was received (39b799d936).  This PR is a minor
addendum to that commit.

This PR simply removes a meaningless `= {0}` operation.

The use of gethostname() here -- and many other places in the code
base -- is technically unsafe.  See
https://github.com/open-mpi/ompi/issues/6801 for a further description
of the issue and a suggested fix.  But the risk is quite low;
real-world hostnames are usually much shorter than
OPAL_MAXHOSTNAMELEN.  Hence, this PR just removes the meaningless
operation and leaves a real fix for gethostname() usage to a potential
future PR.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-07-09 10:38:24 -07:00
Orivej Desh
39b799d936 Fix if_linux_ipv6 verbose output of interface addresses
Previously the verbose output of if_linux_ipv6_open looked like this:

    found interface ab c: 0ab: a b: abc: 0 0: a 0🔡 0 0 scope 0

This changes the output to:

    found interface eth0 inet6 ab0c🆎a0b🔤0:a00:abcd:0 scope 0

Signed-off-by: Orivej Desh <orivej@gmx.fr>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-07-09 05:45:15 -07:00
Gilles Gouaillardet
d2876fa4fd opal/util: revamp opal_output_verbose()
A typical parameter of opal_output_verbose() is ORTE_NAME_PRINT(...),
which is an expensive macro.
Most of the time, this is unnecessary since the verbosity level is too high.

Make opal_output_verbose() a macro so such arguments are only evaluated if the
verbosity is low enough.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-07-09 11:24:31 +09:00
Gilles Gouaillardet
63aa156bb0 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@99971222ce

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-27 09:35:49 +09:00
Gilles Gouaillardet
5679a88867 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@f67efc835c

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-24 10:17:23 +09:00
Ralph Castain
d4070d5f58
Fix finalize of flux component
Per patches from @SteVwonder and @garlick

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-06-18 21:14:04 -07:00
Gilles Gouaillardet
d9326ff2ca pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@186dca196c

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-10 15:17:43 +09:00
markalle
008ab98946
Merge pull request #6531 from markalle/patcher_additions
shmat/shmdt additions for patcher
2019-05-30 12:16:05 -05:00
Nathan Hjelm
8961daae4a opal/atomic: work around memory barrier bug in older gcc
This commit fixes an issue seem with some older versions of gcc
(verified to occur in gcc 6.x) where on x86_64 systems the
acquire memory barrier in C11 atomics acts as a no-op. On these
systems the three memory barriers should all be equivalent.

This is related to the error fixed in open-mpi/ompi@30119ee.

References #6655.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2019-05-30 06:58:28 -07:00
Nathan Hjelm
b78066720c btl/uct: add support for UCX 1.6.x
This commit updates the uct btl to support the v1.6.x release of
UCX. This release breaks API.

Signed-off-by: Nathan Hjelm <hjelmn@cs.unm.edu>
2019-05-21 04:31:57 -06:00
Yossi Itigin
84ae05c7bc
Merge pull request #6675 from hoopoepg/topic/ucx-common-init-patcher-on-hooks-used-only
COMMON/UCX: init memhooks infra on external hooks only
2019-05-16 22:35:32 +03:00
Sergey Oblomov
ebc457baf5 COMMON/UCX: removed ucs stuff
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 20:56:30 +03:00
Sergey Oblomov
a0a9306066 COMMON/UCX: init memhooks infra on external hooks only
- initialize memory hooks infrastructure only in case
  if external memory hooks are requested

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 20:13:16 +03:00
Nathan Hjelm
3e1dd36241 btl/uct: check for support before disabling UCX memory hooks
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2019-05-15 13:49:10 -06:00