1
1

5254 Коммитов

Автор SHA1 Сообщение Дата
Jordan Cherry
d7e7e3acb7 tcp btl: Fix multiple-link connection establishment.
Fix case where the btl_tcp_links MCA parameter is used to create multiple TCP connections between peers.
    Three issues were resulting in hangs during large message transfer:
      * The 2nd..btl_tcp_link connections were dropped during establishment because the per-process
        address check was binary, rather than a count
      * The accept handler would not skip a btl module that was already in use, resulting in all
        connections for a given address being vectored to a single btl
      * Multiple addresses in the same subnet caused connections to be
        stalled, as the receiver would always use the same (first) address
        found.  Binding the outgoing connection solves this issue
     *  Lastly fix race condition created by connections being started at the exact same time
        by accpeting connections not in the closed state, allowing endpoint_accept to resolve
        dispute

    Signed-off-by: Jordan Cherry <cherryj@amazon.com>
2018-02-27 16:36:44 +00:00
Nathan Hjelm
5380d7cce5 mpool/hugepage: add missing header
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-26 13:35:56 -07:00
Nathan Hjelm
38d9b10db8 rcache/base: update VMA tree to use opal_interval_tree_t
This commit replaces the current VMA tree implementation with one that
uses the new opal_interval_tree_t class. Since the VMA tree lock is no
longer used this commit also updates rcache/grdma and btl/vader to
take better care when searching for existing registrations.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-26 13:35:56 -07:00
Nathan Hjelm
7163fc98a0 opal/class: add a new class: opal_interval_tree_t
This commit adds a new class to opal: opal_interval_tree_t. This is a
thread-safe impelementation of a 1-dimensional interval tree. The data
structure is intended to provide a faster implementation of the
registration cache VMA tree.

The thread safety is provided by a relativistic red-black tree
implementation. This structure provides support for multiple-reader,
and single writer. There is one caveat, an item may appear in the tree
twice while the tree is being updated. Care needs to be taken to avoid
issues associated with this "feature". I don't anticipate a problem
with the current VMA tree usage.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-26 13:35:56 -07:00
Ralph Castain
60e6440603 Sync to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-02-19 09:20:13 -08:00
Jeff Squyres
b452991ad8 btl/usnic: missed a preprocessor check in d36648b
Missed updating one instance of `==` to `>=` in d36648b.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-02-19 07:03:05 -08:00
Jeff Squyres
d36648b547 btl/usnic: update BTL_VERSION handling
Follow-on to 8097d09858: now that BTL_VERSION is defined in btl.h, be
a little smarter about whether we define it or not.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-02-16 13:20:36 -08:00
Ralph Castain
8097d09858 Silence usnic warnings - BTL version has changed
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-02-16 10:00:18 -08:00
Nathan Hjelm
9aa21f4467
Merge pull request #4796 from hjelmn/btl_v3.1
opal/btl: add support for flushing RDMA/atomic operations
2018-02-15 12:44:18 -07:00
Nathan Hjelm
072a6a4850 opal/btl: add support for flushing RDMA/atomic operations
This commit adds a new optional function to the BTL module:
btl_flush. This function takes an optional BTL endpoint. When called
this function completes all outstanding RDMA and atomic operations
started prior to the call to btl_flush.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-13 12:49:51 -07:00
Gilles Gouaillardet
9121eb4ff9 opal/lifo: fix a ABA problem in opal_lifo_pop_atomic
that was introduced in open-mpi/ompi@11bb8b09a0

Fixes open-mpi/ompi#4784

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-02-09 14:48:54 +09:00
Ralph Castain
1a7dfd7d54 Sync to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-02-07 12:16:51 -08:00
Ralph Castain
9fe8153d38 Sync to IOF branch and continue fix of request for job info from unknown nspace
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 02400d30d79ce3c7e7e28f9a08f7062a5b6f4c51)
2018-02-03 19:56:35 -08:00
Howard Pritchard
1adf0873f7
Merge pull request #4766 from hppritcha/topic/squash_grdma_comp_warning
rcache/grdma: squash a compiler warning
2018-02-02 18:58:32 -07:00
Gilles Gouaillardet
43700faba1 pmix/ext3x: remove autogenerated ext3x.h header file
This header file was meant to be autogenerated, and for
some reasons, was never removed from the repository.
Update .gitignore as well

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 23:45:42 +09:00
Gilles Gouaillardet
8209fca842 pmix/ext3x: bring external component up-to-date with the embedded pmix3x
add the callback prototype for the upcoming PMIx_IOF_push() API

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:35:34 +09:00
Gilles Gouaillardet
0481277e93 pmix/ext3x: bring external component up-to-date with the embedded pmix3x
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:33:33 +09:00
Nathan Hjelm
bb212e0c94
Merge pull request #4767 from ggouaillardet/topic/vader_backing_file
btl/vader: make the backing file job specific
2018-01-30 21:27:02 -07:00
Gilles Gouaillardet
0285c63348 pmix/ext3x: generate component source when only static libraries are built
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:21:14 +09:00
Gilles Gouaillardet
611d7c2d27 btl/vader: make the backing file job specific
Since open-mpi/ompi@47fd2313ab
the backing file is now in /dev/shm by default. As a consequence,
the backing file name has to include the jobid so more than one job
can run at a time.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-30 16:52:51 +09:00
Howard Pritchard
c3cac6731f rcache/grdma: squash a compiler warning
tired of seeing this compiler warning

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2018-01-29 11:50:01 -07:00
Ralph Castain
a17df810ed Sync with PMIx iof rfc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 10:51:38 -08:00
Ralph Castain
e9cd7fd7e6 Update orte
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:53:43 -08:00
Ralph Castain
9fb80bd239 Update the opal/pmix base framework elements
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:37:52 -08:00
Ralph Castain
187352eb3d Update the PMIx external components
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:35:57 -08:00
Ralph Castain
a5679ef000 Update the PMIx 3.x component
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:34:44 -08:00
Yossi Itigin
7cee60346e opal_progress: check timer only once per 8 calls
Reading the system clock on every call to opal_progress() is an
expensive operation on most architectures, and it can negatively affect
the performance, for example of message rate benchmarks.

We change opal_progress() to read the clock once per 8 calls, unless
there are active users of the event mechanism.

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-01-16 19:18:53 +02:00
Ralph Castain
6216225bda Ensure cleanup of registered files/dirs
Resolve a race condition between registering for a file to be removed upon termination and actual creation of that file by providing attributes that identify whether the path is a file or directory. This removes the need for PMIx to detect the difference.

Refs #4686

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-11 11:05:30 -08:00
Ralph Castain
6dacf40a8c Ensure the epilog gets executed in PMIx server
If we abnormally terminate, then we still want any cleanups to be
executed.

Remove debug

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-10 18:28:05 -08:00
Gilles Gouaillardet
1a17cb3b1c opal/datatype: add opal_datatype_is_monotonic()
return true if the datatype has non-negative displacements and
monotonically nondecreasing, and false otherwise.

Thanks George for the guidance.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-09 18:05:14 +09:00
Ralph Castain
d620070c77 Correct the comment in the default MCA param template - we do not support a param called "component_path". The correct syntax is "mca_base_component_path"
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-05 08:46:44 -08:00
Nathan Hjelm
8b8aae372d opal/asm: add atomic min/max convenience functions
This commit adds atomic functions for min/max.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-01-02 08:38:36 -07:00
Gilles Gouaillardet
125169f057 opal/bitmap: fix opal_bitmap_set_bit()
Correctly reallocate the bitmap when needed

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-27 14:56:43 +09:00
Nathan Hjelm
39d598899b rcache/grdma: fix crash when part of a registration is unmapped
This commit fixes an issue when a registration is created for a large
region and then invalidated while part of it is in use.

References #4509

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-22 10:36:35 -07:00
Ralph Castain
d5471d7898 Silence warnings in optimized build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-20 12:00:28 -08:00
Ralph Castain
db8ebd33ad Fix the optnone attribute, add extension attribute
See how the various compilers handle these

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 19:18:53 -08:00
Nathan Hjelm
47fd2313ab btl/vader: move backing files into /dev/shm on Linux
This commit moves the backing files to /dev/shm to avoid limitations
that may be set on /tmp. The files are registered with pmix to ensure
they are cleaned up after an erroneous exit.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 48101278160672317ade352365592f56ef3b8977)
2017-12-18 07:09:18 -08:00
Ralph Castain
07427c6d89 Update to PMIx v3.0 PR for cleanup registration
If available, have apps use registration capability to cleanup their session directories. Setup capability for vader to register its shared memory file location - let someone familiar with that code do so.

Final cleanup to track uid/gid, update the opal/pmix API to pass flags for ignore and leave top directory alone

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 06:53:11 -08:00
Ralph Castain
5c4185abd8 Add the __optnone__ attribute to help avoid optimizing out MPIR_Breakpoint
Thanks to @kiranchandramohan for the suggestion

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-14 13:14:21 -08:00
Nathan Hjelm
d3fa1bbbb0 rcache/grdma: try to prevent erroneous free error messages
It is possible to have parts of an in-use registered region be passed
to munmap or madvise. This does not necessarily mean the user has made
an error but does mean the entire region should be invalidated. This
commit checks that the munmap or madvise base matches the beginning of
the cached region. If it does and the region is in-use then we print
an error. There will certainly be false-negatives where a user
unmaps something that really is in-use but that is preferrable to a
false-positive.

References #4509

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-12 09:18:39 -07:00
Nathan Hjelm
a82f761a4a btl/vader: change the way fast boxes are used
There were multiple paths that could lead to a fast box
allocation. One of them made little sense (in-place send) so it has
been removed to allow a rework of the fast-box send function. This
should fix a number of issues with hanging/crashing when using the
vader btl.

References #4260

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-11 10:38:33 -07:00
Gilles Gouaillardet
11e5f86bf8 mpool/base: plug a memory leak
set the key of all mpool_tree_item objects, so they can be retrieved
in mpool_base_free and then returned back to the
mca_mpool_base_tree_item_free_list free list.

Refs. open-mpi/ompi#4567

Thanks Philip Blakely for the bug report.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-07 09:06:25 +09:00
Nathan Hjelm
2e74befa13
Merge pull request #4565 from benmenadue/master
Use malloc instead of posix_memalign for small (<= sizeof(void *)) alignments
2017-12-05 16:35:24 -07:00
Nathan Hjelm
ad59b93266
Merge pull request #4566 from kawashima-fj/pr/arm64-atomic
opal/asm/arm64: Fix `opal_atomic_compare_exchange_*` bug
2017-12-05 16:34:51 -07:00
Nathan Hjelm
8e0e184bc9 opal/asm: fix compilation of 128-bit compare-exchange with gcc7
This commit removes eax and edx from the clobber list. Older versions
of gcc handled these ok but gcc 7 does not. They are not required as
eax and edx are specified in output constraints.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-05 15:42:47 -07:00
Ben Menadue
90fa8af10b Use correct alignment request in mca_mpool_base_alloc.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2017-12-06 07:02:17 +11:00
Nathan Hjelm
641bdc4ab7 opal/asm: fix 32-bit build
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-12-05 11:49:13 -07:00
KAWASHIMA Takahiro
08254e8b12 opal/asm/arm64: Fix opal_atomic_compare_exchange_* bug
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-12-05 15:57:29 +09:00
Ben Menadue
db3e25edad Update mca_mpool_base_alloc to use malloc instead of posix_memalign for alignment requests of <= sizeof(void *). This works around issue #4564.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2017-12-05 09:51:31 +11:00
Matias Cabral
2c86b8723d
Merge pull request #4510 from matcabral/mtl_psm2_shadow_vars
New flag for MCA parameters that allows a behaving with a default value of "unset".
2017-12-04 12:25:37 -08:00