1
1

30358 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
f86f805be1 btl/vader: fix issues with xpmem registration invalidation
This commit fixes an issue discovered in the XPMEM registration cache. It
was possible for a registration to be invalidated by multiple threads
leading to a double-free situation or re-use of an invalidated registration.

This commit fixes the issue by setting the INVALID flag on a registation
when it will be deleted. The flag is set while iterating over the tree
to take advantage of the fact that a registration can not be removed
from the VMA tree by a thread while another thread is traversing the VMA
tree.

References #6524
References #7030
Closes #6534

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-07 22:50:52 -07:00
Nathan Hjelm
1145abc0b7 opal: make interval tree resilient to similar intervals
There are cases where the same interval may be in the tree multiple
times. This generally isn't a problem when searching the tree but
may cause issues when attempting to delete a particular registration
from the tree. The issue is fixed by breaking a low value tie by
checking the high value then the interval data.

If the high, low, and data of a new insertion exactly matches an
existing interval then an assertion is raised.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-07 21:43:58 -07:00
Jeff Squyres
a7e2bb44d5
Merge pull request #7264 from jsquyres/pr/linux-specfile-fixes
openmpi.spec: update modulefile_path behavior
2020-01-07 17:50:49 -05:00
bosilca
1b93a173ab
Merge pull request #7149 from bosilca/fix/datatype_overflow
Prevent overflow when dealing with datatype count.
2020-01-07 14:13:32 -05:00
Jeff Squyres
9caf43a9c2
Merge pull request #7271 from e-kwsm/update-man
Fix typo and update URLs (https, redirection) [skip ci]
2020-01-06 16:41:32 -05:00
Eisuke Kawashima
d26d4e1d63
Fix typo and update URLs (https, redirection) [skip ci]
Signed-off-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
2020-01-07 03:52:25 +09:00
Howard Pritchard
b17cd9049f
Merge pull request #7238 from rwespetal/ofi-prov-name-fix
mtl/ofi: ignore case when comparing provider names
2020-01-02 12:23:39 -07:00
Howard Pritchard
37b3e2f3fa make mpifort obey disable-wrapper-runpath
related to #6539

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-01-02 11:19:05 -08:00
Jeff Squyres
352e575e18 openmpi.spec: update modulefile_path behavior
Allow the user to override the modulefile_path (root directory to
install the Open MPI modulefile), even if install_in_opt==1.  For
example:

rpmbuild \
    --rebuild \
    --define 'install_in_opt 1' \
    --define 'modulefile_path /path/to/my/modulefiles/openmpi/%{version}' \
    openmpi-4.0.2-1.src.rpm

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-30 06:53:56 -08:00
Jeff Squyres
a2a9a9516b hwloc2: bump up to hwloc v2.1.0
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
c292e759da hwloc2: bump up to hwloc 2.0.4
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
e5722acc37 hwloc201: replace with "hwloc2" component+git submodule
Rename the component to be "hwloc2" (since it can now be any v2.x.y
version of hwloc), and make the embedded copy of hwloc be a git
submodule.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
c32eeb3eb9 autogen: add sanity checks for git submodules
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 13:15:45 -05:00
Artem Polyakov
3f11c8ef6c
Merge pull request #7249 from janjust/oshmem_atomic_set_fix
Oshmem atomic set fix
2019-12-20 06:47:50 -08:00
Jeff Squyres
569d63ce46
Merge pull request #7252 from jsquyres/pr/clarify-with-hwloc-functionality
hwloc: clarify --with-hwloc behavior
2019-12-19 15:29:57 -05:00
Tomislav Janjusic
2d8f9b1d09 oshmem/extended: Fix shmem_atomic_set for float and double.
Co-authored with: Artem Polyakov <artemp@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-12-19 21:15:41 +02:00
Tomislav Janjusic
cb5ff55b27 oshmem/ucx: fixed a build issue
Co-authored with: Artem Polyakov <artemp@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-12-19 21:14:54 +02:00
Jeff Squyres
18c3e1af5e hwloc: clarify --with-hwloc behavior
Clarify in README what --with-hwloc does in its different use cases.

Also, ensure that the behavior when specifying `--with-hwloc` is the
same as if that option is not specified at all.  This is what we did
in Open MPI <= v3.x; looks like we inadvertantly caused `--with-hwloc`
to be synonymous with `--with-hwloc=external` in v4.0.0.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-19 08:38:57 -08:00
Jeff Squyres
8b424c3863
Merge pull request #7232 from bosilca/hjelmn_neighbor_alltoall_fix
Neighbor alltoall fix
2019-12-17 17:24:05 -05:00
Robert Wespetal
9b72e9465d mtl/ofi: ignore case when comparing provider names
Change the provider include and exclude list name comparison check to
ignore case. The UDP provider's name is uppercase and was being selected
despite being in the exclude list.

Signed-off-by: Robert Wespetal <wesper@amazon.com>
2019-12-16 13:05:00 -08:00
George Bosilca
97eb5d0cf2 Remove unused variable.
It got left over during the Fortran rework.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-12-11 16:59:50 -05:00
George Bosilca
86acdee460
Fix the communication ordering for all cartesian neighbor collectives.
This work is rooted in the [MPI Forum issue
153](https://github.com/mpi-forum/mpi-issues/issues/153).

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2019-12-11 12:40:38 -05:00
Todd Kordenbrock
1af6dbe277
Merge pull request #7066 from tkordenbrock/topic/master/portals4.fix.flowcontrol.bugs
portals4: fix flow control bugs
2019-12-11 06:31:26 -06:00
Jeff Squyres
cf9a5fb06c
Merge pull request #7226 from mcoil1/pr/fix-memory_patcher
memory/patcher: fix compiler warning
2019-12-09 07:26:31 -05:00
Jeff Squyres
73ecece55d
Merge pull request #7211 from mcoil1/pr/fixing-compiler-warnings
Fix a few compiler warnings
2019-12-09 07:26:15 -05:00
Jeff Squyres
fb04596892
Merge pull request #7225 from wbailey2/pr/fix-fcoll_two_phase_support
fcoll/two_phase_support: warning stomp
2019-12-08 14:09:10 -05:00
Maxwell Coil
52241dbbcd libnbc: fixed uninitialized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-08 14:03:48 -05:00
Maxwell Coil
3ced33c2eb ompi/dpm/dpm.c: Fix uninititalized variable
Squash compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-08 14:03:48 -05:00
Maxwell Coil
52a9cce6f3 memory/patcher: fix compiler warning
syscall() returns a long, but we are invoking shmat(), which returns
a void*.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-08 13:56:00 -05:00
William Bailey
e2718e0196 fcoll/two_phase: Compiler warning for wrong variable type used
Squash compiler warning. Changed output specifier to match variable type (long int -> long long int).

Signed-off-by: William Bailey <wbailey2@nd.edu>
2019-12-08 13:15:30 -05:00
Jeff Squyres
cdf46e6682
Merge pull request #7222 from jsquyres/pr/mpool-basic-pointer-fix
mpool/base: fix basic mpool_base() function
2019-12-06 13:22:49 -05:00
Jeff Squyres
37f5079c12
Merge pull request #7219 from mcoil1/pr/romio_fix
romio: Update ADIOI_R_Exchange_data function
2019-12-05 19:30:55 -05:00
Jeff Squyres
53ebea12aa mpool/base: fix basic mpool_base() function
The prior implementation was simply wrong.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-05 18:57:37 -05:00
Maxwell Coil
8c237e2684 romio: Update ADIOI_R_Exchange_data function
Squash compiler warning due to whitespace/brace problems.

The code block from lines 829-839 was improperly indented, which led to
both the code being confusing and a compiler warning. Comparing this code to
the current version in the MPICH repo made it clear that the code was simply
improperly indented. Fixing the indentation both makes the code readable and
squashes the compiler warning.

Signed-off-by: Maxwell Coil <mcoil@nd.edu>
2019-12-05 18:24:02 -05:00
Jeff Squyres
9dd34294db
Merge pull request #7172 from wbailey2/master
Nidmap.c and onesided_aggregation.c Compiler Warnings fix
2019-12-03 11:01:12 -05:00
William Bailey
30bda56bce romio: fix uninitialized variable
Squash compiler warning.

ROMIO is third-party software but has an annoying compiler warning;
this is the minimum distance fix.

Signed-off-by: William Bailey <wbailey2@nd.edu>
2019-12-02 17:35:05 -05:00
William Bailey
f03e2f5e0c orte/util/nidmap.c: fix uninitialized variable
Squash compiler warning.

Signed-off-by: William Bailey <wbailey2@nd.edu>
2019-12-02 17:29:58 -05:00
William Zhang
a471f8749f reachable: Update documentation on reachable function
We have decided to show interfaces that are identical to itself as
reachable. This is consistent with the previous netmask logic when
determining reachability.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-12-02 18:04:40 +00:00
William Zhang
ce40436895 reachable/netlink: Show an interface as reachable to itself
Due to the way netlinks detects reachability, it will not show an
interface as reachable to itself, even if it can pass through a loopback
interface. To maintain similar behavior with netmasks, we display an
interface as reachable to itself.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-12-02 18:04:40 +00:00
Artem Polyakov
ff480706ca
Merge pull request #7065 from janjust/master
oshmem: fix race condition on new contexts
2019-11-30 11:07:10 -08:00
Joseph Schuchart
ee80babe5c Remove unused opal_shmem_seg_hdr_t to retain alignment
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-11-29 08:40:14 +01:00
Gilles Gouaillardet
40f2ec9abb
Merge pull request #7205 from ggouaillardet/topic/configury_shortfloat
configury: fix a typo in mpiext/shortfloat
2019-11-28 19:59:33 -07:00
Gilles Gouaillardet
967cf68027 configury: fix a typo in mpiext/shortfloat
remove an extra and useless comma

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-11-29 11:09:54 +09:00
Howard Pritchard
032a3961bc
Merge pull request #7184 from hppritcha/topic/support_for_cray_fortran
cray ftn: modify fortran module loc checker
2019-11-27 16:09:35 -07:00
Jeff Squyres
37d70a6213
Merge pull request #7153 from mcoil1/pr/fix-misleading-error-message
fix misleading error message with missing #! interpreter
2019-11-27 06:53:08 -05:00
Brice Goglin
ea80a20e10 hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 2.0
Both opal_hwloc_base_get_relative_locality() and _get_locality_string()
iterate over hwloc levels to build the proc locality information.
Unfortunately, NUMA nodes are not in those normal levels anymore since 2.0.
We have to explicitly look a the special NUMA level to get that locality info.

I am factorizing the core of the iterations inside dedicated "_by_depth"
functions and calling them again for the NUMA level at the end of the loops.

Thanks to Hatem Elshazly for reporting the NUMA communicator split failure
at https://www.mail-archive.com/users@lists.open-mpi.org/msg33589.html

It looks like only the opal_hwloc_base_get_locality_string() part is needed
to fix that split, but there's no reason not to fix get_relative_locality()
as well.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2019-11-27 12:41:33 +01:00
Josh Hursey
b50d568004
Merge pull request #7191 from gvallee/returncode_check
Add the missing code to check a return code
2019-11-26 09:02:23 -06:00
Josh Hursey
f828990bcf
Merge pull request #7192 from gvallee/contiaing_typo
Fix typo in comment: contiaing -> containing
2019-11-25 14:40:08 -06:00
Edgar Gabriel
eaf0828643
Merge pull request #7104 from edgargabriel/topic/gpfs
Topic/gpfs
2019-11-25 12:32:11 -06:00
raafatfeki
7b2d83c898 mca/fs: Remove unused functions and prototypes & reduce recurent code through all components
I removed the implementation and/or prototypes of all unused functions defined for all components.
To reduce recurrent code, I created functions under base for the management of error codes and setting of file permission and amode.
Then, I replaced these recurrent code by those function for all components.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>

add a missing header file

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2019-11-25 09:01:38 -06:00