1
1
Граф коммитов

30340 Коммитов

Автор SHA1 Сообщение Дата
Tomislav Janjusic
3d6bf9fd8e oshmem/ucx: improves spml ucx performance for multi-threaded
applications.

Improves multi-threaded performance by adding the option to create
multiple ucx workers in threaded applications.

Co-authored with:
Artem Y. Polyakov <artemp@mellanox.com>,
Manjunath Gorentla Venkata <manjunath@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-01-22 21:41:09 +02:00
Artem Polyakov
611a2bdf68
Merge pull request #7273 from janjust/master-oshmem-perf-progress
oshmem/ucx: Fix progress in iput/iget: periodically poke progress to prevent hardware stalls when using DCT transport.
2020-01-22 11:39:57 -08:00
Sylvain Didelot
01ae0c22b8 fs/ime: fix compilation errors due to missing header inclusion
OpenMPI doesn't compile anymore with IME because the header
file "ompi/mca/fs/base/base.h" needs to be include in every
file where mca_fs_base_get_mpi_err() is used.

Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>
2020-01-22 17:56:55 +01:00
Tomislav Janjusic
1b58e3d073 oshmem/ucx: Improves performance for non-blocking put/get operations.
Improves the performance when excess non-blocking operations are posted
by periodically calling progress on ucx workers.

Co-authored with:
Artem Y. Polyakov <artemp@mellanox.com>,
Manjunath Gorentla Venkata <manjunath@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-01-22 00:59:26 +02:00
Jeff Squyres
d14d8ad30d
Merge pull request #7326 from jsquyres/pr/make-opal-output-safe-before-opal-init
opal: ensure opal_gethostname() always returns a value
2020-01-21 16:14:14 -05:00
Jeff Squyres
f84a692af0
Merge pull request #7324 from jsquyres/pr/advance-hwloc-submodule
hwloc2: advance hwloc git submodule
2020-01-21 14:03:08 -05:00
William Zhang
e958f3cf22 btl tcp: Use reachability and graph solving for global interface matching
Previously we used a fairly simple algorithm in
mca_btl_tcp_proc_insert() to pair local and remote modules. This was a
point in time solution rather than a global optimization problem (where
global means all modules between two peers). The selection logic would
often fail due to pairing interfaces that are not routable for traffic.
The complexity of the selection logic was Θ(n^n), which was expensive.
Due to poor scalability, this logic was only used when the number of
interfaces was less than MAX_PERMUTATION_INTERFACES (default 8). More
details can be found in this ticket:
https://svn.open-mpi.org/trac/ompi/ticket/2031 (The complexity estimates
in the ticket do not match what I calculated from the function)
As a fallback, when interfaces surpassed this threshold, a brute force
O(n^2) double for loop was used to match interfaces.

This commit solves two problems. First, the point-in-time solution is
turned into a global optimization solution. Second, the reachability
framework was used to create a more realistic reachability map. We
switched from using IP/netmask to using the reachability framework,
which supports route lookup. This will help many corner cases as well as
utilize any future development of the reachability framework.

The solution implemented in this commit has a complexity mainly derived
from the bipartite assignment solver. If the local and remote peer both
have the same number of interfaces (n), the complexity of matching will
be O(n^5).

With the decrease in complexity to O(n^5), I calculated and tested
that initialization costs would be 5000 microseconds with 30 interfaces
per node (Likely close to the maximum realistic number of interfaces we
will encounter). For additional datapoints, data up to 300 (a very
unrealistic number) of interfaces was simulated. Up until 150
interfaces, the matching costs will be less than 1 second, climbing to
10 seconds with 300 interfaces. Reflecting on these results, I removed
the suboptimal O(n^2) fallback logic, as it no longer seems necessary.

Data was gathered comparing the scaling of initialization costs with
ranks. For low number of interfaces, the impact of initialization is
negligible. At an interface count of 7-8, the new code has slightly
faster initialization costs. At an interface count of 15, the new code
has slower initialization costs. However, all initialization costs
scale linearly with the number of ranks.

In order to use the reachable function, we populate local and remote
lists of interfaces. We then convert the interface matching problem
into a graph problem. We create a bipartite graph with the local and
remote interfaces as vertices and use negative reachability weights as
costs. Using the bipartite assignment solver, we generate the matches
for the graph. To ensure that both the local and remote process have
the same output, we ensure we mirror their respective inputs for the
graphs. Finally, we store the endpoint matches that we created earlier
in a hash table. This is stored with the btl_index as the key and a
struct mca_btl_tcp_addr_t* as the value. This is then retrieved during
insertion time to set the endpoint address.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-01-21 18:24:08 +00:00
Jeff Squyres
8c819e2a85 opal: ensure opal_gethostname() always returns a value
We initially thought it was a safe bet that opal_gethostname() would
never be called before opal_init().  However, it turns out that there
are some cases -- e.g., developer debugging -- where it is useful to
call opal_output() (which calls opal_gethostname()) before
opal_init().

Hence, we need to guarantee that opal_gethostname() always returns a
valid value.  If opal_gethostname() finds NULL in
opal_process_info.nodename, simply call the internal function to
initialize opal_process_info.nodename.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-01-21 09:56:52 -08:00
Jeff Squyres
ed753afbc0 hwloc2: advance hwloc git submodule
Advance to hwloc-2.1.0rc2-33-g38433c0f, which includes a .gitignore
update that we want here in Open MPI.

Be warned; this is actually 33 commits beyond the hwloc v2.1.0 tag.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-01-21 09:42:39 -08:00
Ralph Castain
1d0b87e170
Merge pull request #7317 from rhc54/topic/agen
Check the return from "chdir" to avoid infinite loop
2020-01-18 15:04:01 -07:00
Ralph Castain
7de5f76b18
Check the return from "chdir" to avoid infinite loop
When autogen attempts to change to a new directory while processing a
subdirectory, it can get into an infinite loop if that directory
doesn't exist as it will remain in the top-level directory, see itself
there (as "autogen.pl"), and re-execute itself. Check the return code on
"chdir" and error out if it fails.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-01-18 11:35:08 -08:00
Austen Lauria
df9745a251
Merge pull request #7299 from awlauria/fix_warnings
Fix some compiler warnings.
2020-01-17 11:33:41 -05:00
Jeff Squyres
69bd54c83b
Merge pull request #7310 from itemko/artemry/azure_ci_updates_for_review
Fixed several Mellanox Open MPI CI issues.
2020-01-16 09:25:43 -05:00
Artem Ryabov
fd7e94022b Fixed several Mellanox Open MPI CI issues.
The following issues have been fixed:
- Corrected the link to CI status badge in README after some renaming.
- Updated agent capabilities.
- Switched to jenkins_scripts from master branch.
- Corrected support e-mail.

Signed-off-by: Artem Ryabov <artemry@mellanox.com>
2020-01-16 13:43:28 +03:00
Nathan Hjelm
037b0bd9ee
Merge pull request #7304 from hjelmn/btl_vader_fix_max_address_on_aarch64
btl/vader: modify how the max attachment address is determined
2020-01-15 17:02:33 -07:00
Nathan Hjelm
728d51f9f3 btl/vader: modify how the max attachment address is determined
This PR removes the constant defining the max attachment address and
replaces it with the largest address that shows up in /proc/self/maps.
This should address issues found on AARCH64 where the max address
may differ based on the configuration.

Since the calculated max address may differ between processes the
max address is sent as part of the modex and stored in the endpoint
data.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-14 15:15:36 -08:00
Nathan Hjelm
61f96b3d6d
Merge pull request #7283 from hjelmn/fix_issues_in_both_vader_and_opal_interval_tree_t_that_were_causing_issue_6524
Fix issues in both vader and opal interval tree t that were causing issue 6524
2020-01-14 14:54:51 -07:00
Jeff Squyres
b3fe523150
Merge pull request #7124 from devreal/fix-opal-align-min
Fix OPAL_ALIGN_MIN to work on 32-bit systems
2020-01-14 15:18:55 -05:00
Jeff Squyres
02f02e7c1e
Merge pull request #7300 from itemko/artemry/mellanox-ci-with-azure-pipelines-review
Reworked Mellanox Open MPI CI with Azure Pipelines
2020-01-14 14:11:34 -05:00
Artem Ryabov
98bfe87ee5 Reworked Mellanox OpenMPI CI with Azure Pipelines.
Signed-off-by: Artem Ryabov <artemry@mellanox.com>
2020-01-14 20:20:51 +03:00
Jeff Squyres
25931ea8bf
Merge pull request #7200 from cpshereda/master-opal_gethostname-change
Fix unsafe use of gethostname()
2020-01-13 16:07:43 -05:00
Jeff Squyres
887400c878
Merge pull request #7174 from jsquyres/pr/ofi-mtl-fi-version-bump
mtl/ofi: increase the FI_VERSION requested to 1.5 and make sure to check for OFI_LOCAL_COMM
2020-01-13 13:22:22 -05:00
Charles Shereda
cbc6feaab2 Created opal_gethostname() as safer gethostname substitute.
The opal_gethostname() function provides a more robust mechanism
to retrieve the hostname than gethostname(), which can return
results that are not null-terminated, and which can vary in its
behavior from system to system.

opal_gethostname() just returns the value in opal_process_info.nodename;
this is populated in opal_init_gethostname() inside opal_init.c.

-Changed all gethostname calls in opal subtree to opal_gethostname
-Changed all gethostname calls in orte subtree to opal_gethostname
-Changed all gethostname calls in ompi subdir to opal_gethostname
-Changed all gethostname calls in oshmem subdir to opal_gethostname
-Changed opal_if.c in test subdir to use opal_gethostname
-Changed opal_init.c to include opal_init_gethostname. This function
 returns an int and directly sets opal_process_info.nodename per
 jsquyres' modifications.

Relates to open-mpi#6801

Signed-off-by: Charles Shereda <cpshereda@lanl.gov>
2020-01-13 08:52:17 -08:00
Robert Wespetal
49128a7adb mtl/ofi: Add workaround for EFA local/remote capabilities bug
Some versions of Libfabric contain a bug in EFA where FI_REMOTE_COMM and
FI_LOCAL_COMM are not advertised. In order to workaround this, we need to call
fi_getinfo() without those capability bits to see if EFA is available first.

Also move around some of the provider include/exclude list logic so we can skip
this workaround if applicable.

Signed-off-by: Robert Wespetal <wesper@amazon.com>
2020-01-13 08:26:01 -08:00
Jeff Squyres
21bc9042e1 mtl/ofi: check for FI_LOCAL_COMM+FI_REMOTE_COMM
Make sure to get an RDM provider that can provide both local and
remote communication.  We need this check because some providers could
be selected via RXD or RXM, but can't provide local communication, for
example.

Add OPAL_CHECK_OFI_VERSION_GE() m4 macro to check that the Libfabric
we're building against is >= a target version.  Use this check in two
places:

1. MTL/OFI: Make sure it is >= v1.5, because the FI_LOCAL_COMM /
   FI_REMOTE_COMM constants were introduced in Libfabric API v1.5.
2. BTL/usnic: It already had similar configury to check for Libfabric
   >= v1.1, but the usnic component was checking for >= v1.3.  So
   update the btl/usnic configury to use the new macro and check for
   >= v1.3.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-01-13 08:19:53 -08:00
George Bosilca
05093f9cb1
Minor cleanup in the monitoring PML.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-01-13 09:24:00 -05:00
Austen Lauria
b65ec27307 Fix some compiler warnings.
Silence unused variables, incompatible pointer types,
un-initialized variables, and signed/unsigned comparisons.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-01-10 13:10:53 -05:00
Jeff Squyres
dd2d7d2866
Merge pull request #7189 from michaellass/fix-dims_create
dims_create: fix calculation of factors for odd squares
2020-01-10 09:47:09 -05:00
Jeff Squyres
c471f1cb0b
Merge pull request #7286 from jsquyres/pr/add-git-submodule-status-to-github-issue-template
Github issue template: updates
2020-01-08 12:53:57 -05:00
Jeff Squyres
a26a6c8e42 Github issue template: updates
1. Add more recent release version numbers in the examples
2. Add request for output from `git submodule status` when
   building/installing from a git clone

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-01-08 11:47:35 -05:00
Brian Barrett
f853971bc1
Merge pull request #6821 from jsquyres/pr/make-hwloc201-tarball-a-submodule
hwloc v2.1.0: use a git submodule
2020-01-08 07:41:37 -08:00
bosilca
2d659323b9
Merge pull request #7233 from bosilca/fix/unused_fortran_protected
Remove unused variable.
2020-01-08 08:04:06 -05:00
Nathan Hjelm
f86f805be1 btl/vader: fix issues with xpmem registration invalidation
This commit fixes an issue discovered in the XPMEM registration cache. It
was possible for a registration to be invalidated by multiple threads
leading to a double-free situation or re-use of an invalidated registration.

This commit fixes the issue by setting the INVALID flag on a registation
when it will be deleted. The flag is set while iterating over the tree
to take advantage of the fact that a registration can not be removed
from the VMA tree by a thread while another thread is traversing the VMA
tree.

References #6524
References #7030
Closes #6534

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-07 22:50:52 -07:00
Nathan Hjelm
1145abc0b7 opal: make interval tree resilient to similar intervals
There are cases where the same interval may be in the tree multiple
times. This generally isn't a problem when searching the tree but
may cause issues when attempting to delete a particular registration
from the tree. The issue is fixed by breaking a low value tie by
checking the high value then the interval data.

If the high, low, and data of a new insertion exactly matches an
existing interval then an assertion is raised.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2020-01-07 21:43:58 -07:00
Jeff Squyres
a7e2bb44d5
Merge pull request #7264 from jsquyres/pr/linux-specfile-fixes
openmpi.spec: update modulefile_path behavior
2020-01-07 17:50:49 -05:00
bosilca
1b93a173ab
Merge pull request #7149 from bosilca/fix/datatype_overflow
Prevent overflow when dealing with datatype count.
2020-01-07 14:13:32 -05:00
Jeff Squyres
9caf43a9c2
Merge pull request #7271 from e-kwsm/update-man
Fix typo and update URLs (https, redirection) [skip ci]
2020-01-06 16:41:32 -05:00
Eisuke Kawashima
d26d4e1d63
Fix typo and update URLs (https, redirection) [skip ci]
Signed-off-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
2020-01-07 03:52:25 +09:00
Howard Pritchard
b17cd9049f
Merge pull request #7238 from rwespetal/ofi-prov-name-fix
mtl/ofi: ignore case when comparing provider names
2020-01-02 12:23:39 -07:00
Howard Pritchard
37b3e2f3fa make mpifort obey disable-wrapper-runpath
related to #6539

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-01-02 11:19:05 -08:00
Jeff Squyres
352e575e18 openmpi.spec: update modulefile_path behavior
Allow the user to override the modulefile_path (root directory to
install the Open MPI modulefile), even if install_in_opt==1.  For
example:

rpmbuild \
    --rebuild \
    --define 'install_in_opt 1' \
    --define 'modulefile_path /path/to/my/modulefiles/openmpi/%{version}' \
    openmpi-4.0.2-1.src.rpm

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-30 06:53:56 -08:00
Jeff Squyres
a2a9a9516b hwloc2: bump up to hwloc v2.1.0
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
c292e759da hwloc2: bump up to hwloc 2.0.4
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
e5722acc37 hwloc201: replace with "hwloc2" component+git submodule
Rename the component to be "hwloc2" (since it can now be any v2.x.y
version of hwloc), and make the embedded copy of hwloc be a git
submodule.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 16:01:03 -08:00
Jeff Squyres
c32eeb3eb9 autogen: add sanity checks for git submodules
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-24 13:15:45 -05:00
Artem Polyakov
3f11c8ef6c
Merge pull request #7249 from janjust/oshmem_atomic_set_fix
Oshmem atomic set fix
2019-12-20 06:47:50 -08:00
Jeff Squyres
569d63ce46
Merge pull request #7252 from jsquyres/pr/clarify-with-hwloc-functionality
hwloc: clarify --with-hwloc behavior
2019-12-19 15:29:57 -05:00
Tomislav Janjusic
2d8f9b1d09 oshmem/extended: Fix shmem_atomic_set for float and double.
Co-authored with: Artem Polyakov <artemp@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-12-19 21:15:41 +02:00
Tomislav Janjusic
cb5ff55b27 oshmem/ucx: fixed a build issue
Co-authored with: Artem Polyakov <artemp@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-12-19 21:14:54 +02:00
Jeff Squyres
18c3e1af5e hwloc: clarify --with-hwloc behavior
Clarify in README what --with-hwloc does in its different use cases.

Also, ensure that the behavior when specifying `--with-hwloc` is the
same as if that option is not specified at all.  This is what we did
in Open MPI <= v3.x; looks like we inadvertantly caused `--with-hwloc`
to be synonymous with `--with-hwloc=external` in v4.0.0.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-12-19 08:38:57 -08:00