1
1

31121 Коммитов

Автор SHA1 Сообщение Дата
Joseph Schuchart
d11f625ed5 SPC: allow counters to be attached solely through MPI_T and reduce overhead
- only make MCA parameters available if SPC is enabled

- do not compile SPC code if SPC is disabled

- move includes into ompi_spc.c

- allow counters to be enabled through MPI_T without setting MCA parameter

- inline counter update calls that are likely in the critical path

- fix test to succeed even if encountering invalid pvars

- move timer_[start|stop] to header and move attachment info into ompi_spc_t

There is no need to store the name in the ompi_spc_t struct too, we can use that space
for the attachment info instead to avoid accessing another cache line.

- make timer/watermark flags a property of the spc description

This is meant to making adding counters easier in the future by
centralizing the necessary information. By storing a copy of these flags
in the ompi_spc_t structure (without adding to its size) reduces
cache pollution for timer/watermark events.

- allocate ompi_spc_t objects with cache-alignment

This prevents objects from spanning multiple cache lines and thus
ensures that only one cache line is loaded per update.

- fix handling of timer and timer conversion

- only call opal_timer_base_get_cycles if necesary to reduce overhead

- Remove use of OPAL_UNLIKELY to improve code generated by GCC

It appears that GCC makes less effort in optimizing the unlikely path
and generates bloated code.

- Allocate ompi_spc_events statically to reduce loads in critical path

- duplicate comm_world only when dumping is requested

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-11-12 21:17:56 +01:00
Raghu Raja
30831fb7f0
Merge pull request #8186 from devreal/fix-tuned-dynamic
Fix some issues with dynamic algorithm selection in coll/tuned
2020-11-12 11:20:57 -08:00
Ralph Castain
d489030925
Merge pull request #8199 from rhc54/topic/locality
Fix confusion between cpuset and locality
2020-11-11 10:22:03 -08:00
Joseph Schuchart
a15e5dc7f0 COLL TUNED: remove stray selection of linear algs for alreduce and allgather
These selections seem harmful in my measurements and don't seem to be
motivated by previous measurement data.

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-11-11 18:40:24 +01:00
Ralph Castain
2f7f1feca5
Fix confusion between cpuset and locality
Ensure we correctly collect and save the cpuset of the process
separately from its locality string. Ensure we use the correct one when
computing things like relative locality between processes.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-11-10 16:41:00 -08:00
Jeff Squyres
57ccb830c2
Merge pull request #8197 from Webcretaire/typo
Typo in error message for ompi_proc_world
2020-11-10 16:28:59 -05:00
Jeff Squyres
fd59b46a0b
Merge pull request #8191 from jsquyres/pr/markdown-ftw
Convert README files to Markdown
2020-11-10 15:09:24 -05:00
Jeff Squyres
c960d292ec Convert all README files to Markdown
A mindless task for a lazy weekend: convert all the README and
README.txt files to Markdown.  Paired with the slow conversion of all
of our man pages to Markdown, this gives a uniform language to the
Open MPI docs.

This commit moved a bunch of copyright headers out of the top-level
README.txt file, so I updated the relevant copyright header years in
the top-level LICENSE file to match what was removed from README.txt.

Additionally, this commit did (very) little to update the actual
content of the README files.  A very small number of updates were made
for topics that I found blatently obvious while Markdown-izing the
content, but in general, I did not update content during this commit.
For example, there's still quite a bit of text about ORTE that was not
meaningfully updated.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Co-authored-by: Josh Hursey <jhursey@us.ibm.com>
2020-11-10 13:52:29 -05:00
Jeff Squyres
686c2142e2 ompi/mca/common/monitoring: add x perms to Perl scripts
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-11-10 13:52:28 -05:00
Jeff Squyres
76a3f43459 Remove some stale contrib scripts
All infrastructure code has long-since moved to the ompi-scripts git
repo.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-11-10 13:52:28 -05:00
Julien EMMANUEL
46ce4ad226 Typo in error message for ompi_proc_world
It seems like a copy / paste error from the "else"

Signed-off-by: Julien EMMANUEL <julien.emmanuel@inria.fr>
2020-11-10 17:43:14 +01:00
Jeff Squyres
cb3d275ac0
Merge pull request #8116 from ggouaillardet/topic/fortran_real128
configury: enhance the check for ISO_FORTRAN_ENV module
2020-11-08 14:25:52 -05:00
Raghu Raja
7a922c8774
Merge pull request #8177 from rajachan/coverity-fixes
Coverity fixes for recent OFI changes
2020-11-06 08:48:29 -08:00
Jeff Squyres
3ea0658f4d
Merge pull request #8185 from jsquyres/pr/fix-getdate-woes
config/Makefile.am: ensure getdate.sh is in dist tarball
2020-11-05 15:10:03 -05:00
Jeff Squyres
a784a8431f PRRTE / OpenPMIx: update git submodule pointers
Update to the latest master HEAD for both PRRTE and OpenPMIx to fix
some getdate.sh issues.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-11-05 11:33:47 -08:00
Joseph Schuchart
22e289b742 coll/tuned: fix minor errors in comments
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-11-05 18:32:47 +01:00
Joseph Schuchart
04d198fc9f coll/tuned: don't select algorithms knowing when it's clear they would fall back to linear
Bcast: scatter_allgather and scatter_allgather_ring expect N_elem >= N_procs
Allreduce: rabenseifner expects N_elem >= pow2 nearest to N_procs

In all cases, the implementations will fall back to a linear implementation,
which will most likely yield the worst performance (noted for 4B bcast on 128 ranks)

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-11-05 18:32:12 +01:00
Joseph Schuchart
7261255b8d coll/tuned: Mark global static algorithm as const
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-11-05 18:25:59 +01:00
Joseph Schuchart
06f605c1e1 coll/tuned: add hint about dynamic rules to mca parameters
The mca parameters coll_tuned_*_algorithm are ignored unless coll_tuned_use_dynamic_rules is true so mention that in the description.

Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
2020-11-05 18:20:24 +01:00
Jeff Squyres
a6a0d511f9 opal_functions.m4: add comment
No code or logic changes.

Add commit about why it's ok to use $srcdir here
(vs. $OMPI_TOP_SRCDIR).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-11-05 08:32:39 -08:00
Jeff Squyres
91a5af83cd config/Makefile.am: ensure getdate.sh is in dist tarball
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-11-05 07:54:22 -08:00
Ralph Castain
62a76c9e7d
Merge pull request #8145 from rhc54/topic/up
Update the PMIx and PRRTE pointers
2020-11-04 07:11:11 -08:00
Ralph Castain
cb908348cb
Update the PMIx and PRRTE pointers
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-11-04 06:24:45 -08:00
Gilles Gouaillardet
38be947f8b
Merge pull request #8180 from ggouaillardet/topic/avx512_pgi
op/avx: check for _mm512_mullo_epi64() AVX512 intrinsic
2020-11-04 15:27:33 +09:00
Gilles Gouaillardet
26e42f9a0c op/avx: check for _mm512_mullo_epi64() AVX512 intrinsic
PGI (20.4) compiler do not define this intrinsic, so only build
AVX512 support if _mm512_mullo_epi64() intrisic is defined.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-11-04 14:45:03 +09:00
Raghu Raja
917269b699 Coverity fixes for recent OFI changes
8017f12 introduced a new function to get the package rank of a process,
which had a pass-by-value signature (opal_process_info_t); and coverity
was not happy about it. This commit changes the signature to take a
reference to opal_process_info_t instead.

Signed-off-by: Raghu Raja <craghun@amazon.com>
2020-11-04 00:18:54 +00:00
Yossi Itigin
1f3e33441c
Merge pull request #8140 from hoopoepg/topic/pml-ucx-recv-improved-errhandling
PML/UCX: improved error processing in MPI_Recv
2020-11-03 13:08:42 +02:00
Gilles Gouaillardet
41a3850efb
Merge pull request #8167 from jsquyres/pr/gitignore-sym-linked-use-mpi-f08-files
.gitignore: ignore sym linked F08 profile bindings
2020-11-03 12:40:12 +09:00
Jeff Squyres
19f4fe95e8
Merge pull request #8168 from jsquyres/pr/different-compiler-default-search-paths
README: Provide example of differing linker search paths
2020-11-02 18:27:53 -05:00
Jeff Squyres
e9e5dab8b9
Merge pull request #8153 from dancejic/multi
Using package_rank to select between NIC of equal distance from the process.
2020-11-02 15:27:37 -05:00
Sergey Oblomov
eb9405d53f PML/UCX: improved error processing in MPI_Recv
- improved error processing in MPI_Recv implementation
  of pml UCX
- added error handling for pml_ucx_mrecv call

Signed-off-by: Sergey Oblomov <sergeyo@nvidia.com>
2020-11-02 11:25:28 +02:00
Nikola Dancejic
8017f12801 Using package_rank to select between NIC of equal distance from the process.
If PMIX_PACKAGE_RANK is available, uses this value to select between multiple
NIC of equal distance between the current process. If this value is not
available, try to calculate it by getting the locality string from each local
process and assign a package_rank. If everything fails, fall back to using
process_id.rank to select the NIC. This last case is not ideal, but has a small
chance of occuring, and causes an output to be displayed to notify that this is
occuring.

Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
2020-11-02 00:32:03 -08:00
Jeff Squyres
5d9a3c2839 README: Provide example of differing linker search paths
I ran into this exact case on MacOS (the C and Fortran compiler have
different default linker search paths).  Technically, we've always had
this problem, but it has just become a bit more likely for real people
to run into because we're now preferring the system-installed
Libevent.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-11-01 13:29:59 -05:00
Jeff Squyres
bd9d7f815f .gitignore: ignore sym linked F08 profile bindings
A recent commit made the use_mpi_f08 bindings sym link into their
profile directory (just like we do for C and other bindings) instead
of having standalone PMPI-ized copies of the bindings.  Make sure to
.gitignore the sym linked files.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-11-01 13:21:31 -05:00
Jeff Squyres
5b25a06c7d
Merge pull request #8165 from Fangcong-Yin/latest_pr
Convert 12 .3in files to md
2020-10-31 22:41:11 -04:00
Fangcong-Yin
7ee34e9c20 Convert 12 .3in files to md
Signed-off-by: Fangcong Yin <fyin2@nd.edu>
Convert MPI_Graph_create.3in - MPI_Group_intersection.3in to md files

Update ompi/mpi/man/man3/MPI_Grequest_complete.md

Signed-off-by: Fangcong Yin <fyin2@nd.edu>

Co-authored-by: Jeff Squyres <jsquyres@users.noreply.github.com>

Update ompi/mpi/man/man3/MPI_Group_excl.md

Signed-off-by: Fangcong Yin <fyin2@nd.edu>

Co-authored-by: Jeff Squyres <jsquyres@users.noreply.github.com>
2020-10-31 15:09:26 -04:00
Jeff Squyres
6f2a5b91cd
Merge pull request #8162 from Colton-K/pr/MPI_C_batch2
Converted batch 2 of MPI_C* (MPI_Comm_accept - MPI_Comm_dup_with_info)
2020-10-31 13:20:14 -04:00
Colton Kammes
ee3bd5859e Converted batch 2 of MPI_C* (MPI_Comm_accept - MPI_Comm_dup_with_info)
Signed-off-by: Colton Kammes <ckammes@nd.edu>
2020-10-31 10:02:37 -04:00
Jeff Squyres
fe03602d1f
Merge pull request #8163 from jsquyres/pr/keyval-parse-tweaks
Minor fix to keyval_parse
2020-10-31 09:42:12 -04:00
Jeff Squyres
c7b968ef20
Merge pull request #6903 from ggouaillardet/topic/use-mpi-f08-profile
fortran/use-mpi-f08: generates PMPI bindings from the MPI bindings
2020-10-31 09:41:49 -04:00
Jeff Squyres
8ed1d28fb4 keyval_parse.c: update whitespace/comments
Slightly improve comments and update some whitespace.

No code or logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-10-31 04:16:00 -07:00
Jeff Squyres
eac0ab5c3a keyval_parse.c: ensure to init values
Coverity complained about uninitialized variables; ensure that they
are initialized to 0 in all cases.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-10-31 04:11:49 -07:00
Jeff Squyres
400c005d04
Merge pull request #8161 from Fangcong-Yin/latest_pr
Convert MPI_Gather.3in - MPI_Get_version.3in to md
2020-10-30 21:13:54 -04:00
Jeff Squyres
b239715f93
Merge pull request #8158 from Colton-K/pr/MPI_C_first_half
Converted first half of MPI_C* (from MPI_Cancel to MPI_Close_port)
2020-10-30 21:13:42 -04:00
Fangcong-Yin
d85bf3ae1a Convert MPI_Gather.3in - MPI_Get_version.3in to md
Signed-off-by: Fangcong Yin (fyin2@nd.edu)
2020-10-30 20:03:59 -04:00
Colton Kammes
a08f8d30dd Converted first half of MPI_C* (from MPI_Cancel to MPI_Close_port)
Signed-off-by: Colton Kammes <ckammes@nd.edu>
2020-10-30 18:25:22 -04:00
Jeff Squyres
dca2058e2f
Merge pull request #8144 from devreal/fix_opal_add_to_env_str_alloc
OPAL: fix string buffer allocation for large env variables
2020-10-30 14:39:01 -04:00
Jeff Squyres
25e47411e0
Merge pull request #8159 from jsquyres/pr/fix-coll-and-adapt-warnings
coll/adapt and coll/han: fix trivial compiler warnings
2020-10-30 14:34:16 -04:00
Jeff Squyres
f813656d24
Merge pull request #8154 from Fangcong-Yin/latest_pr
Convert MPI_File_write_ordered.3in - MPI_Free_mem.3in to md
2020-10-30 10:44:23 -04:00
Jeff Squyres
ee405ccaa5 coll/adapt and coll/han: fix trivial compiler warnings
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-10-30 10:41:14 -04:00