1
1
Граф коммитов

30514 Коммитов

Автор SHA1 Сообщение Дата
Noah Evans
ee3517427e Add threads framework
Add a framework to support different types of threading models including
user space thread packages such as Qthreads and argobot:

https://github.com/pmodels/argobots

https://github.com/Qthreads/qthreads

The default threading model is pthreads.  Alternate thread models are
specificed at configure time using the --with-threads=X option.

The framework is static.  The theading model to use is selected at
Open MPI configure/build time.

mca/threads: implement Argobots threading layer

config: fix thread configury

- Add double quotations
- Change Argobot to Argobots
config: implement Argobots check

If the poll time is too long, MPI hangs.

This quick fix just sets it to 0, but it is not good for the
Pthreads version. Need to find a good way to abstract it.

Note that even 1 (= 1 millisecond) causes disastrous performance
degradation.

rework threads MCA framework configury

It now works more like the ompi/mca/rte configury,
modulo some edge items that are special for threading package
linking, etc.

qthreads module
some argobots cleanup

Signed-off-by: Noah Evans <noah.evans@gmail.com>
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-03-27 10:15:45 -06:00
Ralph Castain
b3f0bc5490
Merge pull request #7544 from rhc54/topic/buf
Re-enable stream buffering option
2020-03-18 18:25:02 -07:00
Ralph Castain
757621e199
Re-enable stream buffering option
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-18 16:20:57 -07:00
Ralph Castain
4bb13bc733
Merge pull request #7543 from rhc54/topic/up
Update PMIx and PRRTE
2020-03-18 14:03:27 -07:00
Ralph Castain
0dccd3378b
Update PMIx and PRRTE
PMIx
- fix several race conditions

PRRTE
- fix race condition
- extend prun-to-prte connection tries
- pass correct nspace to job ctrl in response to ctrl-c

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-18 11:46:38 -07:00
Ralph Castain
c3ff5958a5
Merge pull request #7540 from jsquyres/pr/fix-pmix-resetting-top-level-flags
opal_check_pmi.m4: properly save top-level flags
2020-03-17 20:22:42 -07:00
Jeff Squyres
cb1e424359 opal_check_pmi.m4: properly save top-level flags
CPPFLAGS, LDFLAGS, and LIBS were only being saved conditionally, but
restored unconditionally.  This could result in wiping out
CPPFLAGS/LDFLAGS/LIB.

Make sure to *always* save these flags so that when they are restored,
they are restored to their proper value.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-03-17 22:15:41 -04:00
Ralph Castain
b35d714131
Merge pull request #7537 from rhc54/topic/pxup
Update PMIx
2020-03-17 10:34:54 -07:00
Ralph Castain
972f6aea7f
Update PMIx
- Silence a few (valid) warnings

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-17 08:53:43 -07:00
Ralph Castain
ddc19559af
Merge pull request #7535 from rhc54/topic/rte
Cleanup singleton detection and data retrieval
2020-03-16 15:49:20 -07:00
Ralph Castain
6b4fb509e9
Cleanup singleton detection and data retrieval
Extend the PMIx modex recv macros to cover the full set of
immediate/optional combinations. If PMIx_Init cannot reach a server,
then declare the MPI proc to be a singleton.

Provide full support for info values via PMIx

Catch all the values used in the "info" area of OMPI using data
available from PMIx instead of via envars. Update PMIx and PRRTE to sync
with their capabilities.

PMIx
- ensure cleanup of fork/exec children
- fix bug in gds/hash that left app info off of list

PRRTE
- fix multi-app bugs
- port setup_child logic from orte
- OMPI env changes
- set app->first_rank
- ensure common hostname across prun, prte, and pmix
- Fix "nolocal" support

Silence a warning from btl/vader

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-16 12:25:28 -07:00
Artem Polyakov
9ffee9859f
Merge pull request #7512 from artpol84/topic/master/timings_update
Topic/master/timings update
2020-03-12 17:37:42 -07:00
Austen Lauria
f973381a46
Merge pull request #7524 from awlauria/update_info_for_prrte
Update two orte specific env's to be more generic
2020-03-12 20:18:42 -04:00
Ralph Castain
c296dada2c
Merge pull request #7525 from rhc54/topic/fence
Correct fence logic in MPI_Init
2020-03-11 16:55:01 -07:00
Austen Lauria
7c31586c6d
Merge pull request #7501 from awlauria/finalize_leaks_ggouaillardet_awlauria
Finalize memchecker calls and one memory leak
2020-03-11 13:04:50 -04:00
Ralph Castain
dd623cec34
Correct fence logic in MPI_Init
The fence logic in MPI_Init got messed up somehow such that we were
always executing a fence, which is not desirable. The logic is supposed
to be:

* if async fence is requested and we are not collecting data, then do
not fence at all

* if async fence is requested and we are collecting data, then execute
the fence in the background - wait for completion at the end of MPI_Init.

* if async fence is not requested, then execute a blocking fence at that
point, collecting data as directed. Note that we cannot actually do a
blocking fence as we need to cycle the event library via opal_progress
as the PMIx progress thread is tied to the OMPI event base.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-11 09:25:07 -07:00
Austen Lauria
b675a76361 Update two orte specific env's to be more generic.
orte is gone, and we don't want to require other
RM's to either use a prrte specific env, or to
set their own.

OMPI_MCA_orte_ess_num_procs -> OMPI_MCA_num_procs
OMPI_MCA_orte_cpu_type -> OMPI_MCA_cpu_type

See PRRTE PR's:

https://github.com/openpmix/prrte/pull/443
https://github.com/openpmix/prrte/pull/440

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-11 10:09:35 -04:00
Ralph Castain
11028d0322
Merge pull request #7518 from rhc54/topic/prup
Update PRRTE and PMIx
2020-03-10 05:44:45 -07:00
Ralph Castain
18b06430d3
Update PRRTE and PMIx
- Avoid modifying single-dash options of applications
- Fix fetch of node/app-level info
- Return correct status code

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-09 18:23:43 -07:00
Ralph Castain
4c7890dec4
Merge pull request #7517 from rhc54/topic/prte
Report PRRTE build status if autogen'd --no-prrte
2020-03-09 14:39:38 -07:00
Ralph Castain
727bd8a60d
Report PRRTE build status if autogen'd --no-prrte
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-08 11:02:42 -07:00
Artem Polyakov
0f51ea3fe5 timings: Update/extend OSHMEM timings
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2020-03-07 09:36:43 -08:00
Artem Polyakov
7c17a38c96 timings: Fix timings when 'prefix' is used
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2020-03-07 09:36:43 -08:00
Ralph Castain
836cc5b6a0
Merge pull request #7498 from rhc54/topic/again
Update PRRTE and PMIx
2020-03-06 11:31:33 -08:00
Ralph Castain
d454bf1f20
Update PRRTE and PMIx
PMIx:
- Ensure that launchers open all required frameworks
- Pass back the tool's ID
- Fix race condition in IOF

PRRTE:
- Begin conversion to use of nspace in place of numeric jobid
- Restore support:
    --report-bindings
    --display-map
    --display-devel-map
    --display-topo
    --do-not-launch
    --xml-output
    --display-allocation

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-06 10:04:41 -08:00
Howard Pritchard
8d59512a9e
Merge pull request #7506 from hppritcha/topic/address_issue7458
check for external libevent and hwloc
2020-03-06 09:49:15 -07:00
Austen Lauria
04a3a28a74 Some memchecker cleanup and others.
- Port memchecker call from a1d502c.
- Remove unused memcheck macro variables.
- Some code readability improvements.
- Remove some stray +1's in dynamic comm cleanup.
- Re-add OPAL_ENABLE_DEBUG macro to osc header.
- Cleanup some printf's, and includes.
- Refactor cleanup of dpm_disconnect_objs.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-05 16:44:18 -05:00
Howard Pritchard
2990d8d98b check for external libevent and hwloc
when building with external PMIx.

Related #7458

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-03-05 14:30:57 -07:00
Gilles Gouaillardet
e2ad184db5 pml/ob1: silence valgrind errors
always define and initialize padding in various structs
when OPAL_ENABLE_DEBUG is set

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:43 -05:00
Gilles Gouaillardet
5751dfe91a mpi/c: fix memchecker invokation
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Gilles Gouaillardet
fc2516457b osc/pt2pt: silence valgrind warnings
explicitly add and initialize padding to keep valgrind happy

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Gilles Gouaillardet
ff746153d7 mpool/base: silence a valgrind warning
by adding a constructor to mca_mpool_base_tree_item_t

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Gilles Gouaillardet
4d92b5fcd8 memchecker: fix memchecker_call
- fix handling of contiguous datatypes with a non-zero true lower bound
- fix handling of datatypes using block of non contiguous predefined datatypes

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Austen Lauria
6c91d991ab
Merge pull request #7497 from awlauria/fix_rdma_shmem
Fix segv in btl/vader.
2020-03-05 07:45:48 -05:00
Gilles Gouaillardet
0db5a15696 ompi/dpm: plug misc memory leaks
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-04 14:54:40 -05:00
Austen Lauria
f69c8d6819 Fix segv in btl/vader.
Keep track of the connected procs in vader_add_procs().
Otherwise, the same rank will reconnect the same shmem
segment (rank 0+...) multiple times instead of the next
one as intended.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-04 09:32:58 -05:00
Jeff Squyres
77bf3f08f5
Merge pull request #7496 from jsquyres/pr/remove-remnants-of-zlib-from-core
Remove a few more remnants of zlib
2020-03-03 15:13:11 -05:00
Jeff Squyres
2ffdea6f52 Remove a few more remnants of zlib
The compress framework was removed in 66da0c63, but these bits escaped
removal.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-03-03 08:48:47 -08:00
bosilca
96559fee77
Merge pull request #7495 from hkuno/hkuno/mca_pml_base_pml_check_selected
Fix typo in mca_pml_base_pml_check_selected
2020-03-03 09:11:20 -05:00
Harumi Kuno
397cc44aa4 Fix typo in mca_pml_base_pml_check_selected
Addresses issue https://github.com/open-mpi/ompi/issues/7494

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
2020-03-02 22:49:40 -08:00
Jeff Squyres
0f41608651
Merge pull request #7493 from rhc54/topic/pr
Update PRRTE pointer
2020-03-01 09:42:28 -05:00
Ralph Castain
0e3d17c7c6
Update PRRTE pointer
- fix hwloc compile
- change rules to make man pages to "prrte-rules"

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-29 18:32:50 -08:00
Ralph Castain
23303eaee3
Merge pull request #7492 from rhc54/topic/sing
Fix singleton operation
2020-02-29 15:15:04 -08:00
Ralph Castain
338dd782ed
Merge pull request #7491 from rhc54/topic/tweak
Tweak the C++ binding deprecation check
2020-02-29 15:11:27 -08:00
Ralph Castain
674134430c
Fix singleton operation
OpenPMIx fills in a variety of info when it detects that we are in
singleton mode. Best way of detecting it is to look for the "singleton"
at the beginning of the returned nspace.

Make the modex recvs optional so we don't bounce up to the server and
then to the host trying to retrieve job-level info that must be given to
us at job start.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-29 12:18:30 -08:00
Ralph Castain
4fe9ae329c
Add missing include and remove stale PML
The "yalla" pml no longer exists

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-29 11:54:38 -08:00
Ralph Castain
458b1563e2
Treat PMI-1/2 options the same
For consistency, allow the --without-pmi option

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-29 11:44:11 -08:00
Ralph Castain
9b05b1c4a7
Tweak the C++ binding deprecation check
Some of us have platform files that expressly disabled C++ support.
While it is true that v5 no longer supports C++ and thus no longer needs
us to disable it, there seems no reason to make us create platform files
that differentiate based on OMPI version just for that reason.

So if someone asks to "disable" the no-longer-existing support, just
ignore it.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-02-29 10:57:45 -08:00
Jeff Squyres
277097430c
Merge pull request #7490 from jsquyres/pr/fix-debugger-flags
Re-add debugger flag configury
2020-02-29 11:25:44 -05:00
Jeff Squyres
7daf664b18
Merge pull request #7488 from rhc54/topic/update
Update the PRRTE and PMIx pointers
2020-02-29 11:25:32 -05:00