1
1

30531 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
33ab928e1b ompi_proc_t size reduction: part 1
We currently save the hostname of a proc when we create the ompi_proc_t for it. This was originally done because the only method we had for discovering the host of a proc was to include that info in the modex, and we had to therefore store it somewhere proc-local. Obviously, this ccarried a memory penalty for storing all those strings, and so we added a "cutoff" parameter so that we wouldn't collect hostnames above a certain number of procs.

Unfortunately, this still results in an 8-byte/proc memory cost as we have a char* pointer in the opal_proc_t that is contained in the ompi_proc_t so that we can store the hostname of the other procs if we fall below the cutoff. At scale, this can consume a fair amount of memory.

With the switch to relying on PMIx, there is no longer a need to cache the proc hostnames. Using the "optional" feature of PMIx_Get, we restrict the retrieval to be purely proc-local - i.e., we retrieve the info either via shared memory or from within the proc-internal hash storage (depending upon the active PMIx components). Thus, the retrieval of a hostname is purely a local operation involving no communication.

All RM's are required to provide a complete hostname map of all procs at startup. Thus, we have full access to all hostnames without including them in a modex or having to cache them on each proc. This allows us to remove the char* pointer from the opal_proc_t, saving us 8-bytes/proc.

Unfortunately, PMIx_Get does not currently support the return of a static pointer to memory. Thus, even though PMIx has the hostname in its memory, it can only return a malloc'd version of it. I have therefore ensured that the return from opal_get_proc_hostname is consistently malloc'd and free'd wherever used. This shouldn't be a burden as the hostname is only used in one of two circumstances:

(a) in an error message
(b) in a verbose output for debugging purposes

Thus, there should be no performance penalty associated with the malloc/free requirement. PMIx will eventually be returning static pointers, and so we can eventually simplify this method and return a "const char*" - but as noted, this really isn't an issue even today.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-23 12:49:44 -07:00
Austen Lauria
48b52478ef
Merge pull request #7558 from awlauria/revert_hwloc
Revert "Ensure we get our local topology"
2020-03-23 11:54:11 -04:00
Austen Lauria
ecd20ddcac Revert "Ensure we get our local topology"
Per the devel mailing list, we discussed the need/desirability of this change. Here is the logic behind not including it:

If you call "hwloc_topology_load", then hwloc merrily does its discovery and slams many-core systems. If you call "opal_hwloc_get_topology", then that is fine - it checks if we already have it, tries to get it from PMIx (using shared mem for hwloc 2.x), and only does the discovery if no other method is available.

We previously decided to let those who need the topology call "opal_hwloc_get_topology" to ensure the topo was available so that we don't load it unless someone actually needs it - in the case where it isn't available via PMIx, this avoids paying the startup time and memory footprint penalties for no reason. The function is protected so it will simply return SUCCESS if the topology is already defined.

After discussion, it was decided to stick with that "only setup the topology if someone actually needs it" approach. Hence, we will not blanket init the topology, and the mtl/ofi component will call opal_hwloc_get_topology to ensure the topo has been defined prior to using it.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-23 11:15:47 -04:00
Austen Lauria
cf5ca14f7a
Merge pull request #7547 from rhc54/topic/hwloc
Ensure we get our local topology
2020-03-23 10:13:02 -04:00
Austen Lauria
391370a05c
Merge pull request #7546 from devreal/egrequest-obj-retain
grequestx: fix racy initialization and premature destruction
2020-03-23 10:12:32 -04:00
Austen Lauria
b560fc5fae
Merge pull request #7505 from hkuno/john.l.byrne/btl_ofi
Fix btl ofi clean-up logic
2020-03-23 10:10:33 -04:00
Jeff Squyres
758c3f8d05
Merge pull request #7534 from artemry-mlnx/artemry-mlnx/cleanup_workspace_wa
Implemented a W/A for workspace cleanup issue
2020-03-21 08:35:11 -04:00
Ralph Castain
973d10159a
Merge pull request #7548 from jsquyres/pr/usnic-typo
usnic: remove typo
2020-03-20 14:55:46 -07:00
Ralph Castain
a523e9b74e
Merge pull request #7549 from rhc54/topic/mpirun
Update PMIx and PRRTE to reduce mpirun complexity
2020-03-20 14:55:25 -07:00
Ralph Castain
2979bb2ce8
Update PMIx and PRRTE to reduce mpirun complexity
Use "prte" instead of "prun" for proxy execution of cmds like mpirun.
This avoids the fork/exec-rendezvous complexities and should result in
more reliable operation.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-20 13:49:12 -07:00
Jeff Squyres
1870b04017 usnic: remove typo
Remove an amusing -- but harmless -- typo.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-03-20 11:16:52 -07:00
Ralph Castain
98893a530b
Ensure we get our local topology
Restore missing call to get_topology - others were doing it in their
components as repeated calls just return success, but let's ensure it is
always present.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-20 09:28:20 -07:00
Joseph Schuchart
dabdfe7153 grequestx: fix race condition in initialization
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-03-20 14:53:28 +01:00
Joseph Schuchart
4a39a34bab grequestx: retain request object until it is removed from the list
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-03-20 14:52:42 +01:00
Ralph Castain
b3f0bc5490
Merge pull request #7544 from rhc54/topic/buf
Re-enable stream buffering option
2020-03-18 18:25:02 -07:00
Ralph Castain
757621e199
Re-enable stream buffering option
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-18 16:20:57 -07:00
Ralph Castain
4bb13bc733
Merge pull request #7543 from rhc54/topic/up
Update PMIx and PRRTE
2020-03-18 14:03:27 -07:00
Ralph Castain
0dccd3378b
Update PMIx and PRRTE
PMIx
- fix several race conditions

PRRTE
- fix race condition
- extend prun-to-prte connection tries
- pass correct nspace to job ctrl in response to ctrl-c

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-18 11:46:38 -07:00
Ralph Castain
c3ff5958a5
Merge pull request #7540 from jsquyres/pr/fix-pmix-resetting-top-level-flags
opal_check_pmi.m4: properly save top-level flags
2020-03-17 20:22:42 -07:00
Jeff Squyres
cb1e424359 opal_check_pmi.m4: properly save top-level flags
CPPFLAGS, LDFLAGS, and LIBS were only being saved conditionally, but
restored unconditionally.  This could result in wiping out
CPPFLAGS/LDFLAGS/LIB.

Make sure to *always* save these flags so that when they are restored,
they are restored to their proper value.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-03-17 22:15:41 -04:00
Ralph Castain
b35d714131
Merge pull request #7537 from rhc54/topic/pxup
Update PMIx
2020-03-17 10:34:54 -07:00
Ralph Castain
972f6aea7f
Update PMIx
- Silence a few (valid) warnings

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-17 08:53:43 -07:00
Ralph Castain
ddc19559af
Merge pull request #7535 from rhc54/topic/rte
Cleanup singleton detection and data retrieval
2020-03-16 15:49:20 -07:00
Ralph Castain
6b4fb509e9
Cleanup singleton detection and data retrieval
Extend the PMIx modex recv macros to cover the full set of
immediate/optional combinations. If PMIx_Init cannot reach a server,
then declare the MPI proc to be a singleton.

Provide full support for info values via PMIx

Catch all the values used in the "info" area of OMPI using data
available from PMIx instead of via envars. Update PMIx and PRRTE to sync
with their capabilities.

PMIx
- ensure cleanup of fork/exec children
- fix bug in gds/hash that left app info off of list

PRRTE
- fix multi-app bugs
- port setup_child logic from orte
- OMPI env changes
- set app->first_rank
- ensure common hostname across prun, prte, and pmix
- Fix "nolocal" support

Silence a warning from btl/vader

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-16 12:25:28 -07:00
Artem Ryabov
15e05ebe56 Implemented a W/A for workspace cleanup issue
Signed-off-by: Artem Ryabov <artemry@mellanox.com>
2020-03-15 00:36:13 +03:00
Artem Polyakov
9ffee9859f
Merge pull request #7512 from artpol84/topic/master/timings_update
Topic/master/timings update
2020-03-12 17:37:42 -07:00
Austen Lauria
f973381a46
Merge pull request #7524 from awlauria/update_info_for_prrte
Update two orte specific env's to be more generic
2020-03-12 20:18:42 -04:00
Ralph Castain
c296dada2c
Merge pull request #7525 from rhc54/topic/fence
Correct fence logic in MPI_Init
2020-03-11 16:55:01 -07:00
Austen Lauria
7c31586c6d
Merge pull request #7501 from awlauria/finalize_leaks_ggouaillardet_awlauria
Finalize memchecker calls and one memory leak
2020-03-11 13:04:50 -04:00
Ralph Castain
dd623cec34
Correct fence logic in MPI_Init
The fence logic in MPI_Init got messed up somehow such that we were
always executing a fence, which is not desirable. The logic is supposed
to be:

* if async fence is requested and we are not collecting data, then do
not fence at all

* if async fence is requested and we are collecting data, then execute
the fence in the background - wait for completion at the end of MPI_Init.

* if async fence is not requested, then execute a blocking fence at that
point, collecting data as directed. Note that we cannot actually do a
blocking fence as we need to cycle the event library via opal_progress
as the PMIx progress thread is tied to the OMPI event base.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-11 09:25:07 -07:00
Austen Lauria
b675a76361 Update two orte specific env's to be more generic.
orte is gone, and we don't want to require other
RM's to either use a prrte specific env, or to
set their own.

OMPI_MCA_orte_ess_num_procs -> OMPI_MCA_num_procs
OMPI_MCA_orte_cpu_type -> OMPI_MCA_cpu_type

See PRRTE PR's:

https://github.com/openpmix/prrte/pull/443
https://github.com/openpmix/prrte/pull/440

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-11 10:09:35 -04:00
Harumi Kuno
ab4875ddc2 set ep to NULL to avoid double close
Per suggestion of @awlauria

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
2020-03-10 17:39:59 -06:00
Ralph Castain
11028d0322
Merge pull request #7518 from rhc54/topic/prup
Update PRRTE and PMIx
2020-03-10 05:44:45 -07:00
Ralph Castain
18b06430d3
Update PRRTE and PMIx
- Avoid modifying single-dash options of applications
- Fix fetch of node/app-level info
- Return correct status code

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-09 18:23:43 -07:00
Ralph Castain
4c7890dec4
Merge pull request #7517 from rhc54/topic/prte
Report PRRTE build status if autogen'd --no-prrte
2020-03-09 14:39:38 -07:00
Ralph Castain
727bd8a60d
Report PRRTE build status if autogen'd --no-prrte
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-08 11:02:42 -07:00
Harumi Kuno
1bc3dab118 Add comments about order of close ops
Per suggestion of @awlauria, added some comments about
the need to free ep before resources it points to.

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
2020-03-07 14:08:39 -07:00
Artem Polyakov
0f51ea3fe5 timings: Update/extend OSHMEM timings
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2020-03-07 09:36:43 -08:00
Artem Polyakov
7c17a38c96 timings: Fix timings when 'prefix' is used
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2020-03-07 09:36:43 -08:00
Ralph Castain
836cc5b6a0
Merge pull request #7498 from rhc54/topic/again
Update PRRTE and PMIx
2020-03-06 11:31:33 -08:00
Ralph Castain
d454bf1f20
Update PRRTE and PMIx
PMIx:
- Ensure that launchers open all required frameworks
- Pass back the tool's ID
- Fix race condition in IOF

PRRTE:
- Begin conversion to use of nspace in place of numeric jobid
- Restore support:
    --report-bindings
    --display-map
    --display-devel-map
    --display-topo
    --do-not-launch
    --xml-output
    --display-allocation

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-06 10:04:41 -08:00
Howard Pritchard
8d59512a9e
Merge pull request #7506 from hppritcha/topic/address_issue7458
check for external libevent and hwloc
2020-03-06 09:49:15 -07:00
Austen Lauria
04a3a28a74 Some memchecker cleanup and others.
- Port memchecker call from a1d502c.
- Remove unused memcheck macro variables.
- Some code readability improvements.
- Remove some stray +1's in dynamic comm cleanup.
- Re-add OPAL_ENABLE_DEBUG macro to osc header.
- Cleanup some printf's, and includes.
- Refactor cleanup of dpm_disconnect_objs.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-05 16:44:18 -05:00
Howard Pritchard
2990d8d98b check for external libevent and hwloc
when building with external PMIx.

Related #7458

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-03-05 14:30:57 -07:00
Gilles Gouaillardet
e2ad184db5 pml/ob1: silence valgrind errors
always define and initialize padding in various structs
when OPAL_ENABLE_DEBUG is set

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:43 -05:00
Gilles Gouaillardet
5751dfe91a mpi/c: fix memchecker invokation
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Gilles Gouaillardet
fc2516457b osc/pt2pt: silence valgrind warnings
explicitly add and initialize padding to keep valgrind happy

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Gilles Gouaillardet
ff746153d7 mpool/base: silence a valgrind warning
by adding a constructor to mca_mpool_base_tree_item_t

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Gilles Gouaillardet
4d92b5fcd8 memchecker: fix memchecker_call
- fix handling of contiguous datatypes with a non-zero true lower bound
- fix handling of datatypes using block of non contiguous predefined datatypes

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-03-05 16:10:42 -05:00
Austen Lauria
6c91d991ab
Merge pull request #7497 from awlauria/fix_rdma_shmem
Fix segv in btl/vader.
2020-03-05 07:45:48 -05:00