1
1
Граф коммитов

610 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
aeddd2f249 pmix/pmix3x: plug a memory leak in external_register()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-08-30 10:07:16 +09:00
Gilles Gouaillardet
6e47c5708e pmix/base: plug a memory leak in opal_pmix_base_select()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-08-30 10:07:16 +09:00
Ralph Castain
5cfa2a7fca Complete integration of job_control
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-08-20 16:10:50 -07:00
Ralph Castain
9948084130 Update to PMIx v3.0.1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-08-20 15:05:24 -07:00
Ralph Castain
f7a537cf04 Fix typo
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-25 20:05:33 -07:00
Ralph Castain
bcdb1f45ac Fix the multiple pe/proc option
Things got a little out of whack and we weren't actually processing the map-by modifiers, plus an error crept into the display of the binding report. So clean those up.

Thanks to @tonyreina for the error report

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-25 18:47:39 -07:00
Ralph Castain
55cefedf9b Cleanup pmix selection check
Allow for versions > 3

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-25 16:11:32 -07:00
Ralph Castain
1e6aaf7f22 Default to internal PMIx if newer than external
Per https://github.com/open-mpi/ompi/issues/5031, if the user didn't specify a particular PMIx installation, then default back to the internal version if it is newer than the discovered external one. PMIx doesn't yet provide a full signature so we have to just get as close as possible for now.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-19 11:02:03 -07:00
Ralph Castain
4a596d35f7 Remove the PMIx ext4x component
Update configury to redirect anything at or above v3 to the ext3x component

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-13 19:51:50 -07:00
Ralph Castain
fdca304268 Default to external PMIx installation
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-10 16:12:52 -07:00
Ralph Castain
17c4cf0db8 Install PMIx v3.0.0 release
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-06 06:38:02 -07:00
Jeff Squyres
e3d6c5ce3a pmix3/pmix_server.c: minor compiler warning stomp
Submitted upstream https://github.com/pmix/pmix/pull/776.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 06:35:09 -07:00
Ralph Castain
5ac2ce6346 Cover all the PMIx data types
Cover all data types for OPAL-to-PMIx conversion, generating error logs when we hit something we don't support

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-20 09:06:19 -07:00
Boris Karasev
39c9cb12bb pmix/ext2x: fixed detection PMIx v2.0 by pmix component
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-06-20 13:23:51 +03:00
Ralph Castain
08707c9762 Sync to updated PMIx v3.0.0rc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-19 21:25:43 -07:00
Ralph Castain
f0a0d606a0 Correct accounting for tools
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 1be080f7b92bad39745f42628a8cb6afefad2d2a)
2018-06-18 13:24:25 -07:00
Ralph Castain
7981818b84 Update PMIx atomics
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-17 10:03:49 -07:00
Ralph Castain
fa18ba395d Sync to latest PMIx v3.0rc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-17 02:41:46 -07:00
Ralph Castain
ac7bb15505 Fix other typo in help message
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-16 16:30:52 -07:00
Ralph Castain
8cfce583c0 Correct typo to properly check for PMIx v4
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-16 16:29:05 -07:00
Ralph Castain
48f27655a6 Sync to PMIx v3.0rc and add ext4x
Sync to the draft rc for PMIx v3.0. Add an external component for PMIx master, which is at v4.0

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-11 05:54:23 -07:00
Ralph Castain
840fb42f93 PMIx rte component does support dynamics
Minor cleanups

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-05 21:55:19 -07:00
Ralph Castain
55ac526a67 Enable the PMIx ompi/rte component
Get the OMPI rte/pmix component working. This was tested using PRRTE as the RM, configuring OMPI using:

* autogen --no-orte

* with external libevent, external hwloc, and external PMIx master

* configuring PMIx master with the same libevent and hwloc

* execute the application using PRRTE's "prun" launcher, which has the same cmd line as ORTE's mpirun

Note that PMIx master appears to have a bug in the event notification system that caches job termination events. Thus, the first execution runs fine, but subsequent executions cause an "abort" when the OMPI default error handler is invoked upon notification of the prior job's termination. Will work that separately.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 134cca9ac0de092d767999357573a31703f72292)
2018-06-03 07:25:12 -07:00
Jeff Squyres
fb0473acb5 pmix3x: compiler warning stomp
This fix was already included in pmix upstream (https://github.com/pmix/pmix/commit/fb7af8af2).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-05-30 10:14:37 -07:00
Ralph Castain
4ff61450a4 Ensure pmix_cleanup finalizes the class system
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-05-04 06:22:36 -07:00
Gilles Gouaillardet
edb8fe8e4b pmix/ext1x: fix index handling when populating an info array
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-04-26 11:06:43 +09:00
Ralph Castain
f424aa367e Fix external PMIx v1.2.5 support
As @hjelmn and I discussed, this is a little hacky. However, it is the only solution that can be done solely from the OMPI side.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-04-25 13:42:36 -07:00
Gilles Gouaillardet
37e7bca867 pmix/ext1x: fix misc build time errors
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-04-12 14:58:55 +09:00
Jeff Squyres
45922c4e81 pmix/base: set PMIx to follow OPAL's mca_component_show_load_errors
Have Open MPI's PMIx component to set PMIx's "show_load_errors" to do
the same thing that Open MPI's "show_load_errors" does.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-04-10 10:24:35 -07:00
Jeff Squyres
a2fc1ace09
Merge pull request #4992 from jsquyres/pr/pmix-version-info-mca-vars
pmix: add "pmix*_library_version" info MCA var
2018-04-04 17:29:06 -04:00
Ralph Castain
cd52ccdb68 Move past the '.' when getting jobstepid
The strtoul function returns the pointer to the first non-digit character, which is a '.' in this case. Calling strtoul at that point will always yield a zero - you have to move past it to get the remaining number

Thanks to Greg Lee for the detailed analysis of the problem.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-04-04 11:22:38 -07:00
Jeff Squyres
9f472d8a7b pmix: add "pmix*_library_version" info MCA var
Simple MCA vars for ext1, ext2, and pmix3 components to reflect what
the underlying PMIx library version is.  For example:

```
$ ompi_info --param pmix pmix3x --parsable --level 9 | grep
library_version
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:value:PMIx library version 3.0.0 (embedded in Open MPI)
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:source:default
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:status:writeable
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:level:4
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:help:Version of the underlying PMIx library
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:deprecated:no
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:type:string
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:disabled:false
```

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-29 14:21:07 -07:00
Ralph Castain
e443adc7a1 Reset OMPI master to PMIx master
Track PMIx master instead of the reference server - fixes problem of external PMIx master builds.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 08:36:46 -07:00
Boris Karasev
dca3dd2ea4 pmix: dstore returned for direct modex
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-20 04:56:48 +02:00
Boris Karasev
36a0c6a794 pmix: fixed the direct modex request
This commit fixes the case when local client asks for the key from the
process on the remote node. The local server don't have commit count for
remote ranks, it is maintained by another PMIx server, so commit count
should be ignored for remote requests.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-19 11:51:03 +02:00
Ralph Castain
7241043809 Modify the internal logic for resolve nodes/peers
The current code path for PMIx_Resolve_peers and PMIx_Resolve_nodes executes a threadshift in the preg components themselves. This is done to ensure thread safety when called from the user level. However, it causes thread-stall when someone attempts to call the regex functions from _inside_ the PMIx code base should the call occur from within an event.

Accordingly, move the threadshift to the client-level functions and make the preg components just execute their algorithms. Create a new pnet/test component to verify that the prge code can be safely accessed - set that component to be selected only when the user directly specifies it. The new component will be used to validate various logical extensions during development, and can then be discarded.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 456ac7f7af3d9ba09888e3c899eb001daaa24aef)
2018-03-02 02:00:31 -08:00
Ralph Castain
17c40f4cea Implement support for proctable queries
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-02 02:00:31 -08:00
Ralph Castain
0434b615b5 Update ORTE to support PMIx v3
This is a point-in-time update that includes support for several new PMIx features, mostly focused on debuggers and "instant on":

* initial prototype support for PMIx-based debuggers. For the moment, this is restricted to using the DVM. Supports direct launch of apps under debugger control, and indirect launch using prun as the intermediate launcher. Includes ability for debuggers to control the environment of both the launcher and the spawned app procs. Work continues on completing support for indirect launch

* IO forwarding for tools. Output of apps launched under tool control is directed to the tool and output there - includes support for XML formatting and output to files. Stdin can be forwarded from the tool to apps, but this hasn't been implemented in ORTE yet.

* Fabric integration for "instant on". Enable collection of network "blobs" to be delivered to network libraries on compute nodes prior to local proc spawn. Infrastructure is in place - implementation will come later.

* Harvesting and forwarding of envars. Enable network plugins to harvest envars and include them in the launch msg for setting the environment prior to local proc spawn. Currently, only OmniPath is supported. PMIx MCA params control which envars are included, and also allows envars to be excluded.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-02 02:00:31 -08:00
Ralph Castain
60e6440603 Sync to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-02-19 09:20:13 -08:00
Ralph Castain
1a7dfd7d54 Sync to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-02-07 12:16:51 -08:00
Ralph Castain
9fe8153d38 Sync to IOF branch and continue fix of request for job info from unknown nspace
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 02400d30d79ce3c7e7e28f9a08f7062a5b6f4c51)
2018-02-03 19:56:35 -08:00
Gilles Gouaillardet
43700faba1 pmix/ext3x: remove autogenerated ext3x.h header file
This header file was meant to be autogenerated, and for
some reasons, was never removed from the repository.
Update .gitignore as well

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 23:45:42 +09:00
Gilles Gouaillardet
8209fca842 pmix/ext3x: bring external component up-to-date with the embedded pmix3x
add the callback prototype for the upcoming PMIx_IOF_push() API

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:35:34 +09:00
Gilles Gouaillardet
0481277e93 pmix/ext3x: bring external component up-to-date with the embedded pmix3x
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:33:33 +09:00
Gilles Gouaillardet
0285c63348 pmix/ext3x: generate component source when only static libraries are built
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:21:14 +09:00
Ralph Castain
a17df810ed Sync with PMIx iof rfc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 10:51:38 -08:00
Ralph Castain
e9cd7fd7e6 Update orte
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:53:43 -08:00
Ralph Castain
9fb80bd239 Update the opal/pmix base framework elements
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:37:52 -08:00
Ralph Castain
187352eb3d Update the PMIx external components
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:35:57 -08:00
Ralph Castain
a5679ef000 Update the PMIx 3.x component
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:34:44 -08:00