1
1

104 Коммитов

Автор SHA1 Сообщение Дата
Geoffrey Paulsen
11d79d1f6e Updating OMPI v4.0.x to PMIx v3.1.5 released
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2020-02-24 12:13:20 -05:00
Geoffrey Paulsen
81ad9bfdb6 Adding PMIx v3.1.5rc2
Adding PMIx v3.1.5rc2 from:
  https://github.com/openpmix/openpmix/releases/tag/v3.1.5rc2

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2020-02-10 17:05:53 -06:00
Ralph Castain
8efc6e1dc1
Remove unnecessary error log
Refs https://github.com/pmix/pmix/pull/1413

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-08-26 23:48:34 -07:00
Ralph Castain
167ca31a31
Update PMIx to official v3.1.4 release
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-08-09 13:14:48 -07:00
Ralph Castain
1d0e0557b9
v4.0.x: Update PMIx to official v3.1.3 release
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-07-02 08:56:49 -07:00
Ralph Castain
9d0adbc6bc
Update to track 32-bit support commit
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-06-26 09:31:43 -07:00
Ralph Castain
b353639573
Update to PMIx v3.1.3rc4
Will provide PR to update VERSION to final release once passes MTT

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-06-25 13:45:19 -07:00
Joshua Hursey
45526fadee Do not force 'hash' gds on direct modex
* Forcing the 'hash' gds component should not be necessary any more.

Port of PR #6498 (component names changed so a cherry-pick would not work)

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2019-03-19 10:52:17 -05:00
Aurelien Bouteiller
cf34de33eb Avoid a double lock interlock when calling pmix_finalize
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2019-03-08 15:33:17 -05:00
Ralph Castain
335f8c5100 Update to PMIx 3.1.2
Update the OPAL glue configure code to correctly link the opal/pmix3
component to the hwloc used by OMPI instead of defaulting to the
system-level hwloc. Required a corresponding update to the PMIx hwloc
configure code so we treat hwloc the same way we handle libevent in
embedded scenarios. Roll to PMIx v3.1.2 for plugging of memory leaks and
addition of faster PMIx_Get response

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-01-25 15:58:53 +09:00
Gilles Gouaillardet
61108b6228 pmix/ext3x: fix support for external PMIx v3.1
The PMIX_MODEX and PMIX_INFO_ARRAY macros were removed from the PMIx 3.1 standard.
Open MPI does not really need them (they are only used to be reported as not supported),
so smply #ifdef protect them to support an external PMIx v3.1

The change only need to be done in ext3x/ext3x.c.
But since this file is automatically generated from pmix3x/pmix3x.c, we have to update
the latter file.

Refs. open-mpi/ompi#6247

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(back-ported from commit open-mpi/ompi@950ba16aa1)
2019-01-07 20:27:34 +09:00
Gilles Gouaillardet
195a07d03d pmix/pmix3x: fix macros usage in embedded pmix3x
Use PMIX_* macros instead of OPAL_* macros
master does things differently, so this is a one-off commit

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-12-07 16:07:07 +09:00
Nathan Hjelm
5efc76ef44 pmix3x: fix potential memory barrier bug with __atomic builtin atomics
See open-mpi/ompi#6014 for more information.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-11-06 10:37:14 -07:00
Ralph Castain
12790e8ec6 Protect PMIx from bad configure entry
Ignore with-hwloc=internal or external as those are meaningless to pmix
(will upstream)

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit c498a7e77a377ddc3a7bcc26ea072627a33cb470)
2018-10-09 03:45:58 -07:00
Ralph Castain
3e2cc6f46a Fail configure if pmix won't build
If we are using the internal PMIx component and the embedded library fails to configure, then fail - don't silently fail to build and then fail in execution

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit f379ba9c8e5ce17641937c351ab46e4b4a82446c)
2018-10-09 03:45:37 -07:00
Ralph Castain
131ea01320 Update to PMIx v3.0.2
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-09-18 13:03:18 -07:00
Gilles Gouaillardet
baf41aceed pmix/pmix3x: plug a memory leak in external_register()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@aeddd2f249)
2018-09-10 09:17:31 +09:00
Geoff Paulsen
b2daa0001f
Merge pull request #5565 from rhc54/cmr40/pmix301
Update to PMIx 3.0.1
2018-08-31 13:58:41 -05:00
Ralph Castain
e27e945d9a Complete job control integration
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-08-20 16:08:54 -07:00
Ralph Castain
3eef3d1d8f Update to PMIx 3.0.1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-08-20 14:00:41 -07:00
Ralph Castain
511319c316 Fix the multiple pe/proc option
Things got a little out of whack and we weren't actually processing the map-by modifiers, plus an error crept into the display of the binding report. So clean those up.

Thanks to @tonyreina for the error report

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit bcdb1f45aca3f6dfab2646bfdba99775f728ca3b)
2018-07-25 19:55:28 -07:00
Ralph Castain
04054d63eb Cleanup pmix selection check
Allow for versions > 3

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 55cefedf9b8d97c056f7b027d759f32190c9c23e)
2018-07-25 19:55:18 -07:00
Ralph Castain
fdca304268 Default to external PMIx installation
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-10 16:12:52 -07:00
Ralph Castain
17c4cf0db8 Install PMIx v3.0.0 release
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-06 06:38:02 -07:00
Jeff Squyres
e3d6c5ce3a pmix3/pmix_server.c: minor compiler warning stomp
Submitted upstream https://github.com/pmix/pmix/pull/776.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 06:35:09 -07:00
Ralph Castain
5ac2ce6346 Cover all the PMIx data types
Cover all data types for OPAL-to-PMIx conversion, generating error logs when we hit something we don't support

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-20 09:06:19 -07:00
Ralph Castain
08707c9762 Sync to updated PMIx v3.0.0rc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-19 21:25:43 -07:00
Ralph Castain
f0a0d606a0 Correct accounting for tools
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 1be080f7b92bad39745f42628a8cb6afefad2d2a)
2018-06-18 13:24:25 -07:00
Ralph Castain
7981818b84 Update PMIx atomics
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-17 10:03:49 -07:00
Ralph Castain
fa18ba395d Sync to latest PMIx v3.0rc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-17 02:41:46 -07:00
Ralph Castain
48f27655a6 Sync to PMIx v3.0rc and add ext4x
Sync to the draft rc for PMIx v3.0. Add an external component for PMIx master, which is at v4.0

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-11 05:54:23 -07:00
Ralph Castain
55ac526a67 Enable the PMIx ompi/rte component
Get the OMPI rte/pmix component working. This was tested using PRRTE as the RM, configuring OMPI using:

* autogen --no-orte

* with external libevent, external hwloc, and external PMIx master

* configuring PMIx master with the same libevent and hwloc

* execute the application using PRRTE's "prun" launcher, which has the same cmd line as ORTE's mpirun

Note that PMIx master appears to have a bug in the event notification system that caches job termination events. Thus, the first execution runs fine, but subsequent executions cause an "abort" when the OMPI default error handler is invoked upon notification of the prior job's termination. Will work that separately.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 134cca9ac0de092d767999357573a31703f72292)
2018-06-03 07:25:12 -07:00
Jeff Squyres
fb0473acb5 pmix3x: compiler warning stomp
This fix was already included in pmix upstream (https://github.com/pmix/pmix/commit/fb7af8af2).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-05-30 10:14:37 -07:00
Ralph Castain
4ff61450a4 Ensure pmix_cleanup finalizes the class system
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-05-04 06:22:36 -07:00
Jeff Squyres
9f472d8a7b pmix: add "pmix*_library_version" info MCA var
Simple MCA vars for ext1, ext2, and pmix3 components to reflect what
the underlying PMIx library version is.  For example:

```
$ ompi_info --param pmix pmix3x --parsable --level 9 | grep
library_version
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:value:PMIx library version 3.0.0 (embedded in Open MPI)
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:source:default
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:status:writeable
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:level:4
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:help:Version of the underlying PMIx library
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:deprecated:no
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:type:string
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:disabled:false
```

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-29 14:21:07 -07:00
Ralph Castain
e443adc7a1 Reset OMPI master to PMIx master
Track PMIx master instead of the reference server - fixes problem of external PMIx master builds.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 08:36:46 -07:00
Boris Karasev
dca3dd2ea4 pmix: dstore returned for direct modex
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-20 04:56:48 +02:00
Boris Karasev
36a0c6a794 pmix: fixed the direct modex request
This commit fixes the case when local client asks for the key from the
process on the remote node. The local server don't have commit count for
remote ranks, it is maintained by another PMIx server, so commit count
should be ignored for remote requests.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-19 11:51:03 +02:00
Ralph Castain
7241043809 Modify the internal logic for resolve nodes/peers
The current code path for PMIx_Resolve_peers and PMIx_Resolve_nodes executes a threadshift in the preg components themselves. This is done to ensure thread safety when called from the user level. However, it causes thread-stall when someone attempts to call the regex functions from _inside_ the PMIx code base should the call occur from within an event.

Accordingly, move the threadshift to the client-level functions and make the preg components just execute their algorithms. Create a new pnet/test component to verify that the prge code can be safely accessed - set that component to be selected only when the user directly specifies it. The new component will be used to validate various logical extensions during development, and can then be discarded.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 456ac7f7af3d9ba09888e3c899eb001daaa24aef)
2018-03-02 02:00:31 -08:00
Ralph Castain
17c40f4cea Implement support for proctable queries
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-02 02:00:31 -08:00
Ralph Castain
0434b615b5 Update ORTE to support PMIx v3
This is a point-in-time update that includes support for several new PMIx features, mostly focused on debuggers and "instant on":

* initial prototype support for PMIx-based debuggers. For the moment, this is restricted to using the DVM. Supports direct launch of apps under debugger control, and indirect launch using prun as the intermediate launcher. Includes ability for debuggers to control the environment of both the launcher and the spawned app procs. Work continues on completing support for indirect launch

* IO forwarding for tools. Output of apps launched under tool control is directed to the tool and output there - includes support for XML formatting and output to files. Stdin can be forwarded from the tool to apps, but this hasn't been implemented in ORTE yet.

* Fabric integration for "instant on". Enable collection of network "blobs" to be delivered to network libraries on compute nodes prior to local proc spawn. Infrastructure is in place - implementation will come later.

* Harvesting and forwarding of envars. Enable network plugins to harvest envars and include them in the launch msg for setting the environment prior to local proc spawn. Currently, only OmniPath is supported. PMIx MCA params control which envars are included, and also allows envars to be excluded.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-02 02:00:31 -08:00
Ralph Castain
60e6440603 Sync to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-02-19 09:20:13 -08:00
Ralph Castain
1a7dfd7d54 Sync to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-02-07 12:16:51 -08:00
Ralph Castain
9fe8153d38 Sync to IOF branch and continue fix of request for job info from unknown nspace
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 02400d30d79ce3c7e7e28f9a08f7062a5b6f4c51)
2018-02-03 19:56:35 -08:00
Ralph Castain
a17df810ed Sync with PMIx iof rfc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 10:51:38 -08:00
Ralph Castain
a5679ef000 Update the PMIx 3.x component
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:34:44 -08:00
Ralph Castain
6216225bda Ensure cleanup of registered files/dirs
Resolve a race condition between registering for a file to be removed upon termination and actual creation of that file by providing attributes that identify whether the path is a file or directory. This removes the need for PMIx to detect the difference.

Refs #4686

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-11 11:05:30 -08:00
Ralph Castain
6dacf40a8c Ensure the epilog gets executed in PMIx server
If we abnormally terminate, then we still want any cleanups to be
executed.

Remove debug

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-10 18:28:05 -08:00
Ralph Castain
d5471d7898 Silence warnings in optimized build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-20 12:00:28 -08:00
Ralph Castain
07427c6d89 Update to PMIx v3.0 PR for cleanup registration
If available, have apps use registration capability to cleanup their session directories. Setup capability for vader to register its shared memory file location - let someone familiar with that code do so.

Final cleanup to track uid/gid, update the opal/pmix API to pass flags for ignore and leave top directory alone

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 06:53:11 -08:00