Things got a little out of whack and we weren't actually processing the map-by modifiers, plus an error crept into the display of the binding report. So clean those up.
Thanks to @tonyreina for the error report
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Per https://github.com/open-mpi/ompi/issues/5031, if the user didn't specify a particular PMIx installation, then default back to the internal version if it is newer than the discovered external one. PMIx doesn't yet provide a full signature so we have to just get as close as possible for now.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Cover all data types for OPAL-to-PMIx conversion, generating error logs when we hit something we don't support
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Get the OMPI rte/pmix component working. This was tested using PRRTE as the RM, configuring OMPI using:
* autogen --no-orte
* with external libevent, external hwloc, and external PMIx master
* configuring PMIx master with the same libevent and hwloc
* execute the application using PRRTE's "prun" launcher, which has the same cmd line as ORTE's mpirun
Note that PMIx master appears to have a bug in the event notification system that caches job termination events. Thus, the first execution runs fine, but subsequent executions cause an "abort" when the OMPI default error handler is invoked upon notification of the prior job's termination. Will work that separately.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 134cca9ac0de092d767999357573a31703f72292)
As @hjelmn and I discussed, this is a little hacky. However, it is the only solution that can be done solely from the OMPI side.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Have Open MPI's PMIx component to set PMIx's "show_load_errors" to do
the same thing that Open MPI's "show_load_errors" does.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The strtoul function returns the pointer to the first non-digit character, which is a '.' in this case. Calling strtoul at that point will always yield a zero - you have to move past it to get the remaining number
Thanks to Greg Lee for the detailed analysis of the problem.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Simple MCA vars for ext1, ext2, and pmix3 components to reflect what
the underlying PMIx library version is. For example:
```
$ ompi_info --param pmix pmix3x --parsable --level 9 | grep
library_version
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:value:PMIx library version 3.0.0 (embedded in Open MPI)
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:source:default
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:status:writeable
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:level:4
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:help:Version of the underlying PMIx library
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:deprecated:no
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:type:string
mca:pmix:pmix3x:param:pmix_pmix3x_library_version:disabled:false
```
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit fixes the case when local client asks for the key from the
process on the remote node. The local server don't have commit count for
remote ranks, it is maintained by another PMIx server, so commit count
should be ignored for remote requests.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
The current code path for PMIx_Resolve_peers and PMIx_Resolve_nodes executes a threadshift in the preg components themselves. This is done to ensure thread safety when called from the user level. However, it causes thread-stall when someone attempts to call the regex functions from _inside_ the PMIx code base should the call occur from within an event.
Accordingly, move the threadshift to the client-level functions and make the preg components just execute their algorithms. Create a new pnet/test component to verify that the prge code can be safely accessed - set that component to be selected only when the user directly specifies it. The new component will be used to validate various logical extensions during development, and can then be discarded.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 456ac7f7af3d9ba09888e3c899eb001daaa24aef)
This is a point-in-time update that includes support for several new PMIx features, mostly focused on debuggers and "instant on":
* initial prototype support for PMIx-based debuggers. For the moment, this is restricted to using the DVM. Supports direct launch of apps under debugger control, and indirect launch using prun as the intermediate launcher. Includes ability for debuggers to control the environment of both the launcher and the spawned app procs. Work continues on completing support for indirect launch
* IO forwarding for tools. Output of apps launched under tool control is directed to the tool and output there - includes support for XML formatting and output to files. Stdin can be forwarded from the tool to apps, but this hasn't been implemented in ORTE yet.
* Fabric integration for "instant on". Enable collection of network "blobs" to be delivered to network libraries on compute nodes prior to local proc spawn. Infrastructure is in place - implementation will come later.
* Harvesting and forwarding of envars. Enable network plugins to harvest envars and include them in the launch msg for setting the environment prior to local proc spawn. Currently, only OmniPath is supported. PMIx MCA params control which envars are included, and also allows envars to be excluded.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
This header file was meant to be autogenerated, and for
some reasons, was never removed from the repository.
Update .gitignore as well
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>