also add common verbose variable.
Note the verbosity thing is a little tricky owing to the way the MCA frameworks and components are registered and
and initialized. The BTL's are registered/initialized prior to the MTL components even getting registered.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Now that the old sm btl has been gone for some time there was a request
to rename vader to sm. This commit does just that (reluctantly).
An alias has been generated so specifying vader in the btl selection
variable or specifying vader parameters will continue to work.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit adds support for aliasing component names. A component
name alias is created by calling: mca_base_alias_register. The name
of the project and framework are optional. The component name and
component alias are required. Once an alias is registered all
variables registered after the alias creation will have synonyms
also registered. For example:
```c
mca_base_alias_register("opal", "btl", "vader", "sm", false);
```
would cause all of the variables registered by btl/vader to have
aliases that start with btl_sm. Ex: btl_vader_single_copy_mechanism
would have the synonym: btl_sm_single_copy_mechanism.
If aliases are registered before component filtering the alias
can also be used for component selection. For example, if sm is
registered as an alias to vader in the btl framework register
function then ```--mca btl self,sm``` would be equivalent to
```--mca btl self,vader```.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit adds two additional helpers to opal/class:
- OPAL_HASH_TABLE_FOREACH_PTR: Same as OPAL_HASH_TABLE_FOREACH but
operating on ptr hash tables. This is needed because the _ptr
iterator functions take an additional argument.
- OPAL_LIST_FOREACH_DECL: Same as OPAL_LIST_FOREACH but declares
the variable specified in the first argument.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
Update PMIx/PRRTE to ensure we pickup the default system and user MCA
param definitions during PMIx_server_setup_application so they get
propagated. Protect OPAL's MCA var processing so it doesn't try to
process a NULL filename when PMIx provides the params for it.
Signed-off-by: Ralph Castain <rhc@pmix.org>
Adds the capability to select a NIC based on hardware locality.
Creates a list of NICs that share the same cpuset as the process,
then selects the NIC based on the (local rank) % (number of NICs).
If no NICs are available that share the same cpuset, the selection process
will create a list of all available NICs and make a selection based on
(local rank) % (number of NICs)
Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
The configure script for the btl uct component reports an error for
the new UCX 1.8.0 versions as it was fixed up to UCX 1.7.
This fixes#7612
Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
Deprecate the current OMPI-specific MPI_Info key definitions for
MPI_Comm_spawn and replace them with their PMIx equivalents. Issue a
deprecation/conversion warning as this is done. Also issue deprecation
warnings for options such as "ompi_non_mpi" that are no longer used.
Handle both cases where the user might pass either the PMIx attribute
name itself (e.g., "PMIX_MAPBY") or the string value of the attribute
(e.g., PMIX_MAPBY, which translates to "pmix.mapby"). This can only be
done for PMIx v4 and above, so protect that code.
Silence a couple of Coverity warnings and add a test along the way.
Signed-off-by: Ralph Castain <rhc@pmix.org>
Remove pmix_config.h from the tarball. Deal with the case of no local
procs when register_nspace is called.
Signed-off-by: Ralph Castain <rhc@pmix.org>
Remove a set of functions that were only used by ORTE as they are no
longer required. We can probably remove more of them with a little
cleanup in the rest of the code.
Signed-off-by: Ralph Castain <rhc@pmix.org>
Consolidate the ompi_process_info and opal_process_info structs to
remove duplicate storage and conversion issues. Unwind some interweaving
of include files using opal.h. Silence a couple of warnings.
For now, set the arch to local if PMIX_ARCH is not found.
Signed-off-by: Ralph Castain <rhc@pmix.org>
adding PMIX_NUMA_RANK info to process metadata so that the local NUMA
rank can be accessed through the opal_process_info object.
Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
PMIx:
- restore OPA support
PRRTE:
Restore support for several options
* -N for ppr:N:node
* INHERIT modifier for --map-by option, indicating that
the spawned job should inherit the placement options
of its parent. Only applicable to dynamically spawned
jobs
Signed-off-by: Ralph Castain <rhc@pmix.org>
Do some code cleanup in the connect/accept code. Ensure that the OMPI
layer has access to the PMIx identifier for the process. Add macros for
converting PMIx names to/from strings. Cleanup a few of the simple test
programs. Add a little more info to a btl/tcp error message.
Signed-off-by: Ralph Castain <rhc@pmix.org>
Found a handful of other URLs that weren't https-ized, so I updated
them, too (after verifying that they support https, of course).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This reverts commit 1aabbe456d5de8fc3c3d6f91eabb8851db7d63eb.
Update PMIx and PRRTE, plus PRRTE config integration
Cleanup how we pass the extra libs and LDFLAGS for linking against
external libevent, hwloc, and pmix installs.
Catch the flag indicating that PMIx provided the user-level default MCA
params so we don't go looking for them ourselves.
Cleanup misc config warnings
Signed-off-by: Ralph Castain <rhc@pmix.org>
Properly mark/detect that a daemon sourced the event broadcast to avoid
reinjecting it into the PMIx server library. Correct the source field
for the event notify call on launcher ready.
Update event notification for tool support
Deal with a variety of race conditions related to tool reconnection to a
different server.
Signed-off-by: Ralph Castain <rhc@pmix.org>
To suppress Valgrind warnings, opal_tsd_keys_destruct() needs to explicitly
release TSD values of the main thread. However, they were not freed if keys are
created by non-main threads. This patch fixes it.
This patch also optimizes allocation of opal_tsd_key_values by doubling its size
when count >= length instead of increasing the size by one.
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>