Fine tuning of flux component
Fix a few minor issues with the initial cut:
* Job id could be obtained from the PMI kvsname like SLURM,
but simpler to getenv (FLUX_JOB_ID)
* Flux pmi-1 doesn't define PMI_BOOL, PMI_TRUE, PMI_FALSE
* Flux pmi-1 maps the deprecated PMI_Get_kvs_domain_id() to
PMI_KVS_Get_my_name() internally, so just call that instead.
* Drop residual slurm references.
Add wrappers for PMI functions so that if HAVE_FLUX_PMI_LIBRARY
is not defined, the component can dlopen libpmi.so at location
specified by the FLUX_PMI_LIBRARY_PATH env variable, which adds
flexibility. If HAVE_FLUX_PMI_LIBRARY is defined, link with
libpmi.so at build time in the usual way.
Update configury for flux component
Update m4 so the configure options work as follows:
--with-flux-pmi
Build Flux PMI support (default: yes)
--with-flux-pmi-library
Link Flux PMI support with PMI library at build
time. Otherwise the library is opened at runtime at
location specified by FLUX_PMI_LIBRARY_PATH environment
variable. Use this option to enable Flux support when
building statically or without dlopen support (default: no)
If the latter option is provided, the library/header is located at
build time using the pkg-config module 'flux-pmi'. Otherwise there
is no library/header dependency.
Handle the case where ompi is configured with --disable-dlopen
or --enable-statkc. In those cases, don't build the component
unless --with-flux-pmi-library is provided.
It is fatal if the user explicitly requests --with-flux-pmi but
it cannot be built (e.g. due to --disable-dlopen).
Add a schizo/flux component
Update schizo/flux component
Eliminate slurm-specific usage cases.
Since the module is only loaded if FLUX_JOB_ID is set, there are
only two cases to handle:
1) App was launched indirectly through mpirun. This is not yet
supported with Flux, but hook remains in case this mode is supported
in the future.
2) App was launched directly by Flux, with Flux providing
CPU binding, if any.
Fix up white space in pmix/flux component
Drop non-blocking fence from pmix:flux component
The flux PMI-1 library is not thread safe, therefore
register a regular blocking fence callback instead of the
thread-shifting fencenb().
pmix/flux component avoids extra PMI_KVS_Gets
Keys stored into the base cache under the wildcard
rank are not intended to be part of the global key namespace.
These keys therefore should not trigger a PMI_KVS_Get() if they
are not found in the cache.
Minor pmix/flux component cleanup
pmix/flux: drop code for fetching unused pmix_id
pmix/flux: err_exit must return error
Problem: in flux_init(), although 'ret' (variable holding
err_exit return code) is initialized to OPAL_ERROR, the
variable is reused as a temporary result code, so if there are
some successes followed by a failure that doesn't set 'ret',
flux_init() could return success with PMI not initialized.
Ensure that a "goto err_exit" returns OPAL_ERROR if 'ret'
is not set to some other error code.
pmix/flux: don't mix OPAL_ and PMI_ return codes
Problem: flux_init() can return both PMI_ and OPAL_ return
codes. Although OPAL_SUCCESS and PMI_SUCCESS are both defined
as 0, other codes are not compatible.
Ensure that flux_init() consistently uses 'rc' for PMI_
return codes and 'ret' for OPAL_ return codes.
pmix/flux: factor out repeated code for cache put
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Update ORTE support for dynamic PMIx operations e.g., PMIx_Spawn
Update to track master
Ensure that --disable-pmix-dstore actually disables the dstore. Sync to a few debugger updates
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
It turns that there is an incompatibility between the Cray PMI
library and the default configuration for building Open MPI (master).
To work around this, we now disable use of aprun for direct launch
of Open MPI jobs except under specific conditions.
The problem is that there are now (on master) packages getting
initialized that do not work properly across a fork operation.
As part of a constructor in the Cray PMI library, a fork operation
is done to simplify use of shared memory between the
processes in a job on the same node. This ends up thoroughly
messing up the Open MPI initialization process in the case
that dlopen support is enabled. The initialization process gets
about half-way through when the PMIX framework is opened and
components are loaded, which triggers the Cray PMI constructor
and hence the fork operation.
There are two workarounds for this:
1) configure Open MPI for Cray XE/XC systems using aprun with the
--disable-dlopen option
2) set the PMI_NO_FORK environment variable in the shell in which
the aprun command is run.
Without taking these measures, a Open MPI job will just hang at
job startup in the first attempt to "thread-shift" the PMIx
fence_nb operation. Additional hangs occur at shutdown if this
problem is worked around, again due to the insertion of a fork
operation halfway through the Open MPI initialization procedure.
This commit detects if the conditions that bring out the hang
situation are present, and if so, prints out a message and
aborts the job launch.
Note on systems using slurm, the PMI_NO_FORK environment variable
is set as part of the srun job launch, hence this issue is avoided
on those systems.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
PR open-mpi/ompi#2432 introduced a regression where configure
and build with --disable-dlopn caused build failure owing
to unresolved alps lli symbols in the libopal-pal shared library.
This commit fixes this problem.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Enhance the cray pmix component to set some OMPI internal
env. variables used to set some key/value pairs
on the MPI_INFO_ENV object. This allows more of the
ompi-tests ibm unit tests to pass when using aprun/srun
direct launch and Cray PMI.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Still not completely done as we need a better way of tracking the routed module being used down in the OOB - e.g., when a peer drops connection, we want to remove that route from all conduits that (a) use the OOB and (b) are routed, but we don't want to remove it from an OFI conduit.
- replace MAXHOSTNAMELEN with hardcoded 1024.
unlike Linux, Solaris #define MAXHOSTNAMELEN in <netdb.h>,
so use a hard coded value to keep the test simpl
- stdout cannot be assigned on Solaris, so use freopen instead
(back-ported from upstream commit pmix/master@a63f6e53f4)
this is a convenience macro similar to the PMIX_LIST_FOREACH macro,
that can be used to iterate on all the key/value pairs of a pmix_hash_table_t
(back-ported from upstream commit pmix/master@349971c68c)