If Open MPI is configured with CUDA, then user also should be using a CUDA build of
PSM2 and therefore be setting PSM2_CUDA environment variable to 1 while using
CUDA buffers for transfers. If we detect this setting to be missing, force set
it. If user wants to use this build for regular (Host buffer) transfers, we
allow the option of setting PSM2_CUDA=0, but print a warning
message to user that it is not a recommended usage scenario.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
PSM2 enables support for GPU buffers and CUDA managed memory and it can
directly recognize GPU buffers, handle copies between HFIs and GPUs.
Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases.
In this patch, we allow the PSM2 MTL to specify when
it does not require CUDA convertor support. This allows us to skip CUDA
convertor init phases and lets PSM2 handle the memory transfers.
This translates to improvements in latency.
The patch enables blocking collectives and workloads with GPU contiguous,
GPU non-contiguous memory.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
accelerate message queue lookups if the lookups have
the proper tag&mask layout. OpenMPI should follow
PSM2's preferred tag&mask spec, so that PSM2 can provide
a performance benefit.
Bring Slurm PMI-1 component online
Bring the s2 component online
Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways.
Bring the OMPI pubsub/pmi component online
Get comm_spawn working again
Ensure we always provide a cpuset, even if it is NULL
pmix/cray: adjust cray pmix component for pmix
Make changes so cray pmix can work within the integrated
ompi/pmix framework.
Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet
Cleanup comm_spawn - procs now starting, error in connect_accept
Complete integration
This new MTL runs over PSM2 for Omni Path. PSM2 is a descendant of PSM
with changes to support more ranks and some MPI-3 features like mprobe.
PSM2 will only support Omni Path networks; PSM only supports True Scale.
Likewise, the existing PSM MTL will continue to be maintained for True
Scale, while the PSM2 MTL is developed and maintained for Omni Path.