before this fix, mca_spml_ucx_component_open was using
oshmem_num_procs() to set the value of params.estimated_num_eps for UCX.
The oshmem_num_procs() function uses oshmem_group_all which will be
initialized after the call to mca_spml_ucx_component_open and therefore,
cannot be used there.
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
MPI_IN_PLACE is not a valid send buffer for neighborhood collectives, so do not
invoke memchecker in this case.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
PSM2 enables support for GPU buffers and CUDA managed memory and it can
directly recognize GPU buffers, handle copies between HFIs and GPUs.
Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases.
In this patch, we allow the PSM2 MTL to specify when
it does not require CUDA convertor support. This allows us to skip CUDA
convertor init phases and lets PSM2 handle the memory transfers.
This translates to improvements in latency.
The patch enables blocking collectives and workloads with GPU contiguous,
GPU non-contiguous memory.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
Some OSes have hardcoded limits to prevent overflowing over an int32_t.
We can either detect this at configure (which might be a nicer but
incomplete solution), or always force the pipelined protocol over TCP.
As it only covers data larger than 1GB, no performance penalty is to be
expected.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
they are supposed to be unsigned, casting them to a signed
value for all atomic operations is as errorprone as handling
them as signed entities.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
as the writev and readv support a sum larger than a uint32_t
this version will work. For the other OSes a different patch
is required. This patch is a slight modification of the one
proposed by @ggouaillardet.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
This reverts commit b5ea5e0994a827915107e03d6744e73156534a04
This commit reverts a change that is hopefully not necessary. If this
is the case this will fix#4146.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
though there are no C++ bindings for oshmem, we need C++ wrappers
since a C compiler might not be able to compile a C++ source.
the C++ wrappers are :
- shmemc++ / oshc++
- shmemcxx / oshcxx
- shmemCC / oshCC (on case sensitive filesystems)
also add the examples/hello_oshmem_cxx.cc example
Thanks Bert Wesarg for bringing this to our attention
Fixesopen-mpi/ompi#2097
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
The clang 4.0 compiler that ships with FreeBSD 11.1 doesn't
work well with OpenMPI. Workaround is to use a GNU compiler.
Related to #3992.
[skip ci]
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
This test has proven to produce too many false positives so far. I hope
to re-enable it in the future, but until it has a longer history of not
producing false postivies it doesn't need to produce false nuisance
failures for everybody.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
* Resolves#3705
* Components should link against the project level library to better
support `dlopen` with `RTLD_LOCAL`.
* Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
with the appropriate project level library:
```
MCA components in ompi/
$(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
$(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
$(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
$(top_builddir)/oshmem/liboshmem.la"
```
Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* This script will search for all of the `Makefile.am` files in each
of the project-level components. Then it adds the project-level
library to `mca_FRAMEWORK_COMPONENT_la_LIBADD`.
- If the library is already in the LIBADD list then it's skipped.
So it is safe to run multiple times on the same codebase.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>