This was done after discussions with core developers about any
potential ABI breakage for any of the libs the user directly
links against. Also compaitiblity tests were done using the
ibm test suite and building with v3.1.x and running with v4.0.x
see: https://github.com/open-mpi/ompi/issues/5447
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
Since openib is on its long, slow way out the door, don't let it
complain about not being able to find any NICs at run time.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 098ec55e37261aee4b9672ec393726c2510247a1)
some clean up of old cruft for configure options that
we should have cleaned up a while ago.
[skip ci]
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
As agreed on #4574, where removed in past release branches
to avoid perfomance impacts in the default values for
some paramters.
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
This commit fixes a bug when using the UCT btl with the UCX memory
hooks disabled. We were misssing a call to
opal_mem_hooks_unregister_release to remove the btl memory hook
callback.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 36c206d2d616578c97853d1d69727a1d6e165c1e)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This option isn't needed on modern distros; add a note to README about
it.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 9a8b0d0e18b3df4a0da8ed2367115f5e45e521c7)
If someone specifies --with-verbs-usnic, actually do a configury check
to ensure that it will compile (vs. assuming that it will compile if
someone asks for it).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 05e5f61fe1c961927eae5bb5c0eb2021ac99afa6)
This commit removes some code that protected the odls/alps component
from closing alps file descriptors. For some unknown reason leaving
these file descriptors open causes can cause an orted to hang when
launching apps.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 98172163e6af7ae1bf510b8f31ec97fdb497eaf1)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
- added synonim to common ucx variables to allow
to print it in opal_info -a
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit e00f7a68ba0b1012f954910e39b26f6075f3d006)
Since progress entries array is globally allocated, it is susceptible
to race conditions when using multi-threaded applications. Allocating it
on the stack resolves any potential races as it is thread local by default.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit ed2343034d09b33eb44a0a727bef97a108edc8aa)
Thanks to George for finding/fixing these.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 9194dbbe7b0924f855b3f3e975225fd28fc196f2)
Whitespace change only; no code or logic changes.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 63560fe9c4af95894edaa095f59493a3c887fabc)
Thanks to Stefan Teleman for identifying this issue and providing a
proof-of-concept patch. We ended up revamping the detection of
128-bit atomics to reduce duplicated code and be a slightly simpler --
albiet perhaps a bit more verbose -- approach:
- Remove the --enable-cross-* options; they were confusing and
unnecessary.
- Always try to compile / link the compiler-intrinsic 128-bit atomic
functions.
- Strengthen the C tests we use to be more robust.
- Use m4 to avoid duplicating the C tests multiple times in the .m4
source.
- If not cross-compiling, try to run a short test and ensure that they
actually work (as of Aug 2018, there's at least one platform where
they don't: clang 6 on ARM64). If cross-compiling, just assume that
they work.
- Add more comments about what is going on with all the tests; it's
tricky stuff. Our Future Selves will thank us.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit ff9df9188707f5331cb21aff66371e133e2a810f)
file_delete triggers underneath the hood the full component selection
logic, since we do not have a file handle, just a file name.
As part of the selection logic, we have to however initiate the
framework-open of the fs component in case of ompio, since ompio
will call the delete function of the selected fs componentn, which
is based on the file system where the file is located.
This was not handled correctly so far. The problem however only
shows up if the first I/O operatin to be executed is a file_delete,
other wise the file_open will lead to the correct opening and initialization
of the fs framework. This commit ensures that we do the right thing
even if file_delete is the first file I/O operation in the application.
Fixes issue #5611
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Since version hwloc 2.0.0 has a new organization of NUMA nodes on the
topology tree. This commit adds the detection of local NUMA object for
hwloc => 2.0.0, which fixes the procs bindings policy for rmaps mindist
component.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
(cherry picked from commit e5291ccc34a621295a9d3fc3b1d470a4e4c790e2)
In some scenarios, we can have a daemon sharing the node with mpirun. In
those cases, we need to avoid race conditions in cleanup
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 8d1be27a1e98940e21e87b75251f6996668490bc)
If an application opens a file for reading from multiple processes
using MPI_COMM_SELF (or another communicator that has distinct
process groups but the same comm-id, as can happen as the result
of comm_split), the naming chosen for the lockedfile or the mmapped
file used by the sharedfp/sm component would collide. This patch
ensures that the filename is different by integrating the process id
of rank 0 for each sub-communicator.
This fixes one aspect of the problem reported in github issue 5593
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Thanks Zoltan Mizsei for bringing this to our attention.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@a02be5e91a)
Per suggestion by @bangerth, allow mpirun to execute as root if two
envars are set to specific values
Per conversation with @jsquyres, name the envars OMPI_ALLOW_RUN_AS_ROOT
and OMPI_ALLOW_RUN_AS_ROOT_CONFIRM
Fixes#4451
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 7f1444d5f9e504ff50392a0f73e81787c01b7a0e)
- updated compilation of C11 compiler for API macro
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit be0ea1d7647e8c16cb8f30f4dd916454c2bdd746)