This commit fixes a hang that occurs with debug builds of Open MPI on
aarch64 and power/powerpc systems. When the ll/sc atomics are inline
functions the compiler emits load/store instructions for the function
arguments with -O0. These extra load/store arguments can cause the ll
reservation to be cancelled causing live-lock.
Note that we did attempt to fix this with always_inline but the extra
instructions are stil emitted by the compiler (gcc). There may be
another fix but this has been tested and is working well.
References #3697. Close when applied to v3.0.x and v3.1.x.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
appends MPI1 compatible source files instead of redefining all the source files
fix a typo from open-mpi/ompi@89da9651bb
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This commit adds a new configure option: --enable-mpi1-compat. Without
this option we will no longer provide APIs, typedefs, and defines that
were removed from the standard in MPI-3.0. This option will exist for
one major release (Open MPI v4.x.x) and then the option and associated
code will be removed in Open MPI v5.x.x. Open MPI has already
internally prepared for this change. Please prepare your codes
accordingly.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds a new MCA variable to set the location of the backing
store: osc_sm_backing_directory. The default on Linux has been
changed to use /dev/shm to improve performance in cases where /tmp is
not a tmpfs.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds a new MCA variable to set the location of the backing
store: osc_rdma_backing_directory. The default on Linux has been
changed to use /dev/shm to improve performance in cases where /tmp is
not a tmpfs.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Per discussion at
https://github.com/open-mpi/ompi/issues/2614#issuecomment-392815654,
do not allow for selection of the OSC PT2PT when creating an MPI RMA
window when THREAD_MULTIPLE is active. Print a helpful message and
return a not-supported error.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit d0ffd660841623c02d1dfa3151e7f7afd3327698)
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
- added opal_progress call to wait function to avoid
possible [dead]lock issues
- wait call is declared as inline
- minor fixes
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
Implements butterfly algorithm for MPI_Reduce_scatter_block.
The algorithm can be used both by commutative and non-commutative
operations, for power-of-two and non-power-of-two number of processes.
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
+ Add quiet method to SPML, so it can have different implementation with
fence.
+ Use ucp_worker_fence for spml_fence method of UCX SPML
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
This patch disables the oshmem layer if there are no SPMLs that
will build. With the limited set of SPMLs available to support
oshmem, many builds end up installing an oshmem library that we
know will not work. There has been a bit of customer confusion
over oshmem, hopefully this will lead customers in the right
direction.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Two related changes to allow projects to not build based on
configure test results, as opposed to only reacting to
user configure options today. Use case is disabling a project
like oshmem because no communication channels can be built.
First, Move PROJECT_* AM_CONDITIONALs from the top of configure to
the bottom, so that we can change the results during configure.
Second, add a DIST_SUBDIRS to Makefile.am (and populate it in
opal_mca) so that "make dist" will work even when a project is
disabled.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
enable_oshmem holds the result of a customer decision and, like
most user options, can have the values "yes" (user wants us to
build feature), "no" (user wants us not to build feature),
"" (user wants us to figure it out), and "<something>" (user
wants us to build feature, with <something> turned on).
This change updates oshmem to not lose this data by not overwriting
enable_oshmem with a yes/no and leaving the original customer
intent in place. Aside from fixing one bug (below) there are no
customer visible changes in this patch, but it makes it possible
to do the right thing in the upcoming work to allow oshmem to be
disabled based on test results.
There was a cosmetic bug in the existing code where specifying
a feature argument (like --enable-oshmem=awesome) would result
in the "checking if want oshmem" test reporting no, but oshmem
being built anyway. With this cleanup, the "checking if want
oshmem" test, the final output summary, and what actually happens
will all match.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Don't do a recursive search (hence no need for *idx anymore).
Find the level depth, to hide cache-issues first.
Then iterate over that level to find the objects we want.
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
df_search_min_bound() would need to be fixed for hwloc 2.0,
but it's only used in opal_hwloc_base_find_min_bound_target_under_obj()
which isn't used anymore. So just remove all of them.
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Don't bother doing a lookup upwards or downwards for the target object type.
Just use the target depth, iterate over the level until we find the min_bound
object that intersects the locale cpuset.
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>