This patch fix a race condition, which caused the main thread
to sometimes write to a closed pipe.
The following are details:
Currently, during shut down, the main thread will do the following:
1. set listen_thread_action to false.
2. write to stop_thread pipe to tell the listener thread to exit.
The listener thread do the following:
1. call select() to listen to a set of file descriptors with
a maximum wait time.
2. check listen_thread_action. If it is false, close the stop_thread
pipe.
The main thread will write to closed pipe, when
1. listener's call to select() finished because maximum wait time reached.
2. main thread set listen_thread_action to false
3. listener thread check listen_thread_action and closed the pipe
4. main thread write to the closed pipe.
This patch address the issue by having the main thread close the pipe
after the listener thread has been joined. This way, main thread
will both write and close the thread, so there is no conflict.
Note This patch was opened directly against v4.1.x branch because
the orte/mca/oob/tcp directory has been removed from master branch.
Signed-off-by: Wei Zhang <wzam@amazon.com>
This is a meta commit, that encapsulate all the ADAPT commits in the master
into a single PR for 4.1. The master commits included here are:
fe73586, a4be3bb, d712645, c2970a3, e59bde9, ee592f3 and c98e387.
Here is a detailed list of added capabilities:
* coll/adapt: Fix naming conventions and C11 atomic use
* coll/adapt: Remove unused component field in module
* Consistent handling of zero counts in the MPI API.
* Correctly handle non-blocking collectives tags
* As it is possible to have multiple outstanding non-blocking collectives
provided by different collective modules, we need a consistent
mechanism to allow them to select unique tags for each instance of a
collective.
* Add support for fallback to previous coll module on non-commutative operations (#30)
* Replace mutexes by atomic operations.
* Use the correct nbc request type (for both ibcast and ireduce)
* coll/base: document type casts in ompi_coll_base_retain_*
* add module-wide topology cache
* use standard instead of synchronous send and add mca parameter to control mode of initial send in ireduce/ibcast
* reduce number of memory allocations
* call the default request completion.
* Remove the requests from the Fortran lookup conversion tables before completing
and free it.
* piggybacking Bull functionalities
Signed-off-by: Xi Luo <xluo12@vols.utk.edu>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Marc Sergent <marc.sergent@atos.net>
Co-authored-by: Joseph Schuchart <schuchart@hlrs.de>
Co-authored-by: Lemarinier, Pierre <pierre.lemarinier@atos.net>
Co-authored-by: pierrele <31764860+pierrele@users.noreply.github.com>
The gather and scatter operations did not use the correct message size
(Only did datatype size * com size). This did not correctly reflect the
total message size and prevents fine tuning within a com size. This
patch multiplies the value by the number of elements sent.
Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit 50823fe9a9ef4f93e55ee2087b311303d49f90a8)
Uncovered a problem using the GNI provider with the OFI MTL.
See https://github.com/ofiwg/libfabric/issues/6194.
Related to #8001
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
(cherry picked from commit d6ac41cbbd8dbfdbf1a87533e2e292f2508a85ff)
improve configury to check whether icc is handling no long double.
This prevents seeing 100s of messages like this:
icc: command line warning #10148: option '-Wno-long-double' not supported
A similar patch will be needed for pmix.
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
(cherry picked from commit 6df0e53421b285eef2dc0b442d818026301d29f1)
Reduce scatter block and reduce scatter algorithms were hitting
correctness issues for non commutative strided tests. We will revert to
the original default algorithms for those two collectives (basic linear
and non overlapping respectively) in the non commutative op case.
See #8010
Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit 57b95bcb45d5ce3ae1a1e00bd17ceeaa206526fe)
the ofi mtl mrecv was not properly setting the message in/out
arg to MPI_MRECV to MPI_MESSAGE_NULL.
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
(cherry picked from commit e6f81ed6d6f4940c95732451d6ebf4d39cde1591)
The ofi_rxm provider is dependent upon the underlying hardware for its
implementation of FI_DELIVERY_COMPLETE. Since this can lead to early
completions, we disable the provider to avoid correctness issues.
This is not an issue in the mtl/ofi as it does not require
FI_DELIVERY_COMPLETE.
Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit 41acfee2bbfc5495aeeeae4b72f385ca8d1d8cee)
EFA incorrectly implements FI_DELIVERY_COMPLETE in earlier libfabric
versions. While FI_DELIVERY_COMPLETE would be advertised by the
provider, completions would return too early by not accounting for
bounce buffers on the receive side. This would cause the BTL
to receive early completions that lead to correctness issues.
This is not an issue in the mtl/ofi as it does not require
FI_DELIVERY_COMPLETE.
Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit a7dcfd98742d7f44d96d74a60a38529894d4099c)
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit c4e88a43a3a1f6a13f8eaa8b90d2a0a8caba5adf)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
Alter the test to validate misaligned data.
Fixes#7954.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit b6d71aa893db51c393df9c993fde597eff5457e8)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
The btl/ofi does not currently utilize the common ofi include/exclude
list. Added verification code similar to the mtl/ofi that will check if
the info object is in the include or exclude list. If it isn't in the
include list or is in the exclude list, validate_info will return
OPAL_ERROR. The btl/ofi will no longer pass a provider name as a hint
when calling getinfo, instead filtering the provider during
validate_info.
This patch also moves the is_in_list MTL function into common code and
adds additional debugging output to the BTL to match the MTL standard.
Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit 9b8f463a768206a26e5b1dcea0612a403462b1d0)
The default algorithm selections were out of date and not performing
well. After gathering data from OMPI developers, new default algorithm
decisions were selected for:
allgather
allgatherv
allreduce
alltoall
alltoallv
barrier
bcast
gather
reduce
reduce_scatter_block
reduce_scatter
scatter
These results were gathered using the ompi-collectives-tuning package
and then averaged amongst the results gathered from multiple OMPI
developers on their clusters.
You can access the graphs and averaged data here:
https://drive.google.com/drive/folders/1MV5E9gN-5tootoWoh62aoXmN0jiWiqh3
Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit ce40cfbaa53406be71319041e13e893b0def7ad9)
If building Open MPI with sanitizers, e.g
$ configure CC=clang CFLAGS=-fsanitize=address ....
configure test programs are also build with the sanitizers and will
report errors resulting in configure to fail.
Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
The missing include file causes an error when using an external version of LibEvent.
Signed-off-by: tomhers <tom.herschberg@gmail.com>
(cherry picked from commit 88f9d2c90f1730f2e0b5bc4951893f60ff5a1332)
bugfix: provider selection would not differentiate between ipv4
and ipv6 addresses which would cause some nodes to be unable
to communicate between each other. Adding a check for address
format to provider selection to ensure that all nodes use the
same address format.
Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
(cherry picked from commit 7e463713014ae58f7e78d7a5b49e9e63d62e374e)
Add logic to handle different architectural capabilities
Detect the compiler flags necessary to build specialized
versions of the MPI_OP. Once the different flavors (AVX512,
AVX2, AVX) are built, detect at runtime which is the best
match with the current processor capabilities.
Add validation checks for loadu 256 and 512 bits.
Add validation tests for MPI_Op.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: dongzhong <zhongdong0321@hotmail.com>
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit 14b3c706289cfc26e53b13efd3eb85641636e459)