It seems in some cases (gcc older than v6.0.0) the __atomic_thread_fence is a
no-op with __ATOMIC_ACQUIRE. This appears to be the case with X86_64 so go
ahead and use __ATOMIC_SEQ_CST for the x86_64 read memory barrier. This should
not cause any performance issues as it is equivalent to the memory barrier
in the hand-written atomics.
References #6014
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 30119ee339eea086f43e3392352899187a4a73c7)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>