The xlc compiler seems to behave in a different way that gcc when it
comes the inline asm. There were two problems with the code with xlc:
- The TOC read in mca_patcher_base_patch_hook used the syntax
register unsigned long toc asm("r2") to read $r2 (the TOC
pointer). With gcc this seems to behave as expected but with xlc
the result in toc is not the same as $r2. I updated the code to use
asm volatile ("std 2, %0" : "=m" (toc)) to load the TOC pointer.
- The OPAL_PATCHER_BEGIN macro is meant to be the first thing in a
hook. On PPC64 it loads the correct TOC pointer (thanks to
mca_patcher_base_patch_hook) and saves the old one. The
OPAL_PATCHER_END macro restores the TOC pointer. Because we *need*
the TOC to be correct before it is accessed in the hook the
OPAL_PATCHER_BEGIN macro MUST come first. We did this and all was
well with gcc. With xlc on the other hand there was a TOC access
before the assembly inserted by OPAL_PATCHER_BEGIN. To fix this
quickly I broke each hook into a pair of function with the
OPAL_PATCHER_* macros on the top level functions. This works around
the issue but is not a clean way to fix this. In the future we
should 1) either update overwrite to not need this, or 2) figure
out why xlc is not inserting the asm before the first TOC read.
This fixesopen-mpi/ompi#1854
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
With libfabric v1.4, the usnic provider changed the values of its
fabric and domain name strings (compared to libfabric <v1.4). Update
the Open MPI usNIC BTL to handle both pre-v1.4 and v1.4 fabric/domain
names.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
and fail with a user friendly message if no method is available:
"sec: native cannot validate_cred on this system"
(back-ported from upstream pmix/master@c474a1fc60)
This commit fixes a race that can occur when two threads are in the
ugni progress function at the same time. This race occurs when one
thread calls GNI_PostDataProbeById then goes to sleep then another
thread calls GNI_PostDataProbeById then GNI_EpPostDataWaitById before
the other thread wakes up. If this happens the first thread will print
a warning on GNI_EpPostDataWaitById about no matching post.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes two issues that can occur during a connection:
- Re-entry to connection progress from modex lookup. Added an
additional endpoint state that will keep the code from re-entering
the common endpoint create.
- Fixed a race between a process posting a directed datagram through
a send and a connection being progressed through opal_progress().
The progress code was not obtaining the endpoint lock before
attempting to update the endpoint. To limit the amount of code
changed for 2.0.1 this commit makes the endpoint lock recursive. In
a future update this may be changed.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>