From ddd1f578ecfc443d05250e09bf5e5077c6d6f304 Mon Sep 17 00:00:00 2001 From: Mark Allen Date: Tue, 9 Jun 2020 15:44:27 -0400 Subject: [PATCH] noinline to avoid compiler reading TOC before PATCHER_BEGIN This bug was first seen in a different product that's using the same interception code as OMPI. But I think it's potentially in OMPI too. In my vanilla build of OMPI master on RH8 if I "gdb libopen-pal.so" and "disassemble intercept_brk", I'm seeing a suspicious extra instruction in front of PATCHER_BEGIN: 0x00000000000d6778 <+40>: std r2,24(r1) // something gcc put in front 0x00000000000d677c <+44>: std r2,96(r1) // PATCHER_BEGIN's toc_save 0x00000000000d6780 <+48>: nop // NOPs from PATCHER_BEGIN 0x00000000000d6784 <+52>: nop // that get replaced 0x00000000000d6788 <+56>: nop // by instructions that 0x00000000000d678c <+60>: nop // change r2 0x00000000000d6790 <+64>: nop // Later there are loads from that location like 0x000000000019e0e4 <+132>: ld r2,24(r1) that make me nervous since that's the pre-updated value. I believe this is the same thing Nathan is describing way back in a9bc692d and his solution was to put a second call around each interception, where the outer call is just intercept_brk(): PATCHER_BEGIN _intercept_brk() PATCHER_END and the inner call _intercept_brk() is where the bulk of the code goes. What I'm seeing is that _intercept_brk() is being inlined and probably negating Nathan's fix. So I want to add __opal_attribute_noinline__ to restore the fix. With this commit in place, the disassembly of intercept_brk becomes tiny because it's no longer inlining _intercept_brk() and the susipicious early save of r2 is gone. I made the same fix to all the intercept_* functions, although intercept_brk was the only one that had a suspicious save of r2. As far as empirical failures though, we only have those from the non-OMPI product that's using the same patcher code. I'm not actually getting OMPI to fail from the above suspicious data being saved in r1+24. Signed-off-by: Mark Allen --- .../memory/patcher/memory_patcher_component.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/opal/mca/memory/patcher/memory_patcher_component.c b/opal/mca/memory/patcher/memory_patcher_component.c index 5566dc12c9..98bf6bb87e 100644 --- a/opal/mca/memory/patcher/memory_patcher_component.c +++ b/opal/mca/memory/patcher/memory_patcher_component.c @@ -107,6 +107,22 @@ opal_memory_patcher_component_t mca_memory_patcher_component = { * data. If this can be resolved the two levels can be joined. */ +/* + * Nathan's original fix described above can have the same problem reappear if the + * interception functions inline themselves. + */ +static void *_intercept_mmap(void *start, size_t length, int prot, int flags, int fd, off_t offset) __opal_attribute_noinline__; +static int _intercept_munmap(void *start, size_t length) __opal_attribute_noinline__; +#if defined(__linux__) +static void *_intercept_mremap (void *start, size_t oldlen, size_t newlen, int flags, void *new_address) __opal_attribute_noinline__; +#else +static void *_intercept_mremap (void *start, size_t oldlen, void *new_address, size_t newlen, int flags) __opal_attribute_noinline__; +#endif +static int _intercept_madvise (void *start, size_t length, int advice) __opal_attribute_noinline__; +static int _intercept_brk (void *addr) __opal_attribute_noinline__; +static void *_intercept_shmat(int shmid, const void *shmaddr, int shmflg) __opal_attribute_noinline__; +static int _intercept_shmdt (const void *shmaddr) __opal_attribute_noinline__; + #if defined (SYS_mmap) #if defined(HAVE___MMAP) && !HAVE_DECL___MMAP