Libtool archive files (.la files) create an unnecessary dependency
between linked applications and the development versions of packages
upon which Open MPI depends (to get the .so.1 -> .so symlink).
Remove .la libtool archive files to keep the best practice in
package builders.
Signed-off-by: Jie Zhang <zhngaj@amazon.com>
Once any number of events are read, return immediately, rather than
waiting for fi_cq_read() to return FI_EAGAIN or an error. This can
improve observed latency if the user application is in a blocking call
waiting for us to return. Deleting the while loop here also means
ofi_progress_event_count serves as an upper bound for the total number
of events read in a single call (with the while loop we might read far
more, as long as new events continue to arrive).
Signed-off-by: Eric Badger <eric@badgerio.us>
mpi-next)
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Ordering must match fortran definition index for errhandlers, and we
don't want to change the old ones.
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
This bug was first seen in a different product that's using the same
interception code as OMPI. But I think it's potentially in OMPI too.
In my vanilla build of OMPI master on RH8 if I "gdb libopen-pal.so" and
"disassemble intercept_brk", I'm seeing a suspicious extra instruction
in front of PATCHER_BEGIN:
0x00000000000d6778 <+40>: std r2,24(r1) // something gcc put in front
0x00000000000d677c <+44>: std r2,96(r1) // PATCHER_BEGIN's toc_save
0x00000000000d6780 <+48>: nop // NOPs from PATCHER_BEGIN
0x00000000000d6784 <+52>: nop // that get replaced
0x00000000000d6788 <+56>: nop // by instructions that
0x00000000000d678c <+60>: nop // change r2
0x00000000000d6790 <+64>: nop //
Later there are loads from that location like
0x000000000019e0e4 <+132>: ld r2,24(r1)
that make me nervous since that's the pre-updated value.
I believe this is the same thing Nathan is describing way back in a9bc692d
and his solution was to put a second call around each interception, where
the outer call is just
intercept_brk():
PATCHER_BEGIN
_intercept_brk()
PATCHER_END
and the inner call _intercept_brk() is where the bulk of the code goes.
What I'm seeing is that _intercept_brk() is being inlined and probably
negating Nathan's fix. So I want to add __opal_attribute_noinline__ to
restore the fix.
With this commit in place, the disassembly of intercept_brk becomes tiny
because it's no longer inlining _intercept_brk() and the susipicious
early save of r2 is gone. I made the same fix to all the intercept_*
functions, although intercept_brk was the only one that had a suspicious
save of r2.
As far as empirical failures though, we only have those from the non-OMPI
product that's using the same patcher code. I'm not actually getting OMPI
to fail from the above suspicious data being saved in r1+24.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
We have been bad about updating the NEWS file in master with all
the changes that have gone into the release branches. Patch up
NEWS with the changes from v3.0, v3.1, and v4.0 branches.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
The NEWS file had a mix of ISO-8859-1 and UTF-8 encodings, which
was making a mess of decoding the non-ASCII characters in the
file. This patch unifies the NEWS file as a UTF-8 encoded file
and changes many of the places where we had ASCII-ified a persons
name.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
These op codes used to be in bits/ipc.h but were removed in glibc in 2015
with a comment saying they should be defined in internal headers:
https://sourceware.org/bugzilla/show_bug.cgi?id=18560
and when glibc uses that syscall it seems to do so from its own definitions:
https://github.com/bminor/glibc/search?q=IPCOP_shmat&unscoped_q=IPCOP_shmat
So I think using #ifndef and defining them if they're not already defined
using the values from glibc is the best option.
At IBM it was the testing on redhat 8 that found this as an issue
(the opcodes being undefined on the system made it select the
left undefined so shmat/shmdt memory events went unintercepted).
Signed-off-by: Mark Allen <markalle@us.ibm.com>
mtl_btl_ofi_rcache_init() initializes patcher which should only take
place things are single threaded. OFI providers may start spawn threads,
so initialize the rcache before creating OFI objects to prevent races.
Authored-by: John L. Byrne <john.l.byrne@hpe.com>
Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
Added the flag OPAL_OFI_PCI_DATA_AVAILABLE to remove accessing the nic
object in
fi_info when the ofi version does not support that structure.
Signed-off-by: Nikola Dancejic dancejic@amazon.com
correctly use strlen(char *) instead of sizeof(char *)
Thanks Georg Geiser for reporting this issue.
Refs. open-mpi/ompi#7772
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
1. We haven't used the -dlopen or -preopen options for years (if
ever?); no need for the `dlopen` LT_INIT option.
2. We haven't supported Windows for years; no need for the `win32-dll`
LT_INIT option.
Also, this commit includes a minor fix to a comment.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
As discussed, a feature is being added to libpsm2 to correctly handle
the case where the library is opened by multiple OMPI transports in the same
process. (For example, the OFI BTL and the PSM2 MTL).
* Improved error message to indicate required libpsm2 version.
* Adds a test at autogen/configure time for the existence of
PSM2_LIB_REFCOUNT_CAP.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>