There was a race condition in the eager get protocol where the RDMA complete message could be received before the local completion of the SMSG message that started the eager get protocol.
cmr:v1.7
This commit was SVN r27740.
- configure: fixed warnings from automake 1.12.x
Changes to VT:
- general:
- incremented version number to 5.14.2
- configure:
- fixed warnings from automake 1.12.x
- VT libs / MPI wrappers:
- do initialize VT and enter the dummy main function ("user") if MPI_Initialized is the very first event to be recorded (fixed assertion error)
- leave the dummy main function on the same thread where it is entered (fixes potential stack underflow)
This commit was SVN r27734.
* btl sendi(): if message can be send inline try to avoid signal
* signal is requested one per 64 or when
there are no send wqes
when message can not be send inline
any other btl method then sendi()
This commit was SVN r27724.
to compile. With recent Autoconf versions, AC_CHECK_HEADERS only succeeds
if the header compiles, so need to check these two with sys/param.h
explicitly included in the search list...
This commit was SVN r27715.
which is the same size as a Fortran or C integer. This resulted in configure
coming up with Fortran's MAX_INT as -2^31, which obviously isn't a positive
number. Since we found the MAX_INT using the same broken loop in a couple
places and doing it right is complicated, added a new macro that is much
more careful about sign roll-over.
During the Fortran rework between v1.6 and v1.7, the variable which
indicates whether or not Fortran is being compiled changed, so on platforms
without Fortran compilers, we were trying to determine the max value for
Fortran INTEGERS where we previously didn't. I believe this is why
bug #3374 appeared as a regression.
Finally, since the OMPI code doesn't cope with OMPI_FORTRAN_HANDLE_MAX
being negative (which was the root cause of the segfault in $3374),
add a check at the end of the OMPI_FORTRAN_GET_HANDLE_MAX macro to
ensure that OMPI_FORTRAN_HANDLE_MAX is always non-negative.
This commit was SVN r27714.
- configure: pass the MPI configure options (e.g. --with-mpi-lib, --with-mpi-inc-dir) to the OTF configure, even if MPI compiler wrappers were found
This commit was SVN r27710.
config/ directory. We split them apart a while ago in the hopes that
it would simplify things, but it didn't really (e.g., because there
were still some ompi/opal .m4 files in the top-level config/
directory, resulting in developer confusion where any given m4 macro
was defined).
So this commit consolidates them back into the top-level directory for
simplicity.
There's still (at least) two changes that would be nice to make:
1. Split any generated .m4 file (e.g., autogen-generated .m4 files)
into a separate directory somewhere so that a top-level -Iconfig/
will only get our explicitly defined macros, not the autogen stuff
(e.g., with libevent2019 needing to get the visibility macro, but
NOT all the autogen-generated inclusion of component configure.m4
files).
1. Change configure to be of the form:
{{{
# ...a small amount of preamble/setup...
OPAL_SETUP
m4_ifdef([project_orte], [ORTE_SETUP])
m4_ifdef([project_ompi], [OMPI_SETUP])
# ...a small amount of finishing stuff...
}}}
I doubt we'll ever get anything as clean as that, but that would be
the goal to shoot for.
This commit was SVN r27704.
* We always build OPAL, so always do the filesystem case-sensitivity
check
* Protect the extra ORTE/OMPI macros with "if we're building this
project..." tests
This commit was SVN r27699.
- compiler wrappers: removed invocation of pdbcomment; it removes essential information from the PDB file for instrumenting functions
This commit was SVN r27687.
- otfprofile: fixed build error when using the IBM XL C++ compiler
Changes to VT:
- configure:
- use AC_CHECK_TYPES instead of AC_CHECK_DECLS to check for PAPI's long_long type
- use AC_C_INLINE to check whether C 'inline' is present
- VT libs:
- set CUPTI tracing as default, if CUDA runtime wrapper has not been built
- added a common interface for all NVIDIA CUPTI interfaces (events, activity, callbacks)
- added support for concurrent kernel tracing (since CUDA 5.0)
- removed almost unused VT_DEBUG env. variable - replaced calls to vt_debug_msg(DBG_LEVEL,...) by vt_cntl_msg (DBG_LEVEL+10,...)
- added some more ifdefs to new CUDA 5 features
- added several guards for internal malloc() and free() calls in CUDA related source files
- revised memory allocation tracing:
- intercept memory (de)alloaction functions by library wrapping (replaces deprecated hook technique from the GNU C library)
- added support for multi-threaded applications
- added wrapper functions for memalign, posix_memalign, and valloc
- revised exec,system,fork tracing
- retitled to "Child Process Execution Tracing"
- introduced env. variable VT_EXECTRACE (marked VT_LIBCTRACE as deprecated)
- added wrapper functions for execvpe, fexecve, waitid, wait3, and wait4
- changes default function group name for
- memory (de)allocation functions: "MEM" -> "LIBC-MALLOC"
- I/O functions: "I/O" -> "LIBC-I/O"
- child process execution: "LIBC" -> "LIBC-EXEC"
- plugin counter interface:
- added check for initialized vt_plugin_cntr_info.info
- vtdyn:
- added missing header includes for Dyninst 8
- vtrun:
- do not preload the Dyninst Runtime library; it is loaded by Dyninst itself
This commit was SVN r27679.