- general:
- incremented version number to 1.11.2openmpi
- lib OTFAUX:
- speed-up messages matching, if no snapshots should be generated
Changes to VT:
- general:
- incremented version number to 5.13.1openmpi
- compiler wrappers:
- vtnvcc:
- add path to cuda.h to the PDT parser command
- exclude *.cu source files from instrumenting with PDT/TAU; the PDT parser is not (yet) able to handle CUDA statements and kernels
- vtunify:
- fixed timestamp boundary check for merging asynchronous plugin counters (i.e. async. timestamp must >= process' start timestamp)
- fixed timestamp conversion from local to global
- print percentage of message matching bumps (unmatched or reversed messages)
- inlined key-value list "record handler"
- minor optimizations in hook for merging async. events:
- search async. source manager only once per stream
- don't call writeRecHook_Event() if no async. source key is defined
This commit was SVN r26925.
- return MPI_ERR_OTHER instead of MPI_SUCCESS for the functions that are not
yet implemented
- add another field to the mca_io_ompio_file_t structure to point back to the
ompi_file_t structure.
This commit was SVN r26908.
- For now we'll use 8192 as a base value
- We leave the adjust_cq() as is
- For the long term we can work on an appropriate setting to expose through the INI file.
8K CQEs are 512K per process, which is 8MB for ppn=16
This commit was SVN r26877.
ibv_get_device_list_compat() and not finding it, I finally realized
that it was a function in OMPI. So let's name it with a proper ompi_
prefix, not an ibv_ prefix.
This commit was SVN r26867.
it to a negative number). Get rid of the multiplication in the critical
path, and keep the functions as simple as possible.
This commit was SVN r26864.
old version of the code tried to use the MPI_UB marker, but this
failed if the old marker (the one set in the cyclic function) had
a larger value. Replace the hardcore markers MPI_LB and MPI_UB by
their softer counterparts (using the _resize function).
This commit was SVN r26862.
move). Extended common sm API with: mca_common_sm_module_create and
mca_common_sm_module_attach. Please note that the new routines aren't currently
used -- but will be...
This commit was SVN r26845.
- configure:
- added option --with[out]-liberty to enable/disable symbol demangling with libiberty; default: disabled, because many systems don't provide a PIC version of libiberty
- fixed compiler flags for building Fortran MPI wrapper library
This commit was SVN r26839.
- added CUDA stream reuse for both, CUDA tracing with CUPTI and CUDART wrapper
- removed CUDA stream number from thread name, when CUDA stream reuse is enabled
- disable tracking of MPI communicators, requests, windows, etc. if MPI is initialized with MPI_THREAD_SERIALIZED or MPI_THREAD_MULTIPLE (only MPI function enter/leave events will be recorded)
- configure:
- fixed detection of compiler instrumentation type on Cray platforms using the cc compiler wrapper
- compiler wrappers:
- fixed preprocessing source files to be parsed by OPARI (add path to empty omp.h to the preprocessor flags to avoid multiple declarations of OpenMP functions, types, etc.)
- vtnvcc: Remove 'compinst' instrumentation type, if VT is configured with a non-GNU compiler instrumentation support (Fixed "unrecognized option" error)
- vtdyn:
- added support for instrumenting outer- and inner loops and its iterations (outer=loops within a function, inner=loops within outer loops)
- try to get the full prototype of functions to be instrumented
- consider default filter rules also if no filter file is given
- fixed potential segfault if adding a filter rule w/o stack bounds
- print verbose messages on stdout if vtdyn is started from the Dyninst attach library (libvt-dynatt)
- vtunify:
- print verbose messages on stderr if vtunify is started automatically from the VT library
This commit was SVN r26836.