- only make MCA parameters available if SPC is enabled
- do not compile SPC code if SPC is disabled
- move includes into ompi_spc.c
- allow counters to be enabled through MPI_T without setting MCA parameter
- inline counter update calls that are likely in the critical path
- fix test to succeed even if encountering invalid pvars
- move timer_[start|stop] to header and move attachment info into ompi_spc_t
There is no need to store the name in the ompi_spc_t struct too, we can use that space
for the attachment info instead to avoid accessing another cache line.
- make timer/watermark flags a property of the spc description
This is meant to making adding counters easier in the future by
centralizing the necessary information. By storing a copy of these flags
in the ompi_spc_t structure (without adding to its size) reduces
cache pollution for timer/watermark events.
- allocate ompi_spc_t objects with cache-alignment
This prevents objects from spanning multiple cache lines and thus
ensures that only one cache line is loaded per update.
- fix handling of timer and timer conversion
- only call opal_timer_base_get_cycles if necesary to reduce overhead
- Remove use of OPAL_UNLIKELY to improve code generated by GCC
It appears that GCC makes less effort in optimizing the unlikely path
and generates bloated code.
- Allocate ompi_spc_events statically to reduce loads in critical path
- duplicate comm_world only when dumping is requested
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>