* Support optional, user-directed collection of performance counters The patch allows an engineer wishing to drill into the root causes of a regression, for example. Currently, only single threaded runs are supported. The feature is a build-time opt in, and then a runtime opt in. The engineer may run the benchmark executable, passing a list of performance counter names (using libpfm's naming scheme) at the command line. The counter values will then be collected and reported back as UserCounters. This is different from #240 in that it is a benchmark user opt-in, and the counter collection is transparent to the benchmark. Currently, this is only supported on platforms where libpfm is supported. libpfm: http://perfmon2.sourceforge.net/ * 'Use' values param in Snapshot when BENCHMARK_OS_WINDOWS This is to avoid unused parameter warning-as-error * Added missing include for <vector> in perf_counters.cc * Moved doc to docs * Added license blurbs
1.5 KiB
User-Requested Performance Counters
When running benchmarks, the user may choose to request collection of performance counters. This may be useful in investigation scenarios - narrowing down the cause of a regression; or verifying that the underlying cause of a performance improvement matches expectations.
This feature is available if:
- The benchmark is run on an architecture featuring a Performance Monitoring Unit (PMU),
- The benchmark is compiled with support for collecting counters. Currently, this requires libpfm be available at build time, and
- Currently, there is a limitation that the benchmark be run on one thread.
The feature does not require modifying benchmark code. Counter collection is handled at the boundaries where timer collection is also handled.
To opt-in:
- Install
libpfm4-dev
, e.g.apt-get install libpfm4-dev
. - Enable the cmake flag BENCHMARK_ENABLE_LIBPFM.
To use, pass a comma-separated list of counter names through the
--benchmark_perf_counters
flag. The names are decoded through libpfm - meaning,
they are platform specific, but some (e.g. CYCLES
or INSTRUCTIONS
) are
mapped by libpfm to platform-specifics - see libpfm
documentation for more details.
The counter values are reported back through the User Counters mechanism, meaning, they are available in all the formats (e.g. JSON) supported by User Counters.