This API is akin to the MemoryManager API and lets tools provide
their own profiler which is wrapped in the same way MemoryManager is
wrapped. Namely, the profiler provides Start/Stop methods that are called
at the start/end of running the benchmark in a separate pass.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
* Add -lkstat to the .pc for Solaris
This fixes linking for projects that rely on pkg-config to generate the
link line on Solaris.
Test plan: Built the project locally on Solaris and verified -kstat
appears in the .pc file
```
$ cat lib/pkgconfig/benchmark.pc | grep Libs.private
Libs.private: -lpthread -lkstat
```
* Use BENCHMARK_PRIVATE_LINK_LIBRARIES
The customization done via BENCHMARK_OS_QURT works just fine with the Hexagon simulator, but on at least some Hexagon hardware, both `qurt_timer_get_ticks()` and `std::chrono::now()` are broken and always return 0. This fixes the former by using the better-supported (and essentially identical `qurt_sysclock_get_hw_ticks()` call, and the latter by reading a 19.2MHz hardware counter (per suggestion from Qualcomm). Local testing seems to indicate these changes are just as robust under the simulator as before.
Adjusted the GetSysctl call in sysinfo.cc to ensure the frequency
value is returned as a double rather than an integer. This helps
maintain consistency and clarity in the codebase.
COMPILER_IBMXL identifies the Clang based IBM XL compiler (xlclang) on z/OS. This compiler is obsolete and replaced by the Open XL compiler, so the macro is no longer needed and the existing code would lead to incorrect asm syntax for Open XL.
* Rewrite complexity_test to use (hardcoded) manual time
This test is fundamentally flaky, because it tried to read tea leafs,
and is inherently misbehaving in CI environments,
since there are unmitigated sources of noise.
That being said, the computed Big-O also depends on the `--benchmark_min_time=`
Fixes https://github.com/google/benchmark/issues/272
* Correctly compute Big-O for manual timings. Fixes#1758.
* complexity_test: do more stuff in empty loop
* Make all empty loops be a bit longer empty
Looks like on windows, some of these tests still fail,
i guess clock precision is too small.
Defines a wrapper function, CheckNumCPUs, which enforces that GetNumCPUs
never returns fewer than one CPU. There is no reasonable way to
continue if we are unable to identify the number of CPUs.
Signed-off-by: Sam James <sam@gentoo.org>
* Add support for Alpha architecture
As documented, the real cycle counter is unsafe to use here, because it
is a 32-bit integer which wraps every ~4s. Use gettimeofday instead,
which has a limitation of a low-precision real-time-clock (~1ms), but no
wrapping. Passes test suite.
Support parsing /proc/cpuinfo on Alpha
tabular_test: add a missing DoNotOptimize call
* CMake: `get_git_version()`: just use `--dirty` flag of `git describe`
* CMake: move version normalization out of `get_git_version()`
Mainly, i want `get_git_version()` to return true version,
not something sanitized.
* JSON reporter: store library version and schema version in `context`
* Tools: discard inputs with unexpected `json_schema_version`
* Extract version string into `GetBenchmarkVersiom()`
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
* Instead of directly comparing std::cout and GetOutputStream(), the underlying buffers are retreived via rdbuf(), and then compared.
* Instead of fflush(stdout), call out.flush().
Use out << FormatString() instead of vprintf
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Starting with Linux 6.6 [1], RDCYCLE is a privileged instruction on
RISC-V and can't be used directly from userland. There is a sysctl
option to change that as a transition period, but it will eventually
disappear.
Use RDTIME instead, which while less accurate has the advantage of being
synchronized between CPU (and thus monotonic) and of constant frequency.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc4c07c89aada16229084eeb93895c95b7eabaa3
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
There is no bug here, but it gave me a scare the other day.
It is not incorrect to use `IterationCount` here,
since it's just an `int64_t` either way,
but it's wildly confusing. Let's not do that.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
* Increase the kMaxIterations limit
This fixes#1663. Note that as a result of this change, the columns in the console output can become misaligned if the actual iteration count is too high. This will be dealt with in a separate commit.
* Fix failing test on Windows
* Fix formatting
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
* Make json and csv output consistent.
Currently, the --benchmark_format=csv option does not output the correct value for the cv statistics. Also, the json output should not contain a time unit for the cv statistics.
* fix formatting
* undo json change
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Previously, this could return the wrong result when there
was an even number of elements.
There were two `nth_element` calls. The second call could
change elements in `[center2, end])`, which was where
`center` pointed. Therefore, `*center` sometimes had the
wrong value after the second `nth_element` call.
Rewrite to use `max_element` instead of the second call to
`nth_element`. This avoids modifying the vector.
* perf_counters: Initialize once only when needed
This works around some performance problems running Android under QEMU.
Calling `pfm_initialize` was very slow, and was called during dynamic
initialization (before `main` or when loaded as a shared library).
This happened whenever benchmark was linked, even if no benchmarks
were run.
Instead, call `pfm_initialize` at most once, and only when one of:
1. `PerfCounters::Initialize` is called
2. `PerfCounters::Create` is called with a non-empty counter list
3. `PerfCounters::IsCounterSupported` is called
The return value of the first `pfm_initialize()` is saved and
returned from all subsequent `PerfCounters::Initialize` calls.
* perf_counters: Make success var const
* InitLibPfmOnce: Inline function
* State: Initialize counters with kAvgIteration in constructor
Previously, `counters` was updated in `PauseTiming()` with
`counters[name] += Counter(measurement, kAvgIteration)`.
The first `counters[name]` call inserts a counter with no flags.
There is no `operator+=` for `Counter`, so the insertion is done
by converting the `Counter` to a `double`, then constructing a
`Counter` to insert from the `double`, which drops the flags.
Pre-insert the `Counter` with the correct flags, then only
update `Counter::value`.
Introduced in 1c64a36 ([perf-counters] Fix pause/resume (#1643)).
* perf_counters_test.cc: Don't divide by iterations
Perf counters are now divided by iterations, so dividing again
in the test is wrong.
* State: Fix shadowed param error
* benchmark.cc: Fix clang-tidy error
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Change condition for `benchmarks_with_threads` from `benchmark.threads() > 0` to `> 1`. `threads()` appears to always be `>= 1`.
Introduced in fbc6efa (Refactoring of PerfCounters infrastructure (#1559))
* [perf-counters] Fix pause/resume
Using `state.PauseTiming() / state.ResumeTiming()` was broken.
Thanks [@virajbshah] for the the repro testcase.
* ran clang-format over the whole perf_counters_test.cc
* Remove check that perf counters are 0 on `Pause`, since `Pause`/`Resume`
sequences would cause a non-0 counter value
* both upper and lower bound for the with/without resume counters
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
BENCHMARK_HAVE_STD_REGEX is not used but HAVE_STD_REGEX like the other two choices, i.e. HAVE_GNU_POSIX_REGEX and HAVE_POSIX_REGEX.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>