Commit Graph

1288 Commits

Author SHA1 Message Date
Roman Lebedev
45b194e4d4
Introduce Coefficient of variation aggregate (#1220)
* Introduce Coefficient of variation aggregate

I believe, it is much more useful / use to understand,
because it is already normalized by the mean,
so it is not affected by the duration of the benchmark,
unlike the standard deviation.

Example of real-world output:
```
raw.pixls.us-unique/GoPro/HERO6 Black$ ~/rawspeed/build-old/src/utilities/rsbench/rsbench GOPR9172.GPR --benchmark_repetitions=27 --benchmark_display_aggregates_only=true --benchmark_counters_tabular=true
2021-09-03T18:05:56+03:00
Running /home/lebedevri/rawspeed/build-old/src/utilities/rsbench/rsbench
Run on (32 X 3596.16 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x16)
  L1 Instruction 32 KiB (x16)
  L2 Unified 512 KiB (x16)
  L3 Unified 32768 KiB (x2)
Load Average: 7.00, 2.99, 1.85
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations  CPUTime,s CPUTime/WallTime     Pixels Pixels/CPUTime Pixels/WallTime Raws/CPUTime Raws/WallTime WallTime,s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GOPR9172.GPR/threads:32/process_time/real_time_mean         11.1 ms          353 ms           27   0.353122          31.9473        12M       33.9879M        1085.84M      2.83232       90.4864  0.0110535
GOPR9172.GPR/threads:32/process_time/real_time_median       11.0 ms          352 ms           27   0.351696          31.9599        12M       34.1203M        1090.11M      2.84336       90.8425  0.0110081
GOPR9172.GPR/threads:32/process_time/real_time_stddev      0.159 ms         4.60 ms           27   4.59539m        0.0462064          0       426.371k        14.9631M    0.0355309       1.24692   158.944u
GOPR9172.GPR/threads:32/process_time/real_time_cv           1.44 %          1.30 %            27  0.0130136         1.44633m          0      0.0125448       0.0137802    0.0125448     0.0137802  0.0143795
```

Fixes https://github.com/google/benchmark/issues/1146

* Be consistent, it's CV, not 'rel std dev'
2021-09-03 18:44:10 +01:00
Roman Lebedev
12dc5eeafc
Statistics: add support for percentage unit in addition to time (#1219)
* Statistics: add support for percentage unit in addition to time

I think, `stddev` statistic is useful, but confusing.

What does it mean if `stddev` of `1ms` is reported?
Is that good or bad? If the `median` is `1s`,
then that means that the measurements are pretty noise-less.

And what about `stddev` of `100ms` is reported?
If the `median` is `1s` - awful, if the `median` is `10s` - good.

And hurray, there is just the statistic that we need:
https://en.wikipedia.org/wiki/Coefficient_of_variation

But, naturally, that produces a value in percents,
but the statistics are currently hardcoded to produce time.

So this refactors thinkgs a bit, and allows a percentage unit for statistics.

I'm not sure whether or not `benchmark` would be okay
with adding this `RSD` statistic by default,
but regales, that is a separate patch.

Refs. https://github.com/google/benchmark/issues/1146

* Address review notes
2021-09-03 15:36:56 +01:00
Roman Lebedev
67b77da3c0
report.py: adjust expected u-test values for tests
The test used to work with scipy 1.6, but does not work with 1.7.1
I believe this is because of https://github.com/scipy/scipy/pull/4933
Since there doesn't appear that there is anything wrong with how
we call the `mannwhitneyu()`, i guess we should just expect the new values.

Notably, the tests now fail with earlier scipy versions.
2021-09-03 14:54:06 +03:00
Roman Lebedev
e7a8415876
CMake: add forgotten include(FeatureSummary) into FindPFM.cmake to fix build
Sorry!
2021-09-03 01:15:09 +03:00
Dominic Hamon
2b093325e1
replace #warning with #pragma message (#1216) 2021-08-24 15:08:15 +01:00
Dominic Hamon
04c466603f
force cmake version to 3.5.1 2021-08-24 14:30:36 +01:00
Roman Lebedev
e7fa637cbe
[NFC] PFM: actually report package as found, and advertise description
This slightly prettifies the CMake's `feature_summary()` output,
should the library be built as part of some project that then prints
the CMake summary.
2021-08-23 19:36:34 +03:00
Marcel Jacobse
0a447f8a75
Fix links to further doc in user_guide.md (#1215)
Refactoring in 201b981a moved most of the documentation from `README.md` to `docs/user_guide.md`. Some links from `README.md` to other `docs/*.md` files ended up unchanged in `docs/user_guide.md`. Those links were now broken as they did not link from outside the `docs` directory anymore, but from inside it. Removing the leading `docs/` for these links fixes this.
2021-08-23 14:16:03 +01:00
Vy Nguyen
dc1a97174d
Introduce accessors for currently public data members (threads and thread_index) (#1208)
* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate the direct access to these fields.

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate the direct access to these fields.

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.

* [benchmark] Introduce accessors for currently public data members `threads` and `thread_index`

Also deprecate direct access to `.thread_index` and make threads a private field

Motivations:

Our internal library provides accessors for those fields because the styleguide disalows accessing classes' data members directly (even if they're const).
There has been a discussion to simply move internal library to make its fields public similarly to the OSS version here, however, the concern is that these kinds of direct access would prevent many types of future design changes (eg how/whether the values would be stored in the data member)

I think the concensus in the end is that we'd change the external library for this case.
AFAIK, there are three important third_party users that we'd need to migrate: tcmalloc, abseil and tensorflow.
Please let me know if I'm missing anyone else.
2021-08-23 09:06:57 +01:00
Nico Weber
8fd49d6671
Fix a -Wunreachable-code-aggressive warning (#1214) 2021-08-19 16:09:49 +03:00
Dominic Hamon
c4b06e5b79 Set theme jekyll-theme-minimal 2021-08-18 10:07:38 +01:00
Dominic Hamon
0fb4b75182
wrap things that look like tags but aren't with {% raw %} 2021-08-18 09:26:29 +01:00
Dominic Hamon
990299fff8
install docs folder when installing library (#1212)
fixes #470
2021-08-17 22:16:40 +01:00
Dominic Hamon
91ce110bbf
add .DS_Store to .gitignore 2021-08-17 21:46:11 +01:00
Dominic Hamon
201b981abd
refactor the documentation to minimise README.md (#1211) 2021-08-17 21:45:33 +01:00
Dominic Hamon
2d054b683f Merge branch 'main' of github.com:google/benchmark 2021-08-11 15:26:35 +01:00
Dominic Hamon
ddc76e513e preparing v1.5.6 release 2021-08-11 15:22:36 +01:00
Dominic Hamon
cb9afbbac4 Set theme jekyll-theme-modernist 2021-08-11 15:12:32 +01:00
Dominic Hamon
07f833d6b2 so much for googletest not failing any more 2021-08-11 15:10:02 +01:00
Dominic Hamon
d0db4e01c1 turn back on strict mode for googletest as it no longer breaks 2021-08-11 15:01:17 +01:00
Vy Nguyen
4124223bf5
Change the default value of --benchmark_filter from "." to <empty> (#1207)
Both `.` and `<empty>` already means "run all benchmarks" here, as commented on this flag's declaration (and below around line 448-449).
So this is a NFC.

On the other hand, this help internally because internally, if the flag is empty (or if it's not a specified by a binary), we don't call the RunSpecifiedBenchmarks.
There is still a difference in what <empty> means internally (runs no benchmarks) and externally (runs all benchmarks).
But we can work around this.
2021-08-03 17:11:47 +01:00
Braedy
1067dfc91e
Remove dead code from PredictNumItersNeeded (#1206)
* Remove `min` with dead path.

When `isSignificant` is false, the smallest value `multiplier` can
approach is 14. Thus, min(10, multiplier) will always return 10.

Addresses part one of #1205.

* Remove always false condition.

   1. `multiplier <= 1.0` implies `i.seconds >= min_time * 1.4`
   2. By (1), `isSignficant` is true because `i.seconds > min_time * 1.4` implies `i.seconds > min_time` implies that `i.seconds / minTime > 1 > 0.1`. Thus, the ternary maintains the same multiplier value.
   3. `ShouldReportResults` is always called before `PredictNumItersNeeded`, if `i.seconds >= min_time` then the loop is broken and `PredictNumItersNeeded` is never called.
   4. 1 and 3 together imply that `multiplier <= 1.0` is never true.

Addresses part 2 of #1205.
2021-07-29 08:59:46 +01:00
Dominic Hamon
ab74ae5e10
downgrade warnings for googletest (#1203)
fixes #1202. sort of.
2021-07-19 12:11:13 +01:00
Nicholas Junge
9433793f3e
Add wheel and sdist building action to GH Workflow (#1180) 2021-07-19 09:45:16 +01:00
Dominic Hamon
e451e50e9b
add g++ to sanitizer buildbots (#1197)
* add g++ to sanitizer buildbots

* add compiler to sanitizer build name

* spell g++ correctly. look, it's early, ok?

* only set libcxx if we're using clang
2021-07-01 10:02:54 +01:00
Dominic Hamon
1fcb5c23d8 Don't return a reference when the callers all expect pointers.
Fixes #1196
2021-07-01 09:39:09 +01:00
Dominic Hamon
19026e232c
fix clang-tidy warnings (#1195) 2021-06-29 11:06:53 +01:00
Mircea Trofin
94f845ec4f
Fix typos (#1194) 2021-06-28 17:07:54 +01:00
Mircea Trofin
05a2ace713
Fix type warning on certain compilers (#1193)
repetition_indices is populated with size_t values, so typing it
accordingly.
2021-06-28 17:06:22 +01:00
Mircea Trofin
40d2069d84
Use C++11 atomic_signal_fence for ClobberMemory (#1190)
* Use C++11 atomic_signal_fence for ClobberMemory

* include

* Conditionally using std::atomic_signal_fence when C++11 is available
2021-06-28 17:03:26 +01:00
Manuel Binna
38b767e58a
Bazel qnx (#1192)
* Don't link pthread on QNX

On QNX, pthread is part of libc [1]. There's no separate pthread library
to link.

[1] https://www.qnx.com/developers/docs/7.1/index.html#com.qnx.doc.neutrino.lib_ref/topic/p/pthread_create.html

* Explain that QNX doesn't have pthread lib
2021-06-28 13:46:08 +01:00
Mircea Trofin
d6778aebbe
Deduplicate test function name in python bindings example (#1189)
This appears to be the source of unclean termination of the test on some
versions of python related to object dereferencing.
2021-06-28 10:28:04 +01:00
Dominic Hamon
1799e1b9ec
prefix VLOG (#1187) 2021-06-24 18:55:37 +01:00
Dominic Hamon
6a5bf081d3
prefix macros to avoid clashes (#1186) 2021-06-24 18:21:59 +01:00
Dominic Hamon
5da5660429
Move flags inside the benchmark namespace (#1185)
This avoids clashes with other libraries that might define the same flags.
2021-06-24 16:50:19 +01:00
Dominic Hamon
62937f91b5
Add missing trailing commas (#1182)
* Add missing trailing commas

Fixes #1181

* Better trailing commas
2021-06-18 17:31:47 +01:00
PCMan
c932169e76
Provide helpers to create integer lists for the given ranges. (#1179)
This can be used together with ArgsProduct() to allow multiple ranges
with different multipliers and mixing dense and sparse ranges.

Example:
BENCHMARK(MyTest)->ArgsProduct({
  CreateRange(0, 1024, /*multi=*/32),
  CreateRange(0, 100, /*multi=*/4),
  CreateDenseRange(0, 4, /*step=*/1)
});

Co-authored-by: Jen-yee Hong <pcmantw@google.com>
2021-06-16 12:56:24 +01:00
Michael Lippautz
5b7518482c
benchmark_runner.h: Remove superfluous semi colon (#1178)
Some downstream projects (e.g. V8) treat warnings as errors and cannot roll
the latest changes.
2021-06-15 13:28:55 +01:00
Roman Lebedev
e991355c02
[NFCI] Drop warning to satisfy clang's -Wunused-but-set-variable diag (#1174)
Fixes https://github.com/google/benchmark/issues/1172
2021-06-09 11:52:12 +03:00
huajingyun
f90215f1cc
Add support for new architecture loongarch (#1173) 2021-06-08 10:26:24 +01:00
Dominic Hamon
342409126b
Use modern clang/libc++ for sanitizers (#1171)
* Use modern clang/libc++ for sanitizers

* update ubuntu

* new llvm builds differently

* clang, not clang-3.8

* just build what we need
2021-06-04 11:06:38 +01:00
Dominic Hamon
bdd6c44787
Enable various sanitizer builds in github actions (#1167)
* Enable various sanitizer builds in github actions

* try with off the shelf versions

* nope

* specific version?

* rats

* oops

* remove msan for now

* reorder so env is set before building libc++
2021-06-03 19:45:02 +01:00
Roman Lebedev
fbc31405b2
Random interleaving of benchmark repetitions - the sequel (fixes #1051) (#1163)
Inspired by the original implementation by Hai Huang @haih-g
from https://github.com/google/benchmark/pull/1105.

The original implementation had design deficiencies that
weren't really addressable without redesign, so it was reverted.

In essence, the original implementation consisted of two separateable parts:
* reducing the amount time each repetition is run for, and symmetrically increasing repetition count
* running the repetitions in random order

While it worked fine for the usual case, it broke down when user would specify repetitions
(it would completely ignore that request), or specified per-repetition min time (while it would
still adjust the repetition count, it would not adjust the per-repetition time,
leading to much greater run times)

Here, like i was originally suggesting in the original review, i'm separating the features,
and only dealing with a single one - running repetitions in random order.

Now that the runs/repetitions are no longer in-order, the tooling may wish to sort the output,
and indeed `compare.py` has been updated to do that: #1168.
2021-06-03 21:16:54 +03:00
Dominic Hamon
d17ea66551
Fix leak in test, and provide path to remove leak from library (#1169)
* Fix leak in test, and provide path to remove leak from library

* make doc change
2021-06-03 16:08:00 +01:00
Roman Lebedev
32cc607107
[NFCI] Make BenchmarkRunner non-internal to it's .cpp file
Currently the lifetime of a single BenchmarkRunner is constrained
to a RunBenchmark(), but that will have to change for interleaved
benchmark execution, because we'll need to keep it around to not
forget how much repetitions of an instance we've done.
2021-06-03 16:56:15 +03:00
Roman Lebedev
520573fecb
[NFCI] RunBenchmarks(): extract FlushStreams()/Report() functions
Based on original implementation by Hai Huang @haih-g in
https://github.com/google/benchmark/pull/1105
2021-06-03 16:44:20 +03:00
Roman Lebedev
6e32352c79
compare.py: sort the results (#1168)
Currently, the tooling just keeps the whatever benchmark order
that was present, and this is fine nowadays, but once the benchmarks
will be optionally run interleaved, that will be rather suboptimal.

So, now that i have introduced family index and per-family instance index,
we can define an order for the benchmarks, and sort them accordingly.

There is a caveat with aggregates, we assume that they are in-order,
and hopefully we won't mess that order up..
2021-06-03 16:22:52 +03:00
Roman Lebedev
0c1da0a713
Make 'complexity reports' cache per-family, not global (#1166)
While the current variant works, it assumes that all the instances of
a single family will be run together, with nothing inbetween them.
Naturally, that won't work once the runs may be interleaved.
2021-06-03 11:46:34 +03:00
Roman Lebedev
80a62618e8
Introduce per-family instance index (#1165)
Much like it makes sense to enumerate all the families,
it makes sense to enumerate stuff within families.
Alternatively, we could have a global instance index,
but i'm not sure why that would be better.

This will be useful when the benchmarks are run not in order,
for the tools to sort the results properly.
2021-06-02 23:45:41 +03:00
Roman Lebedev
4c2e32f1d0
Introduce "family index" field into JSON output (#1164)
It may be useful for those wishing to further post-process JSON results,
but it is mainly geared towards better support for run interleaving,
where results from the same family may not be close-by in the JSON.

While we won't be able to do much about that for outputs,
the tools can and perhaps should reorder the results to that
at least in their output they are in proper order, not run order.

Note that this only counts the families that were filtered-in,
so if e.g. there were three families, and we filtered-out
the second one, the two families (which were first and third)
will have family indexes 0 and 1.
2021-06-02 18:06:45 +03:00