Drop Stat1, refactor statistics to be user-providable, add median. (#428)

* Drop Stat1, refactor statistics to be user-providable, add median.

My main goal was to add median statistic. Since Stat1
calculated the stats incrementally, and did not store
the values themselves, it is was not possible. Thus,
i have replaced Stat1 with simple std::vector<double>,
containing all the values.

Then, i have refactored current mean/stdev to be a
function that is provided with values vector, and
returns the statistic. While there, it seemed to make
sense to deduplicate the code by storing all the
statistics functions in a map, and then simply iterate
over it. And the interface to add new statistics is
intentionally exposed, so they may be added easily.

The notable change is that Iterations are no longer
displayed as 0 for stdev. Is could be changed, but
i'm not sure how to nicely fit that into the API.

Similarly, this dance about sometimes (for some fields,
for some statistics) dividing by run.iterations, and
then multiplying the calculated stastic back is also
dropped, and if you do the math, i fail to see why
it was needed there in the first place.

Since that was the only use of stat.h, it is removed.

* complexity.h: attempt to fix MSVC build

* Update README.md

* Store statistics to compute in a vector, ensures ordering.

* Add a bit more tests for repetitions.

* Partially address review notes.

* Fix gcc build: drop extra ';'

clang, why didn't you warn me?

* Address review comments.

* double() -> 0.0
* early return
This commit is contained in:
Roman Lebedev 2017-08-24 02:44:29 +03:00 committed by Dominic Hamon
parent d70417994a
commit a271c36af9
12 changed files with 374 additions and 435 deletions

View File

@ -223,8 +223,7 @@ scope, the `RegisterBenchmark` can be called anywhere. This allows for
benchmark tests to be registered programmatically.
Additionally `RegisterBenchmark` allows any callable object to be registered
as a benchmark. Including capturing lambdas and function objects. This
allows the creation
as a benchmark. Including capturing lambdas and function objects.
For Example:
```c++
@ -241,7 +240,7 @@ int main(int argc, char** argv) {
### Multithreaded benchmarks
In a multithreaded test (benchmark invoked by multiple threads simultaneously),
it is guaranteed that none of the threads will start until all have called
`KeepRunning`, and all will have finished before KeepRunning returns false. As
`KeepRunning`, and all will have finished before `KeepRunning` returns `false`. As
such, any global setup or teardown can be wrapped in a check against the thread
index:
@ -274,7 +273,7 @@ Without `UseRealTime`, CPU time is used by default.
## Manual timing
For benchmarking something for which neither CPU time nor real-time are
correct or accurate enough, completely manual timing is supported using
the `UseManualTime` function.
the `UseManualTime` function.
When `UseManualTime` is used, the benchmarked code must call
`SetIterationTime` once per iteration of the `KeepRunning` loop to
@ -384,7 +383,7 @@ the minimum time, or the wallclock time is 5x minimum time. The minimum time is
set as a flag `--benchmark_min_time` or per benchmark by calling `MinTime` on
the registered benchmark object.
## Reporting the mean and standard devation by repeated benchmarks
## Reporting the mean, median and standard deviation by repeated benchmarks
By default each benchmark is run once and that single result is reported.
However benchmarks are often noisy and a single result may not be representative
of the overall behavior. For this reason it's possible to repeatedly rerun the
@ -392,19 +391,42 @@ benchmark.
The number of runs of each benchmark is specified globally by the
`--benchmark_repetitions` flag or on a per benchmark basis by calling
`Repetitions` on the registered benchmark object. When a benchmark is run
more than once the mean and standard deviation of the runs will be reported.
`Repetitions` on the registered benchmark object. When a benchmark is run more
than once the mean, median and standard deviation of the runs will be reported.
Additionally the `--benchmark_report_aggregates_only={true|false}` flag or
`ReportAggregatesOnly(bool)` function can be used to change how repeated tests
are reported. By default the result of each repeated run is reported. When this
option is 'true' only the mean and standard deviation of the runs is reported.
option is `true` only the mean, median and standard deviation of the runs is reported.
Calling `ReportAggregatesOnly(bool)` on a registered benchmark object overrides
the value of the flag for that benchmark.
## User-defined statistics for repeated benchmarks
While having mean, median and standard deviation is nice, this may not be
enough for everyone. For example you may want to know what is the largest
observation, e.g. because you have some real-time constraints. This is easy.
The following code will specify a custom statistic to be calculated, defined
by a lambda function.
```c++
void BM_spin_empty(benchmark::State& state) {
while (state.KeepRunning()) {
for (int x = 0; x < state.range(0); ++x) {
benchmark::DoNotOptimize(x);
}
}
}
BENCHMARK(BM_spin_empty)
->ComputeStatistics("max", [](const std::vector<double>& v) -> double {
return *(std::max_element(std::begin(v), std::end(v)));
})
->Arg(512);
```
## Fixtures
Fixture tests are created by
first defining a type that derives from ::benchmark::Fixture and then
first defining a type that derives from `::benchmark::Fixture` and then
creating/registering the tests using the following macros:
* `BENCHMARK_F(ClassName, Method)`
@ -614,7 +636,7 @@ The library supports multiple output formats. Use the
is the default format.
The Console format is intended to be a human readable format. By default
the format generates color output. Context is output on stderr and the
the format generates color output. Context is output on stderr and the
tabular data on stdout. Example tabular output looks like:
```
Benchmark Time(ns) CPU(ns) Iterations

View File

@ -378,6 +378,18 @@ enum BigO { oNone, o1, oN, oNSquared, oNCubed, oLogN, oNLogN, oAuto, oLambda };
// computational complexity for the benchmark.
typedef double(BigOFunc)(int);
// StatisticsFunc is passed to a benchmark in order to compute some descriptive
// statistics over all the measurements of some type
typedef double(StatisticsFunc)(const std::vector<double>&);
struct Statistics {
std::string name_;
StatisticsFunc* compute_;
Statistics(std::string name, StatisticsFunc* compute)
: name_(name), compute_(compute) {}
};
namespace internal {
class ThreadTimer;
class ThreadManager;
@ -698,6 +710,9 @@ class Benchmark {
// the asymptotic computational complexity will be shown on the output.
Benchmark* Complexity(BigOFunc* complexity);
// Add this statistics to be computed over all the values of benchmark run
Benchmark* ComputeStatistics(std::string name, StatisticsFunc* statistics);
// Support for running multiple copies of the same benchmark concurrently
// in multiple threads. This may be useful when measuring the scaling
// of some piece of code.
@ -758,6 +773,7 @@ class Benchmark {
bool use_manual_time_;
BigO complexity_;
BigOFunc* complexity_lambda_;
std::vector<Statistics> statistics_;
std::vector<int> thread_counts_;
Benchmark& operator=(Benchmark const&);
@ -1065,6 +1081,9 @@ class BenchmarkReporter {
BigOFunc* complexity_lambda;
int complexity_n;
// what statistics to compute from the measurements
const std::vector<Statistics>* statistics;
// Inform print function whether the current run is a complexity report
bool report_big_o;
bool report_rms;

View File

@ -37,11 +37,11 @@
#include "colorprint.h"
#include "commandlineflags.h"
#include "complexity.h"
#include "statistics.h"
#include "counter.h"
#include "log.h"
#include "mutex.h"
#include "re.h"
#include "stat.h"
#include "string_util.h"
#include "sysinfo.h"
#include "timers.h"
@ -256,6 +256,7 @@ BenchmarkReporter::Run CreateRunReport(
report.complexity_n = results.complexity_n;
report.complexity = b.complexity;
report.complexity_lambda = b.complexity_lambda;
report.statistics = b.statistics;
report.counters = results.counters;
internal::Finish(&report.counters, seconds, b.threads);
}
@ -481,12 +482,16 @@ void RunBenchmarks(const std::vector<Benchmark::Instance>& benchmarks,
// Determine the width of the name field using a minimum width of 10.
bool has_repetitions = FLAGS_benchmark_repetitions > 1;
size_t name_field_width = 10;
size_t stat_field_width = 0;
for (const Benchmark::Instance& benchmark : benchmarks) {
name_field_width =
std::max<size_t>(name_field_width, benchmark.name.size());
has_repetitions |= benchmark.repetitions > 1;
for(const auto& Stat : *benchmark.statistics)
stat_field_width = std::max<size_t>(stat_field_width, Stat.name_.size());
}
if (has_repetitions) name_field_width += std::strlen("_stddev");
if (has_repetitions) name_field_width += 1 + stat_field_width;
// Print header here
BenchmarkReporter::Context context;

View File

@ -25,6 +25,7 @@ struct Benchmark::Instance {
BigO complexity;
BigOFunc* complexity_lambda;
UserCounters counters;
const std::vector<Statistics>* statistics;
bool last_benchmark_instance;
int repetitions;
double min_time;

View File

@ -37,10 +37,10 @@
#include "check.h"
#include "commandlineflags.h"
#include "complexity.h"
#include "statistics.h"
#include "log.h"
#include "mutex.h"
#include "re.h"
#include "stat.h"
#include "string_util.h"
#include "sysinfo.h"
#include "timers.h"
@ -159,6 +159,7 @@ bool BenchmarkFamilies::FindBenchmarks(
instance.use_manual_time = family->use_manual_time_;
instance.complexity = family->complexity_;
instance.complexity_lambda = family->complexity_lambda_;
instance.statistics = &family->statistics_;
instance.threads = num_threads;
// Add arguments to instance name
@ -236,7 +237,11 @@ Benchmark::Benchmark(const char* name)
use_real_time_(false),
use_manual_time_(false),
complexity_(oNone),
complexity_lambda_(nullptr) {}
complexity_lambda_(nullptr) {
ComputeStatistics("mean", StatisticsMean);
ComputeStatistics("median", StatisticsMedian);
ComputeStatistics("stddev", StatisticsStdDev);
}
Benchmark::~Benchmark() {}
@ -409,6 +414,12 @@ Benchmark* Benchmark::Complexity(BigOFunc* complexity) {
return this;
}
Benchmark* Benchmark::ComputeStatistics(std::string name,
StatisticsFunc* statistics) {
statistics_.emplace_back(name, statistics);
return this;
}
Benchmark* Benchmark::Threads(int t) {
CHECK_GT(t, 0);
thread_counts_.push_back(t);

View File

@ -21,7 +21,6 @@
#include <cmath>
#include "check.h"
#include "complexity.h"
#include "stat.h"
namespace benchmark {
@ -150,109 +149,6 @@ LeastSq MinimalLeastSq(const std::vector<int>& n,
return best_fit;
}
std::vector<BenchmarkReporter::Run> ComputeStats(
const std::vector<BenchmarkReporter::Run>& reports) {
typedef BenchmarkReporter::Run Run;
std::vector<Run> results;
auto error_count =
std::count_if(reports.begin(), reports.end(),
[](Run const& run) { return run.error_occurred; });
if (reports.size() - error_count < 2) {
// We don't report aggregated data if there was a single run.
return results;
}
// Accumulators.
Stat1_d real_accumulated_time_stat;
Stat1_d cpu_accumulated_time_stat;
Stat1_d bytes_per_second_stat;
Stat1_d items_per_second_stat;
// All repetitions should be run with the same number of iterations so we
// can take this information from the first benchmark.
int64_t const run_iterations = reports.front().iterations;
// create stats for user counters
struct CounterStat {
Counter c;
Stat1_d s;
};
std::map< std::string, CounterStat > counter_stats;
for(Run const& r : reports) {
for(auto const& cnt : r.counters) {
auto it = counter_stats.find(cnt.first);
if(it == counter_stats.end()) {
counter_stats.insert({cnt.first, {cnt.second, Stat1_d{}}});
} else {
CHECK_EQ(counter_stats[cnt.first].c.flags, cnt.second.flags);
}
}
}
// Populate the accumulators.
for (Run const& run : reports) {
CHECK_EQ(reports[0].benchmark_name, run.benchmark_name);
CHECK_EQ(run_iterations, run.iterations);
if (run.error_occurred) continue;
real_accumulated_time_stat +=
Stat1_d(run.real_accumulated_time / run.iterations);
cpu_accumulated_time_stat +=
Stat1_d(run.cpu_accumulated_time / run.iterations);
items_per_second_stat += Stat1_d(run.items_per_second);
bytes_per_second_stat += Stat1_d(run.bytes_per_second);
// user counters
for(auto const& cnt : run.counters) {
auto it = counter_stats.find(cnt.first);
CHECK_NE(it, counter_stats.end());
it->second.s += Stat1_d(cnt.second);
}
}
// Get the data from the accumulator to BenchmarkReporter::Run's.
Run mean_data;
mean_data.benchmark_name = reports[0].benchmark_name + "_mean";
mean_data.iterations = run_iterations;
mean_data.real_accumulated_time =
real_accumulated_time_stat.Mean() * run_iterations;
mean_data.cpu_accumulated_time =
cpu_accumulated_time_stat.Mean() * run_iterations;
mean_data.bytes_per_second = bytes_per_second_stat.Mean();
mean_data.items_per_second = items_per_second_stat.Mean();
mean_data.time_unit = reports[0].time_unit;
// user counters
for(auto const& kv : counter_stats) {
auto c = Counter(kv.second.s.Mean(), counter_stats[kv.first].c.flags);
mean_data.counters[kv.first] = c;
}
// Only add label to mean/stddev if it is same for all runs
mean_data.report_label = reports[0].report_label;
for (std::size_t i = 1; i < reports.size(); i++) {
if (reports[i].report_label != reports[0].report_label) {
mean_data.report_label = "";
break;
}
}
Run stddev_data;
stddev_data.benchmark_name = reports[0].benchmark_name + "_stddev";
stddev_data.report_label = mean_data.report_label;
stddev_data.iterations = 0;
stddev_data.real_accumulated_time = real_accumulated_time_stat.StdDev();
stddev_data.cpu_accumulated_time = cpu_accumulated_time_stat.StdDev();
stddev_data.bytes_per_second = bytes_per_second_stat.StdDev();
stddev_data.items_per_second = items_per_second_stat.StdDev();
stddev_data.time_unit = reports[0].time_unit;
// user counters
for(auto const& kv : counter_stats) {
auto c = Counter(kv.second.s.StdDev(), counter_stats[kv.first].c.flags);
stddev_data.counters[kv.first] = c;
}
results.push_back(mean_data);
results.push_back(stddev_data);
return results;
}
std::vector<BenchmarkReporter::Run> ComputeBigO(
const std::vector<BenchmarkReporter::Run>& reports) {
typedef BenchmarkReporter::Run Run;

View File

@ -25,12 +25,6 @@
namespace benchmark {
// Return a vector containing the mean and standard devation information for
// the specified list of reports. If 'reports' contains less than two
// non-errored runs an empty vector is returned
std::vector<BenchmarkReporter::Run> ComputeStats(
const std::vector<BenchmarkReporter::Run>& reports);
// Return a vector containing the bigO and RMS information for the specified
// list of reports. If 'reports.size() < 2' an empty vector is returned.
std::vector<BenchmarkReporter::Run> ComputeBigO(
@ -57,4 +51,5 @@ struct LeastSq {
std::string GetBigOString(BigO complexity);
} // end namespace benchmark
#endif // COMPLEXITY_H_

View File

@ -22,7 +22,6 @@
#include <vector>
#include "check.h"
#include "stat.h"
namespace benchmark {

View File

@ -1,310 +0,0 @@
#ifndef BENCHMARK_STAT_H_
#define BENCHMARK_STAT_H_
#include <cmath>
#include <limits>
#include <ostream>
#include <type_traits>
namespace benchmark {
template <typename VType, typename NumType>
class Stat1;
template <typename VType, typename NumType>
class Stat1MinMax;
typedef Stat1<float, int64_t> Stat1_f;
typedef Stat1<double, int64_t> Stat1_d;
typedef Stat1MinMax<float, int64_t> Stat1MinMax_f;
typedef Stat1MinMax<double, int64_t> Stat1MinMax_d;
template <typename VType>
class Vector2;
template <typename VType>
class Vector3;
template <typename VType>
class Vector4;
template <typename VType, typename NumType>
class Stat1 {
public:
typedef Stat1<VType, NumType> Self;
Stat1() { Clear(); }
// Create a sample of value dat and weight 1
explicit Stat1(const VType &dat) {
sum_ = dat;
sum_squares_ = Sqr(dat);
numsamples_ = 1;
}
// Create statistics for all the samples between begin (included)
// and end(excluded)
explicit Stat1(const VType *begin, const VType *end) {
Clear();
for (const VType *item = begin; item < end; ++item) {
(*this) += Stat1(*item);
}
}
// Create a sample of value dat and weight w
Stat1(const VType &dat, const NumType &w) {
sum_ = w * dat;
sum_squares_ = w * Sqr(dat);
numsamples_ = w;
}
// Copy operator
Stat1(const Self &stat) {
sum_ = stat.sum_;
sum_squares_ = stat.sum_squares_;
numsamples_ = stat.numsamples_;
}
void Clear() {
numsamples_ = NumType();
sum_squares_ = sum_ = VType();
}
Self &operator=(const Self &stat) {
sum_ = stat.sum_;
sum_squares_ = stat.sum_squares_;
numsamples_ = stat.numsamples_;
return (*this);
}
// Merge statistics from two sample sets.
Self &operator+=(const Self &stat) {
sum_ += stat.sum_;
sum_squares_ += stat.sum_squares_;
numsamples_ += stat.numsamples_;
return (*this);
}
// The operation opposite to +=
Self &operator-=(const Self &stat) {
sum_ -= stat.sum_;
sum_squares_ -= stat.sum_squares_;
numsamples_ -= stat.numsamples_;
return (*this);
}
// Multiply the weight of the set of samples by a factor k
Self &operator*=(const VType &k) {
sum_ *= k;
sum_squares_ *= k;
numsamples_ *= k;
return (*this);
}
// Merge statistics from two sample sets.
Self operator+(const Self &stat) const { return Self(*this) += stat; }
// The operation opposite to +
Self operator-(const Self &stat) const { return Self(*this) -= stat; }
// Multiply the weight of the set of samples by a factor k
Self operator*(const VType &k) const { return Self(*this) *= k; }
// Return the total weight of this sample set
NumType numSamples() const { return numsamples_; }
// Return the sum of this sample set
VType Sum() const { return sum_; }
// Return the mean of this sample set
VType Mean() const {
if (numsamples_ == 0) return VType();
return sum_ * (1.0 / numsamples_);
}
// Return the mean of this sample set and compute the standard deviation at
// the same time.
VType Mean(VType *stddev) const {
if (numsamples_ == 0) return VType();
VType mean = sum_ * (1.0 / numsamples_);
if (stddev) {
// Sample standard deviation is undefined for n = 1
if (numsamples_ == 1) {
*stddev = VType();
} else {
VType avg_squares = sum_squares_ * (1.0 / numsamples_);
*stddev = Sqrt(numsamples_ / (numsamples_ - 1.0) * (avg_squares - Sqr(mean)));
}
}
return mean;
}
// Return the standard deviation of the sample set
VType StdDev() const {
VType stddev = VType();
Mean(&stddev);
return stddev;
}
private:
static_assert(std::is_integral<NumType>::value &&
!std::is_same<NumType, bool>::value,
"NumType must be an integral type that is not bool.");
// Let i be the index of the samples provided (using +=)
// and weight[i],value[i] be the data of sample #i
// then the variables have the following meaning:
NumType numsamples_; // sum of weight[i];
VType sum_; // sum of weight[i]*value[i];
VType sum_squares_; // sum of weight[i]*value[i]^2;
// Template function used to square a number.
// For a vector we square all components
template <typename SType>
static inline SType Sqr(const SType &dat) {
return dat * dat;
}
template <typename SType>
static inline Vector2<SType> Sqr(const Vector2<SType> &dat) {
return dat.MulComponents(dat);
}
template <typename SType>
static inline Vector3<SType> Sqr(const Vector3<SType> &dat) {
return dat.MulComponents(dat);
}
template <typename SType>
static inline Vector4<SType> Sqr(const Vector4<SType> &dat) {
return dat.MulComponents(dat);
}
// Template function used to take the square root of a number.
// For a vector we square all components
template <typename SType>
static inline SType Sqrt(const SType &dat) {
// Avoid NaN due to imprecision in the calculations
if (dat < 0) return 0;
return sqrt(dat);
}
template <typename SType>
static inline Vector2<SType> Sqrt(const Vector2<SType> &dat) {
// Avoid NaN due to imprecision in the calculations
return Max(dat, Vector2<SType>()).Sqrt();
}
template <typename SType>
static inline Vector3<SType> Sqrt(const Vector3<SType> &dat) {
// Avoid NaN due to imprecision in the calculations
return Max(dat, Vector3<SType>()).Sqrt();
}
template <typename SType>
static inline Vector4<SType> Sqrt(const Vector4<SType> &dat) {
// Avoid NaN due to imprecision in the calculations
return Max(dat, Vector4<SType>()).Sqrt();
}
};
// Useful printing function
template <typename VType, typename NumType>
std::ostream &operator<<(std::ostream &out, const Stat1<VType, NumType> &s) {
out << "{ avg = " << s.Mean() << " std = " << s.StdDev()
<< " nsamples = " << s.NumSamples() << "}";
return out;
}
// Stat1MinMax: same as Stat1, but it also
// keeps the Min and Max values; the "-"
// operator is disabled because it cannot be implemented
// efficiently
template <typename VType, typename NumType>
class Stat1MinMax : public Stat1<VType, NumType> {
public:
typedef Stat1MinMax<VType, NumType> Self;
Stat1MinMax() { Clear(); }
// Create a sample of value dat and weight 1
explicit Stat1MinMax(const VType &dat) : Stat1<VType, NumType>(dat) {
max_ = dat;
min_ = dat;
}
// Create statistics for all the samples between begin (included)
// and end(excluded)
explicit Stat1MinMax(const VType *begin, const VType *end) {
Clear();
for (const VType *item = begin; item < end; ++item) {
(*this) += Stat1MinMax(*item);
}
}
// Create a sample of value dat and weight w
Stat1MinMax(const VType &dat, const NumType &w)
: Stat1<VType, NumType>(dat, w) {
max_ = dat;
min_ = dat;
}
// Copy operator
Stat1MinMax(const Self &stat) : Stat1<VType, NumType>(stat) {
max_ = stat.max_;
min_ = stat.min_;
}
void Clear() {
Stat1<VType, NumType>::Clear();
if (std::numeric_limits<VType>::has_infinity) {
min_ = std::numeric_limits<VType>::infinity();
max_ = -std::numeric_limits<VType>::infinity();
} else {
min_ = std::numeric_limits<VType>::max();
max_ = std::numeric_limits<VType>::min();
}
}
Self &operator=(const Self &stat) {
this->Stat1<VType, NumType>::operator=(stat);
max_ = stat.max_;
min_ = stat.min_;
return (*this);
}
// Merge statistics from two sample sets.
Self &operator+=(const Self &stat) {
this->Stat1<VType, NumType>::operator+=(stat);
if (stat.max_ > max_) max_ = stat.max_;
if (stat.min_ < min_) min_ = stat.min_;
return (*this);
}
// Multiply the weight of the set of samples by a factor k
Self &operator*=(const VType &stat) {
this->Stat1<VType, NumType>::operator*=(stat);
return (*this);
}
// Merge statistics from two sample sets.
Self operator+(const Self &stat) const { return Self(*this) += stat; }
// Multiply the weight of the set of samples by a factor k
Self operator*(const VType &k) const { return Self(*this) *= k; }
// Return the maximal value in this sample set
VType Max() const { return max_; }
// Return the minimal value in this sample set
VType Min() const { return min_; }
private:
// The - operation makes no sense with Min/Max
// unless we keep the full list of values (but we don't)
// make it private, and let it undefined so nobody can call it
Self &operator-=(const Self &stat); // senseless. let it undefined.
// The operation opposite to -
Self operator-(const Self &stat) const; // senseless. let it undefined.
// Let i be the index of the samples provided (using +=)
// and weight[i],value[i] be the data of sample #i
// then the variables have the following meaning:
VType max_; // max of value[i]
VType min_; // min of value[i]
};
// Useful printing function
template <typename VType, typename NumType>
std::ostream &operator<<(std::ostream &out,
const Stat1MinMax<VType, NumType> &s) {
out << "{ avg = " << s.Mean() << " std = " << s.StdDev()
<< " nsamples = " << s.NumSamples() << " min = " << s.Min()
<< " max = " << s.Max() << "}";
return out;
}
} // end namespace benchmark
#endif // BENCHMARK_STAT_H_

175
src/statistics.cc Normal file
View File

@ -0,0 +1,175 @@
// Copyright 2016 Ismael Jimenez Martinez. All rights reserved.
// Copyright 2017 Roman Lebedev. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include "benchmark/benchmark.h"
#include <algorithm>
#include <cmath>
#include <string>
#include <vector>
#include <numeric>
#include "check.h"
#include "statistics.h"
namespace benchmark {
auto StatisticsSum = [](const std::vector<double>& v) {
return std::accumulate(v.begin(), v.end(), 0.0);
};
double StatisticsMean(const std::vector<double>& v) {
if (v.size() == 0) return 0.0;
return StatisticsSum(v) * (1.0 / v.size());
}
double StatisticsMedian(const std::vector<double>& v) {
if (v.size() < 3) return StatisticsMean(v);
std::vector<double> partial;
// we need roundDown(count/2)+1 slots
partial.resize(1 + (v.size() / 2));
std::partial_sort_copy(v.begin(), v.end(), partial.begin(), partial.end());
// did we have odd number of samples?
// if yes, then the last element of partially-sorted vector is the median
// it no, then the average of the last two elements is the median
if(v.size() % 2 == 1)
return partial.back();
return (partial[partial.size() - 2] + partial[partial.size() - 1]) / 2.0;
}
// Return the sum of the squares of this sample set
auto SumSquares = [](const std::vector<double>& v) {
return std::inner_product(v.begin(), v.end(), v.begin(), 0.0);
};
auto Sqr = [](const double dat) { return dat * dat; };
auto Sqrt = [](const double dat) {
// Avoid NaN due to imprecision in the calculations
if (dat < 0.0) return 0.0;
return std::sqrt(dat);
};
double StatisticsStdDev(const std::vector<double>& v) {
const auto mean = StatisticsMean(v);
if (v.size() == 0) return mean;
// Sample standard deviation is undefined for n = 1
if (v.size() == 1)
return 0.0;
const double avg_squares = SumSquares(v) * (1.0 / v.size());
return Sqrt(v.size() / (v.size() - 1.0) * (avg_squares - Sqr(mean)));
}
std::vector<BenchmarkReporter::Run> ComputeStats(
const std::vector<BenchmarkReporter::Run>& reports) {
typedef BenchmarkReporter::Run Run;
std::vector<Run> results;
auto error_count =
std::count_if(reports.begin(), reports.end(),
[](Run const& run) { return run.error_occurred; });
if (reports.size() - error_count < 2) {
// We don't report aggregated data if there was a single run.
return results;
}
// Accumulators.
std::vector<double> real_accumulated_time_stat;
std::vector<double> cpu_accumulated_time_stat;
std::vector<double> bytes_per_second_stat;
std::vector<double> items_per_second_stat;
real_accumulated_time_stat.reserve(reports.size());
cpu_accumulated_time_stat.reserve(reports.size());
bytes_per_second_stat.reserve(reports.size());
items_per_second_stat.reserve(reports.size());
// All repetitions should be run with the same number of iterations so we
// can take this information from the first benchmark.
int64_t const run_iterations = reports.front().iterations;
// create stats for user counters
struct CounterStat {
Counter c;
std::vector<double> s;
};
std::map< std::string, CounterStat > counter_stats;
for(Run const& r : reports) {
for(auto const& cnt : r.counters) {
auto it = counter_stats.find(cnt.first);
if(it == counter_stats.end()) {
counter_stats.insert({cnt.first, {cnt.second, std::vector<double>{}}});
it = counter_stats.find(cnt.first);
it->second.s.reserve(reports.size());
} else {
CHECK_EQ(counter_stats[cnt.first].c.flags, cnt.second.flags);
}
}
}
// Populate the accumulators.
for (Run const& run : reports) {
CHECK_EQ(reports[0].benchmark_name, run.benchmark_name);
CHECK_EQ(run_iterations, run.iterations);
if (run.error_occurred) continue;
real_accumulated_time_stat.emplace_back(run.real_accumulated_time);
cpu_accumulated_time_stat.emplace_back(run.cpu_accumulated_time);
items_per_second_stat.emplace_back(run.items_per_second);
bytes_per_second_stat.emplace_back(run.bytes_per_second);
// user counters
for(auto const& cnt : run.counters) {
auto it = counter_stats.find(cnt.first);
CHECK_NE(it, counter_stats.end());
it->second.s.emplace_back(cnt.second);
}
}
// Only add label if it is same for all runs
std::string report_label = reports[0].report_label;
for (std::size_t i = 1; i < reports.size(); i++) {
if (reports[i].report_label != report_label) {
report_label = "";
break;
}
}
for(const auto& Stat : *reports[0].statistics) {
// Get the data from the accumulator to BenchmarkReporter::Run's.
Run data;
data.benchmark_name = reports[0].benchmark_name + "_" + Stat.name_;
data.report_label = report_label;
data.iterations = run_iterations;
data.real_accumulated_time = Stat.compute_(real_accumulated_time_stat);
data.cpu_accumulated_time = Stat.compute_(cpu_accumulated_time_stat);
data.bytes_per_second = Stat.compute_(bytes_per_second_stat);
data.items_per_second = Stat.compute_(items_per_second_stat);
data.time_unit = reports[0].time_unit;
// user counters
for(auto const& kv : counter_stats) {
const auto uc_stat = Stat.compute_(kv.second.s);
auto c = Counter(uc_stat, counter_stats[kv.first].c.flags);
data.counters[kv.first] = c;
}
results.push_back(data);
}
return results;
}
} // end namespace benchmark

37
src/statistics.h Normal file
View File

@ -0,0 +1,37 @@
// Copyright 2016 Ismael Jimenez Martinez. All rights reserved.
// Copyright 2017 Roman Lebedev. All rights reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#ifndef STATISTICS_H_
#define STATISTICS_H_
#include <vector>
#include "benchmark/benchmark.h"
namespace benchmark {
// Return a vector containing the mean, median and standard devation information
// (and any user-specified info) for the specified list of reports. If 'reports'
// contains less than two non-errored runs an empty vector is returned
std::vector<BenchmarkReporter::Run> ComputeStats(
const std::vector<BenchmarkReporter::Run>& reports);
double StatisticsMean(const std::vector<double>& v);
double StatisticsMedian(const std::vector<double>& v);
double StatisticsStdDev(const std::vector<double>& v);
} // end namespace benchmark
#endif // STATISTICS_H_

View File

@ -182,22 +182,66 @@ void BM_Repeat(benchmark::State& state) {
while (state.KeepRunning()) {
}
}
// need two repetitions min to be able to output any aggregate output
BENCHMARK(BM_Repeat)->Repetitions(2);
ADD_CASES(TC_ConsoleOut, {{"^BM_Repeat/repeats:2 %console_report$"},
{"^BM_Repeat/repeats:2 %console_report$"},
{"^BM_Repeat/repeats:2_mean %console_report$"},
{"^BM_Repeat/repeats:2_median %console_report$"},
{"^BM_Repeat/repeats:2_stddev %console_report$"}});
ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_Repeat/repeats:2\",$"},
{"\"name\": \"BM_Repeat/repeats:2\",$"},
{"\"name\": \"BM_Repeat/repeats:2_mean\",$"},
{"\"name\": \"BM_Repeat/repeats:2_median\",$"},
{"\"name\": \"BM_Repeat/repeats:2_stddev\",$"}});
ADD_CASES(TC_CSVOut, {{"^\"BM_Repeat/repeats:2\",%csv_report$"},
{"^\"BM_Repeat/repeats:2\",%csv_report$"},
{"^\"BM_Repeat/repeats:2_mean\",%csv_report$"},
{"^\"BM_Repeat/repeats:2_median\",%csv_report$"},
{"^\"BM_Repeat/repeats:2_stddev\",%csv_report$"}});
// but for two repetitions, mean and median is the same, so let's repeat..
BENCHMARK(BM_Repeat)->Repetitions(3);
ADD_CASES(TC_ConsoleOut, {{"^BM_Repeat/repeats:3 %console_report$"},
{"^BM_Repeat/repeats:3 %console_report$"},
{"^BM_Repeat/repeats:3 %console_report$"},
{"^BM_Repeat/repeats:3_mean %console_report$"},
{"^BM_Repeat/repeats:3_median %console_report$"},
{"^BM_Repeat/repeats:3_stddev %console_report$"}});
ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_Repeat/repeats:3\",$"},
{"\"name\": \"BM_Repeat/repeats:3\",$"},
{"\"name\": \"BM_Repeat/repeats:3\",$"},
{"\"name\": \"BM_Repeat/repeats:3_mean\",$"},
{"\"name\": \"BM_Repeat/repeats:3_median\",$"},
{"\"name\": \"BM_Repeat/repeats:3_stddev\",$"}});
ADD_CASES(TC_CSVOut, {{"^\"BM_Repeat/repeats:3\",%csv_report$"},
{"^\"BM_Repeat/repeats:3\",%csv_report$"},
{"^\"BM_Repeat/repeats:3\",%csv_report$"},
{"^\"BM_Repeat/repeats:3_mean\",%csv_report$"},
{"^\"BM_Repeat/repeats:3_median\",%csv_report$"},
{"^\"BM_Repeat/repeats:3_stddev\",%csv_report$"}});
// median differs between even/odd number of repetitions, so just to be sure
BENCHMARK(BM_Repeat)->Repetitions(4);
ADD_CASES(TC_ConsoleOut, {{"^BM_Repeat/repeats:4 %console_report$"},
{"^BM_Repeat/repeats:4 %console_report$"},
{"^BM_Repeat/repeats:4 %console_report$"},
{"^BM_Repeat/repeats:4 %console_report$"},
{"^BM_Repeat/repeats:4_mean %console_report$"},
{"^BM_Repeat/repeats:4_median %console_report$"},
{"^BM_Repeat/repeats:4_stddev %console_report$"}});
ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_Repeat/repeats:4\",$"},
{"\"name\": \"BM_Repeat/repeats:4\",$"},
{"\"name\": \"BM_Repeat/repeats:4\",$"},
{"\"name\": \"BM_Repeat/repeats:4\",$"},
{"\"name\": \"BM_Repeat/repeats:4_mean\",$"},
{"\"name\": \"BM_Repeat/repeats:4_median\",$"},
{"\"name\": \"BM_Repeat/repeats:4_stddev\",$"}});
ADD_CASES(TC_CSVOut, {{"^\"BM_Repeat/repeats:4\",%csv_report$"},
{"^\"BM_Repeat/repeats:4\",%csv_report$"},
{"^\"BM_Repeat/repeats:4\",%csv_report$"},
{"^\"BM_Repeat/repeats:4\",%csv_report$"},
{"^\"BM_Repeat/repeats:4_mean\",%csv_report$"},
{"^\"BM_Repeat/repeats:4_median\",%csv_report$"},
{"^\"BM_Repeat/repeats:4_stddev\",%csv_report$"}});
// Test that a non-repeated test still prints non-aggregate results even when
// only-aggregate reports have been requested
@ -219,12 +263,15 @@ BENCHMARK(BM_SummaryRepeat)->Repetitions(3)->ReportAggregatesOnly();
ADD_CASES(TC_ConsoleOut,
{{".*BM_SummaryRepeat/repeats:3 ", MR_Not},
{"^BM_SummaryRepeat/repeats:3_mean %console_report$"},
{"^BM_SummaryRepeat/repeats:3_median %console_report$"},
{"^BM_SummaryRepeat/repeats:3_stddev %console_report$"}});
ADD_CASES(TC_JSONOut, {{".*BM_SummaryRepeat/repeats:3 ", MR_Not},
{"\"name\": \"BM_SummaryRepeat/repeats:3_mean\",$"},
{"\"name\": \"BM_SummaryRepeat/repeats:3_median\",$"},
{"\"name\": \"BM_SummaryRepeat/repeats:3_stddev\",$"}});
ADD_CASES(TC_CSVOut, {{".*BM_SummaryRepeat/repeats:3 ", MR_Not},
{"^\"BM_SummaryRepeat/repeats:3_mean\",%csv_report$"},
{"^\"BM_SummaryRepeat/repeats:3_median\",%csv_report$"},
{"^\"BM_SummaryRepeat/repeats:3_stddev\",%csv_report$"}});
void BM_RepeatTimeUnit(benchmark::State& state) {
@ -238,17 +285,59 @@ BENCHMARK(BM_RepeatTimeUnit)
ADD_CASES(TC_ConsoleOut,
{{".*BM_RepeatTimeUnit/repeats:3 ", MR_Not},
{"^BM_RepeatTimeUnit/repeats:3_mean %console_us_report$"},
{"^BM_RepeatTimeUnit/repeats:3_median %console_us_report$"},
{"^BM_RepeatTimeUnit/repeats:3_stddev %console_us_report$"}});
ADD_CASES(TC_JSONOut, {{".*BM_RepeatTimeUnit/repeats:3 ", MR_Not},
{"\"name\": \"BM_RepeatTimeUnit/repeats:3_mean\",$"},
{"\"time_unit\": \"us\",?$"},
{"\"name\": \"BM_RepeatTimeUnit/repeats:3_median\",$"},
{"\"time_unit\": \"us\",?$"},
{"\"name\": \"BM_RepeatTimeUnit/repeats:3_stddev\",$"},
{"\"time_unit\": \"us\",?$"}});
ADD_CASES(TC_CSVOut,
{{".*BM_RepeatTimeUnit/repeats:3 ", MR_Not},
{"^\"BM_RepeatTimeUnit/repeats:3_mean\",%csv_us_report$"},
{"^\"BM_RepeatTimeUnit/repeats:3_median\",%csv_us_report$"},
{"^\"BM_RepeatTimeUnit/repeats:3_stddev\",%csv_us_report$"}});
// ========================================================================= //
// -------------------- Testing user-provided statistics ------------------- //
// ========================================================================= //
const auto UserStatistics = [](const std::vector<double>& v) {
return v.back();
};
void BM_UserStats(benchmark::State& state) {
while (state.KeepRunning()) {
}
}
BENCHMARK(BM_UserStats)
->Repetitions(3)
->ComputeStatistics("", UserStatistics);
// check that user-provided stats is calculated, and is after the default-ones
// empty string as name is intentional, it would sort before anything else
ADD_CASES(TC_ConsoleOut, {{"^BM_UserStats/repeats:3 %console_report$"},
{"^BM_UserStats/repeats:3 %console_report$"},
{"^BM_UserStats/repeats:3 %console_report$"},
{"^BM_UserStats/repeats:3_mean %console_report$"},
{"^BM_UserStats/repeats:3_median %console_report$"},
{"^BM_UserStats/repeats:3_stddev %console_report$"},
{"^BM_UserStats/repeats:3_ %console_report$"}});
ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_UserStats/repeats:3\",$"},
{"\"name\": \"BM_UserStats/repeats:3\",$"},
{"\"name\": \"BM_UserStats/repeats:3\",$"},
{"\"name\": \"BM_UserStats/repeats:3_mean\",$"},
{"\"name\": \"BM_UserStats/repeats:3_median\",$"},
{"\"name\": \"BM_UserStats/repeats:3_stddev\",$"},
{"\"name\": \"BM_UserStats/repeats:3_\",$"}});
ADD_CASES(TC_CSVOut, {{"^\"BM_UserStats/repeats:3\",%csv_report$"},
{"^\"BM_UserStats/repeats:3\",%csv_report$"},
{"^\"BM_UserStats/repeats:3\",%csv_report$"},
{"^\"BM_UserStats/repeats:3_mean\",%csv_report$"},
{"^\"BM_UserStats/repeats:3_median\",%csv_report$"},
{"^\"BM_UserStats/repeats:3_stddev\",%csv_report$"},
{"^\"BM_UserStats/repeats:3_\",%csv_report$"}});
// ========================================================================= //
// --------------------------- TEST CASES END ------------------------------ //
// ========================================================================= //