From a271c36af93c7a3b19dfeb2aefa9ca77a58e52e4 Mon Sep 17 00:00:00 2001 From: Roman Lebedev Date: Thu, 24 Aug 2017 02:44:29 +0300 Subject: [PATCH] Drop Stat1, refactor statistics to be user-providable, add median. (#428) * Drop Stat1, refactor statistics to be user-providable, add median. My main goal was to add median statistic. Since Stat1 calculated the stats incrementally, and did not store the values themselves, it is was not possible. Thus, i have replaced Stat1 with simple std::vector, containing all the values. Then, i have refactored current mean/stdev to be a function that is provided with values vector, and returns the statistic. While there, it seemed to make sense to deduplicate the code by storing all the statistics functions in a map, and then simply iterate over it. And the interface to add new statistics is intentionally exposed, so they may be added easily. The notable change is that Iterations are no longer displayed as 0 for stdev. Is could be changed, but i'm not sure how to nicely fit that into the API. Similarly, this dance about sometimes (for some fields, for some statistics) dividing by run.iterations, and then multiplying the calculated stastic back is also dropped, and if you do the math, i fail to see why it was needed there in the first place. Since that was the only use of stat.h, it is removed. * complexity.h: attempt to fix MSVC build * Update README.md * Store statistics to compute in a vector, ensures ordering. * Add a bit more tests for repetitions. * Partially address review notes. * Fix gcc build: drop extra ';' clang, why didn't you warn me? * Address review comments. * double() -> 0.0 * early return --- README.md | 42 +++-- include/benchmark/benchmark.h | 19 +++ src/benchmark.cc | 9 +- src/benchmark_api_internal.h | 1 + src/benchmark_register.cc | 15 +- src/complexity.cc | 104 ------------ src/complexity.h | 7 +- src/reporter.cc | 1 - src/stat.h | 310 ---------------------------------- src/statistics.cc | 175 +++++++++++++++++++ src/statistics.h | 37 ++++ test/reporter_output_test.cc | 89 ++++++++++ 12 files changed, 374 insertions(+), 435 deletions(-) delete mode 100644 src/stat.h create mode 100644 src/statistics.cc create mode 100644 src/statistics.h diff --git a/README.md b/README.md index 2430d93b..407ee66c 100644 --- a/README.md +++ b/README.md @@ -223,8 +223,7 @@ scope, the `RegisterBenchmark` can be called anywhere. This allows for benchmark tests to be registered programmatically. Additionally `RegisterBenchmark` allows any callable object to be registered -as a benchmark. Including capturing lambdas and function objects. This -allows the creation +as a benchmark. Including capturing lambdas and function objects. For Example: ```c++ @@ -241,7 +240,7 @@ int main(int argc, char** argv) { ### Multithreaded benchmarks In a multithreaded test (benchmark invoked by multiple threads simultaneously), it is guaranteed that none of the threads will start until all have called -`KeepRunning`, and all will have finished before KeepRunning returns false. As +`KeepRunning`, and all will have finished before `KeepRunning` returns `false`. As such, any global setup or teardown can be wrapped in a check against the thread index: @@ -274,7 +273,7 @@ Without `UseRealTime`, CPU time is used by default. ## Manual timing For benchmarking something for which neither CPU time nor real-time are correct or accurate enough, completely manual timing is supported using -the `UseManualTime` function. +the `UseManualTime` function. When `UseManualTime` is used, the benchmarked code must call `SetIterationTime` once per iteration of the `KeepRunning` loop to @@ -384,7 +383,7 @@ the minimum time, or the wallclock time is 5x minimum time. The minimum time is set as a flag `--benchmark_min_time` or per benchmark by calling `MinTime` on the registered benchmark object. -## Reporting the mean and standard devation by repeated benchmarks +## Reporting the mean, median and standard deviation by repeated benchmarks By default each benchmark is run once and that single result is reported. However benchmarks are often noisy and a single result may not be representative of the overall behavior. For this reason it's possible to repeatedly rerun the @@ -392,19 +391,42 @@ benchmark. The number of runs of each benchmark is specified globally by the `--benchmark_repetitions` flag or on a per benchmark basis by calling -`Repetitions` on the registered benchmark object. When a benchmark is run -more than once the mean and standard deviation of the runs will be reported. +`Repetitions` on the registered benchmark object. When a benchmark is run more +than once the mean, median and standard deviation of the runs will be reported. Additionally the `--benchmark_report_aggregates_only={true|false}` flag or `ReportAggregatesOnly(bool)` function can be used to change how repeated tests are reported. By default the result of each repeated run is reported. When this -option is 'true' only the mean and standard deviation of the runs is reported. +option is `true` only the mean, median and standard deviation of the runs is reported. Calling `ReportAggregatesOnly(bool)` on a registered benchmark object overrides the value of the flag for that benchmark. +## User-defined statistics for repeated benchmarks +While having mean, median and standard deviation is nice, this may not be +enough for everyone. For example you may want to know what is the largest +observation, e.g. because you have some real-time constraints. This is easy. +The following code will specify a custom statistic to be calculated, defined +by a lambda function. + +```c++ +void BM_spin_empty(benchmark::State& state) { + while (state.KeepRunning()) { + for (int x = 0; x < state.range(0); ++x) { + benchmark::DoNotOptimize(x); + } + } +} + +BENCHMARK(BM_spin_empty) + ->ComputeStatistics("max", [](const std::vector& v) -> double { + return *(std::max_element(std::begin(v), std::end(v))); + }) + ->Arg(512); +``` + ## Fixtures Fixture tests are created by -first defining a type that derives from ::benchmark::Fixture and then +first defining a type that derives from `::benchmark::Fixture` and then creating/registering the tests using the following macros: * `BENCHMARK_F(ClassName, Method)` @@ -614,7 +636,7 @@ The library supports multiple output formats. Use the is the default format. The Console format is intended to be a human readable format. By default -the format generates color output. Context is output on stderr and the +the format generates color output. Context is output on stderr and the tabular data on stdout. Example tabular output looks like: ``` Benchmark Time(ns) CPU(ns) Iterations diff --git a/include/benchmark/benchmark.h b/include/benchmark/benchmark.h index bd3b0ffb..a1ce704c 100644 --- a/include/benchmark/benchmark.h +++ b/include/benchmark/benchmark.h @@ -378,6 +378,18 @@ enum BigO { oNone, o1, oN, oNSquared, oNCubed, oLogN, oNLogN, oAuto, oLambda }; // computational complexity for the benchmark. typedef double(BigOFunc)(int); +// StatisticsFunc is passed to a benchmark in order to compute some descriptive +// statistics over all the measurements of some type +typedef double(StatisticsFunc)(const std::vector&); + +struct Statistics { + std::string name_; + StatisticsFunc* compute_; + + Statistics(std::string name, StatisticsFunc* compute) + : name_(name), compute_(compute) {} +}; + namespace internal { class ThreadTimer; class ThreadManager; @@ -698,6 +710,9 @@ class Benchmark { // the asymptotic computational complexity will be shown on the output. Benchmark* Complexity(BigOFunc* complexity); + // Add this statistics to be computed over all the values of benchmark run + Benchmark* ComputeStatistics(std::string name, StatisticsFunc* statistics); + // Support for running multiple copies of the same benchmark concurrently // in multiple threads. This may be useful when measuring the scaling // of some piece of code. @@ -758,6 +773,7 @@ class Benchmark { bool use_manual_time_; BigO complexity_; BigOFunc* complexity_lambda_; + std::vector statistics_; std::vector thread_counts_; Benchmark& operator=(Benchmark const&); @@ -1065,6 +1081,9 @@ class BenchmarkReporter { BigOFunc* complexity_lambda; int complexity_n; + // what statistics to compute from the measurements + const std::vector* statistics; + // Inform print function whether the current run is a complexity report bool report_big_o; bool report_rms; diff --git a/src/benchmark.cc b/src/benchmark.cc index 1ba0a50a..90ed1570 100644 --- a/src/benchmark.cc +++ b/src/benchmark.cc @@ -37,11 +37,11 @@ #include "colorprint.h" #include "commandlineflags.h" #include "complexity.h" +#include "statistics.h" #include "counter.h" #include "log.h" #include "mutex.h" #include "re.h" -#include "stat.h" #include "string_util.h" #include "sysinfo.h" #include "timers.h" @@ -256,6 +256,7 @@ BenchmarkReporter::Run CreateRunReport( report.complexity_n = results.complexity_n; report.complexity = b.complexity; report.complexity_lambda = b.complexity_lambda; + report.statistics = b.statistics; report.counters = results.counters; internal::Finish(&report.counters, seconds, b.threads); } @@ -481,12 +482,16 @@ void RunBenchmarks(const std::vector& benchmarks, // Determine the width of the name field using a minimum width of 10. bool has_repetitions = FLAGS_benchmark_repetitions > 1; size_t name_field_width = 10; + size_t stat_field_width = 0; for (const Benchmark::Instance& benchmark : benchmarks) { name_field_width = std::max(name_field_width, benchmark.name.size()); has_repetitions |= benchmark.repetitions > 1; + + for(const auto& Stat : *benchmark.statistics) + stat_field_width = std::max(stat_field_width, Stat.name_.size()); } - if (has_repetitions) name_field_width += std::strlen("_stddev"); + if (has_repetitions) name_field_width += 1 + stat_field_width; // Print header here BenchmarkReporter::Context context; diff --git a/src/benchmark_api_internal.h b/src/benchmark_api_internal.h index 36d23404..d481dc52 100644 --- a/src/benchmark_api_internal.h +++ b/src/benchmark_api_internal.h @@ -25,6 +25,7 @@ struct Benchmark::Instance { BigO complexity; BigOFunc* complexity_lambda; UserCounters counters; + const std::vector* statistics; bool last_benchmark_instance; int repetitions; double min_time; diff --git a/src/benchmark_register.cc b/src/benchmark_register.cc index ed70d820..c1b80674 100644 --- a/src/benchmark_register.cc +++ b/src/benchmark_register.cc @@ -37,10 +37,10 @@ #include "check.h" #include "commandlineflags.h" #include "complexity.h" +#include "statistics.h" #include "log.h" #include "mutex.h" #include "re.h" -#include "stat.h" #include "string_util.h" #include "sysinfo.h" #include "timers.h" @@ -159,6 +159,7 @@ bool BenchmarkFamilies::FindBenchmarks( instance.use_manual_time = family->use_manual_time_; instance.complexity = family->complexity_; instance.complexity_lambda = family->complexity_lambda_; + instance.statistics = &family->statistics_; instance.threads = num_threads; // Add arguments to instance name @@ -236,7 +237,11 @@ Benchmark::Benchmark(const char* name) use_real_time_(false), use_manual_time_(false), complexity_(oNone), - complexity_lambda_(nullptr) {} + complexity_lambda_(nullptr) { + ComputeStatistics("mean", StatisticsMean); + ComputeStatistics("median", StatisticsMedian); + ComputeStatistics("stddev", StatisticsStdDev); +} Benchmark::~Benchmark() {} @@ -409,6 +414,12 @@ Benchmark* Benchmark::Complexity(BigOFunc* complexity) { return this; } +Benchmark* Benchmark::ComputeStatistics(std::string name, + StatisticsFunc* statistics) { + statistics_.emplace_back(name, statistics); + return this; +} + Benchmark* Benchmark::Threads(int t) { CHECK_GT(t, 0); thread_counts_.push_back(t); diff --git a/src/complexity.cc b/src/complexity.cc index 33975be5..88832698 100644 --- a/src/complexity.cc +++ b/src/complexity.cc @@ -21,7 +21,6 @@ #include #include "check.h" #include "complexity.h" -#include "stat.h" namespace benchmark { @@ -150,109 +149,6 @@ LeastSq MinimalLeastSq(const std::vector& n, return best_fit; } -std::vector ComputeStats( - const std::vector& reports) { - typedef BenchmarkReporter::Run Run; - std::vector results; - - auto error_count = - std::count_if(reports.begin(), reports.end(), - [](Run const& run) { return run.error_occurred; }); - - if (reports.size() - error_count < 2) { - // We don't report aggregated data if there was a single run. - return results; - } - // Accumulators. - Stat1_d real_accumulated_time_stat; - Stat1_d cpu_accumulated_time_stat; - Stat1_d bytes_per_second_stat; - Stat1_d items_per_second_stat; - // All repetitions should be run with the same number of iterations so we - // can take this information from the first benchmark. - int64_t const run_iterations = reports.front().iterations; - // create stats for user counters - struct CounterStat { - Counter c; - Stat1_d s; - }; - std::map< std::string, CounterStat > counter_stats; - for(Run const& r : reports) { - for(auto const& cnt : r.counters) { - auto it = counter_stats.find(cnt.first); - if(it == counter_stats.end()) { - counter_stats.insert({cnt.first, {cnt.second, Stat1_d{}}}); - } else { - CHECK_EQ(counter_stats[cnt.first].c.flags, cnt.second.flags); - } - } - } - - // Populate the accumulators. - for (Run const& run : reports) { - CHECK_EQ(reports[0].benchmark_name, run.benchmark_name); - CHECK_EQ(run_iterations, run.iterations); - if (run.error_occurred) continue; - real_accumulated_time_stat += - Stat1_d(run.real_accumulated_time / run.iterations); - cpu_accumulated_time_stat += - Stat1_d(run.cpu_accumulated_time / run.iterations); - items_per_second_stat += Stat1_d(run.items_per_second); - bytes_per_second_stat += Stat1_d(run.bytes_per_second); - // user counters - for(auto const& cnt : run.counters) { - auto it = counter_stats.find(cnt.first); - CHECK_NE(it, counter_stats.end()); - it->second.s += Stat1_d(cnt.second); - } - } - - // Get the data from the accumulator to BenchmarkReporter::Run's. - Run mean_data; - mean_data.benchmark_name = reports[0].benchmark_name + "_mean"; - mean_data.iterations = run_iterations; - mean_data.real_accumulated_time = - real_accumulated_time_stat.Mean() * run_iterations; - mean_data.cpu_accumulated_time = - cpu_accumulated_time_stat.Mean() * run_iterations; - mean_data.bytes_per_second = bytes_per_second_stat.Mean(); - mean_data.items_per_second = items_per_second_stat.Mean(); - mean_data.time_unit = reports[0].time_unit; - // user counters - for(auto const& kv : counter_stats) { - auto c = Counter(kv.second.s.Mean(), counter_stats[kv.first].c.flags); - mean_data.counters[kv.first] = c; - } - - // Only add label to mean/stddev if it is same for all runs - mean_data.report_label = reports[0].report_label; - for (std::size_t i = 1; i < reports.size(); i++) { - if (reports[i].report_label != reports[0].report_label) { - mean_data.report_label = ""; - break; - } - } - - Run stddev_data; - stddev_data.benchmark_name = reports[0].benchmark_name + "_stddev"; - stddev_data.report_label = mean_data.report_label; - stddev_data.iterations = 0; - stddev_data.real_accumulated_time = real_accumulated_time_stat.StdDev(); - stddev_data.cpu_accumulated_time = cpu_accumulated_time_stat.StdDev(); - stddev_data.bytes_per_second = bytes_per_second_stat.StdDev(); - stddev_data.items_per_second = items_per_second_stat.StdDev(); - stddev_data.time_unit = reports[0].time_unit; - // user counters - for(auto const& kv : counter_stats) { - auto c = Counter(kv.second.s.StdDev(), counter_stats[kv.first].c.flags); - stddev_data.counters[kv.first] = c; - } - - results.push_back(mean_data); - results.push_back(stddev_data); - return results; -} - std::vector ComputeBigO( const std::vector& reports) { typedef BenchmarkReporter::Run Run; diff --git a/src/complexity.h b/src/complexity.h index c0ca60e6..df29b48d 100644 --- a/src/complexity.h +++ b/src/complexity.h @@ -25,12 +25,6 @@ namespace benchmark { -// Return a vector containing the mean and standard devation information for -// the specified list of reports. If 'reports' contains less than two -// non-errored runs an empty vector is returned -std::vector ComputeStats( - const std::vector& reports); - // Return a vector containing the bigO and RMS information for the specified // list of reports. If 'reports.size() < 2' an empty vector is returned. std::vector ComputeBigO( @@ -57,4 +51,5 @@ struct LeastSq { std::string GetBigOString(BigO complexity); } // end namespace benchmark + #endif // COMPLEXITY_H_ diff --git a/src/reporter.cc b/src/reporter.cc index aacd4531..9a0830b0 100644 --- a/src/reporter.cc +++ b/src/reporter.cc @@ -22,7 +22,6 @@ #include #include "check.h" -#include "stat.h" namespace benchmark { diff --git a/src/stat.h b/src/stat.h deleted file mode 100644 index d356875b..00000000 --- a/src/stat.h +++ /dev/null @@ -1,310 +0,0 @@ -#ifndef BENCHMARK_STAT_H_ -#define BENCHMARK_STAT_H_ - -#include -#include -#include -#include - -namespace benchmark { - -template -class Stat1; - -template -class Stat1MinMax; - -typedef Stat1 Stat1_f; -typedef Stat1 Stat1_d; -typedef Stat1MinMax Stat1MinMax_f; -typedef Stat1MinMax Stat1MinMax_d; - -template -class Vector2; -template -class Vector3; -template -class Vector4; - -template -class Stat1 { - public: - typedef Stat1 Self; - - Stat1() { Clear(); } - // Create a sample of value dat and weight 1 - explicit Stat1(const VType &dat) { - sum_ = dat; - sum_squares_ = Sqr(dat); - numsamples_ = 1; - } - // Create statistics for all the samples between begin (included) - // and end(excluded) - explicit Stat1(const VType *begin, const VType *end) { - Clear(); - for (const VType *item = begin; item < end; ++item) { - (*this) += Stat1(*item); - } - } - // Create a sample of value dat and weight w - Stat1(const VType &dat, const NumType &w) { - sum_ = w * dat; - sum_squares_ = w * Sqr(dat); - numsamples_ = w; - } - // Copy operator - Stat1(const Self &stat) { - sum_ = stat.sum_; - sum_squares_ = stat.sum_squares_; - numsamples_ = stat.numsamples_; - } - - void Clear() { - numsamples_ = NumType(); - sum_squares_ = sum_ = VType(); - } - - Self &operator=(const Self &stat) { - sum_ = stat.sum_; - sum_squares_ = stat.sum_squares_; - numsamples_ = stat.numsamples_; - return (*this); - } - // Merge statistics from two sample sets. - Self &operator+=(const Self &stat) { - sum_ += stat.sum_; - sum_squares_ += stat.sum_squares_; - numsamples_ += stat.numsamples_; - return (*this); - } - // The operation opposite to += - Self &operator-=(const Self &stat) { - sum_ -= stat.sum_; - sum_squares_ -= stat.sum_squares_; - numsamples_ -= stat.numsamples_; - return (*this); - } - // Multiply the weight of the set of samples by a factor k - Self &operator*=(const VType &k) { - sum_ *= k; - sum_squares_ *= k; - numsamples_ *= k; - return (*this); - } - - // Merge statistics from two sample sets. - Self operator+(const Self &stat) const { return Self(*this) += stat; } - - // The operation opposite to + - Self operator-(const Self &stat) const { return Self(*this) -= stat; } - - // Multiply the weight of the set of samples by a factor k - Self operator*(const VType &k) const { return Self(*this) *= k; } - - // Return the total weight of this sample set - NumType numSamples() const { return numsamples_; } - - // Return the sum of this sample set - VType Sum() const { return sum_; } - - // Return the mean of this sample set - VType Mean() const { - if (numsamples_ == 0) return VType(); - return sum_ * (1.0 / numsamples_); - } - - // Return the mean of this sample set and compute the standard deviation at - // the same time. - VType Mean(VType *stddev) const { - if (numsamples_ == 0) return VType(); - VType mean = sum_ * (1.0 / numsamples_); - if (stddev) { - // Sample standard deviation is undefined for n = 1 - if (numsamples_ == 1) { - *stddev = VType(); - } else { - VType avg_squares = sum_squares_ * (1.0 / numsamples_); - *stddev = Sqrt(numsamples_ / (numsamples_ - 1.0) * (avg_squares - Sqr(mean))); - } - } - return mean; - } - - // Return the standard deviation of the sample set - VType StdDev() const { - VType stddev = VType(); - Mean(&stddev); - return stddev; - } - - private: - static_assert(std::is_integral::value && - !std::is_same::value, - "NumType must be an integral type that is not bool."); - // Let i be the index of the samples provided (using +=) - // and weight[i],value[i] be the data of sample #i - // then the variables have the following meaning: - NumType numsamples_; // sum of weight[i]; - VType sum_; // sum of weight[i]*value[i]; - VType sum_squares_; // sum of weight[i]*value[i]^2; - - // Template function used to square a number. - // For a vector we square all components - template - static inline SType Sqr(const SType &dat) { - return dat * dat; - } - - template - static inline Vector2 Sqr(const Vector2 &dat) { - return dat.MulComponents(dat); - } - - template - static inline Vector3 Sqr(const Vector3 &dat) { - return dat.MulComponents(dat); - } - - template - static inline Vector4 Sqr(const Vector4 &dat) { - return dat.MulComponents(dat); - } - - // Template function used to take the square root of a number. - // For a vector we square all components - template - static inline SType Sqrt(const SType &dat) { - // Avoid NaN due to imprecision in the calculations - if (dat < 0) return 0; - return sqrt(dat); - } - - template - static inline Vector2 Sqrt(const Vector2 &dat) { - // Avoid NaN due to imprecision in the calculations - return Max(dat, Vector2()).Sqrt(); - } - - template - static inline Vector3 Sqrt(const Vector3 &dat) { - // Avoid NaN due to imprecision in the calculations - return Max(dat, Vector3()).Sqrt(); - } - - template - static inline Vector4 Sqrt(const Vector4 &dat) { - // Avoid NaN due to imprecision in the calculations - return Max(dat, Vector4()).Sqrt(); - } -}; - -// Useful printing function -template -std::ostream &operator<<(std::ostream &out, const Stat1 &s) { - out << "{ avg = " << s.Mean() << " std = " << s.StdDev() - << " nsamples = " << s.NumSamples() << "}"; - return out; -} - -// Stat1MinMax: same as Stat1, but it also -// keeps the Min and Max values; the "-" -// operator is disabled because it cannot be implemented -// efficiently -template -class Stat1MinMax : public Stat1 { - public: - typedef Stat1MinMax Self; - - Stat1MinMax() { Clear(); } - // Create a sample of value dat and weight 1 - explicit Stat1MinMax(const VType &dat) : Stat1(dat) { - max_ = dat; - min_ = dat; - } - // Create statistics for all the samples between begin (included) - // and end(excluded) - explicit Stat1MinMax(const VType *begin, const VType *end) { - Clear(); - for (const VType *item = begin; item < end; ++item) { - (*this) += Stat1MinMax(*item); - } - } - // Create a sample of value dat and weight w - Stat1MinMax(const VType &dat, const NumType &w) - : Stat1(dat, w) { - max_ = dat; - min_ = dat; - } - // Copy operator - Stat1MinMax(const Self &stat) : Stat1(stat) { - max_ = stat.max_; - min_ = stat.min_; - } - - void Clear() { - Stat1::Clear(); - if (std::numeric_limits::has_infinity) { - min_ = std::numeric_limits::infinity(); - max_ = -std::numeric_limits::infinity(); - } else { - min_ = std::numeric_limits::max(); - max_ = std::numeric_limits::min(); - } - } - - Self &operator=(const Self &stat) { - this->Stat1::operator=(stat); - max_ = stat.max_; - min_ = stat.min_; - return (*this); - } - // Merge statistics from two sample sets. - Self &operator+=(const Self &stat) { - this->Stat1::operator+=(stat); - if (stat.max_ > max_) max_ = stat.max_; - if (stat.min_ < min_) min_ = stat.min_; - return (*this); - } - // Multiply the weight of the set of samples by a factor k - Self &operator*=(const VType &stat) { - this->Stat1::operator*=(stat); - return (*this); - } - // Merge statistics from two sample sets. - Self operator+(const Self &stat) const { return Self(*this) += stat; } - // Multiply the weight of the set of samples by a factor k - Self operator*(const VType &k) const { return Self(*this) *= k; } - - // Return the maximal value in this sample set - VType Max() const { return max_; } - // Return the minimal value in this sample set - VType Min() const { return min_; } - - private: - // The - operation makes no sense with Min/Max - // unless we keep the full list of values (but we don't) - // make it private, and let it undefined so nobody can call it - Self &operator-=(const Self &stat); // senseless. let it undefined. - - // The operation opposite to - - Self operator-(const Self &stat) const; // senseless. let it undefined. - - // Let i be the index of the samples provided (using +=) - // and weight[i],value[i] be the data of sample #i - // then the variables have the following meaning: - VType max_; // max of value[i] - VType min_; // min of value[i] -}; - -// Useful printing function -template -std::ostream &operator<<(std::ostream &out, - const Stat1MinMax &s) { - out << "{ avg = " << s.Mean() << " std = " << s.StdDev() - << " nsamples = " << s.NumSamples() << " min = " << s.Min() - << " max = " << s.Max() << "}"; - return out; -} -} // end namespace benchmark - -#endif // BENCHMARK_STAT_H_ diff --git a/src/statistics.cc b/src/statistics.cc new file mode 100644 index 00000000..5932ad43 --- /dev/null +++ b/src/statistics.cc @@ -0,0 +1,175 @@ +// Copyright 2016 Ismael Jimenez Martinez. All rights reserved. +// Copyright 2017 Roman Lebedev. All rights reserved. +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#include "benchmark/benchmark.h" + +#include +#include +#include +#include +#include +#include "check.h" +#include "statistics.h" + +namespace benchmark { + +auto StatisticsSum = [](const std::vector& v) { + return std::accumulate(v.begin(), v.end(), 0.0); +}; + +double StatisticsMean(const std::vector& v) { + if (v.size() == 0) return 0.0; + return StatisticsSum(v) * (1.0 / v.size()); +} + +double StatisticsMedian(const std::vector& v) { + if (v.size() < 3) return StatisticsMean(v); + std::vector partial; + // we need roundDown(count/2)+1 slots + partial.resize(1 + (v.size() / 2)); + std::partial_sort_copy(v.begin(), v.end(), partial.begin(), partial.end()); + // did we have odd number of samples? + // if yes, then the last element of partially-sorted vector is the median + // it no, then the average of the last two elements is the median + if(v.size() % 2 == 1) + return partial.back(); + return (partial[partial.size() - 2] + partial[partial.size() - 1]) / 2.0; +} + +// Return the sum of the squares of this sample set +auto SumSquares = [](const std::vector& v) { + return std::inner_product(v.begin(), v.end(), v.begin(), 0.0); +}; + +auto Sqr = [](const double dat) { return dat * dat; }; +auto Sqrt = [](const double dat) { + // Avoid NaN due to imprecision in the calculations + if (dat < 0.0) return 0.0; + return std::sqrt(dat); +}; + +double StatisticsStdDev(const std::vector& v) { + const auto mean = StatisticsMean(v); + if (v.size() == 0) return mean; + + // Sample standard deviation is undefined for n = 1 + if (v.size() == 1) + return 0.0; + + const double avg_squares = SumSquares(v) * (1.0 / v.size()); + return Sqrt(v.size() / (v.size() - 1.0) * (avg_squares - Sqr(mean))); +} + +std::vector ComputeStats( + const std::vector& reports) { + typedef BenchmarkReporter::Run Run; + std::vector results; + + auto error_count = + std::count_if(reports.begin(), reports.end(), + [](Run const& run) { return run.error_occurred; }); + + if (reports.size() - error_count < 2) { + // We don't report aggregated data if there was a single run. + return results; + } + + // Accumulators. + std::vector real_accumulated_time_stat; + std::vector cpu_accumulated_time_stat; + std::vector bytes_per_second_stat; + std::vector items_per_second_stat; + + real_accumulated_time_stat.reserve(reports.size()); + cpu_accumulated_time_stat.reserve(reports.size()); + bytes_per_second_stat.reserve(reports.size()); + items_per_second_stat.reserve(reports.size()); + + // All repetitions should be run with the same number of iterations so we + // can take this information from the first benchmark. + int64_t const run_iterations = reports.front().iterations; + // create stats for user counters + struct CounterStat { + Counter c; + std::vector s; + }; + std::map< std::string, CounterStat > counter_stats; + for(Run const& r : reports) { + for(auto const& cnt : r.counters) { + auto it = counter_stats.find(cnt.first); + if(it == counter_stats.end()) { + counter_stats.insert({cnt.first, {cnt.second, std::vector{}}}); + it = counter_stats.find(cnt.first); + it->second.s.reserve(reports.size()); + } else { + CHECK_EQ(counter_stats[cnt.first].c.flags, cnt.second.flags); + } + } + } + + // Populate the accumulators. + for (Run const& run : reports) { + CHECK_EQ(reports[0].benchmark_name, run.benchmark_name); + CHECK_EQ(run_iterations, run.iterations); + if (run.error_occurred) continue; + real_accumulated_time_stat.emplace_back(run.real_accumulated_time); + cpu_accumulated_time_stat.emplace_back(run.cpu_accumulated_time); + items_per_second_stat.emplace_back(run.items_per_second); + bytes_per_second_stat.emplace_back(run.bytes_per_second); + // user counters + for(auto const& cnt : run.counters) { + auto it = counter_stats.find(cnt.first); + CHECK_NE(it, counter_stats.end()); + it->second.s.emplace_back(cnt.second); + } + } + + // Only add label if it is same for all runs + std::string report_label = reports[0].report_label; + for (std::size_t i = 1; i < reports.size(); i++) { + if (reports[i].report_label != report_label) { + report_label = ""; + break; + } + } + + for(const auto& Stat : *reports[0].statistics) { + // Get the data from the accumulator to BenchmarkReporter::Run's. + Run data; + data.benchmark_name = reports[0].benchmark_name + "_" + Stat.name_; + data.report_label = report_label; + data.iterations = run_iterations; + + data.real_accumulated_time = Stat.compute_(real_accumulated_time_stat); + data.cpu_accumulated_time = Stat.compute_(cpu_accumulated_time_stat); + data.bytes_per_second = Stat.compute_(bytes_per_second_stat); + data.items_per_second = Stat.compute_(items_per_second_stat); + + data.time_unit = reports[0].time_unit; + + // user counters + for(auto const& kv : counter_stats) { + const auto uc_stat = Stat.compute_(kv.second.s); + auto c = Counter(uc_stat, counter_stats[kv.first].c.flags); + data.counters[kv.first] = c; + } + + results.push_back(data); + } + + return results; +} + +} // end namespace benchmark diff --git a/src/statistics.h b/src/statistics.h new file mode 100644 index 00000000..7eccc855 --- /dev/null +++ b/src/statistics.h @@ -0,0 +1,37 @@ +// Copyright 2016 Ismael Jimenez Martinez. All rights reserved. +// Copyright 2017 Roman Lebedev. All rights reserved. +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#ifndef STATISTICS_H_ +#define STATISTICS_H_ + +#include + +#include "benchmark/benchmark.h" + +namespace benchmark { + +// Return a vector containing the mean, median and standard devation information +// (and any user-specified info) for the specified list of reports. If 'reports' +// contains less than two non-errored runs an empty vector is returned +std::vector ComputeStats( + const std::vector& reports); + +double StatisticsMean(const std::vector& v); +double StatisticsMedian(const std::vector& v); +double StatisticsStdDev(const std::vector& v); + +} // end namespace benchmark + +#endif // STATISTICS_H_ diff --git a/test/reporter_output_test.cc b/test/reporter_output_test.cc index 01d12cd8..bd33dc3e 100644 --- a/test/reporter_output_test.cc +++ b/test/reporter_output_test.cc @@ -182,22 +182,66 @@ void BM_Repeat(benchmark::State& state) { while (state.KeepRunning()) { } } +// need two repetitions min to be able to output any aggregate output +BENCHMARK(BM_Repeat)->Repetitions(2); +ADD_CASES(TC_ConsoleOut, {{"^BM_Repeat/repeats:2 %console_report$"}, + {"^BM_Repeat/repeats:2 %console_report$"}, + {"^BM_Repeat/repeats:2_mean %console_report$"}, + {"^BM_Repeat/repeats:2_median %console_report$"}, + {"^BM_Repeat/repeats:2_stddev %console_report$"}}); +ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_Repeat/repeats:2\",$"}, + {"\"name\": \"BM_Repeat/repeats:2\",$"}, + {"\"name\": \"BM_Repeat/repeats:2_mean\",$"}, + {"\"name\": \"BM_Repeat/repeats:2_median\",$"}, + {"\"name\": \"BM_Repeat/repeats:2_stddev\",$"}}); +ADD_CASES(TC_CSVOut, {{"^\"BM_Repeat/repeats:2\",%csv_report$"}, + {"^\"BM_Repeat/repeats:2\",%csv_report$"}, + {"^\"BM_Repeat/repeats:2_mean\",%csv_report$"}, + {"^\"BM_Repeat/repeats:2_median\",%csv_report$"}, + {"^\"BM_Repeat/repeats:2_stddev\",%csv_report$"}}); +// but for two repetitions, mean and median is the same, so let's repeat.. BENCHMARK(BM_Repeat)->Repetitions(3); ADD_CASES(TC_ConsoleOut, {{"^BM_Repeat/repeats:3 %console_report$"}, {"^BM_Repeat/repeats:3 %console_report$"}, {"^BM_Repeat/repeats:3 %console_report$"}, {"^BM_Repeat/repeats:3_mean %console_report$"}, + {"^BM_Repeat/repeats:3_median %console_report$"}, {"^BM_Repeat/repeats:3_stddev %console_report$"}}); ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_Repeat/repeats:3\",$"}, {"\"name\": \"BM_Repeat/repeats:3\",$"}, {"\"name\": \"BM_Repeat/repeats:3\",$"}, {"\"name\": \"BM_Repeat/repeats:3_mean\",$"}, + {"\"name\": \"BM_Repeat/repeats:3_median\",$"}, {"\"name\": \"BM_Repeat/repeats:3_stddev\",$"}}); ADD_CASES(TC_CSVOut, {{"^\"BM_Repeat/repeats:3\",%csv_report$"}, {"^\"BM_Repeat/repeats:3\",%csv_report$"}, {"^\"BM_Repeat/repeats:3\",%csv_report$"}, {"^\"BM_Repeat/repeats:3_mean\",%csv_report$"}, + {"^\"BM_Repeat/repeats:3_median\",%csv_report$"}, {"^\"BM_Repeat/repeats:3_stddev\",%csv_report$"}}); +// median differs between even/odd number of repetitions, so just to be sure +BENCHMARK(BM_Repeat)->Repetitions(4); +ADD_CASES(TC_ConsoleOut, {{"^BM_Repeat/repeats:4 %console_report$"}, + {"^BM_Repeat/repeats:4 %console_report$"}, + {"^BM_Repeat/repeats:4 %console_report$"}, + {"^BM_Repeat/repeats:4 %console_report$"}, + {"^BM_Repeat/repeats:4_mean %console_report$"}, + {"^BM_Repeat/repeats:4_median %console_report$"}, + {"^BM_Repeat/repeats:4_stddev %console_report$"}}); +ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_Repeat/repeats:4\",$"}, + {"\"name\": \"BM_Repeat/repeats:4\",$"}, + {"\"name\": \"BM_Repeat/repeats:4\",$"}, + {"\"name\": \"BM_Repeat/repeats:4\",$"}, + {"\"name\": \"BM_Repeat/repeats:4_mean\",$"}, + {"\"name\": \"BM_Repeat/repeats:4_median\",$"}, + {"\"name\": \"BM_Repeat/repeats:4_stddev\",$"}}); +ADD_CASES(TC_CSVOut, {{"^\"BM_Repeat/repeats:4\",%csv_report$"}, + {"^\"BM_Repeat/repeats:4\",%csv_report$"}, + {"^\"BM_Repeat/repeats:4\",%csv_report$"}, + {"^\"BM_Repeat/repeats:4\",%csv_report$"}, + {"^\"BM_Repeat/repeats:4_mean\",%csv_report$"}, + {"^\"BM_Repeat/repeats:4_median\",%csv_report$"}, + {"^\"BM_Repeat/repeats:4_stddev\",%csv_report$"}}); // Test that a non-repeated test still prints non-aggregate results even when // only-aggregate reports have been requested @@ -219,12 +263,15 @@ BENCHMARK(BM_SummaryRepeat)->Repetitions(3)->ReportAggregatesOnly(); ADD_CASES(TC_ConsoleOut, {{".*BM_SummaryRepeat/repeats:3 ", MR_Not}, {"^BM_SummaryRepeat/repeats:3_mean %console_report$"}, + {"^BM_SummaryRepeat/repeats:3_median %console_report$"}, {"^BM_SummaryRepeat/repeats:3_stddev %console_report$"}}); ADD_CASES(TC_JSONOut, {{".*BM_SummaryRepeat/repeats:3 ", MR_Not}, {"\"name\": \"BM_SummaryRepeat/repeats:3_mean\",$"}, + {"\"name\": \"BM_SummaryRepeat/repeats:3_median\",$"}, {"\"name\": \"BM_SummaryRepeat/repeats:3_stddev\",$"}}); ADD_CASES(TC_CSVOut, {{".*BM_SummaryRepeat/repeats:3 ", MR_Not}, {"^\"BM_SummaryRepeat/repeats:3_mean\",%csv_report$"}, + {"^\"BM_SummaryRepeat/repeats:3_median\",%csv_report$"}, {"^\"BM_SummaryRepeat/repeats:3_stddev\",%csv_report$"}}); void BM_RepeatTimeUnit(benchmark::State& state) { @@ -238,17 +285,59 @@ BENCHMARK(BM_RepeatTimeUnit) ADD_CASES(TC_ConsoleOut, {{".*BM_RepeatTimeUnit/repeats:3 ", MR_Not}, {"^BM_RepeatTimeUnit/repeats:3_mean %console_us_report$"}, + {"^BM_RepeatTimeUnit/repeats:3_median %console_us_report$"}, {"^BM_RepeatTimeUnit/repeats:3_stddev %console_us_report$"}}); ADD_CASES(TC_JSONOut, {{".*BM_RepeatTimeUnit/repeats:3 ", MR_Not}, {"\"name\": \"BM_RepeatTimeUnit/repeats:3_mean\",$"}, {"\"time_unit\": \"us\",?$"}, + {"\"name\": \"BM_RepeatTimeUnit/repeats:3_median\",$"}, + {"\"time_unit\": \"us\",?$"}, {"\"name\": \"BM_RepeatTimeUnit/repeats:3_stddev\",$"}, {"\"time_unit\": \"us\",?$"}}); ADD_CASES(TC_CSVOut, {{".*BM_RepeatTimeUnit/repeats:3 ", MR_Not}, {"^\"BM_RepeatTimeUnit/repeats:3_mean\",%csv_us_report$"}, + {"^\"BM_RepeatTimeUnit/repeats:3_median\",%csv_us_report$"}, {"^\"BM_RepeatTimeUnit/repeats:3_stddev\",%csv_us_report$"}}); +// ========================================================================= // +// -------------------- Testing user-provided statistics ------------------- // +// ========================================================================= // + +const auto UserStatistics = [](const std::vector& v) { + return v.back(); +}; +void BM_UserStats(benchmark::State& state) { + while (state.KeepRunning()) { + } +} +BENCHMARK(BM_UserStats) + ->Repetitions(3) + ->ComputeStatistics("", UserStatistics); +// check that user-provided stats is calculated, and is after the default-ones +// empty string as name is intentional, it would sort before anything else +ADD_CASES(TC_ConsoleOut, {{"^BM_UserStats/repeats:3 %console_report$"}, + {"^BM_UserStats/repeats:3 %console_report$"}, + {"^BM_UserStats/repeats:3 %console_report$"}, + {"^BM_UserStats/repeats:3_mean %console_report$"}, + {"^BM_UserStats/repeats:3_median %console_report$"}, + {"^BM_UserStats/repeats:3_stddev %console_report$"}, + {"^BM_UserStats/repeats:3_ %console_report$"}}); +ADD_CASES(TC_JSONOut, {{"\"name\": \"BM_UserStats/repeats:3\",$"}, + {"\"name\": \"BM_UserStats/repeats:3\",$"}, + {"\"name\": \"BM_UserStats/repeats:3\",$"}, + {"\"name\": \"BM_UserStats/repeats:3_mean\",$"}, + {"\"name\": \"BM_UserStats/repeats:3_median\",$"}, + {"\"name\": \"BM_UserStats/repeats:3_stddev\",$"}, + {"\"name\": \"BM_UserStats/repeats:3_\",$"}}); +ADD_CASES(TC_CSVOut, {{"^\"BM_UserStats/repeats:3\",%csv_report$"}, + {"^\"BM_UserStats/repeats:3\",%csv_report$"}, + {"^\"BM_UserStats/repeats:3\",%csv_report$"}, + {"^\"BM_UserStats/repeats:3_mean\",%csv_report$"}, + {"^\"BM_UserStats/repeats:3_median\",%csv_report$"}, + {"^\"BM_UserStats/repeats:3_stddev\",%csv_report$"}, + {"^\"BM_UserStats/repeats:3_\",%csv_report$"}}); + // ========================================================================= // // --------------------------- TEST CASES END ------------------------------ // // ========================================================================= //