[RFC] Tools: compare-bench.py: print change% with two decimal digits (#440)

* Tools: compare-bench.py: print change% with two decimal digits

Here is a comparison of before vs. after:
```diff
-Benchmark                      Time           CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------
-BM_SameTimes                  +0.00         +0.00            10            10            10            10
-BM_2xFaster                   -0.50         -0.50            50            25            50            25
-BM_2xSlower                   +1.00         +1.00            50           100            50           100
-BM_1PercentFaster             -0.01         -0.01           100            99           100            99
-BM_1PercentSlower             +0.01         +0.01           100           101           100           101
-BM_10PercentFaster            -0.10         -0.10           100            90           100            90
-BM_10PercentSlower            +0.10         +0.10           100           110           100           110
-BM_100xSlower                +99.00        +99.00           100         10000           100         10000
-BM_100xFaster                 -0.99         -0.99         10000           100         10000           100
-BM_10PercentCPUToTime         +0.10         -0.10           100           110           100            90
+Benchmark                        Time             CPU      Time Old      Time New       CPU Old       CPU New
+-------------------------------------------------------------------------------------------------------------
+BM_SameTimes                  +0.0000         +0.0000            10            10            10            10
+BM_2xFaster                   -0.5000         -0.5000            50            25            50            25
+BM_2xSlower                   +1.0000         +1.0000            50           100            50           100
+BM_1PercentFaster             -0.0100         -0.0100           100            99           100            99
+BM_1PercentSlower             +0.0100         +0.0100           100           101           100           101
+BM_10PercentFaster            -0.1000         -0.1000           100            90           100            90
+BM_10PercentSlower            +0.1000         +0.1000           100           110           100           110
+BM_100xSlower                +99.0000        +99.0000           100         10000           100         10000
+BM_100xFaster                 -0.9900         -0.9900         10000           100         10000           100
+BM_10PercentCPUToTime         +0.1000         -0.1000           100           110           100            90
+BM_ThirdFaster                -0.3333         -0.3333           100            67           100            67

```

So the first ("Time") column is exactly where it was, but with
two more decimal digits. The position of the '.' in the second
("CPU") column is shifted right by those two positions, and the
rest is unmodified, but simply shifted right by those 4 positions.

As for the reasoning, i guess it is more or less the same as
with #426. In some sad times, microbenchmarking is not applicable.
In those cases, the more precise the change report is, the better.

The current formatting prints not so much the percentages,
but the fraction i'd say. It is more useful for huge changes,
much more than 100%. That is not always the case, especially
if it is not a microbenchmark. Then, even though the change
may be good/bad, the change is small (<0.5% or so),
rounding happens, and it is no longer possible to tell.

I do acknowledge that this change does not fix that problem. Of
course, confidence intervals and such would be better, and they
would probably fix the problem. But i think this is good as-is
too, because now the you see 2 fractional percentage digits!1

The obvious downside is that the output is now even wider.

* Revisit tests, more closely documents the current behavior.
This commit is contained in:
Roman Lebedev 2017-08-29 02:12:18 +03:00 committed by Dominic Hamon
parent 6e06648133
commit 886585a3b7
3 changed files with 43 additions and 13 deletions

View File

@ -77,6 +77,20 @@
"cpu_time": 100,
"time_unit": "ns"
},
{
"name": "BM_ThirdFaster",
"iterations": 1000,
"real_time": 100,
"cpu_time": 100,
"time_unit": "ns"
},
{
"name": "BM_BadTimeUnit",
"iterations": 1000,
"real_time": 0.4,
"cpu_time": 0.5,
"time_unit": "s"
},
{
"name": "BM_DifferentTimeUnit",
"iterations": 1,

View File

@ -77,6 +77,20 @@
"cpu_time": 90,
"time_unit": "ns"
},
{
"name": "BM_ThirdFaster",
"iterations": 1000,
"real_time": 66.665,
"cpu_time": 66.664,
"time_unit": "ns"
},
{
"name": "BM_BadTimeUnit",
"iterations": 1000,
"real_time": 0.04,
"cpu_time": 0.6,
"time_unit": "s"
},
{
"name": "BM_DifferentTimeUnit",
"iterations": 1,

View File

@ -71,13 +71,13 @@ def generate_difference_report(json1, json2, use_color=True):
Calculate and report the difference between each test of two benchmarks
runs specified as 'json1' and 'json2'.
"""
first_col_width = find_longest_name(json1['benchmarks']) + 5
first_col_width = find_longest_name(json1['benchmarks'])
def find_test(name):
for b in json2['benchmarks']:
if b['name'] == name:
return b
return None
first_line = "{:<{}s} Time CPU Time Old Time New CPU Old CPU New".format(
first_line = "{:<{}s} Time CPU Time Old Time New CPU Old CPU New".format(
'Benchmark', first_col_width)
output_strs = [first_line, '-' * len(first_line)]
@ -97,7 +97,7 @@ def generate_difference_report(json1, json2, use_color=True):
return BC_WHITE
else:
return BC_CYAN
fmt_str = "{}{:<{}s}{endc}{}{:+9.2f}{endc}{}{:+14.2f}{endc}{:14.0f}{:14.0f}{endc}{:14.0f}{:14.0f}"
fmt_str = "{}{:<{}s}{endc}{}{:+16.4f}{endc}{}{:+16.4f}{endc}{:14.0f}{:14.0f}{endc}{:14.0f}{:14.0f}"
tres = calculate_change(bn['real_time'], other_bench['real_time'])
cpures = calculate_change(bn['cpu_time'], other_bench['cpu_time'])
output_strs += [color_format(use_color, fmt_str,
@ -127,16 +127,18 @@ class TestReportDifference(unittest.TestCase):
def test_basic(self):
expect_lines = [
['BM_SameTimes', '+0.00', '+0.00', '10', '10', '10', '10'],
['BM_2xFaster', '-0.50', '-0.50', '50', '25', '50', '25'],
['BM_2xSlower', '+1.00', '+1.00', '50', '100', '50', '100'],
['BM_1PercentFaster', '-0.01', '-0.01', '100', '99', '100', '99'],
['BM_1PercentSlower', '+0.01', '+0.01', '100', '101', '100', '101'],
['BM_10PercentFaster', '-0.10', '-0.10', '100', '90', '100', '90'],
['BM_10PercentSlower', '+0.10', '+0.10', '100', '110', '100', '110'],
['BM_100xSlower', '+99.00', '+99.00', '100', '10000', '100', '10000'],
['BM_100xFaster', '-0.99', '-0.99', '10000', '100', '10000', '100'],
['BM_10PercentCPUToTime', '+0.10', '-0.10', '100', '110', '100', '90'],
['BM_SameTimes', '+0.0000', '+0.0000', '10', '10', '10', '10'],
['BM_2xFaster', '-0.5000', '-0.5000', '50', '25', '50', '25'],
['BM_2xSlower', '+1.0000', '+1.0000', '50', '100', '50', '100'],
['BM_1PercentFaster', '-0.0100', '-0.0100', '100', '99', '100', '99'],
['BM_1PercentSlower', '+0.0100', '+0.0100', '100', '101', '100', '101'],
['BM_10PercentFaster', '-0.1000', '-0.1000', '100', '90', '100', '90'],
['BM_10PercentSlower', '+0.1000', '+0.1000', '100', '110', '100', '110'],
['BM_100xSlower', '+99.0000', '+99.0000', '100', '10000', '100', '10000'],
['BM_100xFaster', '-0.9900', '-0.9900', '10000', '100', '10000', '100'],
['BM_10PercentCPUToTime', '+0.1000', '-0.1000', '100', '110', '100', '90'],
['BM_ThirdFaster', '-0.3333', '-0.3334', '100', '67', '100', '67'],
['BM_BadTimeUnit', '-0.9000', '+0.2000', '0', '0', '0', '1'],
]
json1, json2 = self.load_results()
output_lines_with_header = generate_difference_report(json1, json2, use_color=False)