benchmark/test
Alexander Popov 8545dfb3ea
Fix DoNotOptimize() GCC copy overhead (#1340) (#1410)
* Fix DoNotOptimize() GCC copy overhead (#1340)

The issue is that GCC DoNotOptimize() does a full copy of an argument
if it's not a pointer and it slows down a benchmark. If an argument is big
enough there is a memcpy() call for copying the argument. An argument
object can be a big object so DoNotOptimize() could add sufficient
overhead and affects benchmark results.

The cause is in GCC behavior with asm volatile constraints. Looks like GCC
trying to use r(register) constraint for all cases despite object size.
See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105519

The solution is the split DoNotOptimize() in two cases - value fits
in register and value doesn't fit in register. And use case specific
asm constraint. std::is_trivially_copyable trait is needed because
"+r" constraint doesn't work with non trivial copyable objects.

- Fix requires support C++11 feature std::is_trivially_copyable from GCC
  compiler. The feature has been supported since GCC 5
- Fallback for GCC version < 5 still exists but it uses "m" constraint
  which means a little bit more overhead in some cases
- Add assembly tests for issued cases

Fixes #1340

* Add supported compiler versions info for assembly tests

- Assembly tests are inherently non-portable. So explicitly add GCC
  and Clang versions required for reliable tests passed
- Write a warning message if the current compiler version isn't supported
2022-06-20 10:12:58 +01:00
..
args_product_test.cc clang-tidy: readability-redundant and performance (#1298) 2021-12-06 11:18:04 +00:00
AssemblyTests.cmake Fix DoNotOptimize() GCC copy overhead (#1340) (#1410) 2022-06-20 10:12:58 +01:00
basic_test.cc Avoid potential truncation issues for the integral type parameterized tests. (#1341) 2022-02-08 16:40:43 +00:00
benchmark_gtest.cc annotate and export public symbols (#1321) 2022-02-14 10:48:53 +00:00
benchmark_name_gtest.cc Introduce warmup phase to BenchmarkRunner (#1130) (#1399) 2022-05-23 13:50:17 +01:00
benchmark_random_interleaving_gtest.cc Expose default display reporter creation in public API (#1344) 2022-02-11 10:23:05 +00:00
benchmark_setup_teardown_test.cc lose some build warnings 2021-11-19 19:54:05 +00:00
benchmark_test.cc Enable -Wconversion (#1390) 2022-05-01 19:56:30 +01:00
BUILD add multiple OSes to bazel workflow (#1412) 2022-06-13 17:45:20 +01:00
clobber_memory_assembly_test.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
CMakeLists.txt Add option to get the verbosity provided by commandline flag -v (#1330) (#1397) 2022-05-17 17:59:36 +01:00
commandlineflags_gtest.cc Add benchmark_context flag that allows per-run custom context. (#1127) 2021-05-04 14:36:11 +01:00
complexity_test.cc fix some build warnings on type conversions 2022-06-08 10:32:20 +01:00
cxx03_test.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
diagnostics_test.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
display_aggregates_only_test.cc Introduce Coefficient of variation aggregate (#1220) 2021-09-03 18:44:10 +01:00
donotoptimize_assembly_test.cc Fix DoNotOptimize() GCC copy overhead (#1340) (#1410) 2022-06-20 10:12:58 +01:00
donotoptimize_test.cc Enable -Wconversion (#1390) 2022-05-01 19:56:30 +01:00
filter_test.cc Enable -Wconversion (#1390) 2022-05-01 19:56:30 +01:00
fixture_test.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
internal_threading_test.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
link_main_test.cc Add benchmark_main target. (#601) 2018-05-25 11:18:58 +01:00
map_test.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
memory_manager_test.cc Introduce per-family instance index (#1165) 2021-06-02 23:45:41 +03:00
multiple_ranges_test.cc clang-tidy: readability-redundant and performance (#1298) 2021-12-06 11:18:04 +00:00
options_test.cc Introduce warmup phase to BenchmarkRunner (#1130) (#1399) 2022-05-23 13:50:17 +01:00
output_test_helper.cc clang-tidy: readability-redundant and performance (#1298) 2021-12-06 11:18:04 +00:00
output_test.h clang-tidy: readability-redundant and performance (#1298) 2021-12-06 11:18:04 +00:00
perf_counters_gtest.cc Cache PerfCounters instance in PerfCountersMeasurement (#1308) 2022-01-25 10:14:20 +00:00
perf_counters_test.cc fix clang-tidy warnings (#1195) 2021-06-29 11:06:53 +01:00
register_benchmark_test.cc Filter out benchmarks that start with "DISABLED_" (#1387) 2022-05-01 10:41:34 +01:00
repetitions_test.cc Statistics: add support for percentage unit in addition to time (#1219) 2021-09-03 15:36:56 +01:00
report_aggregates_only_test.cc Introduce Coefficient of variation aggregate (#1220) 2021-09-03 18:44:10 +01:00
reporter_output_test.cc Add possibility to ask for libbenchmark version number (#1004) (#1403) 2022-06-20 09:45:50 +01:00
skip_with_error_test.cc clang-tidy: readability-redundant and performance (#1298) 2021-12-06 11:18:04 +00:00
spec_arg_test.cc Add SetBenchmarkFilter() to set --benchmark_filter flag value in user code (#1362) 2022-03-08 16:02:37 +00:00
spec_arg_verbosity_test.cc Add option to get the verbosity provided by commandline flag -v (#1330) (#1397) 2022-05-17 17:59:36 +01:00
state_assembly_test.cc Iteration counts should be uint64_t globally. (#817) 2019-05-13 12:33:11 +03:00
statistics_gtest.cc Introduce Coefficient of variation aggregate (#1220) 2021-09-03 18:44:10 +01:00
string_util_gtest.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
templated_fixture_test.cc format tests with clang-format (#1282) 2021-11-10 16:22:31 +00:00
time_unit_gtest.cc Allow setting the default time unit globally (#1337) 2022-03-04 11:07:01 +00:00
user_counters_tabular_test.cc COnsole reporter: if statistic produces percents, format it as such (#1221) 2021-09-06 11:33:27 +03:00
user_counters_test.cc Fix error with Fix Werror=old-style-cast (#1272) 2021-11-04 12:09:10 +00:00
user_counters_thousands_test.cc Statistics: add support for percentage unit in addition to time (#1219) 2021-09-03 15:36:56 +01:00