Currently, i get:
```
Run on (32 X 7326.56 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x16)
L1 Instruction 32 KiB (x16)
L2 Unified 512 KiB (x16)
L3 Unified 32768 KiB (x2)
```
which seems mostly right, except that the frequency is rather bogus.
Yes, i guess the CPU could theoretically achieve that,
but i have 3.6GHz configured, and scaling disabled.
So we clearly read the wrong thing.
With this fix, i now get the expected
```
Run on (32 X 3598.53 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x16)
L1 Instruction 32 KiB (x16)
L2 Unified 512 KiB (x16)
L3 Unified 32768 KiB (x2)
```
Use the benchmark's reported iteration count when estimating
iterations for the next repetition, rather than the requested
iteration count. When the benchmark uses KeepRunningBatch the actual
iteration count can be larger than the one the runner requested.
Prior to this fix the runner was underestimating the next iteration
count, sometimes significantly so. Consider the case of a benchmark
using a batch size of 1024. Prior to this change, the benchmark
runner would attempt iteration counts 1, 10, 100 and 1000, yet the
benchmark itself would do the same amount of work each time: a single
batch of 1024 iterations. The discrepancy could also contribute to
estimation errors once the benchmark time reached 10% of the target.
For example, if the very first batch of 1024 iterations reached 10% of
benchmark_min_min time, the runner would attempt to scale that to 100%
from a basis of one iteration rather than 1024.
This bug was particularly noticeable in benchmarks with large batch
sizes, especially when the benchmark also had slow set up or tear down
phases.
With this fix in place it is possible to use KeepRunningBatch to
achieve a kind of "minimum iteration count" feature by using a larger
fixed batch size. For example, a benchmark may build a map of 500K
elements and test a "find" operation. There is no point in running
"find" just 1, 10, 100, etc., times. The benchmark can now pick a
batch size of something like 10K, and the runner will arrive at the
final max iteration count with in noticeably fewer repetitions.
When building with gcc TSan on, and in Debug mode, we see a warning
like:
benchmark/src/timers.cc: In function ‘std::string benchmark::LocalDateTimeString()’:
src/timers.cc:241:15: warning: ‘char* strncat(char*, const char*, size_t)’ output may be truncated copying 108 bytes from a string of length 127 [-Wstringop-truncation]
241 | std::strncat(storage, tz_offset, sizeof(storage) - timestamp_len - 1);
| ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
While this is essentially a false positive (we never expect
the number of bytes in tz_offset to be too large), the compiler can't
actually tell that. Shrink the size of tz_offset to a smaller, but still safe
size to eliminate this warning.
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
* Implement custom benchmark name
The benchmark's name can be changed using the Name() function
which internally uses SetName().
* Update AUTHORS and CONTRIBUTORS
* Describe new feature in README
* Move new name function up
Fixes#1106
Fixes google#1077
Bazel clients currently cannot build the benchmark library in Release
mode. This commit adds a new target ":benchmark_release" to enable this.
The existing behavior results in the `0` value being added twice. Since
`lo` is always added to `dst`, we never want to explicitly add `0` if
`lo` is equal to `0`.
On s390 architecture, z/OS XL compiler uses HLASM inline assembly, which has different syntax and needs to be distinguished to avoid compilation error.
Noticed missing header when was building llvm with gcc-11:
```
llvm-project/llvm/utils/benchmark/src/benchmark_register.h:17:30:
error: 'numeric_limits' is not a member of 'std'
17 | static const T kmax = std::numeric_limits<T>::max();
| ^~~~~~~~~~~~~~
```
Without this commit, compilation fails on DragonFly with the following message:
```
/home/mneumann/Dev/benchmark.old/src/sysinfo.cc:446:2: error: #warning "HOST_NAME_MAX not defined. using 64" [-Werror=cpp]
^~~~~~~
```
Also note that the sysctl is actually `hw.tsc_frequency` on DragonFly:
```
$ sysctl hw.tsc_frequency
hw.tsc_frequency: 3498984022
```
Tested on:
```
$ uname -a
DragonFly box.localnet 5.9-DEVELOPMENT DragonFly v5.9.0.742.g4b29dd-DEVELOPMENT #5: Tue Aug 18 00:21:31 CEST 2020
```
As per discussions in here [1], LLVM is going to get backend support on
Motorola 68000 series CPUs (a.k.a M68K or M680x0). So it's necessary to
add CycleTimer implementation here, which is simply using `gettimeofday`
same as MIPS. This fixes#1049
[1] https://reviews.llvm.org/D88389
NOTE: This is a fresh-start of #738 pull-request which I messed up by re-editing the commiter email which I forgot to modify before pushing. Sorry for the inconvenience.
This PR brings proposed solution for functionality described in #737Fixes#737.
* Fix setup.py and reformat
* Bind benchmark
* Add benchmark option to Python
* Add Python examples for range, complexity, and thread
* Remove invalid multithreading in Python
* Bump Python bindings version to 0.2.0
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
* Bind Counter to Python
* Bind State methods to Python
* Bind state.counters to Python
* Import _benchmark.Counter
* Add Python example of state usage
Co-authored-by: Dominic Hamon <dominichamon@users.noreply.github.com>
* Create pylint.yml
* improve file matching
* fix some pylint issues
* run on PR and push (force on master only)
* more pylint fixes
* suppress noisy exit code and filter to fatals
* add conan as a dep so the module is importable
* fix lint error on unreachable branch
* Adds -lm linker flag for (Free|Open)BSD and uses github.com/bazelbuild/platforms for platform detection.
* Prefer selects.with_or to select the linkopts.
* @platforms appears to be implicitly available. @bazel_skylib would require updating every dependent repository.
* Re-enable platforms package.
Fixes#974. The `cxx_feature_check` now has an additional
optional argument which can be used to supply extra cmake flags
to pass to the `try_compile` command. The `CMAKE_CXX_STANDARD=14`
flag was determined to be the minimum flag necessary to correctly
compile and run the regex feature checks when compiling with Clang
under Windows (n.b. this does *not* refer to clang-cl, the frontend
to the MSVC compiler). The additional flag is not enabled for any
other compiler/platform tuple.
* Add CartesianProduct with associated test
* Use CartesianProduct in Ranges to avoid code duplication
* Add new cartesian_product_test to CMakeLists.txt
* Update AUTHORS & CONTRIBUTORS
* Rename CartesianProduct to ArgsProduct
* Rename test & fixture accordingly
* Add example for ArgsProduct to README
* ctest is now working
* Update README
* remove commented out lines
* Tweaked docs
Added note to use parallel and cleaned build config notes
* Response to comments
* revert all but the readme
* make error message clearer
* drop --parallel
Build instructions needlessly referred to make when CMake offers
a command-line interface to abstract away from the specific build
system.
Furthermore, CMake offers command-line "tool mode" which performs basic
filesystem operations. While the syntax is a bit more verbose than
Linux commands it is platform-independent. Now the commands can be
copy-pasted on both Linux and Windows and will just work.
Finally, the Release build type is included in initial commands. A natural flow
for a new-comer is to read and execute the commands and only then learn
that one has to go back and redo them again this time with proper parameters.
Now instead the parameters are only explained later but present already in the
initial commands.
As noted in #995, this causes issues when the command line flag already
starts with "benchmark_", which they all do.
Not caught by tests as the test flags didn't start with "benchmark".
Fixes#995
* JSONReporter: don't report on scaling if we didn't get it (#1005)
* JSONReporter: fix due to review (std::pair<bool, bool> -> enum)
* JSONReporter: scaling: fix the algo (due to review discussion)
* benchmark.h: revert to old-fashioned enum's (C++03 compatibility); rreporter_output_test: let's skip scaling