Run the external profiler the same number of iterations as the
benchmark was run normally.
This makes, for example, a trace collected via ProfilerManager
consistent with collected PMU data.
As a fix, also turn the comment in libpfm's build file into a proper Starlark
docstring.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
* Get number of CPUs with sysconf() on Linux
Avoid parsing the /proc/cpuinfo just to get number of CPUs.
Instead use the portable function provided by glibc.
* Update sysinfo.cc
The testcase fails on sparc64, because the parsing of /proc/cpuinfo
fails and thus currently returns "0" CPUs which finally leads
to division-by-zero faults in the tests.
Fix the issue by returning at least "1" CPU which allows the
tests to run. A error message will be printed in any case.
Long-term the code should be fixed to parse the cpuinfo output
on sparch which looks like this:
...
type : sun4v
ncpus probed : 48
ncpus active : 48
The Linux kernel provides the clock_gettime() functions since a long
time already, so it's possible to use it as a generic fallback option
for any architecture if no other (better) possibility has been provided
instead.
I noticed the benchmark package failed to build on debian on the SH-4
architecture, so with this change SH-4 is now the first user of this
fallback option.
- MP flag only applies to cl, not cl frontends to other compilers (e.g. clang-cl, icx-cl etc).
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
In a future version of bazel this produces a warning. In this case using
only the platform being windows is enough. Fixes:
```
WARNING: /.../benchmark/BUILD.bazel:29:15: in config_setting rule //:windows: select() on cpu is deprecated. Use platform constraints instead: https://bazel.build/docs/configurable-attributes#platforms.
```
The fix is, unsurprisingly, to not invoke clang at all, because we use
Bazel to build everything anyway.
This also means that we can drop the setuptools pin.
The new solution was too smart (read: dense), because it did not account for
the fact that we look for the Windows libs of the interpreter building
the wheel, not the hermetic one supplying the header files.
The fix is to just align the versions again, so that the libs and headers
come from the same minor version.
Also contains a run of `pre-commit autoupdate`, and a bump of cibuildwheel
to its latest tag for CPython 3.13 support.
But, since we build for 3.10+ with SABI from 3.12 onwards, we don't even
need a dedicated Python 3.13 build job or toolchain - the wheels from 3.12
can be reused.
Simplifies some version-dependent logic around assembling the bazel
build command in setup.py, and fixes a possible unbound local error in
the toolchain patch context manager.
* Verify RegisterProfilerManager doesn't overwrite an existing registration
Tested:
Add a second registration to test/profiler_manager_test.cc and
verify the test crashes as expected.
* Verify RegisterProfilerManager doesn't overwrite an existing registration
Tested:
Configure with:
cmake -GNinja -DCMAKE_BUILD_TYPE=Debug -DBENCHMARK_DOWNLOAD_DEPENDENCIES=on
Then run:
ctest -R profiler_manager_gtest
Before change test fails (expected), after change test passes (expected)
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Adds support for free-threaded nanobind extension builds, though we
don't currently build a free-threaded wheel.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Disables 'misc-use-anonymous-namespace' for usage of the BENCHMARK
macro. This warning is spurious, and the variable declared by the
BENCHMARK macro can't be moved into an annonymous namespace.
We don't want to disable it globally, but it can be disabled locally,
for the `BENCHMARK` statement, as this warning appears downstream for
users.
See:
https://clang.llvm.org/extra/clang-tidy/#suppressing-undesired-diagnostics
* Add enum value from newest Windows SDK
Windows SDK version 10.0.26100.0 adds a cache type value, `CacheUnknown`. This adds a case for that type to `sysinfo.cc`, which will otherwise complain about the switch statement being non-exhaustive when building with the new SDK.
Since the value doesn't exist in prior SDK versions, we only add the case conditionally. The condition can be removed if we ever decide to bump up the required SDK version.
* Fix SDK version macro
Make sure the version macro we're using for the SDK is properly indicative of version 10.0.26100.0. Also fix formatting complains from the linter.
* Add space to satisfy formatter
Formatter insists on two space before a comment after a macro...
* Change preprocessor condition
Try detecting the current SDK version in a slightly different way.
* Replace NTDDI_WIN11_GE with its value
Undefined constants are treated as 0 by the preprocessor, which causes the check to trivially return true for previous SDK versions. Replace the constant with its value (from the newest SDK version) instead,
* Added benchmark_dry_run boolean flag to command line options
* Dry run logic to exit early and override iterations, repetitions, min time, min warmup time
* Changeddry run override logic structure and added dry run to context
---------
Co-authored-by: Shaan <shaanmistry03@gmail.com>
Co-authored-by: Shaan Mistry <49106143+Shaan-Mistry@users.noreply.github.com>
* Supply MacOS deployment target to delocate, use build+uv frontend
This shaves off multiple minutes from the wheel builds alone.
Also revert to trusted publishing for wheel uploads as it is now set up.
* Bump oldest supported Python to 3.10, eliminate setuptools-scm
The version is now a string again, under the same attribute as it was
before. This is a pragmatic decision in order to be able to upload wheels
again, possibly directly from main.
We could in the future also set the Python version to a development version
if we want to avoid accidental uploads of `main`.
* Add a note on supported Python versions in the docs
Also fixes the setuptools failure observed in the latest CI by pinning
to the last version before v73 until the problem is identified and resolved.
* Fix C4459: Rename a function parameter `profiler_manager` to avoid hiding the global declaration.
* Treat warnings as errors for MSVC
* disable one warning for MSVC
* Align benchmark::State to a cacheline.
This can avoid interference with neighboring objects and stabilize
benchmark results.
* separate cachline definition from alignment attribute macro
Co-authored-by: Roman Lebedev <lebedev.ri@gmail.com>
---------
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>
Co-authored-by: Roman Lebedev <lebedev.ri@gmail.com>
According to the user guide, when manual timing, it is necessary to explicit it by using the `UseManualTime` function. Its equivalent in Python is use_manual_time(). This function was not called in the example.
It is possible to verify that the use of this function has an impact on the measure by adding another `time.sleep(0.01)` at the end of the iteration. There is a x2 difference depending on whether `use_manual_time()` is used or not.
Co-authored-by: dominic <510002+dmah42@users.noreply.github.com>