Translated Why const Doesn't Make C Code Faster.

This commit is contained in:
LazyWolf Lin 2019-09-12 16:52:13 +08:00
parent e7cfe337f3
commit 2bf6f0201b
2 changed files with 22 additions and 422 deletions

View File

@ -1,402 +0,0 @@
[#]: collector: (lujun9972)
[#]: translator: (LazyWolfLin)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Why const Doesn't Make C Code Faster)
[#]: via: (https://theartofmachinery.com/2019/08/12/c_const_isnt_for_performance.html)
[#]: author: (Simon Arneaud https://theartofmachinery.com)
Why const Doesn't Make C Code Faster
======
In a post a few months back I said [its a popular myth that `const` is helpful for enabling compiler optimisations in C and C++][1]. I figured I should explain that one, especially because I used to believe it was obviously true, myself. Ill start off with some theory and artificial examples, then Ill do some experiments and benchmarks on a real codebase: Sqlite.
### A simple test
Lets start with what I used to think was the simplest and most obvious example of how `const` can make C code faster. First, lets say we have these two function declarations:
```
void func(int *x);
void constFunc(const int *x);
```
And suppose we have these two versions of some code:
```
void byArg(int *x)
{
printf("%d\n", *x);
func(x);
printf("%d\n", *x);
}
void constByArg(const int *x)
{
printf("%d\n", *x);
constFunc(x);
printf("%d\n", *x);
}
```
To do the `printf()`, the CPU has to fetch the value of `*x` from RAM through the pointer. Obviously, `constByArg()` can be made slightly faster because the compiler knows that `*x` is constant, so theres no need to load its value a second time after `constFunc()` does its thing. Its just printing the same thing. Right? Lets see the assembly code generated by GCC with optimisations cranked up:
```
$ gcc -S -Wall -O3 test.c
$ view test.s
```
Heres the full assembly output for `byArg()`:
```
byArg:
.LFB23:
.cfi_startproc
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movl (%rdi), %edx
movq %rdi, %rbx
leaq .LC0(%rip), %rsi
movl $1, %edi
xorl %eax, %eax
call __printf_chk@PLT
movq %rbx, %rdi
call func@PLT # The only instruction that's different in constFoo
movl (%rbx), %edx
leaq .LC0(%rip), %rsi
xorl %eax, %eax
movl $1, %edi
popq %rbx
.cfi_def_cfa_offset 8
jmp __printf_chk@PLT
.cfi_endproc
```
The only difference between the generated assembly code for `byArg()` and `constByArg()` is that `constByArg()` has a `call constFunc@PLT`, just like the source code asked. The `const` itself has literally made zero difference.
Okay, thats GCC. Maybe we just need a sufficiently smart compiler. Is Clang any better?
```
$ clang -S -Wall -O3 -emit-llvm test.c
$ view test.ll
```
Heres the IR. Its more compact than assembly, so Ill dump both functions so you can see what I mean by “literally zero difference except for the call”:
```
; Function Attrs: nounwind uwtable
define dso_local void @byArg(i32*) local_unnamed_addr #0 {
%2 = load i32, i32* %0, align 4, !tbaa !2
%3 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %2)
tail call void @func(i32* %0) #4
%4 = load i32, i32* %0, align 4, !tbaa !2
%5 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %4)
ret void
}
; Function Attrs: nounwind uwtable
define dso_local void @constByArg(i32*) local_unnamed_addr #0 {
%2 = load i32, i32* %0, align 4, !tbaa !2
%3 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %2)
tail call void @constFunc(i32* %0) #4
%4 = load i32, i32* %0, align 4, !tbaa !2
%5 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %4)
ret void
}
```
### Something that (sort of) works
Heres some code where `const` actually does make a difference:
```
void localVar()
{
int x = 42;
printf("%d\n", x);
constFunc(&x);
printf("%d\n", x);
}
void constLocalVar()
{
const int x = 42; // const on the local variable
printf("%d\n", x);
constFunc(&x);
printf("%d\n", x);
}
```
Heres the assembly for `localVar()`, which has two instructions that have been optimised out of `constLocalVar()`:
```
localVar:
.LFB25:
.cfi_startproc
subq $24, %rsp
.cfi_def_cfa_offset 32
movl $42, %edx
movl $1, %edi
movq %fs:40, %rax
movq %rax, 8(%rsp)
xorl %eax, %eax
leaq .LC0(%rip), %rsi
movl $42, 4(%rsp)
call __printf_chk@PLT
leaq 4(%rsp), %rdi
call constFunc@PLT
movl 4(%rsp), %edx # not in constLocalVar()
xorl %eax, %eax
movl $1, %edi
leaq .LC0(%rip), %rsi # not in constLocalVar()
call __printf_chk@PLT
movq 8(%rsp), %rax
xorq %fs:40, %rax
jne .L9
addq $24, %rsp
.cfi_remember_state
.cfi_def_cfa_offset 8
ret
.L9:
.cfi_restore_state
call __stack_chk_fail@PLT
.cfi_endproc
```
The LLVM IR is a little clearer. The `load` just before the second `printf()` call has been optimised out of `constLocalVar()`:
```
; Function Attrs: nounwind uwtable
define dso_local void @localVar() local_unnamed_addr #0 {
%1 = alloca i32, align 4
%2 = bitcast i32* %1 to i8*
call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %2) #4
store i32 42, i32* %1, align 4, !tbaa !2
%3 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 42)
call void @constFunc(i32* nonnull %1) #4
%4 = load i32, i32* %1, align 4, !tbaa !2
%5 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 %4)
call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %2) #4
ret void
}
```
Okay, so, `constLocalVar()` has sucessfully elided the reloading of `*x`, but maybe youve noticed something a bit confusing: its the same `constFunc()` call in the bodies of `localVar()` and `constLocalVar()`. If the compiler can deduce that `constFunc()` didnt modify `*x` in `constLocalVar()`, why cant it deduce that the exact same function call didnt modify `*x` in `localVar()`?
The explanation gets closer to the heart of why C `const` is impractical as an optimisation aid. C `const` effectively has two meanings: it can mean the variable is a read-only alias to some data that may or may not be constant, or it can mean the variable is actually constant. If you cast away `const` from a pointer to a constant value and then write to it, the result is undefined behaviour. On the other hand, its okay if its just a `const` pointer to a value thats not constant.
This possible implementation of `constFunc()` shows what that means:
```
// x is just a read-only pointer to something that may or may not be a constant
void constFunc(const int *x)
{
// local_var is a true constant
const int local_var = 42;
// Definitely undefined behaviour by C rules
doubleIt((int*)&local_var);
// Who knows if this is UB?
doubleIt((int*)x);
}
void doubleIt(int *x)
{
*x *= 2;
}
```
`localVar()` gave `constFunc()` a `const` pointer to non-`const` variable. Because the variable wasnt originally `const`, `constFunc()` can be a liar and forcibly modify it without triggering UB. So the compiler cant assume the variable has the same value after `constFunc()` returns. The variable in `constLocalVar()` really is `const`, though, so the compiler can assume it wont change — because this time it _would_ be UB for `constFunc()` to cast `const` away and write to it.
The `byArg()` and `constByArg()` functions in the first example are hopeless because the compiler has no way of knowing if `*x` really is `const`.
But why the inconsistency? If the compiler can assume that `constFunc()` doesnt modify its argument when called in `constLocalVar()`, surely it can go ahead an apply the same optimisations to other `constFunc()` calls, right? Nope. The compiler cant assume `constLocalVar()` is ever run at all. If it isnt (say, because its just some unused extra output of a code generator or macro), `constFunc()` can sneakily modify data without ever triggering UB.
You might want to read the above explanation and examples a few times, but dont worry if it sounds absurd: it is. Unfortunately, writing to `const` variables is the worst kind of UB: most of the time the compiler cant know if it even would be UB. So most of the time the compiler sees `const`, it has to assume that someone, somewhere could cast it away, which means the compiler cant use it for optimisation. This is true in practice because enough real-world C code has “I know what Im doing” casting away of `const`.
In short, a whole lot of things can prevent the compiler from using `const` for optimisation, including receiving data from another scope using a pointer, or allocating data on the heap. Even worse, in most cases where `const` can be used by the compiler, its not even necessary. For example, any decent compiler can figure out that `x` is constant in the following code, even without `const`:
```
int x = 42, y = 0;
printf("%d %d\n", x, y);
y += x;
printf("%d %d\n", x, y);
```
TL;DR: `const` is almost useless for optimisation because
1. Except for special cases, the compiler has to ignore it because other code might legally cast it away
2. In most of the exceptions to #1, the compiler can figure out a variable is constant, anyway
### C++
Theres another way `const` can affect code generation if youre using C++: function overloads. You can have `const` and non-`const` overloads of the same function, and maybe the non-`const` can be optimised (by the programmer, not the compiler) to do less copying or something.
```
void foo(int *p)
{
// Needs to do more copying of data
}
void foo(const int *p)
{
// Doesn't need defensive copies
}
int main()
{
const int x = 42;
// const-ness affects which overload gets called
foo(&x);
return 0;
}
```
On the one hand, I dont think this is exploited much in practical C++ code. On the other hand, to make a real difference, the programmer has to make assumptions that the compiler cant make because theyre not guaranteed by the language.
### An experiment with Sqlite3
Thats enough theory and contrived examples. How much effect does `const` have on a real codebase? I thought Id do a test on the Sqlite database (version 3.30.0) because
* It actually uses `const`
* Its a non-trivial codebase (over 200KLOC)
* As a database, it includes a range of things from string processing to arithmetic to date handling
* It can be tested with CPU-bound loads
Also, the author and contributors have put years of effort into performance optimisation already, so I can assume they havent missed anything obvious.
#### The setup
I made two copies of [the source code][2] and compiled one normally. For the other copy, I used this hacky preprocessor snippet to turn `const` into a no-op:
```
#define const
```
(GNU) `sed` can add that to the top of each file with something like `sed -i '1i#define const' *.c *.h`.
Sqlite makes things slightly more complicated by generating code using scripts at build time. Fortunately, compilers make a lot of noise when `const` and non-`const` code are mixed, so it was easy to detect when this happened, and tweak the scripts to include my anti-`const` snippet.
Directly diffing the compiled results is a bit pointless because a tiny change can affect the whole memory layout, which can change pointers and function calls throughout the code. Instead I took a fingerprint of the disassembly (`objdump -d libsqlite3.so.0.8.6`), using the binary size and mnemonic for each instruction. For example, this function:
```
000000000005d570 <sqlite3_blob_read>:
5d570: 4c 8d 05 59 a2 ff ff lea -0x5da7(%rip),%r8 # 577d0 <sqlite3BtreePayloadChecked>
5d577: e9 04 fe ff ff jmpq 5d380 <blobReadWrite>
5d57c: 0f 1f 40 00 nopl 0x0(%rax)
```
would turn into something like this:
```
sqlite3_blob_read 7lea 5jmpq 4nopl
```
I left all the Sqlite build settings as-is when compiling anything.
#### Analysing the compiled code
The `const` version of libsqlite3.so was 4,740,704 bytes, about 0.1% larger than the 4,736,712 bytes of the non-`const` version. Both had 1374 exported functions (not including low-level helpers like stuff in the PLT), and a total of 13 had any difference in fingerprint.
A few of the changes were because of the dumb preprocessor hack. For example, heres one of the changed functions (with some Sqlite-specific definitions edited out):
```
#define LARGEST_INT64 (0xffffffff|(((int64_t)0x7fffffff)<<32))
#define SMALLEST_INT64 (((int64_t)-1) - LARGEST_INT64)
static int64_t doubleToInt64(double r){
/*
** Many compilers we encounter do not define constants for the
** minimum and maximum 64-bit integers, or they define them
** inconsistently. And many do not understand the "LL" notation.
** So we define our own static constants here using nothing
** larger than a 32-bit integer constant.
*/
static const int64_t maxInt = LARGEST_INT64;
static const int64_t minInt = SMALLEST_INT64;
if( r<=(double)minInt ){
return minInt;
}else if( r>=(double)maxInt ){
return maxInt;
}else{
return (int64_t)r;
}
}
```
Removing `const` makes those constants into `static` variables. I dont see why anyone who didnt care about `const` would make those variables `static`. Removing both `static` and `const` makes GCC recognise them as constants again, and we get the same output. Three of the 13 functions had spurious changes because of local `static const` variables like this, but I didnt bother fixing any of them.
Sqlite uses a lot of global variables, and thats where most of the real `const` optimisations came from. Typically they were things like a comparison with a variable being replaced with a constant comparison, or a loop being partially unrolled a step. (The [Radare toolkit][3] was handy for figuring out what the optimisations did.) A few changes were underwhelming. `sqlite3ParseUri()` is 487 instructions, but the only difference `const` made was taking this pair of comparisons:
```
test %al, %al
je <sqlite3ParseUri+0x717>
cmp $0x23, %al
je <sqlite3ParseUri+0x717>
```
And swapping their order:
```
cmp $0x23, %al
je <sqlite3ParseUri+0x717>
test %al, %al
je <sqlite3ParseUri+0x717>
```
#### Benchmarking
Sqlite comes with a performance regression test, so I tried running it a hundred times for each version of the code, still using the default Sqlite build settings. Here are the timing results in seconds:
| const | No const
---|---|---
Minimum | 10.658s | 10.803s
Median | 11.571s | 11.519s
Maximum | 11.832s | 11.658s
Mean | 11.531s | 11.492s
Personally, Im not seeing enough evidence of a difference worth caring about. I mean, I removed `const` from the entire program, so if it made a significant difference, Id expect it to be easy to see. But maybe you care about any tiny difference because youre doing something absolutely performance critical. Lets try some statistical analysis.
I like using the Mann-Whitney U test for stuff like this. Its similar to the more-famous t test for detecting differences in groups, but its more robust to the kind of complex random variation you get when timing things on computers (thanks to unpredictable context switches, page faults, etc). Heres the result:
| const | No const
---|---|---
N | 100 | 100
Mean rank | 121.38 | 79.62
Mann-Whitney U | 2912
---|---
Z | -5.10
2-sided p value | &lt;10-6
HL median difference | -.056s
95% confidence interval | -.077s -0.038s
The U test has detected a statistically significant difference in performance. But, surprise, its actually the non-`const` version thats faster — by about 60ms, or 0.5%. It seems like the small number of “optimisations” that `const` enabled werent worth the cost of extra code. Its not like `const` enabled any major optimisations like auto-vectorisation. Of course, your mileage may vary with different compiler flags, or compiler versions, or codebases, or whatever, but I think its fair to say that if `const` were effective at improving C performance, wed have seen it by now.
### So, whats `const` for?
For all its flaws, C/C++ `const` is still useful for type safety. In particular, combined with C++ move semantics and `std::unique_pointer`s, `const` can make pointer ownership explicit. Pointer ownership ambiguity was a huge pain in old C++ codebases over ~100KLOC, so personally Im grateful for that alone.
However, I used to go beyond using `const` for meaningful type safety. Id heard it was best practices to use `const` literally as much as possible for performance reasons. Id heard that when performance really mattered, it was important to refactor code to add more `const`, even in ways that made it less readable. That made sense at the time, but Ive since learned that its just not true.
--------------------------------------------------------------------------------
via: https://theartofmachinery.com/2019/08/12/c_const_isnt_for_performance.html
作者:[Simon Arneaud][a]
选题:[lujun9972][b]
译者:[LazyWolfLin](https://github.com/LazyWolfLin)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://theartofmachinery.com
[b]: https://github.com/lujun9972
[1]: https://theartofmachinery.com/2019/04/05/d_as_c_replacement.html#const-and-immutable
[2]: https://sqlite.org/src/doc/trunk/README.md
[3]: https://rada.re/r/

View File

@ -7,10 +7,10 @@
[#]: via: (https://theartofmachinery.com/2019/08/12/c_const_isnt_for_performance.html)
[#]: author: (Simon Arneaud https://theartofmachinery.com)
为什么 `const` 不能`C` 代码跑得更快?
为什么 `const` 无法`C` 代码跑得更快?
======
在几个月前的一篇文章里,我曾说过“[有个一个流行的传言,`const` 有助于编译器优化 `C``C++` 代码][1]”。我觉得我需要解释一下,尤其是曾经我自己也以为这是显然对的。我将会用一些理论和人工构造的例子论证,然后在一个真正的代码库 `Sqlite` 上做一些实验和基准测试。
在几个月前的一篇文章里,我曾说过“[有个一个流行的传言,`const` 有助于编译器优化 `C``C++` 代码][1]”。我觉得我需要解释一下,尤其是曾经我自己也以为这是显然对的。我将会用一些理论并构造一些例子来论证,然后在一个真正的代码库,`Sqlite`上做一些实验和基准测试。
### 一个简单的测试
@ -39,7 +39,7 @@ void constByArg(const int *x)
}
```
调用 `printf()`CPU 会通过指针从 RAM 中取得 `*x` 的值。很显然,`constByArg()` 会稍微快一点,因为编译器知道 `*x` 是常量,因此不需要在调用 `constFunc()` 之后再次获取它的值。它仅是打印相同的东西。吧?让我们来看下 `GCC` 在如下编译选项下生成的汇编代码:
调用 `printf()`CPU 会通过指针从 RAM 中取得 `*x` 的值。很显然,`constByArg()` 会稍微快一点,因为编译器知道 `*x` 是常量,因此不需要在调用 `constFunc()` 之后再次获取它的值。它仅是打印相同的东西。没问题吧?让我们来看下 `GCC` 在如下编译选项下生成的汇编代码:
```
$ gcc -S -Wall -O3 test.c
@ -209,13 +209,13 @@ void doubleIt(int *x)
`localVar()` 传递给 `constFunc()` 一个指向非 `const` 变量的 `const` 指针。因为这个变量并非常量,`constFunc()` 可以撒个谎并强行修改它而不触发而不触发未定义行为。所以,编译器不能断定变量在调用 `constFunc()` 后仍是同样的值。在 `constLocalVar()` 中的变量是真正的常量,因此,编译器可以断定它不会改变——因为在 `constFunc()` 去除变量的 `const` 属性并写入它*将*会是一个未定义行为。
第一个例子中的函数 `byArg()``constByArg()` 是没有可能的,因为编译器没有任何方法可以知道 `*x` 是否真的是 `const` 常量。
第一个例子中的函数 `byArg()``constByArg()` 是没有可能优化的,因为编译器没有任何方法能够知道 `*x` 是否真的是 `const` 常量。
但是为什么不一致呢?如果编译器能够推断出 `constLocalVar()` 中调用的 `constFunc()` 不会修改它的参数,那么肯定也能继续在其他 `constFunc()` 的调用上实施相同的优化,对吧?不。编译器不能假设 `constLocalVar()` 根本没有运行。 If it isnt (say, because its just some unused extra output of a code generator or macro), `constFunc()` can sneakily modify data without ever triggering UB.
但是为什么不一致呢?如果编译器能够推断出 `constLocalVar()` 中调用的 `constFunc()` 不会修改它的参数,那么肯定也能继续在其他 `constFunc()` 的调用上实施相同的优化,是吗?并不。编译器不能假设 `constLocalVar()` 根本没有运行。 如果不是这样(例如,它只是代码生成器或者宏的一些未使用的额外输出),`constFunc()` 就能偷偷地修改数据而不触发未定义行为。
你可能需要重复阅读上述说明和示例,但不要担心它听起来很荒谬,它确实是的。不幸的是,对 `const` 变量进行写入是最糟糕的未定义行为:大多数情况下,编译器不知道它是否将会是未定义行为。所以,大多数情况下,编译器看见 `const` 时必须假设它未来可能会被移除掉,这意味着编译器不能使用它进行优化。这在实践中是正确的,因为真实的 `C` 代码会在“明确知道后果”下移除 `const`
你可能需要重复阅读上述说明和示例,但不要担心它听起来很荒谬,它确实是正确的。不幸的是,对 `const` 变量进行写入是最糟糕的未定义行为:大多数情况下,编译器不知道它是否将会是未定义行为。所以,大多数情况下,编译器看见 `const` 时必须假设它未来可能会被移除掉,这意味着编译器不能使用它进行优化。这在实践中是正确的,因为真实的 `C` 代码会在“深思熟虑”后移除 `const`
简而言之,很多事情都可以阻止编译器使用 `const` 进行优化,包括使用指针从另一内存空间接受数据,或者在堆空间上分配数据。更糟糕的是,在大部分编译器能够使用 `const` 的情况,它都不是必须的。例如,任何像样的编译器都能推断出下面代码中的 `x` 是一个常量,甚至都不需要 `const`
简而言之,很多事情都可以阻止编译器使用 `const` 进行优化,包括使用指针从另一内存空间接受数据,或者在堆空间上分配数据。更糟糕的是,在大部分编译器能够使用 `const` 进行优化的情况,它都不是必须的。例如,任何像样的编译器都能推断出下面代码中的 `x` 是一个常量,甚至都不需要 `const`
```
int x = 42, y = 0;
@ -247,7 +247,7 @@ void foo(const int *p)
int main()
{
const int x = 42;
// const 影响被调用的是哪一个版本的重载
// const 影响被调用的是哪一个版本的重载函数
foo(&x);
return 0;
}
@ -257,7 +257,7 @@ int main()
### 用 `Sqlite3` 进行实验
有了足够的理论和例子。那么 `const` 在一个真正的代码库中有多大的影响呢?我将会在 `Sqlite`版本3.30.0的代码库上做一个测试,因为:
有了足够的理论和例子。那么 `const` 在一个真正的代码库中有多大的影响呢?我将会在代码库 `Sqlite`版本3.30.0)上做一个测试,因为:
* 它真正地使用了 `const`
* 它不是一个简单的代码库(超过 20 万行代码)
@ -328,7 +328,7 @@ static int64_t doubleToInt64(double r){
}
```
删去 `const` 使得这些常量变成了 `static` 变量。我不明白为什么会有不了解 `const` 的人让这些变量加上 `static`。同时删去 `static``const` 会让 GCC 再次认为它们是常量,而我们将得到同样的编译输出。由于像这样子的局部的 `static const` 变量,使得 13 个函数中有 3 个函数产生假的变化,但我一个都不打算修复它们。
删去 `const` 使得这些常量变成了 `static` 变量。我不明白为什么会有不了解 `const` 的人让这些变量加上 `static`。同时删去 `static``const` 会让 GCC 再次认为它们是常量,而我们将得到同样的编译输出。由于类似这样的局部的 `static const` 变量,使得 13 个函数中有 3 个函数产生假的变化,但我一个都不打算修复它们。
`Sqlite` 使用了很多全局变量,而这正是大多数真正的 `const` 优化产生的地方。通常情况下,它们类似于将一个变量比较代替成一个常量比较,或者一个循环在部分展开的一步。([Radare toolkit][3] 可以很方便的找出这些优化措施。)一些变化则令人失望。`sqlite3ParseUri()` 有 487 指令,但 `const` 产生的唯一区别是进行了这个比较:
@ -361,26 +361,28 @@ Mean | 11.531s | 11.492s
就我个人看来,我没有发现足够的证据说明这个差异值得关注。我是说,我从整个程序中删去 `const`,所以如果它有明显的差别,那么我希望它是显而易见的。但也许你关心任何微小的差异,因为你正在做一些绝对性能非常重要的事。那让我们试一下统计分析。
I like using the Mann-Whitney U test for stuff like this. Its similar to the more-famous t test for detecting differences in groups, but its more robust to the kind of complex random variation you get when timing things on computers (thanks to unpredictable context switches, page faults, etc). Heres the result:
我喜欢使用类似 Mann-Whitney U 检验这样的东西。它类似于更著名的 T 检验,但对你在机器上计时时产生的复杂随机变量(由于不可预测的上下文切换,页错误等)更加鲁棒。以下是结果:
| const | No const
|| const | No const|
---|---|---
N | 100 | 100
Mean rank | 121.38 | 79.62
Mann-Whitney U | 2912
|||
---|---
Mann-Whitney U | 2912
Z | -5.10
2-sided p value | &lt;10-6
HL median difference | -.056s
95% confidence interval | -.077s -0.038s
2-sided p value | <10-6
HL median difference | -0.056s
95% confidence interval | -0.077s -0.038s
The U test has detected a statistically significant difference in performance. But, surprise, its actually the non-`const` version thats faster — by about 60ms, or 0.5%. It seems like the small number of “optimisations” that `const` enabled werent worth the cost of extra code. Its not like `const` enabled any major optimisations like auto-vectorisation. Of course, your mileage may vary with different compiler flags, or compiler versions, or codebases, or whatever, but I think its fair to say that if `const` were effective at improving C performance, wed have seen it by now.
U 检验已经发现统计意义上具有显著的性能差异。但是,令人惊讶的是,实际上是非 `const` 版本更快——大约 60ms0.5%。似乎 `const` 启用的少量“优化”不值得额外代码的开销。这不像是 `const` 启用了任何类似于自动矢量化的重要的优化。当然,你的结果可能因为编译器配置、编译器版本或者代码库等等而有所不同,但是我觉得这已经说明了 `const` 是否能够有效地提高 `C` 的性能,我们现在已经看到答案了。
### So, whats `const` for?
### 那么,`const` 有什么用呢?
For all its flaws, C/C++ `const` is still useful for type safety. In particular, combined with C++ move semantics and `std::unique_pointer`s, `const` can make pointer ownership explicit. Pointer ownership ambiguity was a huge pain in old C++ codebases over ~100KLOC, so personally Im grateful for that alone.
尽管存在缺陷,`C/C++` 的 `const` 仍有助于类型安全。特别是,结合 `C++` 的移动语义和 `std::unique_pointer``const` 可以使指针所有权显式化。在超过十万行代码的 `C++` 旧代码库里,指针所有权模糊是一个大难题,我对此深有感触。
However, I used to go beyond using `const` for meaningful type safety. Id heard it was best practices to use `const` literally as much as possible for performance reasons. Id heard that when performance really mattered, it was important to refactor code to add more `const`, even in ways that made it less readable. That made sense at the time, but Ive since learned that its just not true.
但是,我以前常常使用 `const` 来实现有意义的类型安全。我曾听说过基于性能上的原因,最好是尽可能多地使用 `const`。我曾听说过当性能很重要时,重构代码并添加更多的 `const` 非常重要,即使以降低代码可读性的方式。当时觉得这没问题,但后来我才知道这并不对。
--------------------------------------------------------------------------------