just few lines

This commit is contained in:
Kenneth Hawk 2017-03-23 19:15:01 +08:00
parent 60dea920ee
commit 1de30b3c6f

View File

@ -84,23 +84,33 @@ C# 最基本的问题是对基础值类型value-base低下的支持性。
* _引用_ 值被苛刻的限制。你可以将一个引用值传给函数,但只能这样。你不能直接引用 `List<int>` 中的元素,你必须先把所有的引用和索引全部存储下来。你不能直接取得指向栈、对象中的变量(或其他变量)的指针。你只能把他们复制一份,除了将他们传给一个函数(使用引用的方式)。当然这也是可以理解的。如果类型安全是一个先驱条件,灵活的引用变量和保证类型安全这两项要同时支持太难了(虽然不可能)。
* [Fixed sized buffers][6] dont support custom types and also requires you to use an unsafe keyword.
* [固定大小的缓冲区][6] 不支持自定义类型,而且还必须使用 `unsafe` 关键字。
* Limited “array slice” functionality. Theres an ArraySegment class, but its not really used by anyone, which means that in order to pass a range of elements from an array you have to create an IEnumerable, which means allocation (boxing). Even if the APIs accepted ArraySegment parameters its still not good enough you can only use it for normal arrays, not for List<t>, not for [stack-allocated array][4]s, etc.</t>
* 有限的“数组切片”功能。虽然有提供 `ArraySegment` 类,但并没有人会使用它,这意味着如果只需要传递数组的一部分,你必须去创建一个 `IEnumerable` 对象,也就意味着要分配大小(包装)。就算接口接受 `ArraySegment` 对象作为参数,也是不够的——你只能用普通数组,而不能用 `List<T>`,也不能用 [栈数组][4] 等等。
The bottom line is that for all but very simple cases, the language pushes you very strongly towards heap allocations. If all your data is on the heap, it means that accessing it is likely to cause a cache misses (since you cant decide how objects are organized in the heap). So while a C++ program poses few challenges to ensuring that data is organized in cache-efficient ways, C# typically encourages you to allocate each part of that data in a separate heap allocation. This means the programmers loses control over data layout, which means unnecessary cache misses are introduced and performance drops precipitously. It doesnt matter that [you can now compile C# programs natively][11] ahead of time improvement to code quality is a drop in the bucket compared to poor memory locality.
最重要的是除了非常简单的情况之外C# 非常惯用堆分配。如果所有的数据都被堆分配,这意味着被访问时会造成缓存未命中(从你无法决定对象是如何在堆中存储开始)。所以当 C++ 程序面临着如何有效的组织数据在缓存中的存储这个挑战时C# 则鼓励程序员去将数据分开的存放在一个个堆分配空间中。这就意味着程序员无法控制数据层了,也就意味着开始产生不必要的缓存未命中问题,也就导致性能急速的下降了。[C# 已经支持原生编译][11] 也不会提升太多性能——毕竟在内存不足的情况下,提高代码质量本就杯水车薪。
Plus, theres storage overhead. Each reference is 8 bytes on a 64-bit machine, and each allocation has its own overhead in the form of various metadata. A heap full of tiny objects with lots of references everywhere has a lot of space overhead compared to a heap with very few large allocations where most data is just stored embedded within their owners at fixed offsets. Even if you dont care about memory requirements, the fact that the heap is bloated with header words and references means that cache lines have more waste in them, this in turn means even more cache misses and reduced performance.
Plus, theres storage overhead. Each reference is 8 bytes on a 64-bit machine, and each allocation has its own overhead in the form of various metadata. A heap full of tiny objects with lots of references everywhere has a lot of space overhead compared to a heap with very few large allocations where most data is just stored embedded within their owners at fixed offsets.
Even if you dont care about memory requirements, the fact that the heap is bloated with header words and references means that cache lines have more waste in them, this in turn means even more cache misses and reduced performance.
存储是有开销的。在64位的机器上每个地址值占8位内存而每次分配都会有存储元数据而产生的开销。与存储着少量大数据以固定偏移的方式存储在其中的堆相比存储着大量小数据的堆并且其中的数据到处都被引用会产生更多的内存开销。尽管你可能不怎么关心内存怎么用但事实上就是那些头部内容和地址信息导致堆变得臃肿也就是在浪费缓存了所以也造成了更多的缓存未命中降低了代码性能。
There are sometimes workarounds you can do, for example you can use structs and allocate them in a pool using a big List<t>. This allows you to e.g. traverse the pool and update all of the objects in-bulk, getting good locality. This does get pretty messy though, because now anything else wanting to refer to one of these objects have to have a reference to the pool as well as an index, and then keep doing array-indexing all over the place. For this reason, and the reasons above, it is significantly more painful to do this sort of stuff in C# than it is to do it in C++, because its just not something the language was designed to do. Furthermore, accessing a single element in the pool is now more expensive than just having an allocation per object - you now get  _two_  cache misses because you have to first dereference the pool itself (since its a class). Ok, so you can duplicate the functionality of List<t> in struct-form and avoid this extra cache miss and make things even uglier. Ive written plenty of code just like this and its just extremely low level and error prone.</t></t>
当然有些时候也是有办法的,比如你可以使用一个很大的 `List<T>` 来构造数据池以存储分配你需要的数据和自己的结构体。这样你就可以方便的遍历或者批量更新你的数据池中的数据了。但这也会很混乱,因为无论你在哪要引用什么对象都要先能引用这个池,然后每次引用都需要做数组索引。从上文可以得出,在 C# 中做类似这样的处理的痛感比在 C++ 中做来的更痛,因为 C# 在设计时就是这样。此外,通过这种方式来访问池中的单个对象比直接将这个对象分配到内存来访问更加的昂贵——前者你得先访问池(这是个类)的地址,这意味着可能产生 _2_ 次缓存未命中。你还可以通过复制 `List<T>` 的结构形式来避免更多的缓存未命中问题,但这就更难搞了。我就写过很多类似的代码,自然这样的代码只会很低水平而且容易出错。
Finally, I want to point out that this isnt just an issue for “hot spot” code. Idiomatically written C# code tends to have classes and references basically  _everywhere_ . This means that all over your code at relatively uniform frequency there are random multi-hundred cycle stalls, dwarfing the cost of surrounding operations. Yes there could be hotspots too, but after youve optimized them youre left with a program thats just [uniformly slow.][12] So unless you want to write all your code with memory pools and indices, effectively operating at a lower level of abstraction than even C++ does (and at that point, why bother with C#?), theres not a ton you can do to avoid this issue.
最后,我想说我指出的问题不仅是那些“热门”的代码。惯用手段编写的 C# 代码倾向于几乎所有地方都用类和引用。意思就是在你的代码中会均匀频率地随机出现数百次的运算周期损耗,使得操作的损耗似乎降低了。这虽然也可以被找出来,但你优化了这问题后,这还是一个 [均匀变慢][12] 的程序。
### Garbage Collection
### 垃圾回收
Im just going to assume in the following that you already understand why garbage collection is a performance problem in a lot of cases. That pausing randomly for many milliseconds just is usually unacceptable for anything with animation. I wont linger on it and move on to explaining why the language design itself exacerbates this issue.
在读下文之前我会假设你已经知道为什么在许多用例中垃圾回收是影响性能问题的重要原因。播放动画时总是随机的暂停通常都是大家都不能接受的吧。我会继续解释为什么设计语言时还加剧了这个问题。
Because of the limitations when it comes to dealing with values, the language very strongly discourages you from using big chunky allocations consisting mostly of values embedded within other values (perhaps stored on the stack), pressuring you instead to use lots of small classes which have to be allocated on the heap. Roughly speaking, more allocations means more time spent collecting garbage.
There are benchmarks that show how C# or Java beat C++ in some particular case, because an allocator based on a GC can have decent throughput (cheap allocations, and you batch all the deallocations up). However, this isnt a common real world scenario. It takes a huge amount of effort to write a C# program with the same low allocation rate that even a very naïve C++ program has, so those kinds of comparisons are really comparing a highly tuned managed program with a naïve native one. Once you spend the same amount of effort on the C++ program, youd be miles ahead of C# again.
Im relatively convinced that you could write a GC more suitable for high performance and low latency applications (e.g. an incremental GC where you spend a fixed amount of time per frame doing collection), but this is not enough on its own. At the end of the day the biggest issue with most high level languages is simply that the design encourages far too much garbage being created in the first place. If idiomatic C# allocated at the same low rate a C program does, the GC would pose far fewer problems for high performance applications. And if you  _did_  have an incremental GC to support soft real-time applications, youll probably need a write barrier for it which, as cheap as it is, means that a language that encourages pointers will add a performance tax to the mutators.