mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-02-28 01:01:09 +08:00
Merge remote-tracking branch 'LCTT/master'
This commit is contained in:
commit
80f5a25d2f
@ -1,8 +1,8 @@
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: (lxbwolf)
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
[#]: reviewer: (wxy)
|
||||
[#]: publisher: (wxy)
|
||||
[#]: url: (https://linux.cn/article-12176-1.html)
|
||||
[#]: subject: (Inlining optimisations in Go)
|
||||
[#]: via: (https://dave.cheney.net/2020/04/25/inlining-optimisations-in-go)
|
||||
[#]: author: (Dave Cheney https://dave.cheney.net/author/davecheney)
|
||||
@ -10,33 +10,35 @@
|
||||
Go 中的内联优化
|
||||
======
|
||||
|
||||
本文讨论 Go 编译器是如何实现内联的以及这种优化方法如何影响你的 Go 代码。
|
||||
> 本文讨论 Go 编译器是如何实现内联的,以及这种优化方法如何影响你的 Go 代码。
|
||||
|
||||
*请注意:*本文重点讨论 *gc*,实际上是 [golang.org](https://github.com/golang/go) 的 Go 编译器。讨论到的概念可以广泛用于其他 Go 编译器,如 gccgo 和 llgo,但它们在实现方式和功能上可能有所差异。
|
||||

|
||||
|
||||
*请注意:*本文重点讨论 *gc*,这是来自 [golang.org](https://github.com/golang/go) 的事实标准的 Go 编译器。讨论到的概念可以广泛适用于其它 Go 编译器,如 gccgo 和 llgo,但它们在实现方式和功效上可能有所差异。
|
||||
|
||||
### 内联是什么?
|
||||
|
||||
内联就是把简短的函数在调用它的地方展开。在计算机发展历程的早期,这个优化是由程序员手动实现的。现在,内联已经成为编译过程中自动实现的基本优化过程的其中一步。
|
||||
<ruby>内联<rt>inlining</rt></ruby>就是把简短的函数在调用它的地方展开。在计算机发展历程的早期,这个优化是由程序员手动实现的。现在,内联已经成为编译过程中自动实现的基本优化过程的其中一步。
|
||||
|
||||
### 为什么内联很重要?
|
||||
|
||||
有两个原因。第一个是它消除了函数调用本身的虚耗。第二个是它使得编译器能更高效地执行其他的优化策略。
|
||||
有两个原因。第一个是它消除了函数调用本身的开销。第二个是它使得编译器能更高效地执行其他的优化策略。
|
||||
|
||||
#### 函数调用的虚耗
|
||||
#### 函数调用的开销
|
||||
|
||||
在任何语言中,调用一个函数 [1][2] 都会有消耗。把参数编组进寄存器或放入栈中(取决于 ABI),在返回结果时倒序取出时会有虚耗。引入一次函数调用会导致程序计数器从指令流的一点跳到另一点,这可能导致管道阻塞。函数内部通常有前置处理,需要为函数执行准备新的栈帧,还有与前置相似的后续处理,需要在返回给调用方之前释放栈帧空间。
|
||||
在任何语言中,调用一个函数 [^1] 都会有消耗。把参数编组进寄存器或放入栈中(取决于 ABI),在返回结果时的逆反过程都会有开销。引入一次函数调用会导致程序计数器从指令流的一点跳到另一点,这可能导致管道滞后。函数内部通常有<ruby>前置处理<rt>preamble</rt></ruby>,需要为函数执行准备新的栈帧,还有与前置相似的<ruby>后续处理<rt>epilogue</rt></ruby>,需要在返回给调用方之前释放栈帧空间。
|
||||
|
||||
在 Go 中函数调用会消耗额外的资源来支持栈的动态增长。在进入函数时,goroutine 可用的栈空间与函数需要的空间大小相等。如果可用空间不同,前置处理就会跳到把数据复制到一块新的、更大的空间的运行时逻辑,而这会导致栈空间变大。当这个复制完成后,运行时跳回到原来的函数入口,再执行栈空间检查,函数调用继续执行。这种方式下,goroutine 开始时可以申请很小的栈空间,在有需要时再申请更大的空间。[2][3]
|
||||
在 Go 中函数调用会消耗额外的资源来支持栈的动态增长。在进入函数时,goroutine 可用的栈空间与函数需要的空间大小进行比较。如果可用空间不同,前置处理就会跳到<ruby>运行时<rt>runtime</rt></ruby>的逻辑中,通过把数据复制到一块新的、更大的空间的来增长栈空间。当这个复制完成后,运行时就会跳回到原来的函数入口,再执行栈空间检查,现在通过了检查,函数调用继续执行。这种方式下,goroutine 开始时可以申请很小的栈空间,在有需要时再申请更大的空间。[^2]
|
||||
|
||||
这个检查消耗很小 — 只有几个指令 — 而且由于 goroutine 是成几何级数增长的,因此这个检查很少失败。这样,现代处理器的分支预测单元会通过假定检查肯定会成功来隐藏栈空间检查的消耗。当处理器预测错了栈空间检查,必须要抛弃它推测性执行的操作时,与为了增加 goroutine 的栈空间运行时所需的操作消耗的资源相比,管道阻塞的代价更小。
|
||||
这个检查消耗很小,只有几个指令,而且由于 goroutine 的栈是成几何级数增长的,因此这个检查很少失败。这样,现代处理器的分支预测单元可以通过假定检查肯定会成功来隐藏栈空间检查的消耗。当处理器预测错了栈空间检查,不得不放弃它在推测性执行所做的操作时,与为了增加 goroutine 的栈空间运行时所需的操作消耗的资源相比,管道滞后的代价更小。
|
||||
|
||||
虽然现代处理器可以用预测性执行技术优化每次函数调用中的泛型和 Go 特定的元素的虚耗,但那些虚耗不能被完全消除,因此在每次函数调用执行必要的工作过程中都会有性能消耗。一次函数调用本身的虚耗是固定的,与更大的函数相比,调用小函数的代价更大,因为在每次调用过程中它们做的有用的工作更少。
|
||||
虽然现代处理器可以用预测性执行技术优化每次函数调用中的泛型和 Go 特定的元素的开销,但那些开销不能被完全消除,因此在每次函数调用执行必要的工作过程中都会有性能消耗。一次函数调用本身的开销是固定的,与更大的函数相比,调用小函数的代价更大,因为在每次调用过程中它们做的有用的工作更少。
|
||||
|
||||
消除这些虚耗的方法必须是要消除函数调用本身,Go 的编译器就是这么做的,在某些条件下通过用函数的内容来替换函数调用来实现。这个过程被称为*内联*,因为它在函数调用处把函数体展开了。
|
||||
因此,消除这些开销的方法必须是要消除函数调用本身,Go 的编译器就是这么做的,在某些条件下通过用函数的内容来替换函数调用来实现。这个过程被称为*内联*,因为它在函数调用处把函数体展开了。
|
||||
|
||||
#### 改进的优化机会
|
||||
|
||||
Cliff Click 博士把内联描述为现代编译器做的优化措施,像常量传播(译注:此处作者笔误,原文为 constant proportion,修正为 constant propagation)和死码消除一样,都是编译器的基本优化方法。实际上,内联可以让编译器看得更深,使编译器可以观察调用的特定函数的上下文内容,可以看到能继续简化或彻底消除的逻辑。由于可以递归地执行内联,因此不仅可以在每个独立的函数上下文处进行这种优化,也可以在整个函数调用链中进行。
|
||||
Cliff Click 博士把内联描述为现代编译器做的优化措施,像常量传播(LCTT 译注:此处作者笔误,原文为 constant proportion,修正为 constant propagation)和死代码消除一样,都是编译器的基本优化方法。实际上,内联可以让编译器看得更深,使编译器可以观察调用的特定函数的上下文内容,可以看到能继续简化或彻底消除的逻辑。由于可以递归地执行内联,因此不仅可以在每个独立的函数上下文处进行这种优化决策,也可以在整个函数调用链中进行。
|
||||
|
||||
### 实践中的内联
|
||||
|
||||
@ -66,7 +68,7 @@ func BenchmarkMax(b *testing.B) {
|
||||
}
|
||||
```
|
||||
|
||||
运行这个基准,会得到如下结果:[3][4]
|
||||
运行这个基准,会得到如下结果:[^3]
|
||||
|
||||
```bash
|
||||
% go test -bench=.
|
||||
@ -90,7 +92,7 @@ Max-4 2.21ns ± 1% 0.49ns ± 6% -77.96% (p=0.000 n=18+19)
|
||||
|
||||
这个提升是从哪儿来的呢?
|
||||
|
||||
首先,移除掉函数调用以及与之关联的前置处理 [4][5] 是主要因素。把 `max` 函数的函数体在调用处展开,减少了处理器执行的指令数量并且消除了一些分支。
|
||||
首先,移除掉函数调用以及与之关联的前置处理 [^4] 是主要因素。把 `max` 函数的函数体在调用处展开,减少了处理器执行的指令数量并且消除了一些分支。
|
||||
|
||||
现在由于编译器优化了 `BenchmarkMax`,因此它可以看到 `max` 函数的内容,进而可以做更多的提升。当 `max` 被内联后,`BenchmarkMax` 呈现给编译器的样子,看起来是这样的:
|
||||
|
||||
@ -116,7 +118,7 @@ name old time/op new time/op delta
|
||||
Max-4 2.21ns ± 1% 0.48ns ± 3% -78.14% (p=0.000 n=18+18)
|
||||
```
|
||||
|
||||
现在编译器能看到在 `BenchmarkMax` 里内联 `max` 的结果,可以执行以前不能执行的优化措施。例如,编译器注意到 `i` 初始值为 `0`,仅做自增操作,因此所有与 `i` 的比较都可以假定 `i` 不是负值。这样条件表达式 `-1 > i` 永远不是 true。[5][6]
|
||||
现在编译器能看到在 `BenchmarkMax` 里内联 `max` 的结果,可以执行以前不能执行的优化措施。例如,编译器注意到 `i` 初始值为 `0`,仅做自增操作,因此所有与 `i` 的比较都可以假定 `i` 不是负值。这样条件表达式 `-1 > i` 永远不是 `true`。[^5]
|
||||
|
||||
证明了 `-1 > i` 永远不为 true 后,编译器可以把代码简化为:
|
||||
|
||||
@ -150,7 +152,7 @@ func BenchmarkMax(b *testing.B) {
|
||||
|
||||
### 内联的限制
|
||||
|
||||
本文中我论述的内联称作*叶子*内联;把函数调用栈中最底层的函数在调用它的函数处展开的行为。内联是个递归的过程,当把函数内联到调用它的函数 A 处后,编译器会把内联后的结果代码再内联到 A 的调用方,这样持续内联下去。例如,下面的代码:
|
||||
本文中我论述的内联称作<ruby>叶子内联<rt>leaf inlining</rt></ruby>:把函数调用栈中最底层的函数在调用它的函数处展开的行为。内联是个递归的过程,当把函数内联到调用它的函数 A 处后,编译器会把内联后的结果代码再内联到 A 的调用方,这样持续内联下去。例如,下面的代码:
|
||||
|
||||
```go
|
||||
func BenchmarkMaxMaxMax(b *testing.B) {
|
||||
@ -166,11 +168,11 @@ func BenchmarkMaxMaxMax(b *testing.B) {
|
||||
|
||||
下一篇文章中,我会论述当 Go 编译器想要内联函数调用栈中间的某个函数时选用的另一种内联策略。最后我会论述编译器为了内联代码准备好要达到的极限,这个极限 Go 现在的能力还达不到。
|
||||
|
||||
1. 在 Go 中,一个方法就是一个有预先定义的形参和接受者的函数。假设这个方法不是通过接口调用的,调用一个无消耗的函数所消耗的代价与引入一个方法是相同的。[][7]
|
||||
2. 在 Go 1.14 以前,栈检查的前置处理也被 gc 用于 STW,通过把所有活跃的 goroutine 栈空间设为 0,来强制它们切换为下一次函数调用时的运行时状态。这个机制[最近被替换][8]为一种新机制,新机制下运行时可以不用等 goroutine 进行函数调用就可以暂停 goroutine。[][9]
|
||||
3. 我用 `//go:noinline` 编译指令来阻止编译器内联 `max`。这是因为我想把内联 `max` 的影响与其他影响隔离开,而不是用 `-gcflags='-l -N'` 选项在全局范围内禁止优化。关于 `//go:` 注释在[这篇文章][10]中详细论述。[][11]
|
||||
4. 你可以自己通过比较 `go test -bench=. -gcflags=-S`有无 `//go:noinline` 注释时的不同结果来验证一下。[][12]
|
||||
5. 你可以用 `-gcflags=-d=ssa/prove/debug=on` 选项来自己验证一下。[][13]
|
||||
[^1]: 在 Go 中,一个方法就是一个有预先定义的形参和接受者的函数。假设这个方法不是通过接口调用的,调用一个无消耗的函数所消耗的代价与引入一个方法是相同的。
|
||||
[^2]: 在 Go 1.14 以前,栈检查的前置处理也被垃圾回收器用于 STW,通过把所有活跃的 goroutine 栈空间设为 0,来强制它们切换为下一次函数调用时的运行时状态。这个机制[最近被替换][8]为一种新机制,新机制下运行时可以不用等 goroutine 进行函数调用就可以暂停 goroutine。
|
||||
[^3]: 我用 `//go:noinline` 编译指令来阻止编译器内联 `max`。这是因为我想把内联 `max` 的影响与其他影响隔离开,而不是用 `-gcflags='-l -N'` 选项在全局范围内禁止优化。关于 `//go:` 注释在[这篇文章][10]中详细论述。
|
||||
[^4]: 你可以自己通过比较 `go test -bench=. -gcflags=-S` 有无 `//go:noinline` 注释时的不同结果来验证一下。
|
||||
[^5]: 你可以用 `-gcflags=-d=ssa/prove/debug=on` 选项来自己验证一下。
|
||||
|
||||
#### 相关文章:
|
||||
|
||||
@ -186,7 +188,7 @@ via: https://dave.cheney.net/2020/04/25/inlining-optimisations-in-go
|
||||
作者:[Dave Cheney][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[lxbwolf](https://github.com/lxbwolf)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
校对:[wxy](https://github.com/wxy)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
@ -0,0 +1,129 @@
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: ( )
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
[#]: subject: (The real impact of canceling PyCon due to COVID-19)
|
||||
[#]: via: (https://opensource.com/article/20/5/pycon-covid-19)
|
||||
[#]: author: (Matthew Broberg https://opensource.com/users/mbbroberg)
|
||||
|
||||
The real impact of canceling PyCon due to COVID-19
|
||||
======
|
||||
An interview with Ewa Jodlowska on how the Python Software Foundation is
|
||||
responding to the cancelation of in-person events.
|
||||
![A dollar sign in a network][1]
|
||||
|
||||
The Python Software Foundation (PSF) had to [cancel its popular PyCon US][2] event in response to COVID-19. I interviewed [Ewa Jodlowska][3], Executive Director of the PSF, to talk about the experience and see what we all can learn, and how we can be supportive of the non-profit that supports one of my favorite programming languages.
|
||||
|
||||
### The impact on PSF employees
|
||||
|
||||
I asked Jodlowska "how have you had to adjust your work in light of COVID-19?"
|
||||
|
||||
In her response, the day-to-day didn't sound like much of a change. PSF staff "have always worked remotely." The organization practices a [fully remote work][4] culture and doesn’t have an office. The small staff of seven employees is well versed in collaborating outside of an office.
|
||||
|
||||
Familiarity aside, the emotional impact of needing to cancel an event they put a year’s worth of planning into hurt.
|
||||
|
||||
> **"We all believe in what we do. Which is particularly why we’re such a great small team. So it really impacted us emotionally and mentally. And it continues to."**
|
||||
|
||||
We spoke about how the team is reliving what the days would have looked like if PyCon wasn't interrupted by COVID-19–keynotes would start _now_, the sponsor booths would be in full motion right _now_–and just how emotionally taxing it all was. Throughout the discussion, Jodlowska always came back to recognizing the staff for their resiliency and energy to pivot the event online.
|
||||
|
||||
### The cascading impact of event cancellation
|
||||
|
||||
Jodlowska has been incredibly transparent about the experience. In her March 31st [article on the financial outcome][5], she outlines it clearly: the Python Software Foundation would take a hit from the event cancelation.
|
||||
|
||||
Jodlowska notes that part of the challenge is that PyCon accounts for too much of the organization’s financial health. About 63% of the 2020 revenue was projected to come from the show. While that number is down from the [2017 estimate of 80%][6], it’s still a concern when in-person events will remain limited to keep attendees safe during the COVID-19 outbreak.
|
||||
|
||||
> **"We don’t want to rely on one event**–**or events in general**–**to operate and provide community support."**
|
||||
|
||||
The PSF board of directors is hard at work to look into the diversification of funding. In the meantime, PyCon remains essential to sustainability running the organization.
|
||||
|
||||
### Community support makes all the difference
|
||||
|
||||
It's at this point that Jodlowska again recognizes the incredible work of the PSF staff. They quickly pivoted the vision of the event, and the community of attendees, sponsors, and speakers were all supportive of the move.
|
||||
|
||||
> **"[We] have been brought to tears many times by the generosity of our sponsors and our individual donors."**
|
||||
|
||||
Jodlowska noted that the generosity of so many resulted in reducing the financial impact on the PSF. An incredible amount of Individual attendees are donating their registration costs to the PSF. They are also showing up across social media sites to participate in their own distributed virtual experience of PyCon.
|
||||
|
||||
Another important part of the community, the corporate sponsors of the show, are also showing up to support the non-profit. Many sponsors had already canceled physical presence at the show before the event was officially moved online. Some of them were kind enough, as Jodlowska noted, to donate the cost of sponsorship to the PSF. In a huge turn of events, the list of sponsors **grew** as the online event came together.
|
||||
|
||||
> **[M]any sponsors have opted into participating in PyCon 2020 online. Because of this we have decreased the amount needed from our reserve by 77%! The PSF will now only need $141,713 from its financial reserve to get through 2020.**
|
||||
|
||||
For more on the data side, see Jodlowska’s article _[Thank you to donors & sponsors][7]_.
|
||||
|
||||
Support in all its forms led to the conference feeling like it is well on its way. Some sponsors are even moving to a virtual booth experience.
|
||||
|
||||
> Since our sponsors can’t be with you in person, we’ve created a place to provide their content online - <https://t.co/oGDz3jNZWD>. [#PyCon2020][8] Gold Sponsor Weekly Python Exercise shared this video to introduce you to their offerings: <https://t.co/6VFF8AwMEK>.
|
||||
>
|
||||
> — PyCon US (@pycon) [April 18, 2020][9]
|
||||
|
||||
Maybe most impressively, many speakers and tutorial instructors made the effort of recording their sessions. That’s helped PyCon to [gradually unfold online][10] with incredible educational content. The audience is still able to interact as well: YouTube comments are open for moderation so speakers can interact with their audience.
|
||||
|
||||
Lastly, there remains an army of volunteers who shifted their in-person plans online and continue to help in any way possible.
|
||||
|
||||
### Some of the surprising positives from this difficult change
|
||||
|
||||
While it is without a doubt a challenging time for the organization, Jodlowska noted a number of positives that are unfolding due to this move to virtual.
|
||||
|
||||
To start, the staff of the PSF “have never been closer,” as they bond over the experience and spend more time getting to know each other through weekly video calls and baking competitions.
|
||||
|
||||
Jodlowska was inspired to get involved in another open source effort, [FOSS responders][11], who are helping organizations respond to the cancelation of events due to COVID-19. (If you’ve been affected as well, they are there to help.)
|
||||
|
||||
The generosity mentioned above is a silver lining to the experience and encouraging to the hardworking team that uplifts the popular Python programming language.
|
||||
|
||||
There is also a broader impact on participation in PyCon. While the final numbers are not in yet, an international audience has access to all of PyCon as it unfolds, which gives the entire world a chance to be part of an excellent event [I got to attend][12] last year. On the development side, Jodlowska mentioned that the [core-dev team][13] that maintains Python, who would normally meet in person, shifted to a virtual meeting. As a result of that shift, some participants got to attend that otherwise would not have had the opportunity to join in person.
|
||||
|
||||
### How you can help the Python Software Foundation
|
||||
|
||||
I reached out to Jodlowska because I am impressed with and supportive of their mission to support the Python community. If you want to support them as well, you have options:
|
||||
|
||||
* Become a [free or supporting member][14] of the PSF to get involved in our future.
|
||||
* [Sign up for the PSF’s free newsletter][15] to stay up to date.
|
||||
* [Donate][16] directly to the PSF (and thank you to those that already have).
|
||||
* Ask your employer to [sponsor the PSF][17].
|
||||
* Ask your employer if they match donations to 501(c)(3) non-profits, and ask for your donations to the PSF to be matched.
|
||||
|
||||
|
||||
|
||||
Last but not least, participate in PyCon over the next few weeks. You can learn from all kinds of smart people on a range of topics like [Matt Harrison][18]’s [Hands-on Python for Programmers][19] that guides attendees through analyzing COVID-19 data to [Katie McLaughlin][20]’s thoughtful talk on [What is deployment, anyway?][21]
|
||||
|
||||
Be sure to [review the full][10] list and engage with the amazing lineup of speakers.
|
||||
|
||||
* * *
|
||||
|
||||
_Are you part of a non-profit looking to connect with your open source community at this time of social distancing? Let me know at matt @ opensource.com._
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://opensource.com/article/20/5/pycon-covid-19
|
||||
|
||||
作者:[Matthew Broberg][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://opensource.com/users/mbbroberg
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/osdc_whitehurst_money.png?itok=ls-SOzM0 (A dollar sign in a network)
|
||||
[2]: https://pycon.blogspot.com/2020/03/pycon-us-2020-in-pittsburgh.html
|
||||
[3]: https://www.python.org/psf/records/staff/
|
||||
[4]: https://opensource.com/tags/wfh
|
||||
[5]: http://pyfound.blogspot.com/2020/03/psfs-projected-2020-financial-outcome.html
|
||||
[6]: https://www.youtube.com/watch?v=79AIzbjLzdk
|
||||
[7]: http://pyfound.blogspot.com/2020/04/thank-you-to-donors-sponsors.html
|
||||
[8]: https://twitter.com/hashtag/PyCon2020?src=hash&ref_src=twsrc%5Etfw
|
||||
[9]: https://twitter.com/pycon/status/1251563142641000455?ref_src=twsrc%5Etfw
|
||||
[10]: https://us.pycon.org/2020/online/
|
||||
[11]: https://fossresponders.com/
|
||||
[12]: https://opensource.com/article/19/5/jupyterlab-python-developers-magic
|
||||
[13]: https://devguide.python.org/coredev/
|
||||
[14]: https://www.python.org/psf/membership/
|
||||
[15]: https://www.python.org/psf/newsletter/
|
||||
[16]: https://www.python.org/psf/donations/
|
||||
[17]: https://www.python.org/psf/sponsorship/
|
||||
[18]: https://us.pycon.org/2020/speaker/profile/454/
|
||||
[19]: https://youtu.be/fuJcSNUMrW0
|
||||
[20]: https://opensource.com/users/glasnt
|
||||
[21]: https://youtu.be/8vstov3Y7uE
|
@ -1,207 +0,0 @@
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: (robsean)
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
[#]: subject: (How to compress files on Linux 5 ways)
|
||||
[#]: via: (https://www.networkworld.com/article/3538471/how-to-compress-files-on-linux-5-ways.html)
|
||||
[#]: author: (Sandra Henry-Stocker https://www.networkworld.com/author/Sandra-Henry_Stocker/)
|
||||
|
||||
How to compress files on Linux 5 ways
|
||||
======
|
||||
There are a number of tools that you use to compress files on Linux systems, but they don't all behave the same way or yield the same level of compression. In this post, we compare five of them.
|
||||
Getty Images
|
||||
|
||||
There are quite a few commands on Linux for compressing files. One of the newest and most effective is **xz**, but they all have advantages for both saving disk space and preserving files for later use. In this post, we compare the compression commands and point out the significant differences.
|
||||
|
||||
### tar
|
||||
|
||||
The tar command is not specifically a compression command. It’s generally used to pull a number of files into a single file for easy transport to another system or to back the files up as a related group. It also provides compression as a feature, which makes a lot of sense, and the addition of the **z** compression option is available to make this happen.
|
||||
|
||||
When compression is added to a **tar** command with the **z** option, tar uses **gzip** to do the compressing.
|
||||
|
||||
You can use **tar** to compress a single file as easily as a group though this offers no particular advantage over using **gzip** directly. To use **tar** for this, just identify the file as you would a group of files with a “tar cfz newtarfile filename” command like this:
|
||||
|
||||
```
|
||||
$ tar cfz bigfile.tgz bigfile
|
||||
^ ^
|
||||
| |
|
||||
+- new file +- file to be compressed
|
||||
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 103270400 Apr 16 16:09 bigfile
|
||||
-rw-rw-r-- 1 shs shs 21608325 Apr 16 16:08 bigfile.tgz
|
||||
```
|
||||
|
||||
Note the significant reduction in the file size.
|
||||
|
||||
If you prefer, you can use the **tar.gz** extension which might make the character of the file a bit more obvious, but most Linux users will probably recognize **tgz** as meaning the same thing – the combination of **tar** and **gz** to indicate that the file is a compressed tar file. You will be left with both the original file and the compressed file once the compression is complete.
|
||||
|
||||
To collect a number of files together and compress the resultant “tar ball” in one command, use the same basic syntax, but specify the files to be included as a group in place of the single file. Here’s an example:
|
||||
|
||||
[][1]
|
||||
|
||||
```
|
||||
$ tar cfz bin.tgz bin/*
|
||||
^ ^
|
||||
| +-- files to include
|
||||
+ new file
|
||||
```
|
||||
|
||||
### zip
|
||||
|
||||
The **zip** command creates a compressed file while leaving the original file intact. The syntax is straightforward except that, as with **tar**, you have to remember that your original file should be the last argument on the command line.
|
||||
|
||||
```
|
||||
$ zip ./bigfile.zip bigfile
|
||||
updating: bigfile (deflated 79%)
|
||||
$ ls -l bigfile bigfile.zip
|
||||
-rw-rw-r-- 1 shs shs 103270400 Apr 16 11:18 bigfile
|
||||
-rw-rw-r-- 1 shs shs 21606889 Apr 16 11:19 bigfile.zip
|
||||
```
|
||||
|
||||
### gzip
|
||||
|
||||
The **gzip** command is very simple to use. You just type "gzip" followed by the name of the file you want to compress. Unlike the commands described above, **gzip** will encrypt the files "in place". In other words, the original file will be replaced by the encrypted file.
|
||||
|
||||
```
|
||||
$ gzip bigfile
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 21606751 Apr 15 17:57 bigfile.gz
|
||||
```
|
||||
|
||||
### bzip2
|
||||
|
||||
As with the **gzip** command, **bzip2** will compress the file that you select "in place", leaving only the original file.
|
||||
|
||||
```
|
||||
$ bzip bigfile
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 18115234 Apr 15 17:57 bigfile.bz2
|
||||
```
|
||||
|
||||
### xz
|
||||
|
||||
A relative newcomer to the compression command team, **xz** is a front runner in terms of how well it compresses files. Like the two previous commands, you only need to supply the file name to the command. Again, the original file is compressed in place.
|
||||
|
||||
```
|
||||
$ xz bigfile
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 13427236 Apr 15 17:30 bigfile.xz
|
||||
```
|
||||
|
||||
For large files, you are likely to notice that **xz** takes longer to run than other compression commands, but the compression results are very impressive.
|
||||
|
||||
### Comparisons to consider
|
||||
|
||||
Most people have heard it said that "size isn't everything". So, let's compare file size as well as some other issues to be considered when you make plans for how you want to compress your files.
|
||||
|
||||
The stats shown below all relate to compressing the single file – bigfile – used in the example commands shown above. This file is a large and fairly random text file. Compression rates will depend to some extent on the content of the files.
|
||||
|
||||
#### Size reduction
|
||||
|
||||
When compared, the various compression commands shown above yielded the following results. The percentages represent how the compressed files compare with the original file.
|
||||
|
||||
```
|
||||
-rw-rw-r-- 1 shs shs 103270400 Apr 16 14:01 bigfile
|
||||
------------------------------------------------------
|
||||
-rw-rw-r-- 1 shs shs 18115234 Apr 16 13:59 bigfile.bz2 ~17%
|
||||
-rw-rw-r-- 1 shs shs 21606751 Apr 16 14:00 bigfile.gz ~21%
|
||||
-rw-rw-r-- 1 shs shs 21608322 Apr 16 13:59 bigfile.tgz ~21%
|
||||
-rw-rw-r-- 1 shs shs 13427236 Apr 16 14:00 bigfile.xz ~13%
|
||||
-rw-rw-r-- 1 shs shs 21606889 Apr 16 13:59 bigfile.zip ~21%
|
||||
```
|
||||
|
||||
The **xz** commands wins, ending up at only 13% the size of the original file, but all of these compression commands reduced the original file size quite significantly.
|
||||
|
||||
#### Whether the original files are replaced
|
||||
|
||||
The **bzip2**, **gzip** and **xz** commands all replace the original files with compressed versions. The **tar** and **zip** commands to not.
|
||||
|
||||
#### Run time
|
||||
|
||||
The **xz** command seems to take more time than the other commands to encrypt the files. For bigfile, the approximate times were:
|
||||
|
||||
```
|
||||
command run-time
|
||||
tar 4.9 seconds
|
||||
zip 5.2 seconds
|
||||
bzip2 22.8 seconds
|
||||
gzip 4.8 seconds
|
||||
xz 50.4 seconds
|
||||
```
|
||||
|
||||
Decompression times are likely to be considerably smaller than compression times.
|
||||
|
||||
#### File permissions
|
||||
|
||||
Regardless of what permissions you have set on your original file, permissions for the compressed file will be based on your **umask** setting, except for **bzip2** which retains the original file's permissions.
|
||||
|
||||
#### Compatibility with Windows
|
||||
|
||||
The **zip** command creates a file which can be used (i.e., decompressed) on Windows systems as well as Linux and other Unix systems without having to install other tools which may or may not be available.
|
||||
|
||||
### Decompressing files
|
||||
|
||||
The commands for decompressing files are similar to those used to compress the files. These commands would work for decompressing bigfile after the compression commands shown above were run.
|
||||
|
||||
* tar: **tar xf bigfile.tgz**
|
||||
* zip: **unzip bigfile.zip**
|
||||
* gzip: **gunzip bigfile.gz**
|
||||
* bzip2: **bunzip2 bigfile.gz2**
|
||||
* xz: **xz -d bigfile.xz** or **unxz bigfile.xz**
|
||||
|
||||
|
||||
|
||||
### Running your own compression comparisons
|
||||
|
||||
If you'd like to run some tests on your own, grab a large but replaceable file and compress it using each of the commands shown above – preferably using a new subdirectory. You might have to first install **xz** if you want to include it in the tests.This script can make the comparison easier, but will likely take a few minutes to complete.
|
||||
|
||||
```
|
||||
#!/bin/bash
|
||||
|
||||
# ask user for filename
|
||||
echo -n "filename> "
|
||||
read filename
|
||||
|
||||
# you need this because some commands will replace the original file
|
||||
cp $filename $filename-2
|
||||
|
||||
# clean up first (in case previous results are still available)
|
||||
rm $filename.*
|
||||
|
||||
tar cvfz ./$filename.tgz $filename > /dev/null
|
||||
zip $filename.zip $filename > /dev/null
|
||||
bzip2 $filename
|
||||
# recover original file
|
||||
cp $filename-2 $filename
|
||||
gzip $filename
|
||||
# recover original file
|
||||
cp $filename-2 $filename
|
||||
xz $filename
|
||||
|
||||
# show results
|
||||
ls -l $filename.*
|
||||
|
||||
# replace the original file
|
||||
mv $filename-2 $filename
|
||||
```
|
||||
|
||||
Join the Network World communities on [Facebook][2] and [LinkedIn][3] to comment on topics that are top of mind.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.networkworld.com/article/3538471/how-to-compress-files-on-linux-5-ways.html
|
||||
|
||||
作者:[Sandra Henry-Stocker][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://www.networkworld.com/author/Sandra-Henry_Stocker/
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://www.networkworld.com/blog/itaas-and-the-corporate-storage-technology/?utm_source=IDG&utm_medium=promotions&utm_campaign=HPE22140&utm_content=sidebar (ITAAS and Corporate Storage Strategy)
|
||||
[2]: https://www.facebook.com/NetworkWorld/
|
||||
[3]: https://www.linkedin.com/company/network-world
|
@ -0,0 +1,207 @@
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: (robsean)
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
[#]: subject: (How to compress files on Linux 5 ways)
|
||||
[#]: via: (https://www.networkworld.com/article/3538471/how-to-compress-files-on-linux-5-ways.html)
|
||||
[#]: author: (Sandra Henry-Stocker https://www.networkworld.com/author/Sandra-Henry_Stocker/)
|
||||
|
||||
在 Linux 上压缩文件的 5 种方法
|
||||
======
|
||||
在 Linux 系统上有很多可以用于压缩文件的工具,但是它们表现的行为或产生相同程度的压缩等级并不相同,在这篇文章中,我们比较其中的五个工具。
|
||||
Getty Images
|
||||
|
||||
在 Linux 上有不少用于压缩文件的命令。最新最有效的一个方法是 **xz** ,但是所有的方法都有节省磁盘空间和为后期使用维护备份文件的优点。在这篇文章中,我们将比较压缩命令并指出显著的不同 。
|
||||
|
||||
### tar
|
||||
|
||||
tar 命令不是专门的压缩命令。它通常用于将多个文件拉入一单个文件中,以便容易地传输到另一个系统,或者备份文件为一个相关的组。它也提供压缩作为一个功能,这是很明智的,附加的 **z** 压缩选项能够实现压缩文件。
|
||||
|
||||
当压缩过程被附加到一个使用 **z** 选项的 **tar** 命令时,tar 使用 **gzip** 来进行压缩。
|
||||
|
||||
你可以使用 **tar** 来压缩一个单个文件,就像压缩一个组一样容易,尽管这种操作与直接使用 **gzip** 相比没有特别的优势。为此,要使用 **tar** ,只需要使用一个 “tar cfz newtarfile filename” 命令来像你标识一个组一样标识文件,像这样:
|
||||
|
||||
```
|
||||
$ tar cfz bigfile.tgz bigfile
|
||||
^ ^
|
||||
| |
|
||||
+- 新的文件 +- 将被压缩的文件
|
||||
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 103270400 Apr 16 16:09 bigfile
|
||||
-rw-rw-r-- 1 shs shs 21608325 Apr 16 16:08 bigfile.tgz
|
||||
```
|
||||
|
||||
注意,文件的大小显著减少。
|
||||
|
||||
如果你喜欢,你可以使用 **tar.gz** 扩展名,这可能会使文件的特征更加明显,但是大多数的 Linux 用户将很可能会意识到与 **tgz** 的意思是相同的东西 – **tar** 和 **gz** 的组合来显示文件是一个压缩的 tar 文件。在压缩完成后,将留下原始文件和压缩文件。
|
||||
|
||||
为收集很多文件在一起并在一个命令中压缩生成的 “tar ball” ,使用相同的语法,但是要指明将要被包含的文件来作为一个组,而不是单个文件。这里有一个示例:
|
||||
|
||||
[][1]
|
||||
|
||||
```
|
||||
$ tar cfz bin.tgz bin/*
|
||||
^ ^
|
||||
| +-- 将被包含的文件
|
||||
+ 新的文件
|
||||
```
|
||||
|
||||
### zip
|
||||
|
||||
**zip** 命令创建一个压缩文件,与此同时保留原始文件的完整性。语法像使用 **tar** 一样简单,只是你必需记住,你的原始文件名称应该是命令行上的最后一个参数。
|
||||
|
||||
```
|
||||
$ zip ./bigfile.zip bigfile
|
||||
updating: bigfile (deflated 79%)
|
||||
$ ls -l bigfile bigfile.zip
|
||||
-rw-rw-r-- 1 shs shs 103270400 Apr 16 11:18 bigfile
|
||||
-rw-rw-r-- 1 shs shs 21606889 Apr 16 11:19 bigfile.zip
|
||||
```
|
||||
|
||||
### gzip
|
||||
|
||||
**gzip** 命令非常容易使用。你只需要键入 "gzip" ,紧随其后的是你想要压缩的文件名称。不像上述描述的命令,**gzip** 将“就地”加密文件。换句话说,原始文件将被加密文件替换。
|
||||
|
||||
```
|
||||
$ gzip bigfile
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 21606751 Apr 15 17:57 bigfile.gz
|
||||
```
|
||||
|
||||
### bzip2
|
||||
|
||||
像使用 **gzip** 命令一样,**bzip2** 将在你选的“合适位置”压缩文件,只留下原始文件保持原样离开。
|
||||
|
||||
```
|
||||
$ bzip bigfile
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 18115234 Apr 15 17:57 bigfile.bz2
|
||||
```
|
||||
|
||||
### xz
|
||||
|
||||
压缩命令组中的一个相对较新的成员,**xz** 就如何更好的压缩文件而言是领跑者。像先前的两个命令一样,你只需要将文件名称补给到命令中。再强调一次,原始文件被就地压缩。
|
||||
|
||||
```
|
||||
$ xz bigfile
|
||||
$ ls -l bigfile*
|
||||
-rw-rw-r-- 1 shs shs 13427236 Apr 15 17:30 bigfile.xz
|
||||
```
|
||||
|
||||
对于大文件来说,你可能会注意到 **xz** 将比其它的压缩命令花费更多的运行时间,但是压缩的结果却是非常令人赞叹的。
|
||||
|
||||
### 考虑对比性
|
||||
|
||||
大多数人都听说过 "文件大小不是万能的"。所以,让我们比较一下文件大小以及一些当你计划如何压缩文件时的问题。
|
||||
|
||||
下面显示的统计数据都与压缩单个文件相关,在上面显示的示例中使用 – bigfile – 。这个文件是一个大的且相当随机的文本文件。压缩率在一定程度上取决于文件的内容。
|
||||
|
||||
#### 大小减缩率
|
||||
|
||||
在比较期间,上面显示的各种压缩命产生下面的结果。百分比表示压缩文件对比原始文件。
|
||||
|
||||
```
|
||||
-rw-rw-r-- 1 shs shs 103270400 Apr 16 14:01 bigfile
|
||||
------------------------------------------------------
|
||||
-rw-rw-r-- 1 shs shs 18115234 Apr 16 13:59 bigfile.bz2 ~17%
|
||||
-rw-rw-r-- 1 shs shs 21606751 Apr 16 14:00 bigfile.gz ~21%
|
||||
-rw-rw-r-- 1 shs shs 21608322 Apr 16 13:59 bigfile.tgz ~21%
|
||||
-rw-rw-r-- 1 shs shs 13427236 Apr 16 14:00 bigfile.xz ~13%
|
||||
-rw-rw-r-- 1 shs shs 21606889 Apr 16 13:59 bigfile.zip ~21%
|
||||
```
|
||||
|
||||
**xz** 命令获胜,最终只有压缩文件大小的13%,但是这些所有的压缩命令都相当显著地减少原始文件的大小。
|
||||
|
||||
#### 是否替换原始文件
|
||||
|
||||
**bzip2**,**gzip** 和 **xz** 命令都将使用压缩文件替换原始文件。**tar** 和 **zip** 命令不替换。
|
||||
|
||||
#### 运行时间
|
||||
|
||||
**xz** 命令似乎比其它命令需要花费更多的时间来加密文件。对于 bigfile 来说,近似时间是:
|
||||
|
||||
```
|
||||
命令 运行时间
|
||||
tar 4.9 秒
|
||||
zip 5.2 秒
|
||||
bzip2 22.8 秒
|
||||
gzip 4.8 秒
|
||||
xz 50.4 秒
|
||||
```
|
||||
|
||||
解压缩文件很可能比压缩时间要短得多。
|
||||
|
||||
#### 文件权限
|
||||
|
||||
不管你对压缩文件设置什么权限,压缩文件的权限将基于你的 **umask** 设置,除 **bzip2** 维持原始文件的权限外。
|
||||
|
||||
#### 与 Windows 的兼容性
|
||||
|
||||
**zip** 命令将创建一个可被使用的文件(例如,解压缩),在 Windows 系统上以及 Linux 和其它 Unix 系统上,无需安装其它可能可用或不可用的工具。
|
||||
|
||||
### 解压缩文件
|
||||
|
||||
解压缩文件的命令类似于这些压缩文件的命令。这些命令将在我们运行上述压缩命令后用于解压缩 bigfile 。
|
||||
|
||||
* tar: **tar xf bigfile.tgz**
|
||||
* zip: **unzip bigfile.zip**
|
||||
* gzip: **gunzip bigfile.gz**
|
||||
* bzip2: **bunzip2 bigfile.gz2**
|
||||
* xz: **xz -d bigfile.xz** 或 **unxz bigfile.xz**
|
||||
|
||||
|
||||
|
||||
### 对比你自己运行的压缩
|
||||
|
||||
如果你想自己运行一些测试,抓取一个大的且可以替换的文件,并使用上面显示的每个命令来压缩它 – 最好使用一个新的子目录。你可能必需先安装 **xz** ,如果你想在测试中包含它的话。这个脚本可能更容易地压缩,但是将可能花费几分钟来完成。
|
||||
|
||||
```
|
||||
#!/bin/bash
|
||||
|
||||
# 询问用户文件名称
|
||||
echo -n "filename> "
|
||||
read filename
|
||||
|
||||
# 你需要这个,因为一些命令将替换原始文件
|
||||
cp $filename $filename-2
|
||||
|
||||
# 先清理(以免先前的结果仍然可用)
|
||||
rm $filename.*
|
||||
|
||||
tar cvfz ./$filename.tgz $filename > /dev/null
|
||||
zip $filename.zip $filename > /dev/null
|
||||
bzip2 $filename
|
||||
# 恢复原始文件
|
||||
cp $filename-2 $filename
|
||||
gzip $filename
|
||||
# 恢复原始文件
|
||||
cp $filename-2 $filename
|
||||
xz $filename
|
||||
|
||||
# 显示结果
|
||||
ls -l $filename.*
|
||||
|
||||
# 替换原始文件
|
||||
mv $filename-2 $filename
|
||||
```
|
||||
|
||||
加入 [Facebook][2] 和 [LinkedIn][3] 网络世界社区来评论那些最重要的话题。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.networkworld.com/article/3538471/how-to-compress-files-on-linux-5-ways.html
|
||||
|
||||
作者:[Sandra Henry-Stocker][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://www.networkworld.com/author/Sandra-Henry_Stocker/
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://www.networkworld.com/blog/itaas-and-the-corporate-storage-technology/?utm_source=IDG&utm_medium=promotions&utm_campaign=HPE22140&utm_content=sidebar (ITAAS and Corporate Storage Strategy)
|
||||
[2]: https://www.facebook.com/NetworkWorld/
|
||||
[3]: https://www.linkedin.com/company/network-world
|
211
translated/tech/20200502 Mid-stack inlining in Go.md
Normal file
211
translated/tech/20200502 Mid-stack inlining in Go.md
Normal file
@ -0,0 +1,211 @@
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: (lxbwolf)
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
[#]: subject: (Mid-stack inlining in Go)
|
||||
[#]: via: (https://dave.cheney.net/2020/05/02/mid-stack-inlining-in-go)
|
||||
[#]: author: (Dave Cheney https://dave.cheney.net/author/davecheney)
|
||||
|
||||
Go 中对栈中函数进行内联
|
||||
======
|
||||
|
||||
[上一篇文章][1]中我论述了叶子内联是怎样让 Go 编译器减少函数调用的开销的,以及延伸出了跨函数边界的优化的机会。本文中,我要论述内联的限制以及叶子与栈中内联的对比。
|
||||
|
||||
### 内联的限制
|
||||
|
||||
把函数内联到它的调用处消除了调用的开销,为编译器进行其他的优化提供了更好的机会,那么问题来了,既然内联这么好,内联得越多开销就越少,_为什么不尽可能多地内联呢?_
|
||||
|
||||
内联用可能的增加程序大小换来了更快的执行时间。限制内联的最主要原因是,创建太多的函数内联的备份会增加编译时间,并且作为边际效应会增加生成的二进制文件的大小。即使把内联带来的进一步的优化机会考虑在内,太激进的内联也可能会增加生成的二进制文件的大小和编译时间。
|
||||
|
||||
内联收益最大的是[小函数][2],相对于调用它们的开销来说,这些函数做很少的工作。随着函数大小的增长,函数内部做的工作与函数调用的开销相比省下的时间越来越少。函数越大通常越复杂,因此对它们内联后进行优化与不内联相比的收益没有(对小函数进行内联)那么大。
|
||||
|
||||
### 内联预算
|
||||
|
||||
在编译过程中,每个函数的内联能力是用_内联预算_计算的。开销的计算过程可以巧妙地内化,像一元和二元等简单操作,在抽象语法数(Abstract Syntax Tree,AST)中通常是每个节点一个单元,更复杂的操作如 `make` 可能单元更多。考虑下面的例子:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
func small() string {
|
||||
s := "hello, " + "world!"
|
||||
return s
|
||||
}
|
||||
|
||||
func large() string {
|
||||
s := "a"
|
||||
s += "b"
|
||||
s += "c"
|
||||
s += "d"
|
||||
s += "e"
|
||||
s += "f"
|
||||
s += "g"
|
||||
s += "h"
|
||||
s += "i"
|
||||
s += "j"
|
||||
s += "k"
|
||||
s += "l"
|
||||
s += "m"
|
||||
s += "n"
|
||||
s += "o"
|
||||
s += "p"
|
||||
s += "q"
|
||||
s += "r"
|
||||
s += "s"
|
||||
s += "t"
|
||||
s += "u"
|
||||
s += "v"
|
||||
s += "w"
|
||||
s += "x"
|
||||
s += "y"
|
||||
s += "z"
|
||||
return s
|
||||
}
|
||||
|
||||
func main() {
|
||||
small()
|
||||
large()
|
||||
}
|
||||
```
|
||||
|
||||
使用 `-gcflags=-m=2` 参数编译这个函数能让我们看到编译器分配给每个函数的开销:
|
||||
|
||||
```bash
|
||||
% go build -gcflags=-m=2 inl.go
|
||||
# command-line-arguments
|
||||
./inl.go:3:6: can inline small with cost 7 as: func() string { s := "hello, world!"; return s }
|
||||
./inl.go:8:6: cannot inline large: function too complex: cost 82 exceeds budget 80
|
||||
./inl.go:38:6: can inline main with cost 68 as: func() { small(); large() }
|
||||
./inl.go:39:7: inlining call to small func() string { s := "hello, world!"; return s }
|
||||
```
|
||||
|
||||
编译器根据函数 `func small()` 的开销(7)决定可以对它内联,而`func large()` 的开销太大,编译器决定不进行内联。`func main()` 被标记为适合内联的,分配了 68 的开销;其中 `small` 占用 7,调用 `small` 函数占用 57,剩余的(4)是它自己的开销。
|
||||
|
||||
可以用 `-gcflag=-l` 参数控制内联预算的等级。下面是可使用的值:
|
||||
|
||||
* `-gcflags=-l=0` 默认的内联等级。
|
||||
* `-gcflags=-l` (或 `-gcflags=-l=1`) 取消内联。
|
||||
* `-gcflags=-l=2` 和 `-gcflags=-l=3` 现在已经不使用了。不影响 `-gcflags=-l=0`
|
||||
* `-gcflags=-l=4` 减少非叶子函数和通过接口调用的函数的开销。[2][4]
|
||||
|
||||
|
||||
|
||||
#### 难以理解的优化
|
||||
|
||||
一些函数虽然内联的开销很小,但由于太复杂它们仍不适合进行内联。这就是函数的不确定性,因为一些操作的语义在内联后很难去推导,如 `recover`,`break`。其他的操作,如 `select` 和 `go` 涉及运行时的协调,因此内联后引入的额外的开销不能抵消内联带来的收益。
|
||||
|
||||
难理解的语句也包括 `for` 和 `range`,这些语句不一定开销很大,但目前为止还没有对它们进行优化。
|
||||
|
||||
### 栈中函数优化
|
||||
|
||||
在过去,Go 编译器只对叶子函数进行内联 — 只有那些不调用其他函数的函数才有资格。在上一段难以理解的的语句的探讨内容中,一次函数调用会让这个函数失去内联的资格。
|
||||
|
||||
进入栈中进行内联,就像它的名字一样,能内联在函数调用栈中间的函数,不需要先让它下面的所有的函数都被标记为有资格内联的。栈中内联是 David Lazar 在 Go 1.9 中引入的,并在随后的版本中做了改进。[这篇文章][5]深入探究保留栈追踪的表现和被深度内联后的代码路径里的 `runtime.Caller` 们的难点。
|
||||
|
||||
在前面的例子中我们看到了栈中函数内联。内联后,`func main()` 包含了 `func small()` 的函数体和对 `func large()` 的一次调用,因此它被判定为非叶子函数。在过去,这会阻止它被继续内联,虽然它的联合开销小于内联预算。
|
||||
|
||||
栈中内联的最主要的应用案例就是减少贯穿函数调用栈的开销。考虑下面的例子:
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strconv"
|
||||
)
|
||||
|
||||
type Rectangle struct {}
|
||||
|
||||
//go:noinline
|
||||
func (r *Rectangle) Height() int {
|
||||
h, _ := strconv.ParseInt("7", 10, 0)
|
||||
return int(h)
|
||||
}
|
||||
|
||||
func (r *Rectangle) Width() int {
|
||||
return 6
|
||||
}
|
||||
|
||||
func (r *Rectangle) Area() int { return r.Height() * r.Width() }
|
||||
|
||||
func main() {
|
||||
var r Rectangle
|
||||
fmt.Println(r.Area())
|
||||
}
|
||||
```
|
||||
|
||||
在这个例子中, `r.Area()` 是个简单的函数,调用了两个函数。`r.Width()` 可以被内联,`r.Height()` 这里用 `//go:noinline` 指令标注了,不能被内联。[3][6]
|
||||
|
||||
```bash
|
||||
% go build -gcflags='-m=2' square.go
|
||||
# command-line-arguments
|
||||
./square.go:12:6: cannot inline (*Rectangle).Height: marked go:noinline
|
||||
./square.go:17:6: can inline (*Rectangle).Width with cost 2 as: method(*Rectangle) func() int { return 6 }
|
||||
./square.go:21:6: can inline (*Rectangle).Area with cost 67 as: method(*Rectangle) func() int { return r.Height() * r.Width() } ./square.go:21:61: inlining call to (*Rectangle).Width method(*Rectangle) func() int { return 6 }
|
||||
./square.go:23:6: cannot inline main: function too complex: cost 150 exceeds budget 80
|
||||
./square.go:25:20: inlining call to (*Rectangle).Area method(*Rectangle) func() int { return r.Height() * r.Width() }
|
||||
./square.go:25:20: inlining call to (*Rectangle).Width method(*Rectangle) func() int { return 6 }
|
||||
```
|
||||
|
||||
由于 `r.Area()` 中的乘法与调用它的开销相比并不大,因此内联它的表达式是纯收益,即使它的调用的下游 `r.Height()` 仍是没有内联资格的。
|
||||
|
||||
#### 快速路径内联
|
||||
|
||||
关于栈中内联的效果最令人吃惊的例子是 2019 年 [Carlo Alberto Ferraris][7] 通过允许把 `sync.Mutex.Lock()` 的快速路径,非竞争的情况,内联到它的调用方来[提升它的性能][7]。在这个修改之前,`sync.Mutex.Lock()` 是个很大的函数,包含很多难以理解的条件,使得它没有资格被内联。即使锁可用时,调用者也要付出调用 `sync.Mutex.Lock()` 的代价。
|
||||
|
||||
Carlo 把 `sync.Mutex.Lock()` 分成了两个函数(他自己称为*外联*)。外部的 `sync.Mutex.Lock()` 方法现在调用 `sync/atomic.CompareAndSwapInt32()` 且如果 CAS(Compare and Swap)成功了之后立即返回给调用者。如果 CAS 失败,函数会走到 `sync.Mutex.lockSlow()` 慢速路径,需要对锁进行注册,暂停 goroutine。[4][8]
|
||||
|
||||
```bash
|
||||
% go build -gcflags='-m=2 -l=0' sync 2>&1 | grep '(*Mutex).Lock'
|
||||
../go/src/sync/mutex.go:72:6: can inline (*Mutex).Lock with cost 69 as: method(*Mutex) func() { if "sync/atomic".CompareAndSwapInt32(&m.state, 0, mutexLocked) { if race.Enabled { }; return }; m.lockSlow() }
|
||||
```
|
||||
|
||||
通过把函数分割成一个简单的不能再被分割的外部函数,和(如果没走到外部函数就走到的)一个处理慢速路径的复杂的内部函数,Carlo 组合了栈中函数内联和[编译器对基础操作的支持][9],减少了非竞争锁 14% 的开销。之后他在 `sync.RWMutex.Unlock()` 重复这个技巧,节省了另外 9% 的开销。
|
||||
|
||||
1. 不同发布版本中,在考虑该函数是否适合内联时,Go 编译器对同一函数的预算是不同的。[][10]
|
||||
2. 时刻记着编译器的作者警告过[“更高的内联等级(比 -l 更高)可能导致 bug 或不被支持”][11]。 Caveat emptor。[][12]
|
||||
3. 编译器有足够的能力来内联像 `strconv.ParseInt` 的复杂函数。作为一个实验,你可以尝试去掉 `//go:noinline` 注释,使用 `-gcflags=-m=2` 编译后观察。[][13]
|
||||
4. `race.Enable` 表达式是通过传递给 `go` 工具的 `-race` 参数控制的一个常量。对于普通编译,它的值是 `false`,此时编译器可以完全省略代码路径。[][14]
|
||||
|
||||
|
||||
|
||||
#### 相关文章:
|
||||
|
||||
1. [Go 中的内联优化][15]
|
||||
2. [goroutine 的栈为什么会无限增长?][16]
|
||||
3. [栈追踪和 errors 包][17]
|
||||
4. [零值是什么,为什么它很有用?][18]
|
||||
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://dave.cheney.net/2020/05/02/mid-stack-inlining-in-go
|
||||
|
||||
作者:[Dave Cheney][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[lxbwolf](https://github.com/lxbwolf)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://dave.cheney.net/author/davecheney
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://dave.cheney.net/2020/04/25/inlining-optimisations-in-go
|
||||
[2]: https://medium.com/@joshsaintjacque/small-functions-considered-awesome-c95b3fd1812f
|
||||
[3]: tmp.FyRthF1bbF#easy-footnote-bottom-1-4076 (The budget the Go compiler applies to each function when considering if it is eligible for inlining changes release to release.)
|
||||
[4]: tmp.FyRthF1bbF#easy-footnote-bottom-2-4076 (Keep in mind that the compiler authors warn that “<a href="https://github.com/golang/go/blob/be08e10b3bc07f3a4e7b27f44d53d582e15fd6c7/src/cmd/compile/internal/gc/inl.go#L11">Additional levels of inlining (beyond -l) may be buggy and are not supported”</a>. Caveat emptor.)
|
||||
[5]: https://docs.google.com/presentation/d/1Wcblp3jpfeKwA0Y4FOmj63PW52M_qmNqlQkNaLj0P5o/edit#slide=id.p
|
||||
[6]: tmp.FyRthF1bbF#easy-footnote-bottom-3-4076 (The compiler is powerful enough that it can inline complex functions like <code>strconv.ParseInt</code>. As a experiment, try removing the <code>//go:noinline</code> annotation and observe the result with <code>-gcflags=-m=2</code>.)
|
||||
[7]: https://go-review.googlesource.com/c/go/+/148959
|
||||
[8]: tmp.FyRthF1bbF#easy-footnote-bottom-4-4076 (The expression <code>race.Enable</code> is a constant controlled by the <code>-race</code> flag passed to the <code>go</code> tool. It is <code>false</code> for normal builds which allows the compiler to elide those code paths entirely.)
|
||||
[9]: https://dave.cheney.net/2019/08/20/go-compiler-intrinsics
|
||||
[10]: tmp.FyRthF1bbF#easy-footnote-1-4076
|
||||
[11]: https://github.com/golang/go/blob/be08e10b3bc07f3a4e7b27f44d53d582e15fd6c7/src/cmd/compile/internal/gc/inl.go#L11
|
||||
[12]: tmp.FyRthF1bbF#easy-footnote-2-4076
|
||||
[13]: tmp.FyRthF1bbF#easy-footnote-3-4076
|
||||
[14]: tmp.FyRthF1bbF#easy-footnote-4-4076
|
||||
[15]: https://dave.cheney.net/2020/04/25/inlining-optimisations-in-go (Inlining optimisations in Go)
|
||||
[16]: https://dave.cheney.net/2013/06/02/why-is-a-goroutines-stack-infinite (Why is a Goroutine’s stack infinite ?)
|
||||
[17]: https://dave.cheney.net/2016/06/12/stack-traces-and-the-errors-package (Stack traces and the errors package)
|
||||
[18]: https://dave.cheney.net/2013/01/19/what-is-the-zero-value-and-why-is-it-useful (What is the zero value, and why is it useful?)
|
Loading…
Reference in New Issue
Block a user