mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-10 22:21:11 +08:00
TSL
This commit is contained in:
parent
6c1aefb84a
commit
3e91d556b1
@ -1,111 +0,0 @@
|
||||
[#]: subject: "below: a time traveling resource monitor"
|
||||
[#]: via: "https://fedoramagazine.org/below-a-time-traveling-resource-monitor/"
|
||||
[#]: author: "Daniel Xu https://fedoramagazine.org/author/dxuu/"
|
||||
[#]: collector: "lujun9972"
|
||||
[#]: translator: "wxy"
|
||||
[#]: reviewer: " "
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
|
||||
below: a time traveling resource monitor
|
||||
======
|
||||
|
||||
![][1]
|
||||
|
||||
In this article, we introduce _below_: an Apache 2.0 licensed resource monitor for modern Linux systems. _below_ allows you to replay previously recorded data.
|
||||
|
||||
### Background
|
||||
|
||||
One of the kernel’s primary responsibilities is mediating access to resources. Sometimes this might mean parceling out physical memory such that multiple processes can share the same host. Other times it might mean ensuring equitable distribution of CPU time. In all these contexts, the kernel provides the mechanism and leaves the policy to “someone else”. In more recent times, this “someone else” is usually a runtime like systemd or dockerd. The runtime takes input from a scheduler or end user — something along the lines of what to run and how to run it — and turns the right knobs and pulls the right levers on the kernel such that the workload can —well — get to work.
|
||||
|
||||
In a perfect world this would be the end of the story. However, the reality is that resource management is a complex and rather opaque amalgam of technologies that has evolved over decades of computing. Despite some of this technology having various warts and dead ends, the end result — a container — works relatively well. While the user does not usually need to concern themselves with the details, it is crucial for infrastructure operators to have visibility into their stack. Visibility and debuggability are essential for detecting and investigating misconfigurations, bugs, and systemic issues.
|
||||
|
||||
To make matters more complicated, resource outages are often difficult to reproduce. It is not unusual to spend weeks waiting for an issue to reoccur so that the root cause can be investigated. Scale further compounds this issue: one cannot run a custom script on _every_ host in the hopes of logging bits of crucial state if the bug happens again. Therefore, more sophisticated tooling is required. Enter _below_.
|
||||
|
||||
### Motivation
|
||||
|
||||
Historically Facebook has been a heavy user of _atop_ [0]. _atop_ is a performance monitor for Linux that is capable of reporting the activity of all processes as well as various pieces of system level activity. One of the most compelling features _atop_ has over tools like _htop_ is the ability to record historical data as a daemon. This sounds like a simple feature, but in practice this has enabled debugging countless production issues. With long enough data retention, it is possible to go backwards in time and look at the host state before, during, and after the issue or outage.
|
||||
|
||||
Unfortunately, it became clear over the years that _atop_ had certain deficiencies. First, cgroups [1] have emerged as the defacto way to control and monitor resources on a Linux machine. _atop_ still lacks support for this fundamental building block. Second, _atop_ stores data on disk with custom delta compression. This works fine under normal circumstances, but under heavy resource pressure the host is likely to lose data points. Since delta compression is in use, huge swaths of data can be lost for periods of time where the data is most important. Third, the user experience has a steep learning curve. We frequently heard from _atop_ power users that they love the dense layout and numerous keybindings. However, this is a double edged sword. When someone new to the space wants to debug a production issue, they’re solving two problems at once now: the issue at hand and how to use _atop_.
|
||||
|
||||
_below_ was designed and developed by and for the resource control team at Facebook with input from production _atop_ users. The resource control team is responsible for, as the name suggests, resource management at scale. The team is comprised of kernel developers, container runtime developers, and hardware folks. Recognizing the opportunity for a next-generation system monitor, we designed _below_ with the following in mind:
|
||||
|
||||
* Ease of use: _below_ must be both intuitive for new users as well as powerful for daily users
|
||||
* Opinionated statistics: _below_ displays accurate and useful statistics. We try to avoid collecting and dumping stats just because we can.
|
||||
* Flexibility: when the default settings are not enough, we allow the user to customize their experience. Examples include configurable keybindings, configurable default view, and a scripting interface (the default being a terminal user interface).
|
||||
|
||||
|
||||
|
||||
### Install
|
||||
|
||||
To install the package:
|
||||
|
||||
```
|
||||
# dnf install -y below
|
||||
```
|
||||
|
||||
To turn on the recording daemon:
|
||||
|
||||
```
|
||||
# systemctl enable --now below
|
||||
```
|
||||
|
||||
### Quick tour
|
||||
|
||||
_below_’s most commonly used mode is replay mode. As the name implies, replay mode replays previously recorded data. Assuming you’ve already started the recording daemon, start a session by running:
|
||||
|
||||
```
|
||||
$ below replay --time "5 minutes ago"
|
||||
```
|
||||
|
||||
You will then see the cgroup view:
|
||||
|
||||
![][2]
|
||||
|
||||
If you get stuck or forget a keybinding, press **?** to access the help menu.
|
||||
|
||||
The very top of the screen is the status bar. The status bar displays information about the current sample. You can move forwards and backwards through samples by pressing **t** and **T**, respectively. The middle section is the system overview. The system overview contains statistics about the system as a whole that are generally always useful to see. The third and lowest section is the multipurpose view. The image above shows the cgroup view. Additionally, there are process and system views, accessible by pressing **p** and **s**, respectively.
|
||||
|
||||
Press **↑** and **↓** to move the list selection. Press **<Enter>** to collapse and expand cgroups. Suppose you’ve found an interesting cgroup and you want to see what processes are running inside it. To zoom into the process view, select the cgroup and press **z**:
|
||||
|
||||
![][3]
|
||||
|
||||
Press **z** again to return to the cgroup view. The cgroup view can be somewhat long at times. If you have a vague idea of what you’re looking for, you can filter by cgroup name by pressing **/** and entering a filter:
|
||||
|
||||
![][4]
|
||||
|
||||
At this point, you may have noticed a tab system we haven’t explored yet. To cycle forwards and backwards through tabs, press **<Tab>** and **<Shift> \+ <Tab>** respectively. We’ll leave this as an exercise to the reader.
|
||||
|
||||
### Other features
|
||||
|
||||
Under the hood, _below_ has a powerful design and architecture. Facebook is constantly upgrading to newer kernels, so we never assume a data source is available. This tacit assumption enables total backwards and forwards compatibility between kernels and _below_ versions. Furthermore, each data point is zstd compressed and stored in full. This solves the issues with delta compression we’ve seen _atop_ have at scale. Based on our tests, our per-sample compression can achieve on average a 5x compression ratio.
|
||||
|
||||
_below_ also uses eBPF [2] to collect information about short-lived processes (processes that live for shorter than the data collection interval). In contrast, _atop_ implements this feature with BSD process accounting, a known slow and priority-inversion-prone kernel interface.
|
||||
|
||||
For the user, _below_ also supports live-mode and a dump interface. Live mode combines the recording daemon and the TUI session into one process. This is convenient for browsing system state without committing to a long running daemon or disk space for data storage. The dump interface is a scriptable interface to all the data _below_ stores. Dump is both powerful and flexible — detailed data is available in CSV, JSON, and human readable format.
|
||||
|
||||
### Conclusion
|
||||
|
||||
_below_ is an Apache 2.0 licensed open source project that we (the _below_ developers) think offers compelling advantages over existing tools in the resource monitoring space. We’ve spent a great deal of effort preparing _below_ for open source use so we hope that readers and the community get a chance to try _below_ out and report back with bugs and feature requests.
|
||||
|
||||
[0]: <https://www.atoptool.nl/>
|
||||
[1]: <https://en.wikipedia.org/wiki/Cgroups>
|
||||
[2]: <https://ebpf.io/>
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://fedoramagazine.org/below-a-time-traveling-resource-monitor/
|
||||
|
||||
作者:[Daniel Xu][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://fedoramagazine.org/author/dxuu/
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://fedoramagazine.org/wp-content/uploads/2021/08/below_resource_monitor-816x345.jpg
|
||||
[2]: https://fedoramagazine.org/wp-content/uploads/2021/08/image-1024x800.png
|
||||
[3]: https://fedoramagazine.org/wp-content/uploads/2021/08/image-1-1024x800.png
|
||||
[4]: https://fedoramagazine.org/wp-content/uploads/2021/08/image-2-1024x800.png
|
@ -0,0 +1,108 @@
|
||||
[#]: subject: "below: a time traveling resource monitor"
|
||||
[#]: via: "https://fedoramagazine.org/below-a-time-traveling-resource-monitor/"
|
||||
[#]: author: "Daniel Xu https://fedoramagazine.org/author/dxuu/"
|
||||
[#]: collector: "lujun9972"
|
||||
[#]: translator: "wxy"
|
||||
[#]: reviewer: " "
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
|
||||
Below:一个时间旅行的资源监控器
|
||||
======
|
||||
|
||||
![][1]
|
||||
|
||||
在这篇文章中,我们将介绍 below:一个用于现代 Linux 系统的 Apache 2.0 许可的资源监视器。below 允许你重放以前记录的数据。
|
||||
|
||||
### 背景
|
||||
|
||||
内核的主要职责之一是调解对资源的访问。有时这可能意味着分配物理内存,使多个进程可以共享同一主机。其他时候,它可能意味着确保 CPU 时间的公平分配。在所有这些情况下,内核提供了机制,而将策略留给了“别人”。在最近,这个“别人”通常是 systemd 或 dockerd 这样的运行时。运行时接受来自调度器或最终用户的输入(类似于运行什么和如何运行)并在内核上转动正确的旋钮和拉动正确的杠杆,从而使工作负载能够*好好*工作。
|
||||
|
||||
在一个完美的世界里,故事就到此结束了。然而,现实情况是,资源管理是一个复杂的、相当不透明的技术混合体,在几十年里计算技术不断发展。尽管其中一些技术有各种缺陷和死角,但最终的结果,容器运作得比较好。虽然用户通常不需要关心这些细节,但对于基础设施运营商来说,对他们的技术架构拥有可见性是至关重要的。可见性和可调试性对于检测和调查错误配置、问题和系统性故障至关重要。
|
||||
|
||||
让事情变得更加复杂的是,资源中断往往难以重现。经常需要花费数周时间等待一个问题重新出现,以便调查其根本原因。规模的扩大进一步加剧了这个问题:我们不能在*每台*主机上运行一个自定义脚本,希望在错误再次发生时记录下关键状态的片段。因此,需要更复杂的工具。这就出现了 below。
|
||||
|
||||
### ### 动机
|
||||
|
||||
历史上,Facebook 一直是 [atop][5] 的忠实用户。`atop` 是一个用于 Linux 的性能监视器,能够报告所有进程的活动以及各种系统级活动。与 `htop` 等工具相比,`atop` 最引人注目的功能之一是能够作为一个守护程序记录历史数据。这听起来是一个简单的功能,但在实践中,这使得调试无数的生产问题成为可能。有了足够长的数据保留,就有可能在时间上回溯,查看在问题或故障发生之前、期间和之后的主机状态。
|
||||
|
||||
不幸的是,随着时间的推移,人们发现`atop` 有某些不足之处。首先,<ruby>[控制组][6]<rt>cgroup</rt></ruby> 已经成为控制和监视 Linux 机器上资源的实际方式。`atop` 仍然缺乏对这一基本构建模块的支持。第二,`atop` 用自定义的 delta 压缩方法在磁盘上存储数据。这在正常情况下运行良好,但在沉重的资源压力下,主机很可能会丢失数据点。由于使用了 delta 压缩,在数据最重要的时间段内,数据可能会大面积丢失。第三,用户体验有一个陡峭的学习曲线。我们经常听到 `atop` 的资深用户说,他们喜欢密集的布局和众多的键盘绑定。然而,这也是一把双刃剑。当一个刚进入这个领域的人想要调试一个生产问题时,他们现在要同时解决两个问题:手头的问题和如何使用 `atop`。
|
||||
|
||||
`below` 是由 Facebook 的资源控制团队为其设计和开发的,并得到了 `atop` 生产用户的投入。顾名思义,资源控制团队负责的是规模化的资源管理。该团队由内核开发人员、容器运行时开发人员和硬件人员组成。认识到下一代系统监控器的机会,我们在设计 `below` 时考虑到以下几点:
|
||||
|
||||
* 易用性:`below` 必须既能为新用户提供直观的体验,又能为日常用户提供强大的功能。
|
||||
*有意义的统计数据:`below` 显示准确和有用的统计数据。即便可以,但我们尽量避免收集和倾倒统计数字。
|
||||
* 灵活性:当默认设置不合适时,我们允许用户自定义他们的体验。例如包括可配置的键绑定、可配置的默认视图,以及脚本界面(默认为终端用户接口)。
|
||||
|
||||
### 安装
|
||||
|
||||
安装该软件包:
|
||||
|
||||
```
|
||||
# dnf install -y below
|
||||
```
|
||||
|
||||
打开记录守护进程:
|
||||
|
||||
```
|
||||
# systemctl enable --now below
|
||||
```
|
||||
|
||||
### 快速介绍
|
||||
|
||||
`below` 最常用的模式是重放模式。顾名思义,重放模式是重放以前记录的数据。假设你已经启动了记录守护程序,那么通过运行以下程序启动一个会话:
|
||||
|
||||
```
|
||||
$ below replay --time "5 minutes ago"
|
||||
```
|
||||
|
||||
然后你会看到控制组视图:
|
||||
|
||||
![][2]
|
||||
|
||||
如果你不知道该怎么操作,或者忘记了一个键位,按 `?` 可以进入帮助菜单。
|
||||
|
||||
屏幕的最上方是状态栏。状态栏显示关于当前样本的信息。你可以通过按 `t` 和 `T` 分别向前和向后移动样本。中间的部分是系统概览。系统概览包含了关于整个系统的统计数据,一般来说,这些数据总是很有用的。第三部分也是最下面的部分是多用途视图。上面的图片显示了控制组视图。此外,还有进程和系统视图,分别通过按 `p` 和` s` 来访问。
|
||||
|
||||
按 `↑` 和 `↓` 来移动列表选择。按回车键来折叠和展开控制组。假设你发现了一个感兴趣的控制组,你想看看它里面有哪些进程在运行。要放大进程视图,选择控制组并按 `z`:
|
||||
|
||||
![][3]
|
||||
|
||||
再按 `z` 返回到控制组视图。这个视图有时会有点长。如果你对你要找的东西有一个模糊的概念,你可以通过按 `/` 并输入一个过滤器来过滤控制组名称。
|
||||
|
||||
![][4]
|
||||
|
||||
在这一点上,你可能已经注意到了一个我们还没有探索过的标签系统。要在标签中向前和向后循环,可以分别按 `Tab` 和 `Shift` + `Tab`。我们把这个问题留给读者去做练习。
|
||||
|
||||
### 其他功能
|
||||
|
||||
在底层下,`below` 有一个强大的设计和架构。Facebook 正在不断升级到更新的内核,所以我们从不假设数据源是可用的。这种默契的假设使得内核和 `below `版本之间能够完全向前和向后兼容。此外,每个数据点都被 zstd 压缩并完整地存储。这解决了我们看到的 `atop` 在大规模时的 delta 压缩问题。根据我们的测试,我们的每个样本压缩可以达到平均 5 倍的压缩率。
|
||||
|
||||
`below` 也使用 [eBPF][8] 来收集关于短暂进程(生存时间短于数据收集间隔的进程)的信息。相比之下,`atop` 使用 BSD 进程核算来实现这一功能,这是一个已知的缓慢且容易发生优先级转换的内核接口。
|
||||
|
||||
对于用户来说,`below` 还支持实时模式和一个转储接口。实时模式将记录守护程序和 TUI 会话合并到一个进程中。这对于浏览系统状态是很方便的,不需要为数据存储投入长期运行的守护程序或磁盘空间。转储接口是一个可编写脚本的接口,用于所有的 `below` 数据存储。转储既强大又灵活,详细的数据以 CSV、JSON 和人类可读格式提供。
|
||||
|
||||
### 总结
|
||||
|
||||
`below` 是一个 Apache 2.0 许可的开源项目,我们(`below` 的开发者)认为它比资源监控领域的现有工具具有引人注目的优势。我们已经花了大量的精力来准备 `below`,以提供开源使用,所以我们希望读者和社区有机会尝试 `below`,并报告错误和功能要求。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://fedoramagazine.org/below-a-time-traveling-resource-monitor/
|
||||
|
||||
作者:[Daniel Xu][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[wxy](https://github.com/wxy)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://fedoramagazine.org/author/dxuu/
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://fedoramagazine.org/wp-content/uploads/2021/08/below_resource_monitor-816x345.jpg
|
||||
[2]: https://fedoramagazine.org/wp-content/uploads/2021/08/image-1024x800.png
|
||||
[3]: https://fedoramagazine.org/wp-content/uploads/2021/08/image-1-1024x800.png
|
||||
[4]: https://fedoramagazine.org/wp-content/uploads/2021/08/image-2-1024x800.png
|
||||
[5]: <https://www.atoptool.nl/>
|
||||
[6]: <https://en.wikipedia.org/wiki/Cgroups>
|
||||
[7]: <https://ebpf.io/>
|
Loading…
Reference in New Issue
Block a user