mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-13 22:30:37 +08:00
parent
8d1ebfdb25
commit
0094f3f872
@ -0,0 +1,162 @@
|
||||
[#]: subject: "A 10-minute guide to the Linux ABI"
|
||||
[#]: via: "https://opensource.com/article/22/12/linux-abi"
|
||||
[#]: author: "Alison Chaiken https://opensource.com/users/chaiken"
|
||||
[#]: collector: "lkxed"
|
||||
[#]: translator: "ChatGPT"
|
||||
[#]: reviewer: "wxy"
|
||||
[#]: publisher: "wxy"
|
||||
[#]: url: "https://linux.cn/article-16002-1.html"
|
||||
|
||||
10 分钟让你了解 Linux ABI
|
||||
======
|
||||
|
||||
![][0]
|
||||
|
||||
> 熟悉 ABI 的概念、ABI 稳定性的重要性以及 Linux 稳定 ABI 中包含的内容。
|
||||
|
||||
> LCTT 译注:昨天,AlmaLinux 称将 [放弃](https://linux.cn/article-16000-1.html) 对 RHEL 的 1:1 兼容性,但将保持对 RHEL 的 ABI 兼容,以便在 RHEL 上运行的软件可以无缝地运行在 AlmaLinux 上。可能有的同学对 ABI 的概念还不是很清楚,因此翻译此文供大家了解。
|
||||
|
||||
许多 Linux 爱好者都熟悉 Linus Torvalds 的 [著名告诫][1]:“我们不破坏用户空间”,但可能并非每个听到这句话的人都清楚其含义。
|
||||
|
||||
这个“第一规则”提醒开发人员关于应用程序的二进制接口(ABI)的稳定性,该接口用于应用程序与内核之间的通信和配置。接下来的内容旨在使读者熟悉 ABI 的概念,阐述为什么 ABI 的稳定性很重要,并讨论 Linux 稳定 ABI 中包含了哪些内容。Linux 的持续增长和演进需要对 ABI 进行变更,其中一些变更引起了争议。
|
||||
|
||||
### 什么是 ABI?
|
||||
|
||||
ABI 表示 <ruby>应用程序二进制接口<rt>Applications Binary Interface</rt></ruby>。理解 ABI 概念的一种方式是考虑它与其他概念的区别。对于许多开发人员来说,<ruby>应用程序编程接口<rt>Applications Programming Interface</rt></ruby>(API)更为熟悉。通常,库的头文件和文档被认为是其 API,以及还有像 [HTML5][2] 这样的标准文档。调用库或交换字符串格式数据的程序必须遵守 API 中所描述的约定,否则可能得到意外的结果。
|
||||
|
||||
ABI 类似于 API,因为它们规定了命令的解释和二进制数据的交换方式。对于 C 程序,ABI 通常包括函数的返回类型和参数列表、结构体的布局,以及枚举类型的含义、顺序和范围。截至 2022 年,Linux 内核仍然几乎完全是 C 程序,因此必须遵守这些规范。
|
||||
|
||||
“[内核系统调用接口][3]” 的描述可以在《[Linux 手册第 2 节][4]》中找到,并包括了可从中间件应用程序调用的类似 `mount` 和 `sync` 的 C 版本函数。这些函数的二进制布局是 Linux ABI 的第一个重要组成部分。对于问题 “Linux 的稳定 ABI 包括哪些内容?”,许多用户和开发人员的回答是 “sysfs(`/sys`)和 procfs(`/proc`)的内容”。而实际上,[官方 Linux ABI 文档][5] 确实主要集中在这些 [虚拟文件系统][6] 上。
|
||||
|
||||
前面着重介绍了 Linux ABI 在程序中的应用方式,但未涵盖同等重要的人为因素。正如下图所示,ABI 的功能需要内核社区、C 编译器(如 [GCC][7] 或 [clang][8])、创建用户空间 C 库(通常是 [glibc][9])的开发人员,以及按照 [可执行与链接格式(ELF)][10] 布局的二进制应用程序之间的合作努力。
|
||||
|
||||
![开发社区内的合作][11]
|
||||
|
||||
### 为什么我们关注 ABI?
|
||||
|
||||
来自 Torvalds 本人的 Linux ABI 的稳定性保证,使得 Linux 发行版和个人用户能够独立更新内核,而不受操作系统的影响。
|
||||
|
||||
如果 Linux 没有稳定的 ABI,那么每次内核需要修补以解决安全问题时,操作系统的大部分甚至全部内容都需要重新安装。显然,二进制接口的稳定性是 Linux 的可用性和广泛采用的重要因素之一。
|
||||
|
||||
![Terminal output][12]
|
||||
|
||||
如上图所示,内核(在 `linux-libc-dev` 中)和 Glibc(在 `libc6-dev` 中)都提供了定义文件权限的位掩码。显然,这两个定义集必须一致!`apt` 软件包管理器会识别软件包提供每个文件。Glibc ABI 的潜在不稳定部分位于 `bits/` 目录中。
|
||||
|
||||
在大部分情况下,Linux ABI 的稳定性保证运作良好。按照 <ruby>[康韦定律][13]<rt>Conway's Law</rt></ruby>,在开发过程中出现的烦人技术问题往往是由于不同软件开发社区之间的误解或分歧所致,而这些社区都为 Linux 做出了贡献。不同社区之间的接口可以通过 Linux 包管理器的元数据轻松地进行想象,如上图所示。
|
||||
|
||||
### Y2038:一个 ABI 破坏的例子
|
||||
|
||||
通过考虑当前正在进行的、[缓慢发生][14] 的 “Y2038” ABI 破坏的例子,可以更好地理解 Linux ABI。在 2038 年 1 月,32 位时间计数器将回滚到全零,就像较旧车辆的里程表一样。2038 年 1 月听起来还很遥远,但可以肯定的是,如今销售的许多物联网设备仍将处于运行状态。像今年安装的 [智能电表][15] 和 [智能停车系统][16] 这样的普通产品可能采用的是 32 位处理器架构,而且也可能不支持软件更新。
|
||||
|
||||
Linux 内核已经在内部转向使用 64 位的 `time_t` 不透明数据类型来表示更晚的时间点。这意味着像 `time()` 这样的系统调用在 64 位系统上已经变更了它们的函数签名。这些努力的艰难程度可以在内核头文件中(例如 [time_types.h][17])清楚地看到,在那里放着新的和 `_old` 版本的数据结构。
|
||||
|
||||
![里程表翻转][18]
|
||||
|
||||
Glibc 项目也 [支持 64 位时间][19],那么就大功告成了,对吗?不幸的是,根据 [Debian 邮件列表中的讨论][20] 来看,情况并非如此。发行版面临难以选择的问题,要么为 32 位系统提供所有二进制软件包的两个版本,要么为安装介质提供两个版本。在后一种情况下,32 位时间的用户将不得不重新编译其应用程序并重新安装。正如往常一样,专有应用程序才是一个真正的头疼问题。
|
||||
|
||||
### Linux 稳定 ABI 里到底包括什么内容?
|
||||
|
||||
理解稳定 ABI 有些微妙。需要考虑的是,尽管大部分 sysfs 是稳定 ABI,但调试接口肯定是不稳定的,因为它们将内核内部暴露给用户空间。Linus Torvalds 曾表示,“不要破坏用户空间”,通常情况下,他是指保护那些 “只想它能工作” 的普通用户,而不是系统程序员和内核工程师,后者应该能够阅读内核文档和源代码,以了解不同版本之间发生了什么变化。下图展示了这个区别。
|
||||
|
||||
![稳定性保证][21]
|
||||
|
||||
普通用户不太可能与 Linux ABI 的不稳定部分进行交互,但系统程序员可能无意中这样做。除了 `/sys/kernel/debug` 以外,sysfs(`/sys`)和 procfs(`/proc`)的所有部分都是稳定的。
|
||||
|
||||
那么其他对用户空间可见的二进制接口如何呢,包括 `/dev` 中的设备文件、内核日志文件(可通过 `dmesg` 命令读取)、文件系统元数据或在内核的 “命令行” 中提供的 “引导参数”(在引导加载程序如 GRUB 或 u-boot 中可见)呢?当然,“这要视情况而定”。
|
||||
|
||||
### 挂载旧文件系统
|
||||
|
||||
除了 Linux 系统在引导过程中出现挂起之外,文件系统无法挂载是最令人失望的事情。如果文件系统位于付费客户的固态硬盘上,那么问题确实十分严重。当内核升级时,一个能够在旧内核版本下挂载的 Linux 文件系统应该仍然能够挂载,对吗?实际上,“这要视情况而定”。
|
||||
|
||||
在 2020 年,一位受到伤害的 Linux 开发人员在内核的邮件列表上 [抱怨道][23]:
|
||||
|
||||
> 内核已经接受这个作为一个有效的可挂载文件系统格式,没有任何错误或任何类型的警告,而且已经这样稳定地工作了多年……我一直普遍地以为,挂载现有的根文件系统属于内核<->用户空间或内核<->现有系统边界的范围,由内核接受并被现有用户空间成功使用的内容所定义,升级内核应该与现有用户空间和系统兼容。
|
||||
|
||||
但是有一个问题:这些无法挂载的文件系统是使用一种依赖于内核定义,但并未被内核使用的标志的专有工具创建的。该标志未出现在 Linux 的 API 头文件或 procfs/sysfs 中,而是一种 [实现细节][24]。因此,在用户空间代码中解释该标志意味着依赖于“[未定义行为][25]”,这是个几乎会让每个软件开发人员都感到战栗的短语。当内核社区改进其内部测试并开始进行新的一致性检查时,“[man 2 mount][26]” 系统调用突然开始拒绝具有专有格式的文件系统。由于该格式的创建者明确是一位软件开发人员,因此他未能得到内核文件系统维护者的同情。
|
||||
|
||||
![施工标志上写着工作人员在树上进行工作][27]
|
||||
|
||||
### 线程化内核的 dmesg 日志
|
||||
|
||||
`/dev` 目录中的文件格式是否保证稳定或不稳定?[dmesg 命令][28] 会从文件 `/dev/kmsg` 中读取内容。2018 年,一位开发人员 [为 dmesg 输出实现了线程化][29],使内核能够“在打印一系列 `printk()` 消息到控制台时,不会被中断和/或被其他线程的并发 `printk()` 干扰”。听起来很棒!通过在 `/dev/kmsg` 输出的每一行添加线程 ID,实现了线程化。密切关注的读者将意识到这个改动改变了 `/dev/kmsg` 的 ABI,这意味着解析该文件的应用程序也需要进行相应的修改。由于许多发行版没有编译启用新功能的内核,大多数使用 `/bin/dmesg` 的用户可能没有注意到这件事,但这个改动破坏了 [GDB 调试器][30] 读取内核日志的能力。
|
||||
|
||||
确实,敏锐的读者会认为 GDB 的用户运气不佳,因为调试器是开发人员工具。实际上并非如此,因为需要更新以支持新的 `/dev/kmsg` 格式的代码位于内核自己的 Git 源代码库的 “树内” 部分。对于一个正常的项目来说,单个代码库内的程序无法协同工作就是一个明显的错误,因此已经合并了一份 [使 GDB 能够与线程化的 /dev/kmsg 一起工作的补丁][32]。
|
||||
|
||||
### 那么 BPF 程序呢?
|
||||
|
||||
[BPF][33] 是一种强大的工具,可以在运行的内核中监控甚至实时进行配置。BPF 最初的目的是通过允许系统管理员即时从命令行修改数据包过滤器,从而支持实时网络配置。[Alexei Starovoitov 和其他人极大地扩展了 BPF][34],使其能够跟踪任意内核函数。跟踪明显是开发人员的领域,而不是普通用户,因此它显然不受任何 ABI 保证的约束(尽管 [bpf() 系统调用][35] 具有与其他系统调用相同的稳定性承诺)。另一方面,创建新功能的 BPF 程序为“[取代内核模块成为扩展内核的事实标准手段][36]”提供了可能性。内核模块使设备、文件系统、加密、网络等工作正常,因此明显是“只希望它工作”的普通用户所依赖的设施。问题是,与大多数开源内核模块不同,BPF 程序传统上不在内核源代码中。
|
||||
|
||||
2022 年春季,[一个提案][37] 成为了焦点,该提案提议使用微型 BPF 程序而不是设备驱动程序补丁,对广泛的人机接口设备(如鼠标和键盘)提供支持。
|
||||
|
||||
随后进行了一场激烈的讨论,但这个问题显然在 [Torvalds 在开源峰会上的评论][38] 中得到解决:
|
||||
|
||||
> 他指出,如果你破坏了“普通(非内核开发人员)用户使用的真实用户空间工具”,那么你需要修复它,无论是否使用了 eBPF。
|
||||
|
||||
一致意见似乎正在形成,即希望其 BPF 程序在内核更新后仍能正常工作的开发人员 [将需要将其提交到内核源代码库中一个尚未指定的位置][39]。敬请关注后继发展,以了解内核社区对于 BPF 和 ABI 稳定性将采取什么样的政策。
|
||||
|
||||
### 结论
|
||||
|
||||
内核的 ABI 稳定性保证适用于 procfs、sysfs 和系统调用接口,但也存在重要的例外情况。当内核变更破坏了“树内”代码或用户空间应用程序时,通常会迅速回滚有问题的补丁。对于依赖内核实现细节的专有代码,尽管这些细节可以从用户空间访问,但它并没有受到保护,并且在出现问题时得到的同情有限。当像 Y2038 这样的问题无法避免 ABI 破坏时,会以尽可能慎重和系统化的方式进行过渡。而像 BPF 程序这样的新功能提出了关于 ABI 稳定性边界的尚未解答的问题。
|
||||
|
||||
### 致谢
|
||||
|
||||
感谢 [Akkana Peck][40]、[Sarah R. Newman][41] 和 [Luke S. Crawford][42] 对早期版本材料的有益评论。
|
||||
|
||||
*(题图:MJ/da788385-ca24-4be5-bc27-ad7e7ef75973)*
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://opensource.com/article/22/12/linux-abi
|
||||
|
||||
作者:[Alison Chaiken][a]
|
||||
选题:[lkxed][b]
|
||||
译者:ChatGPT
|
||||
校对:[wxy](https://github.com/wxy)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://opensource.com/users/chaiken
|
||||
[b]: https://github.com/lkxed
|
||||
[1]: https://lkml.org/lkml/2018/12/22/232
|
||||
[2]: https://www.w3.org/TR/2014/REC-html5-20141028/
|
||||
[3]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi-stable.html#the-kernel-syscall-interface
|
||||
[4]: https://www.man7.org/linux/man-pages/dir_section_2.html
|
||||
[5]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi.html
|
||||
[6]: https://opensource.com/article/19/3/virtual-filesystems-linux
|
||||
[7]: https://gcc.gnu.org/
|
||||
[8]: https://clang.llvm.org/get_started.html
|
||||
[9]: https://www.gnu.org/software/libc/
|
||||
[10]: https://www.man7.org/linux/man-pages/man5/elf.5.html
|
||||
[11]: https://opensource.com/sites/default/files/2022-11/1cooperation.png
|
||||
[12]: https://opensource.com/sites/default/files/2022-12/better_apt-file-find_ABI-boundary.png
|
||||
[13]: https://en.wikipedia.org/wiki/Conway's_law
|
||||
[14]: https://www.phoronix.com/news/MTc2Mjg
|
||||
[15]: https://www.lfenergy.org/projects/super-advanced-meter-sam/
|
||||
[16]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506899/
|
||||
[17]: https://github.com/torvalds/linux/blob/master/include/uapi/linux/time_types.h
|
||||
[18]: https://opensource.com/sites/default/files/2022-11/3speedometerrollingover_0.jpg
|
||||
[19]: https://www.phoronix.com/scan.php?page=news_item&px=Glibc-More-Y2038-Work
|
||||
[20]: https://groups.google.com/g/linux.debian.ports.arm/c/_KBFSz4YRZs
|
||||
[21]: https://opensource.com/sites/default/files/2022-11/4stability.png
|
||||
[22]: https://lwn.net/Articles/833696/
|
||||
[23]: https://lwn.net/ml/linux-kernel/20201006050306.GA8098@localhost/
|
||||
[24]: https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)
|
||||
[25]: https://en.wikipedia.org/wiki/Undefined_behavior
|
||||
[26]: https://www.man7.org/linux/man-pages/man2/mount.2.html
|
||||
[27]: https://opensource.com/sites/default/files/2022-11/5crewworkingintrees.jpg
|
||||
[28]: https://www.man7.org/linux/man-pages/man1/dmesg.1.html
|
||||
[29]: https://lkml.org/lkml/2018/11/24/180
|
||||
[30]: https://sourceware.org/gdb/current/onlinedocs/gdb/
|
||||
[31]: https://unix.stackexchange.com/questions/208638/linux-kernel-meaning-of-source-tree-in-tree-and-out-of-tree
|
||||
[32]: https://lore.kernel.org/all/20191011142500.2339-1-joel.colledge@linbit.com/
|
||||
[33]: https://opensource.com/article/19/8/introduction-bpftrace
|
||||
[34]: https://lwn.net/Articles/740157/
|
||||
[35]: https://www.man7.org/linux/man-pages/man2/bpf.2.html
|
||||
[36]: https://lwn.net/Articles/909095/
|
||||
[37]: https://lwn.net/ml/ksummit-discuss/CAO-hwJJxCteD_BHZTeqQ1f7gWOHoj+05qP8bmFsRYVfMc_3FxQ@mail.gmail.com/
|
||||
[38]: https://lwn.net/ml/ksummit-discuss/20220621110514.6ef174d0@rorschach.local.home/
|
||||
[39]: https://lwn.net/ml/ksummit-discuss/20220616125128.68151432@gandalf.local.home/
|
||||
[40]: https://shallowsky.com/blog/
|
||||
[41]: https://www.socallinuxexpo.org/scale/19x/presentations/live-patching-down-trenches-view
|
||||
[42]: https://www.amazon.com/Book-Xen-Practical-System-Administrator/dp/1593271867
|
||||
[0]: https://img.linux.net.cn/data/attachment/album/202307/15/114240eo7her2zbdqqp448.jpg
|
@ -1,151 +0,0 @@
|
||||
[#]: subject: "A 10-minute guide to the Linux ABI"
|
||||
[#]: via: "https://opensource.com/article/22/12/linux-abi"
|
||||
[#]: author: "Alison Chaiken https://opensource.com/users/chaiken"
|
||||
[#]: collector: "lkxed"
|
||||
[#]: translator: " "
|
||||
[#]: reviewer: " "
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
|
||||
A 10-minute guide to the Linux ABI
|
||||
======
|
||||
|
||||
Many Linux enthusiasts are familiar with Linus Torvalds' [famous admonition][1], "we don't break user space," but perhaps not everyone who recognizes the phrase is certain about what it means.
|
||||
|
||||
The "#1 rule" reminds developers about the stability of the applications' binary interface via which applications communicate with and configure the kernel. What follows is intended to familiarize readers with the concept of an ABI, describe why ABI stability matters, and discuss precisely what is included in Linux's stable ABI. The ongoing growth and evolution of Linux necessitate changes to the ABI, some of which have been controversial.
|
||||
|
||||
### What is an ABI?
|
||||
|
||||
ABI stands for Applications Binary Interface. One way to understand the concept of an ABI is to consider what it is not. Applications Programming Interfaces (APIs) are more familiar to many developers. Generally, the headers and documentation of libraries are considered to be their API, as are standards documents like those for [HTML5][2], for example. Programs that call into libraries or exchange string-formatted data must comply with the conventions described in the API or expect unwanted results.
|
||||
|
||||
ABIs are similar to APIs in that they govern the interpretation of commands and exchange of binary data. For C programs, the ABI generally comprises the return types and parameter lists of functions, the layout of structs, and the meaning, ordering, and range of enumerated types. The Linux kernel remains, as of 2022, almost entirely a C program, so it must adhere to these specifications.
|
||||
|
||||
"[The kernel syscall interface][3]" is described by [Section 2 of the Linux man pages][4] and includes the C versions of familiar functions like "mount" and "sync" that are callable from middleware applications. The binary layout of these functions is the first major part of Linux's ABI. In answer to the question, "What is in Linux's stable ABI?" many users and developers will respond with "the contents of sysfs (/sys) and procfs (/proc)." In fact, the [official Linux ABI documentation][5] concentrates mostly on these [virtual filesystems][6].
|
||||
|
||||
The preceding text focuses on how the Linux ABI is exercised by programs but fails to capture the equally important human aspect. As the figure below illustrates, the functionality of the ABI requires a joint, ongoing effort by the kernel community, C compilers (such as [GCC][7] or [clang][8]), the developers who create the userspace C library (most commonly [glibc][9]) that implements system calls, and binary applications, which much be laid out in accordance with the Executable and Linking Format ([ELF][10]).
|
||||
|
||||
![Cooperation within the development community][11]
|
||||
|
||||
### Why do we care about the ABI?
|
||||
|
||||
The Linux ABI stability guarantee that comes from Torvalds himself enables Linux distros and individual users to update the kernel independently of the operating system.
|
||||
|
||||
If Linux did not have a stable ABI, then every time the kernel needed patching to address a security problem, a large part of the operating system, if not the entirety, would need to be reinstalled. Obviously, the stability of the binary interface is a major contributing factor to Linux's usability and wide adoption.
|
||||
|
||||
![Terminal output][12]
|
||||
|
||||
As the second figure illustrates, both the kernel (in linux-libc-dev) and Glibc (in `libc6-dev`) provide bitmasks that define file permissions. Obviously the two sets of definitions must agree! The `apt` package manager identifies which software project provided each file. The potentially unstable part of Glibc's ABI is found in the `bits/` directory.
|
||||
|
||||
For the most part, the Linux ABI stability guarantee works just fine. In keeping with [Conway's Law][13], vexing technical issues that arise in the course of development most frequently occur due to misunderstandings or disagreements between different software development communities that contribute to Linux. The interface between communities is easy to envision via Linux package-manager metadata, as shown in the image above.
|
||||
|
||||
### Y2038: An example of an ABI break
|
||||
|
||||
The Linux ABI is best understood by considering the example of the ongoing, [slow-motion][14] "Y2038" ABI break. In January 2038, 32-bit time counters will roll over to all zeroes, just like the odometer of an older vehicle. January 2038 sounds far away, but assuredly many IoT devices sold in 2022 will still be operational. Mundane products like [smart electrical meters][15] and [smart parking systems][16] installed this year may or may not have 32-bit processor architectures and may or may not support software updates.
|
||||
|
||||
The Linux kernel has already moved to a 64-bit `time_t` opaque data type internally to represent later timepoints. The implication is that system calls like `time()` have already changed their function signature on 64-bit systems. The arduousness of these efforts is on ready display in kernel headers like [time_types.h][17], which includes new and "_old" versions of data structures.
|
||||
|
||||
![Odometer rolling over][18]
|
||||
|
||||
The Glibc project also [supports 64-bit time][19], so yay, we're done, right? Unfortunately, no, as a [discussion on the Debian mailing list][20] makes clear. Distros are faced with the unenviable choice of either providing two versions of all binary packages for 32-bit systems or two versions of installation media. In the latter case, users of 32-bit time will have to recompile their applications and reinstall. As always, proprietary applications will be a real headache.
|
||||
|
||||
### What precisely is in the Linux stable ABI anyway?
|
||||
|
||||
Understanding the stable ABI is a bit subtle. Consider that, while most of sysfs is stable ABI, the debug interfaces are guaranteed to be _un_stable since they expose kernel internals to userspace. In general, Linus Torvalds has pronounced that by "don't break userspace," he means to protect ordinary users who "just want it to work" rather than system programmers and kernel engineers, who should be able to read the kernel documentation and source code to figure out what has changed between releases. The distinction is illustrated in the figure below.
|
||||
|
||||
![Stability guarantee][21]
|
||||
|
||||
Ordinary users are unlikely to interact with unstable parts of the Linux ABI, but system programmers may do so inadvertently. All of sysfs (`/sys`) and procfs (`/proc`) are guaranteed stable except for `/sys/kernel/debug`.
|
||||
|
||||
But what about other binary interfaces that are userspace-visible, including miscellaneous ABI bits like device files in `/dev`, the kernel log file (readable with the `dmesg` command), filesystem metadata, or "bootargs" provided on the kernel "command line" that are visible in a bootloader like GRUB or u-boot? Naturally, "it depends."
|
||||
|
||||
### Mounting old filesystems
|
||||
|
||||
Next to observing a Linux system hang during the boot sequence, having a filesystem fail to mount is the greatest disappointment. If the filesystem resides on an SSD belonging to a paying customer, the matter is grave indeed. Surely a Linux filesystem that mounts with an old kernel version will still mount when the kernel is upgraded, right? Actually, "[it depends][22]."
|
||||
|
||||
In 2020 an aggrieved Linux developer [complained on the kernel's mailing list][23]:
|
||||
|
||||
> The kernel already accepted this as a valid mountable filesystem format, without a single error or warning of any kind, and has done so stably for years. . . . I was generally under the impression that mounting existing root filesystems fell under the scope of the kernel<->userspace or kernel<->existing-system boundary, as defined by what the kernel accepts and existing userspace has used successfully, and that upgrading the kernel should work with existing userspace and systems.
|
||||
|
||||
But there was a catch: The filesystems that failed to mount were created with a proprietary tool that relied on a flag that was defined but not used by the kernel. The flag did not appear in Linux's API header files or procfs/sysfs but was instead an [implementation detail][24]. Therefore, interpreting the flag in userspace code meant relying on "[undefined behavior][25]," a phrase that will make software developers almost universally shudder. When the kernel community improved its internal testing and started making new consistency checks, the "[man 2 mount][26]" system call suddenly began rejecting filesystems with the proprietary format. Since the format creator was decidedly a software developer, he got little sympathy from kernel filesystem maintainers.
|
||||
|
||||
![Construction sign reading crews working in trees][27]
|
||||
|
||||
### Threading the kernel dmesg log
|
||||
|
||||
Is the format of files in `/dev` guaranteed stable or not? The [command dmesg][28] reads from the file `/dev/kmsg`. In 2018, a developer [made output to dmesg threaded][29], enabling the kernel "to print a series of printk() messages to consoles without being disturbed by concurrent printk() from interrupts and/or other threads." Sounds excellent! Threading was made possible by adding a thread ID to each line of the `/dev/kmsg` output. Readers following closely will realize that the addition changed the ABI of `/dev/kmsg`, meaning that applications that parse that file needed to change too. Since many distros didn't compile their kernels with the new feature enabled, most users of `/bin/dmesg` won't have noticed, but the change broke the [GDB debugger][30]'s ability to read the kernel log.
|
||||
|
||||
Assuredly, astute readers will think users of GDB are out of luck because debuggers are developer tools. Actually, no, since the code that needed to be updated to support the new `/dev/kmsg` format was "[in-tree][31]," meaning part of the kernel's own Git source repository. The failure of programs within a single repo to work together is just an out-and-out bug for any sane project, and a [patch that made GDB work with threaded /dev/kmsg][32] was merged.
|
||||
|
||||
### What about BPF programs?
|
||||
|
||||
[BPF][33] is a powerful tool to monitor and even configure the running kernel dynamically. BPF's original purpose was to support on-the-fly network configuration by allowing sysadmins to modify packet filters from the command line instantly. [Alexei Starovoitov and others greatly extended BPF][34], giving it the power to trace arbitrary kernel functions. Tracing is clearly the domain of developers rather than ordinary users, so it is certainly not subject to any ABI guarantee (although the [bpf() system call][35] has the same stability promise as any other). On the other hand, BPF programs that create new functionality present the possibility of "[replacing kernel modules as the de-facto means of extending the kernel][36]." Kernel modules make devices, filesystems, crypto, networks, and the like work, and therefore clearly are a facility on which the "just want it to work" user relies. The problem arises that BFP programs have not traditionally been "in-tree" as most open-source kernel modules are. A proposal in spring 2022 to [provide support to the vast array of human interface devices (HIDs) like mice and keyboards via tiny BPF programs][37] rather than patches to device drivers brought the issue into sharp focus.
|
||||
|
||||
A rather heated discussion followed, but the issue was apparently settled by [Torvalds' comments at Open Source Summit][38]:
|
||||
|
||||
> He specified if you break 'real user space tools, that normal (non-kernel developers) users use,' then you need to fix it, regardless of whether it is using eBPF or not.
|
||||
|
||||
A consensus appears to be forming that developers who expect their BPF programs to withstand kernel updates [will need to submit them to an as-yet unspecified place in the kernel source repository][39]. Stay tuned to find out what policy the kernel community adopts regarding BPF and ABI stability.
|
||||
|
||||
### Conclusion
|
||||
|
||||
The kernel ABI stability guarantee applies to procfs, sysfs, and the system call interface, with important exceptions. When "in-tree" code or userspace applications are "broken" by kernel changes, the offending patches are typically quickly reverted. When proprietary code relies on kernel implementation details that are incidentally accessible from userspace, it is not protected and garners little sympathy when it breaks. When, as with Y2038, there is no way to avoid an ABI break, the transition is made as thoughtfully and methodically as possible. Newer features like BPF programs present as-yet-unanswered questions about where exactly the ABI-stability border lies.
|
||||
|
||||
##### Acknowledgments
|
||||
|
||||
Thanks to [Akkana Peck][40], [Sarah R. Newman][41], and [Luke S. Crawford][42] for their helpful comments on early versions of this material.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://opensource.com/article/22/12/linux-abi
|
||||
|
||||
作者:[Alison Chaiken][a]
|
||||
选题:[lkxed][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://opensource.com/users/chaiken
|
||||
[b]: https://github.com/lkxed
|
||||
[1]: https://lkml.org/lkml/2018/12/22/232
|
||||
[2]: https://www.w3.org/TR/2014/REC-html5-20141028/
|
||||
[3]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi-stable.html#the-kernel-syscall-interface
|
||||
[4]: https://www.man7.org/linux/man-pages/dir_section_2.html
|
||||
[5]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi.html
|
||||
[6]: https://opensource.com/article/19/3/virtual-filesystems-linux
|
||||
[7]: https://gcc.gnu.org/
|
||||
[8]: https://clang.llvm.org/get_started.html
|
||||
[9]: https://www.gnu.org/software/libc/
|
||||
[10]: https://www.man7.org/linux/man-pages/man5/elf.5.html
|
||||
[11]: https://opensource.com/sites/default/files/2022-11/1cooperation.png
|
||||
[12]: https://opensource.com/sites/default/files/2022-12/better_apt-file-find_ABI-boundary.png
|
||||
[13]: https://en.wikipedia.org/wiki/Conway's_law
|
||||
[14]: https://www.phoronix.com/news/MTc2Mjg
|
||||
[15]: https://www.lfenergy.org/projects/super-advanced-meter-sam/
|
||||
[16]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506899/
|
||||
[17]: https://github.com/torvalds/linux/blob/master/include/uapi/linux/time_types.h
|
||||
[18]: https://opensource.com/sites/default/files/2022-11/3speedometerrollingover_0.jpg
|
||||
[19]: https://www.phoronix.com/scan.php?page=news_item&px=Glibc-More-Y2038-Work
|
||||
[20]: https://groups.google.com/g/linux.debian.ports.arm/c/_KBFSz4YRZs
|
||||
[21]: https://opensource.com/sites/default/files/2022-11/4stability.png
|
||||
[22]: https://lwn.net/Articles/833696/
|
||||
[23]: https://lwn.net/ml/linux-kernel/20201006050306.GA8098@localhost/
|
||||
[24]: https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)
|
||||
[25]: https://en.wikipedia.org/wiki/Undefined_behavior
|
||||
[26]: https://www.man7.org/linux/man-pages/man2/mount.2.html
|
||||
[27]: https://opensource.com/sites/default/files/2022-11/5crewworkingintrees.jpg
|
||||
[28]: https://www.man7.org/linux/man-pages/man1/dmesg.1.html
|
||||
[29]: https://lkml.org/lkml/2018/11/24/180
|
||||
[30]: https://sourceware.org/gdb/current/onlinedocs/gdb/
|
||||
[31]: https://unix.stackexchange.com/questions/208638/linux-kernel-meaning-of-source-tree-in-tree-and-out-of-tree
|
||||
[32]: https://lore.kernel.org/all/20191011142500.2339-1-joel.colledge@linbit.com/
|
||||
[33]: https://opensource.com/article/19/8/introduction-bpftrace
|
||||
[34]: https://lwn.net/Articles/740157/
|
||||
[35]: https://www.man7.org/linux/man-pages/man2/bpf.2.html
|
||||
[36]: https://lwn.net/Articles/909095/
|
||||
[37]: https://lwn.net/ml/ksummit-discuss/CAO-hwJJxCteD_BHZTeqQ1f7gWOHoj+05qP8bmFsRYVfMc_3FxQ@mail.gmail.com/
|
||||
[38]: https://lwn.net/ml/ksummit-discuss/20220621110514.6ef174d0@rorschach.local.home/
|
||||
[39]: https://lwn.net/ml/ksummit-discuss/20220616125128.68151432@gandalf.local.home/
|
||||
[40]: https://shallowsky.com/blog/
|
||||
[41]: https://www.socallinuxexpo.org/scale/19x/presentations/live-patching-down-trenches-view
|
||||
[42]: https://www.amazon.com/Book-Xen-Practical-System-Administrator/dp/1593271867
|
Loading…
Reference in New Issue
Block a user