From 0094f3f87275b9f5eb9fb5bbf07efcd7b599174d Mon Sep 17 00:00:00 2001 From: Xingyu Wang Date: Sat, 15 Jul 2023 11:43:42 +0800 Subject: [PATCH] ATRP @wxy https://linux.cn/article-16002-1.html --- ...ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md | 162 ++++++++++++++++++ ...ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md | 151 ---------------- 2 files changed, 162 insertions(+), 151 deletions(-) create mode 100644 published/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md delete mode 100644 sources/tech/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md diff --git a/published/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md b/published/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md new file mode 100644 index 0000000000..1a698f48c6 --- /dev/null +++ b/published/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md @@ -0,0 +1,162 @@ +[#]: subject: "A 10-minute guide to the Linux ABI" +[#]: via: "https://opensource.com/article/22/12/linux-abi" +[#]: author: "Alison Chaiken https://opensource.com/users/chaiken" +[#]: collector: "lkxed" +[#]: translator: "ChatGPT" +[#]: reviewer: "wxy" +[#]: publisher: "wxy" +[#]: url: "https://linux.cn/article-16002-1.html" + +10 分钟让你了解 Linux ABI +====== + +![][0] + +> 熟悉 ABI 的概念ã€ABI 稳定性的é‡è¦æ€§ä»¥åŠ Linux 稳定 ABI 中包å«çš„内容。 + +> LCTT 译注:昨天,AlmaLinux 称将 [放弃](https://linux.cn/article-16000-1.html) 对 RHEL çš„ 1:1 兼容性,但将ä¿æŒå¯¹ RHEL çš„ ABI 兼容,以便在 RHEL 上è¿è¡Œçš„软件å¯ä»¥æ— ç¼åœ°è¿è¡Œåœ¨ AlmaLinux 上。å¯èƒ½æœ‰çš„åŒå­¦å¯¹ ABI 的概念还ä¸æ˜¯å¾ˆæ¸…楚,因此翻译此文供大家了解。 + +许多 Linux 爱好者都熟悉 Linus Torvalds çš„ [è‘—å告诫][1]:“我们ä¸ç ´å用户空间â€ï¼Œä½†å¯èƒ½å¹¶éžæ¯ä¸ªå¬åˆ°è¿™å¥è¯çš„人都清楚其å«ä¹‰ã€‚ + +这个“第一规则â€æ醒开å‘人员关于应用程åºçš„二进制接å£ï¼ˆABI)的稳定性,该接å£ç”¨äºŽåº”用程åºä¸Žå†…核之间的通信和é…置。接下æ¥çš„内容旨在使读者熟悉 ABI 的概念,é˜è¿°ä¸ºä»€ä¹ˆ ABI 的稳定性很é‡è¦ï¼Œå¹¶è®¨è®º Linux 稳定 ABI 中包å«äº†å“ªäº›å†…容。Linux çš„æŒç»­å¢žé•¿å’Œæ¼”进需è¦å¯¹ ABI 进行å˜æ›´ï¼Œå…¶ä¸­ä¸€äº›å˜æ›´å¼•èµ·äº†äº‰è®®ã€‚ + +### 什么是 ABI? + +ABI 表示 应用程åºäºŒè¿›åˆ¶æŽ¥å£Applications Binary Interface。ç†è§£ ABI 概念的一ç§æ–¹å¼æ˜¯è€ƒè™‘它与其他概念的区别。对于许多开å‘人员æ¥è¯´ï¼Œåº”用程åºç¼–程接å£Applications Programming Interface(API)更为熟悉。通常,库的头文件和文档被认为是其 API,以åŠè¿˜æœ‰åƒ [HTML5][2] 这样的标准文档。调用库或交æ¢å­—符串格å¼æ•°æ®çš„程åºå¿…é¡»éµå®ˆ API 中所æ述的约定,å¦åˆ™å¯èƒ½å¾—到æ„外的结果。 + +ABI 类似于 API,因为它们规定了命令的解释和二进制数æ®çš„交æ¢æ–¹å¼ã€‚对于 C 程åºï¼ŒABI 通常包括函数的返回类型和å‚数列表ã€ç»“构体的布局,以åŠæžšä¸¾ç±»åž‹çš„å«ä¹‰ã€é¡ºåºå’ŒèŒƒå›´ã€‚截至 2022 年,Linux 内核ä»ç„¶å‡ ä¹Žå®Œå…¨æ˜¯ C 程åºï¼Œå› æ­¤å¿…é¡»éµå®ˆè¿™äº›è§„范。 + +“[内核系统调用接å£][3]†的æè¿°å¯ä»¥åœ¨ã€Š[Linux 手册第 2 节][4]》中找到,并包括了å¯ä»Žä¸­é—´ä»¶åº”用程åºè°ƒç”¨çš„类似 `mount` å’Œ `sync` çš„ C 版本函数。这些函数的二进制布局是 Linux ABI 的第一个é‡è¦ç»„æˆéƒ¨åˆ†ã€‚对于问题 “Linux 的稳定 ABI 包括哪些内容?â€ï¼Œè®¸å¤šç”¨æˆ·å’Œå¼€å‘人员的回答是 “sysfs(`/sys`)和 procfs(`/proc`)的内容â€ã€‚而实际上,[官方 Linux ABI 文档][5] 确实主è¦é›†ä¸­åœ¨è¿™äº› [虚拟文件系统][6] 上。 + +å‰é¢ç€é‡ä»‹ç»äº† Linux ABI 在程åºä¸­çš„应用方å¼ï¼Œä½†æœªæ¶µç›–åŒç­‰é‡è¦çš„人为因素。正如下图所示,ABI 的功能需è¦å†…核社区ã€C 编译器(如 [GCC][7] 或 [clang][8])ã€åˆ›å»ºç”¨æˆ·ç©ºé—´ C 库(通常是 [glibc][9])的开å‘人员,以åŠæŒ‰ç…§ [å¯æ‰§è¡Œä¸Žé“¾æŽ¥æ ¼å¼ï¼ˆELF)][10] 布局的二进制应用程åºä¹‹é—´çš„åˆä½œåŠªåŠ›ã€‚ + +![å¼€å‘社区内的åˆä½œ][11] + +### 为什么我们关注 ABI? + +æ¥è‡ª Torvalds 本人的 Linux ABI 的稳定性ä¿è¯ï¼Œä½¿å¾— Linux å‘行版和个人用户能够独立更新内核,而ä¸å—æ“作系统的影å“。 + +如果 Linux 没有稳定的 ABI,那么æ¯æ¬¡å†…核需è¦ä¿®è¡¥ä»¥è§£å†³å®‰å…¨é—®é¢˜æ—¶ï¼Œæ“作系统的大部分甚至全部内容都需è¦é‡æ–°å®‰è£…。显然,二进制接å£çš„稳定性是 Linux çš„å¯ç”¨æ€§å’Œå¹¿æ³›é‡‡ç”¨çš„é‡è¦å› ç´ ä¹‹ä¸€ã€‚ + +![Terminal output][12] + +如上图所示,内核(在 `linux-libc-dev` 中)和 Glibc(在 `libc6-dev` 中)都æ供了定义文件æƒé™çš„ä½æŽ©ç ã€‚显然,这两个定义集必须一致ï¼`apt` 软件包管ç†å™¨ä¼šè¯†åˆ«è½¯ä»¶åŒ…æä¾›æ¯ä¸ªæ–‡ä»¶ã€‚Glibc ABI 的潜在ä¸ç¨³å®šéƒ¨åˆ†ä½äºŽ `bits/` 目录中。 + +在大部分情况下,Linux ABI 的稳定性ä¿è¯è¿ä½œè‰¯å¥½ã€‚按照 [康韦定律][13]Conway's Law,在开å‘过程中出现的烦人技术问题往往是由于ä¸åŒè½¯ä»¶å¼€å‘社区之间的误解或分歧所致,而这些社区都为 Linux åšå‡ºäº†è´¡çŒ®ã€‚ä¸åŒç¤¾åŒºä¹‹é—´çš„接å£å¯ä»¥é€šè¿‡ Linux 包管ç†å™¨çš„元数æ®è½»æ¾åœ°è¿›è¡Œæƒ³è±¡ï¼Œå¦‚上图所示。 + +### Y2038:一个 ABI ç ´åçš„ä¾‹å­ + +通过考虑当å‰æ­£åœ¨è¿›è¡Œçš„ã€[缓慢å‘生][14] çš„ “Y2038†ABI ç ´å的例å­ï¼Œå¯ä»¥æ›´å¥½åœ°ç†è§£ Linux ABI。在 2038 å¹´ 1 月,32 ä½æ—¶é—´è®¡æ•°å™¨å°†å›žæ»šåˆ°å…¨é›¶ï¼Œå°±åƒè¾ƒæ—§è½¦è¾†çš„里程表一样。2038 å¹´ 1 月å¬èµ·æ¥è¿˜å¾ˆé¥è¿œï¼Œä½†å¯ä»¥è‚¯å®šçš„是,如今销售的许多物è”网设备ä»å°†å¤„于è¿è¡ŒçŠ¶æ€ã€‚åƒä»Šå¹´å®‰è£…çš„ [智能电表][15] å’Œ [智能åœè½¦ç³»ç»Ÿ][16] 这样的普通产å“å¯èƒ½é‡‡ç”¨çš„是 32 ä½å¤„ç†å™¨æž¶æž„,而且也å¯èƒ½ä¸æ”¯æŒè½¯ä»¶æ›´æ–°ã€‚ + +Linux 内核已ç»åœ¨å†…部转å‘使用 64 ä½çš„ `time_t` ä¸é€æ˜Žæ•°æ®ç±»åž‹æ¥è¡¨ç¤ºæ›´æ™šçš„时间点。这æ„味ç€åƒ `time()` 这样的系统调用在 64 ä½ç³»ç»Ÿä¸Šå·²ç»å˜æ›´äº†å®ƒä»¬çš„函数签å。这些努力的艰难程度å¯ä»¥åœ¨å†…核头文件中(例如 [time_types.h][17])清楚地看到,在那里放ç€æ–°çš„å’Œ `_old` 版本的数æ®ç»“构。 + +![里程表翻转][18] + +Glibc 项目也 [æ”¯æŒ 64 ä½æ—¶é—´][19],那么就大功告æˆäº†ï¼Œå¯¹å—?ä¸å¹¸çš„æ˜¯ï¼Œæ ¹æ® [Debian 邮件列表中的讨论][20] æ¥çœ‹ï¼Œæƒ…况并éžå¦‚此。å‘行版é¢ä¸´éš¾ä»¥é€‰æ‹©çš„问题,è¦ä¹ˆä¸º 32 ä½ç³»ç»Ÿæ供所有二进制软件包的两个版本,è¦ä¹ˆä¸ºå®‰è£…介质æ供两个版本。在åŽä¸€ç§æƒ…况下,32 ä½æ—¶é—´çš„用户将ä¸å¾—ä¸é‡æ–°ç¼–译其应用程åºå¹¶é‡æ–°å®‰è£…。正如往常一样,专有应用程åºæ‰æ˜¯ä¸€ä¸ªçœŸæ­£çš„头疼问题。 + +### Linux 稳定 ABI 里到底包括什么内容? + +ç†è§£ç¨³å®š ABI 有些微妙。需è¦è€ƒè™‘的是,尽管大部分 sysfs 是稳定 ABI,但调试接å£è‚¯å®šæ˜¯ä¸ç¨³å®šçš„,因为它们将内核内部暴露给用户空间。Linus Torvalds 曾表示,“ä¸è¦ç ´å用户空间â€ï¼Œé€šå¸¸æƒ…况下,他是指ä¿æŠ¤é‚£äº› “åªæƒ³å®ƒèƒ½å·¥ä½œâ€ 的普通用户,而ä¸æ˜¯ç³»ç»Ÿç¨‹åºå‘˜å’Œå†…核工程师,åŽè€…应该能够阅读内核文档和æºä»£ç ï¼Œä»¥äº†è§£ä¸åŒç‰ˆæœ¬ä¹‹é—´å‘生了什么å˜åŒ–。下图展示了这个区别。 + +![稳定性ä¿è¯][21] + +普通用户ä¸å¤ªå¯èƒ½ä¸Ž Linux ABI çš„ä¸ç¨³å®šéƒ¨åˆ†è¿›è¡Œäº¤äº’,但系统程åºå‘˜å¯èƒ½æ— æ„中这样åšã€‚除了 `/sys/kernel/debug` 以外,sysfs(`/sys`)和 procfs(`/proc`)的所有部分都是稳定的。 + +那么其他对用户空间å¯è§çš„二进制接å£å¦‚何呢,包括 `/dev` 中的设备文件ã€å†…核日志文件(å¯é€šè¿‡ `dmesg` 命令读å–)ã€æ–‡ä»¶ç³»ç»Ÿå…ƒæ•°æ®æˆ–在内核的 “命令行†中æ供的 “引导å‚æ•°â€ï¼ˆåœ¨å¼•å¯¼åŠ è½½ç¨‹åºå¦‚ GRUB 或 u-boot 中å¯è§ï¼‰å‘¢ï¼Ÿå½“然,“这è¦è§†æƒ…况而定â€ã€‚ + +### 挂载旧文件系统 + +除了 Linux 系统在引导过程中出现挂起之外,文件系统无法挂载是最令人失望的事情。如果文件系统ä½äºŽä»˜è´¹å®¢æˆ·çš„固æ€ç¡¬ç›˜ä¸Šï¼Œé‚£ä¹ˆé—®é¢˜ç¡®å®žå分严é‡ã€‚当内核å‡çº§æ—¶ï¼Œä¸€ä¸ªèƒ½å¤Ÿåœ¨æ—§å†…核版本下挂载的 Linux 文件系统应该ä»ç„¶èƒ½å¤ŸæŒ‚载,对å—?实际上,“这è¦è§†æƒ…况而定â€ã€‚ + +在 2020 年,一ä½å—到伤害的 Linux å¼€å‘人员在内核的邮件列表上 [抱怨é“][23]: + +> 内核已ç»æŽ¥å—这个作为一个有效的å¯æŒ‚载文件系统格å¼ï¼Œæ²¡æœ‰ä»»ä½•é”™è¯¯æˆ–任何类型的警告,而且已ç»è¿™æ ·ç¨³å®šåœ°å·¥ä½œäº†å¤šå¹´â€¦â€¦æˆ‘一直普é地以为,挂载现有的根文件系统属于内核<->用户空间或内核<->现有系统边界的范围,由内核接å—并被现有用户空间æˆåŠŸä½¿ç”¨çš„内容所定义,å‡çº§å†…核应该与现有用户空间和系统兼容。 + +但是有一个问题:这些无法挂载的文件系统是使用一ç§ä¾èµ–于内核定义,但并未被内核使用的标志的专有工具创建的。该标志未出现在 Linux çš„ API 头文件或 procfs/sysfs ä¸­ï¼Œè€Œæ˜¯ä¸€ç§ [实现细节][24]。因此,在用户空间代ç ä¸­è§£é‡Šè¯¥æ ‡å¿—æ„味ç€ä¾èµ–于“[未定义行为][25]â€ï¼Œè¿™æ˜¯ä¸ªå‡ ä¹Žä¼šè®©æ¯ä¸ªè½¯ä»¶å¼€å‘人员都感到战栗的短语。当内核社区改进其内部测试并开始进行新的一致性检查时,“[man 2 mount][26]†系统调用çªç„¶å¼€å§‹æ‹’ç»å…·æœ‰ä¸“有格å¼çš„文件系统。由于该格å¼çš„创建者明确是一ä½è½¯ä»¶å¼€å‘人员,因此他未能得到内核文件系统维护者的åŒæƒ…。 + +![施工标志上写ç€å·¥ä½œäººå‘˜åœ¨æ ‘上进行工作][27] + +### 线程化内核的 dmesg 日志 + +`/dev` 目录中的文件格å¼æ˜¯å¦ä¿è¯ç¨³å®šæˆ–ä¸ç¨³å®šï¼Ÿ[dmesg 命令][28] 会从文件 `/dev/kmsg` 中读å–内容。2018 年,一ä½å¼€å‘人员 [为 dmesg 输出实现了线程化][29],使内核能够“在打å°ä¸€ç³»åˆ— `printk()` 消æ¯åˆ°æŽ§åˆ¶å°æ—¶ï¼Œä¸ä¼šè¢«ä¸­æ–­å’Œ/æˆ–è¢«å…¶ä»–çº¿ç¨‹çš„å¹¶å‘ `printk()` 干扰â€ã€‚å¬èµ·æ¥å¾ˆæ£’ï¼é€šè¿‡åœ¨ `/dev/kmsg` 输出的æ¯ä¸€è¡Œæ·»åŠ çº¿ç¨‹ ID,实现了线程化。密切关注的读者将æ„识到这个改动改å˜äº† `/dev/kmsg` çš„ ABI,这æ„味ç€è§£æžè¯¥æ–‡ä»¶çš„应用程åºä¹Ÿéœ€è¦è¿›è¡Œç›¸åº”的修改。由于许多å‘行版没有编译å¯ç”¨æ–°åŠŸèƒ½çš„内核,大多数使用 `/bin/dmesg` 的用户å¯èƒ½æ²¡æœ‰æ³¨æ„到这件事,但这个改动破å了 [GDB 调试器][30] 读å–内核日志的能力。 + +确实,æ•é”的读者会认为 GDB 的用户è¿æ°”ä¸ä½³ï¼Œå› ä¸ºè°ƒè¯•å™¨æ˜¯å¼€å‘人员工具。实际上并éžå¦‚此,因为需è¦æ›´æ–°ä»¥æ”¯æŒæ–°çš„ `/dev/kmsg` æ ¼å¼çš„代ç ä½äºŽå†…核自己的 Git æºä»£ç åº“çš„ “树内†部分。对于一个正常的项目æ¥è¯´ï¼Œå•ä¸ªä»£ç åº“内的程åºæ— æ³•ååŒå·¥ä½œå°±æ˜¯ä¸€ä¸ªæ˜Žæ˜¾çš„错误,因此已ç»åˆå¹¶äº†ä¸€ä»½ [使 GDB 能够与线程化的 /dev/kmsg 一起工作的补ä¸][32]。 + +### 那么 BPF 程åºå‘¢ï¼Ÿ + +[BPF][33] 是一ç§å¼ºå¤§çš„工具,å¯ä»¥åœ¨è¿è¡Œçš„内核中监控甚至实时进行é…置。BPF 最åˆçš„目的是通过å…许系统管ç†å‘˜å³æ—¶ä»Žå‘½ä»¤è¡Œä¿®æ”¹æ•°æ®åŒ…过滤器,从而支æŒå®žæ—¶ç½‘络é…置。[Alexei Starovoitov 和其他人æžå¤§åœ°æ‰©å±•äº† BPF][34],使其能够跟踪任æ„内核函数。跟踪明显是开å‘人员的领域,而ä¸æ˜¯æ™®é€šç”¨æˆ·ï¼Œå› æ­¤å®ƒæ˜¾ç„¶ä¸å—任何 ABI ä¿è¯çš„约æŸï¼ˆå°½ç®¡ [bpf() 系统调用][35] 具有与其他系统调用相åŒçš„稳定性承诺)。å¦ä¸€æ–¹é¢ï¼Œåˆ›å»ºæ–°åŠŸèƒ½çš„ BPF 程åºä¸ºâ€œ[å–代内核模å—æˆä¸ºæ‰©å±•å†…核的事实标准手段][36]â€æ供了å¯èƒ½æ€§ã€‚内核模å—使设备ã€æ–‡ä»¶ç³»ç»Ÿã€åŠ å¯†ã€ç½‘络等工作正常,因此明显是“åªå¸Œæœ›å®ƒå·¥ä½œâ€çš„普通用户所ä¾èµ–的设施。问题是,与大多数开æºå†…核模å—ä¸åŒï¼ŒBPF 程åºä¼ ç»Ÿä¸Šä¸åœ¨å†…æ ¸æºä»£ç ä¸­ã€‚ + +2022 年春季,[一个æ案][37] æˆä¸ºäº†ç„¦ç‚¹ï¼Œè¯¥æ案æ议使用微型 BPF 程åºè€Œä¸æ˜¯è®¾å¤‡é©±åŠ¨ç¨‹åºè¡¥ä¸ï¼Œå¯¹å¹¿æ³›çš„人机接å£è®¾å¤‡ï¼ˆå¦‚鼠标和键盘)æ供支æŒã€‚ + +éšåŽè¿›è¡Œäº†ä¸€åœºæ¿€çƒˆçš„讨论,但这个问题显然在 [Torvalds 在开æºå³°ä¼šä¸Šçš„评论][38] 中得到解决: + +> 他指出,如果你破å了“普通(éžå†…核开å‘人员)用户使用的真实用户空间工具â€ï¼Œé‚£ä¹ˆä½ éœ€è¦ä¿®å¤å®ƒï¼Œæ— è®ºæ˜¯å¦ä½¿ç”¨äº† eBPF。 + +一致æ„è§ä¼¼ä¹Žæ­£åœ¨å½¢æˆï¼Œå³å¸Œæœ›å…¶ BPF 程åºåœ¨å†…核更新åŽä»èƒ½æ­£å¸¸å·¥ä½œçš„å¼€å‘人员 [将需è¦å°†å…¶æ交到内核æºä»£ç åº“中一个尚未指定的ä½ç½®][39]。敬请关注åŽç»§å‘展,以了解内核社区对于 BPF å’Œ ABI 稳定性将采å–什么样的政策。 + +### 结论 + +内核的 ABI 稳定性ä¿è¯é€‚用于 procfsã€sysfs 和系统调用接å£ï¼Œä½†ä¹Ÿå­˜åœ¨é‡è¦çš„例外情况。当内核å˜æ›´ç ´å了“树内â€ä»£ç æˆ–用户空间应用程åºæ—¶ï¼Œé€šå¸¸ä¼šè¿…速回滚有问题的补ä¸ã€‚对于ä¾èµ–内核实现细节的专有代ç ï¼Œå°½ç®¡è¿™äº›ç»†èŠ‚å¯ä»¥ä»Žç”¨æˆ·ç©ºé—´è®¿é—®ï¼Œä½†å®ƒå¹¶æ²¡æœ‰å—到ä¿æŠ¤ï¼Œå¹¶ä¸”在出现问题时得到的åŒæƒ…有é™ã€‚å½“åƒ Y2038 这样的问题无法é¿å… ABI ç ´å时,会以尽å¯èƒ½æ…Žé‡å’Œç³»ç»ŸåŒ–çš„æ–¹å¼è¿›è¡Œè¿‡æ¸¡ã€‚è€Œåƒ BPF 程åºè¿™æ ·çš„新功能æ出了关于 ABI 稳定性边界的尚未解答的问题。 + +### 致谢 + +æ„Ÿè°¢ [Akkana Peck][40]ã€[Sarah R. Newman][41] å’Œ [Luke S. Crawford][42] 对早期版本æ料的有益评论。 + +*(题图:MJ/da788385-ca24-4be5-bc27-ad7e7ef75973)* + +-------------------------------------------------------------------------------- + +via: https://opensource.com/article/22/12/linux-abi + +作者:[Alison Chaiken][a] +选题:[lkxed][b] +译者:ChatGPT +校对:[wxy](https://github.com/wxy) + +本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) è£èª‰æŽ¨å‡º + +[a]: https://opensource.com/users/chaiken +[b]: https://github.com/lkxed +[1]: https://lkml.org/lkml/2018/12/22/232 +[2]: https://www.w3.org/TR/2014/REC-html5-20141028/ +[3]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi-stable.html#the-kernel-syscall-interface +[4]: https://www.man7.org/linux/man-pages/dir_section_2.html +[5]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi.html +[6]: https://opensource.com/article/19/3/virtual-filesystems-linux +[7]: https://gcc.gnu.org/ +[8]: https://clang.llvm.org/get_started.html +[9]: https://www.gnu.org/software/libc/ +[10]: https://www.man7.org/linux/man-pages/man5/elf.5.html +[11]: https://opensource.com/sites/default/files/2022-11/1cooperation.png +[12]: https://opensource.com/sites/default/files/2022-12/better_apt-file-find_ABI-boundary.png +[13]: https://en.wikipedia.org/wiki/Conway's_law +[14]: https://www.phoronix.com/news/MTc2Mjg +[15]: https://www.lfenergy.org/projects/super-advanced-meter-sam/ +[16]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506899/ +[17]: https://github.com/torvalds/linux/blob/master/include/uapi/linux/time_types.h +[18]: https://opensource.com/sites/default/files/2022-11/3speedometerrollingover_0.jpg +[19]: https://www.phoronix.com/scan.php?page=news_item&px=Glibc-More-Y2038-Work +[20]: https://groups.google.com/g/linux.debian.ports.arm/c/_KBFSz4YRZs +[21]: https://opensource.com/sites/default/files/2022-11/4stability.png +[22]: https://lwn.net/Articles/833696/ +[23]: https://lwn.net/ml/linux-kernel/20201006050306.GA8098@localhost/ +[24]: https://en.wikipedia.org/wiki/Encapsulation_(computer_programming) +[25]: https://en.wikipedia.org/wiki/Undefined_behavior +[26]: https://www.man7.org/linux/man-pages/man2/mount.2.html +[27]: https://opensource.com/sites/default/files/2022-11/5crewworkingintrees.jpg +[28]: https://www.man7.org/linux/man-pages/man1/dmesg.1.html +[29]: https://lkml.org/lkml/2018/11/24/180 +[30]: https://sourceware.org/gdb/current/onlinedocs/gdb/ +[31]: https://unix.stackexchange.com/questions/208638/linux-kernel-meaning-of-source-tree-in-tree-and-out-of-tree +[32]: https://lore.kernel.org/all/20191011142500.2339-1-joel.colledge@linbit.com/ +[33]: https://opensource.com/article/19/8/introduction-bpftrace +[34]: https://lwn.net/Articles/740157/ +[35]: https://www.man7.org/linux/man-pages/man2/bpf.2.html +[36]: https://lwn.net/Articles/909095/ +[37]: https://lwn.net/ml/ksummit-discuss/CAO-hwJJxCteD_BHZTeqQ1f7gWOHoj+05qP8bmFsRYVfMc_3FxQ@mail.gmail.com/ +[38]: https://lwn.net/ml/ksummit-discuss/20220621110514.6ef174d0@rorschach.local.home/ +[39]: https://lwn.net/ml/ksummit-discuss/20220616125128.68151432@gandalf.local.home/ +[40]: https://shallowsky.com/blog/ +[41]: https://www.socallinuxexpo.org/scale/19x/presentations/live-patching-down-trenches-view +[42]: https://www.amazon.com/Book-Xen-Practical-System-Administrator/dp/1593271867 +[0]: https://img.linux.net.cn/data/attachment/album/202307/15/114240eo7her2zbdqqp448.jpg \ No newline at end of file diff --git a/sources/tech/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md b/sources/tech/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md deleted file mode 100644 index eec93fd329..0000000000 --- a/sources/tech/20221206.4 â­ï¸â­ï¸â­ï¸ A 10-minute guide to the Linux ABI.md +++ /dev/null @@ -1,151 +0,0 @@ -[#]: subject: "A 10-minute guide to the Linux ABI" -[#]: via: "https://opensource.com/article/22/12/linux-abi" -[#]: author: "Alison Chaiken https://opensource.com/users/chaiken" -[#]: collector: "lkxed" -[#]: translator: " " -[#]: reviewer: " " -[#]: publisher: " " -[#]: url: " " - -A 10-minute guide to the Linux ABI -====== - -Many Linux enthusiasts are familiar with Linus Torvalds' [famous admonition][1], "we don't break user space," but perhaps not everyone who recognizes the phrase is certain about what it means. - -The "#1 rule" reminds developers about the stability of the applications' binary interface via which applications communicate with and configure the kernel. What follows is intended to familiarize readers with the concept of an ABI, describe why ABI stability matters, and discuss precisely what is included in Linux's stable ABI. The ongoing growth and evolution of Linux necessitate changes to the ABI, some of which have been controversial. - -### What is an ABI? - -ABI stands for Applications Binary Interface. One way to understand the concept of an ABI is to consider what it is not. Applications Programming Interfaces (APIs) are more familiar to many developers. Generally, the headers and documentation of libraries are considered to be their API, as are standards documents like those for [HTML5][2], for example. Programs that call into libraries or exchange string-formatted data must comply with the conventions described in the API or expect unwanted results. - -ABIs are similar to APIs in that they govern the interpretation of commands and exchange of binary data. For C programs, the ABI generally comprises the return types and parameter lists of functions, the layout of structs, and the meaning, ordering, and range of enumerated types. The Linux kernel remains, as of 2022, almost entirely a C program, so it must adhere to these specifications. - -"[The kernel syscall interface][3]" is described by [Section 2 of the Linux man pages][4] and includes the C versions of familiar functions like "mount" and "sync" that are callable from middleware applications. The binary layout of these functions is the first major part of Linux's ABI. In answer to the question, "What is in Linux's stable ABI?" many users and developers will respond with "the contents of sysfs (/sys) and procfs (/proc)." In fact, the [official Linux ABI documentation][5] concentrates mostly on these [virtual filesystems][6]. - -The preceding text focuses on how the Linux ABI is exercised by programs but fails to capture the equally important human aspect. As the figure below illustrates, the functionality of the ABI requires a joint, ongoing effort by the kernel community, C compilers (such as [GCC][7] or [clang][8]), the developers who create the userspace C library (most commonly [glibc][9]) that implements system calls, and binary applications, which much be laid out in accordance with the Executable and Linking Format ([ELF][10]). - -![Cooperation within the development community][11] - -### Why do we care about the ABI? - -The Linux ABI stability guarantee that comes from Torvalds himself enables Linux distros and individual users to update the kernel independently of the operating system. - -If Linux did not have a stable ABI, then every time the kernel needed patching to address a security problem, a large part of the operating system, if not the entirety, would need to be reinstalled. Obviously, the stability of the binary interface is a major contributing factor to Linux's usability and wide adoption. - -![Terminal output][12] - -As the second figure illustrates, both the kernel (in linux-libc-dev) and Glibc (in `libc6-dev`) provide bitmasks that define file permissions. Obviously the two sets of definitions must agree! The `apt` package manager identifies which software project provided each file. The potentially unstable part of Glibc's ABI is found in the `bits/` directory. - -For the most part, the Linux ABI stability guarantee works just fine. In keeping with [Conway's Law][13], vexing technical issues that arise in the course of development most frequently occur due to misunderstandings or disagreements between different software development communities that contribute to Linux. The interface between communities is easy to envision via Linux package-manager metadata, as shown in the image above. - -### Y2038: An example of an ABI break - -The Linux ABI is best understood by considering the example of the ongoing, [slow-motion][14] "Y2038" ABI break. In January 2038, 32-bit time counters will roll over to all zeroes, just like the odometer of an older vehicle. January 2038 sounds far away, but assuredly many IoT devices sold in 2022 will still be operational. Mundane products like [smart electrical meters][15] and [smart parking systems][16] installed this year may or may not have 32-bit processor architectures and may or may not support software updates. - -The Linux kernel has already moved to a 64-bit `time_t` opaque data type internally to represent later timepoints. The implication is that system calls like `time()` have already changed their function signature on 64-bit systems. The arduousness of these efforts is on ready display in kernel headers like [time_types.h][17], which includes new and "_old" versions of data structures. - -![Odometer rolling over][18] - -The Glibc project also [supports 64-bit time][19], so yay, we're done, right? Unfortunately, no, as a [discussion on the Debian mailing list][20] makes clear. Distros are faced with the unenviable choice of either providing two versions of all binary packages for 32-bit systems or two versions of installation media. In the latter case, users of 32-bit time will have to recompile their applications and reinstall. As always, proprietary applications will be a real headache. - -### What precisely is in the Linux stable ABI anyway? - -Understanding the stable ABI is a bit subtle. Consider that, while most of sysfs is stable ABI, the debug interfaces are guaranteed to be _un_stable since they expose kernel internals to userspace. In general, Linus Torvalds has pronounced that by "don't break userspace," he means to protect ordinary users who "just want it to work" rather than system programmers and kernel engineers, who should be able to read the kernel documentation and source code to figure out what has changed between releases. The distinction is illustrated in the figure below. - -![Stability guarantee][21] - -Ordinary users are unlikely to interact with unstable parts of the Linux ABI, but system programmers may do so inadvertently. All of sysfs (`/sys`) and procfs (`/proc`) are guaranteed stable except for `/sys/kernel/debug`. - -But what about other binary interfaces that are userspace-visible, including miscellaneous ABI bits like device files in `/dev`, the kernel log file (readable with the `dmesg` command), filesystem metadata, or "bootargs" provided on the kernel "command line" that are visible in a bootloader like GRUB or u-boot? Naturally, "it depends." - -### Mounting old filesystems - -Next to observing a Linux system hang during the boot sequence, having a filesystem fail to mount is the greatest disappointment. If the filesystem resides on an SSD belonging to a paying customer, the matter is grave indeed. Surely a Linux filesystem that mounts with an old kernel version will still mount when the kernel is upgraded, right? Actually, "[it depends][22]." - -In 2020 an aggrieved Linux developer [complained on the kernel's mailing list][23]: - -> The kernel already accepted this as a valid mountable filesystem format, without a single error or warning of any kind, and has done so stably for years. . . . I was generally under the impression that mounting existing root filesystems fell under the scope of the kernel<->userspace or kernel<->existing-system boundary, as defined by what the kernel accepts and existing userspace has used successfully, and that upgrading the kernel should work with existing userspace and systems. - -But there was a catch: The filesystems that failed to mount were created with a proprietary tool that relied on a flag that was defined but not used by the kernel. The flag did not appear in Linux's API header files or procfs/sysfs but was instead an [implementation detail][24]. Therefore, interpreting the flag in userspace code meant relying on "[undefined behavior][25]," a phrase that will make software developers almost universally shudder. When the kernel community improved its internal testing and started making new consistency checks, the "[man 2 mount][26]" system call suddenly began rejecting filesystems with the proprietary format. Since the format creator was decidedly a software developer, he got little sympathy from kernel filesystem maintainers. - -![Construction sign reading crews working in trees][27] - -### Threading the kernel dmesg log - -Is the format of files in `/dev` guaranteed stable or not? The [command dmesg][28] reads from the file `/dev/kmsg`. In 2018, a developer [made output to dmesg threaded][29], enabling the kernel "to print a series of printk() messages to consoles without being disturbed by concurrent printk() from interrupts and/or other threads." Sounds excellent! Threading was made possible by adding a thread ID to each line of the `/dev/kmsg` output. Readers following closely will realize that the addition changed the ABI of `/dev/kmsg`, meaning that applications that parse that file needed to change too. Since many distros didn't compile their kernels with the new feature enabled, most users of `/bin/dmesg` won't have noticed, but the change broke the [GDB debugger][30]'s ability to read the kernel log. - -Assuredly, astute readers will think users of GDB are out of luck because debuggers are developer tools. Actually, no, since the code that needed to be updated to support the new `/dev/kmsg` format was "[in-tree][31]," meaning part of the kernel's own Git source repository. The failure of programs within a single repo to work together is just an out-and-out bug for any sane project, and a [patch that made GDB work with threaded /dev/kmsg][32] was merged. - -### What about BPF programs? - -[BPF][33] is a powerful tool to monitor and even configure the running kernel dynamically. BPF's original purpose was to support on-the-fly network configuration by allowing sysadmins to modify packet filters from the command line instantly. [Alexei Starovoitov and others greatly extended BPF][34], giving it the power to trace arbitrary kernel functions. Tracing is clearly the domain of developers rather than ordinary users, so it is certainly not subject to any ABI guarantee (although the [bpf() system call][35] has the same stability promise as any other). On the other hand, BPF programs that create new functionality present the possibility of "[replacing kernel modules as the de-facto means of extending the kernel][36]." Kernel modules make devices, filesystems, crypto, networks, and the like work, and therefore clearly are a facility on which the "just want it to work" user relies. The problem arises that BFP programs have not traditionally been "in-tree" as most open-source kernel modules are. A proposal in spring 2022 to [provide support to the vast array of human interface devices (HIDs) like mice and keyboards via tiny BPF programs][37] rather than patches to device drivers brought the issue into sharp focus. - -A rather heated discussion followed, but the issue was apparently settled by [Torvalds' comments at Open Source Summit][38]: - -> He specified if you break 'real user space tools, that normal (non-kernel developers) users use,' then you need to fix it, regardless of whether it is using eBPF or not. - -A consensus appears to be forming that developers who expect their BPF programs to withstand kernel updates [will need to submit them to an as-yet unspecified place in the kernel source repository][39]. Stay tuned to find out what policy the kernel community adopts regarding BPF and ABI stability. - -### Conclusion - -The kernel ABI stability guarantee applies to procfs, sysfs, and the system call interface, with important exceptions. When "in-tree" code or userspace applications are "broken" by kernel changes, the offending patches are typically quickly reverted. When proprietary code relies on kernel implementation details that are incidentally accessible from userspace, it is not protected and garners little sympathy when it breaks. When, as with Y2038, there is no way to avoid an ABI break, the transition is made as thoughtfully and methodically as possible. Newer features like BPF programs present as-yet-unanswered questions about where exactly the ABI-stability border lies. - -##### Acknowledgments - -Thanks to [Akkana Peck][40], [Sarah R. Newman][41], and [Luke S. Crawford][42] for their helpful comments on early versions of this material. - --------------------------------------------------------------------------------- - -via: https://opensource.com/article/22/12/linux-abi - -作者:[Alison Chaiken][a] -选题:[lkxed][b] -译者:[译者ID](https://github.com/译者ID) -校对:[校对者ID](https://github.com/校对者ID) - -本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) è£èª‰æŽ¨å‡º - -[a]: https://opensource.com/users/chaiken -[b]: https://github.com/lkxed -[1]: https://lkml.org/lkml/2018/12/22/232 -[2]: https://www.w3.org/TR/2014/REC-html5-20141028/ -[3]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi-stable.html#the-kernel-syscall-interface -[4]: https://www.man7.org/linux/man-pages/dir_section_2.html -[5]: https://www.kernel.org/doc/html/v6.0/admin-guide/abi.html -[6]: https://opensource.com/article/19/3/virtual-filesystems-linux -[7]: https://gcc.gnu.org/ -[8]: https://clang.llvm.org/get_started.html -[9]: https://www.gnu.org/software/libc/ -[10]: https://www.man7.org/linux/man-pages/man5/elf.5.html -[11]: https://opensource.com/sites/default/files/2022-11/1cooperation.png -[12]: https://opensource.com/sites/default/files/2022-12/better_apt-file-find_ABI-boundary.png -[13]: https://en.wikipedia.org/wiki/Conway's_law -[14]: https://www.phoronix.com/news/MTc2Mjg -[15]: https://www.lfenergy.org/projects/super-advanced-meter-sam/ -[16]: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506899/ -[17]: https://github.com/torvalds/linux/blob/master/include/uapi/linux/time_types.h -[18]: https://opensource.com/sites/default/files/2022-11/3speedometerrollingover_0.jpg -[19]: https://www.phoronix.com/scan.php?page=news_item&px=Glibc-More-Y2038-Work -[20]: https://groups.google.com/g/linux.debian.ports.arm/c/_KBFSz4YRZs -[21]: https://opensource.com/sites/default/files/2022-11/4stability.png -[22]: https://lwn.net/Articles/833696/ -[23]: https://lwn.net/ml/linux-kernel/20201006050306.GA8098@localhost/ -[24]: https://en.wikipedia.org/wiki/Encapsulation_(computer_programming) -[25]: https://en.wikipedia.org/wiki/Undefined_behavior -[26]: https://www.man7.org/linux/man-pages/man2/mount.2.html -[27]: https://opensource.com/sites/default/files/2022-11/5crewworkingintrees.jpg -[28]: https://www.man7.org/linux/man-pages/man1/dmesg.1.html -[29]: https://lkml.org/lkml/2018/11/24/180 -[30]: https://sourceware.org/gdb/current/onlinedocs/gdb/ -[31]: https://unix.stackexchange.com/questions/208638/linux-kernel-meaning-of-source-tree-in-tree-and-out-of-tree -[32]: https://lore.kernel.org/all/20191011142500.2339-1-joel.colledge@linbit.com/ -[33]: https://opensource.com/article/19/8/introduction-bpftrace -[34]: https://lwn.net/Articles/740157/ -[35]: https://www.man7.org/linux/man-pages/man2/bpf.2.html -[36]: https://lwn.net/Articles/909095/ -[37]: https://lwn.net/ml/ksummit-discuss/CAO-hwJJxCteD_BHZTeqQ1f7gWOHoj+05qP8bmFsRYVfMc_3FxQ@mail.gmail.com/ -[38]: https://lwn.net/ml/ksummit-discuss/20220621110514.6ef174d0@rorschach.local.home/ -[39]: https://lwn.net/ml/ksummit-discuss/20220616125128.68151432@gandalf.local.home/ -[40]: https://shallowsky.com/blog/ -[41]: https://www.socallinuxexpo.org/scale/19x/presentations/live-patching-down-trenches-view -[42]: https://www.amazon.com/Book-Xen-Practical-System-Administrator/dp/1593271867