Merge pull request #3854 from cposture/master

Translated Data Structures in the Linux Kernel
2025-02-25 00:50:15 +08:00 · 2016-02-28 18:30:55 +08:00 · 2016-02-28 18:30:55 +08:00 · abd6f048f4
commit abd6f048f4
parent d977e0b5f5 938cbe4b34
2 changed files with 198 additions and 202 deletions
--- a/sources/tech/20151123
+++ b/sources/tech/20151123
@ -1,202 +0,0 @@
-【Translating By cposture 2016-02-26】
-Data Structures in the Linux Kernel
-================================================================================
-
-Radix tree
--------------------------------------------------------------------------------
-
-As you already know linux kernel provides many different libraries and functions which implement different data structures and algorithms. In this part we will consider one of these data structures - [Radix tree](http://en.wikipedia.org/wiki/Radix_tree). There are two files which are related to `radix tree` implementation and API in the linux kernel:
-
-* [include/linux/radix-tree.h](https://github.com/torvalds/linux/blob/master/include/linux/radix-tree.h)
-* [lib/radix-tree.c](https://github.com/torvalds/linux/blob/master/lib/radix-tree.c)
-
-Lets talk about what a `radix tree` is. Radix tree is a `compressed trie` where a [trie](http://en.wikipedia.org/wiki/Trie) is a data structure which implements an interface of an associative array and allows to store values as `key-value`. The keys are usually strings, but any data type can be used. A trie is different from an `n-tree` because of its nodes. Nodes of a trie do not store keys; instead, a node of a trie stores single character labels. The key which is related to a given node is derived by traversing from the root of the tree to this node. For example:
-
-
-```
-               +-----------+
-               |           |
-               |    " "    |
-               |           |
-        +------+-----------+------+
-        |                         |
-        |                         |
-   +----v------+            +-----v-----+
-   |           |            |           |
-   |    g      |            |     c     |
-   |           |            |           |
-   +-----------+            +-----------+
-        |                         |
-        |                         |
-   +----v------+            +-----v-----+
-   |           |            |           |
-   |    o      |            |     a     |
-   |           |            |           |
-   +-----------+            +-----------+
-                                  |
-                                  |
-                            +-----v-----+
-                            |           |
-                            |     t     |
-                            |           |
-                            +-----------+
-```
-
-So in this example, we can see the `trie` with keys, `go` and `cat`. The compressed trie or `radix tree` differs from `trie` in that all intermediates nodes which have only one child are removed.
-
-Radix tree in linux kernel is the datastructure which maps values to integer keys. It is represented by the following structures from the file [include/linux/radix-tree.h](https://github.com/torvalds/linux/blob/master/include/linux/radix-tree.h):
-
-```C
-struct radix_tree_root {
-         unsigned int            height;
-         gfp_t                   gfp_mask;
-         struct radix_tree_node  __rcu *rnode;
-};
-```
-
-This structure presents the root of a radix tree and contains three fields:
-
-* `height`   - height of the tree;
-* `gfp_mask` - tells how memory allocations will be performed;
-* `rnode`    - pointer to the child node.
-
-The first field we will discuss is `gfp_mask`:
-
-Low-level kernel memory allocation functions take a set of flags as - `gfp_mask`, which describes how that allocation is to be performed. These `GFP_` flags which control the allocation process can have following values: (`GF_NOIO` flag) means sleep and wait for memory, (`__GFP_HIGHMEM` flag) means high memory can be used, (`GFP_ATOMIC` flag) means the allocation process has high-priority and can't sleep etc.
-
-* `GFP_NOIO` - can sleep and wait for memory;
-* `__GFP_HIGHMEM` - high memory can be used;
-* `GFP_ATOMIC` - allocation process is high-priority and can't sleep;
-
-etc.
-
-The next field is `rnode`:
-
-```C
-struct radix_tree_node {
-        unsigned int    path;
-        unsigned int    count;
-        union {
-                struct {
-                        struct radix_tree_node *parent;
-                        void *private_data;
-                };
-                struct rcu_head rcu_head;
-        };
-        /* For tree user */
-        struct list_head private_list;
-        void __rcu      *slots[RADIX_TREE_MAP_SIZE];
-        unsigned long   tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS];
-};
-```
-
-This structure contains information about the offset in a parent and height from the bottom, count of the child nodes and fields for accessing and freeing a node. This fields are described below:
-
-* `path` - offset in parent & height from the bottom;
-* `count` - count of the child nodes;
-* `parent` - pointer to the parent node;
-* `private_data` - used by the user of a tree;
-* `rcu_head` - used for freeing a node;
-* `private_list` - used by the user of a tree;
-
-The two last fields of the `radix_tree_node` - `tags` and `slots` are important and interesting. Every node can contains a set of slots which are store pointers to the data. Empty slots in the linux kernel radix tree implementation store `NULL`. Radix trees in the linux kernel also supports tags which are associated with the `tags` fields in the `radix_tree_node` structure. Tags allow individual bits to be set on records which are stored in the radix tree.
-
-Now that we know about radix tree structure, it is time to look on its API.
-
-Linux kernel radix tree API
---------------------------------------------------------------------------------
-
-We start from the datastructure initialization. There are two ways to initialize a new radix tree. The first is to use `RADIX_TREE` macro:
-
-```C
-RADIX_TREE(name, gfp_mask);
-````
-
-As you can see we pass the `name` parameter, so with the `RADIX_TREE` macro we can define and initialize radix tree with the given name. Implementation of the `RADIX_TREE` is easy:
-
-```C
-#define RADIX_TREE(name, mask) \
-         struct radix_tree_root name = RADIX_TREE_INIT(mask)
-
-#define RADIX_TREE_INIT(mask)   { \
-        .height = 0,              \
-        .gfp_mask = (mask),       \
-        .rnode = NULL,            \
-}
-```
-
-At the beginning of the `RADIX_TREE` macro we define instance of the `radix_tree_root` structure with the given name and call `RADIX_TREE_INIT` macro with the given mask. The `RADIX_TREE_INIT` macro just initializes `radix_tree_root` structure with the default values and the given mask.
-
-The second way is to define `radix_tree_root` structure by hand and pass it with mask to the `INIT_RADIX_TREE` macro:
-
-```C
-struct radix_tree_root my_radix_tree;
-INIT_RADIX_TREE(my_tree, gfp_mask_for_my_radix_tree);
-```
-
-where:
-
-```C
-#define INIT_RADIX_TREE(root, mask)  \
-do {                                 \
-        (root)->height = 0;          \
-        (root)->gfp_mask = (mask);   \
-        (root)->rnode = NULL;        \
-} while (0)
-```
-
-makes the same initialziation with default values as it does `RADIX_TREE_INIT` macro.
-
-The next are two functions for inserting and deleting records to/from a radix tree:
-
-* `radix_tree_insert`;
-* `radix_tree_delete`;
-
-The first `radix_tree_insert` function takes three parameters:
-
-* root of a radix tree;
-* index key;
-* data to insert;
-
-The `radix_tree_delete` function takes the same set of parameters as the `radix_tree_insert`, but without data.
-
-The search in a radix tree implemented in two ways:
-
-* `radix_tree_lookup`;
-* `radix_tree_gang_lookup`;
-* `radix_tree_lookup_slot`.
-
-The first `radix_tree_lookup` function takes two parameters:
-
-* root of a radix tree;
-* index key;
-
-This function tries to find the given key in the tree and return the record associated with this key. The second `radix_tree_gang_lookup` function have the following signature
-
-```C
-unsigned int radix_tree_gang_lookup(struct radix_tree_root *root,
-                                    void **results,
-                                    unsigned long first_index,
-                                    unsigned int max_items);
-```
-
-and returns number of records, sorted by the keys, starting from the first index. Number of the returned records will not be greater than `max_items` value.
-
-And the last `radix_tree_lookup_slot` function will return the slot which will contain the data.
-
-Links
---------------------------------------------------------------------------------
-
-* [Radix tree](http://en.wikipedia.org/wiki/Radix_tree)
-* [Trie](http://en.wikipedia.org/wiki/Trie)
-
--------------------------------------------------------------------------------
-
-via: https://github.com/0xAX/linux-insides/edit/master/DataStructures/radix-tree.md
-
-作者：[0xAX]
-译者：[译者ID](https://github.com/译者ID)
-校对：[校对者ID](https://github.com/校对者ID)
-
-本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译，[Linux中国](http://linux.cn/) 荣誉推出
-
--- a/translated/tech/20151123
+++ b/translated/tech/20151123
@ -0,0 +1,198 @@
+Linux内核数据结构
+================================================================================
+
+基数树 Radix tree
+--------------------------------------------------------------------------------
+正如你所知道的，Linux内核提供了许多不同的库和函数，它们实现了不同的数据结构和算法。在这部分，我们将研究其中一种数据结构——[基数树 Radix tree](http://en.wikipedia.org/wiki/Radix_tree)。在Linux内核中，有两个与基数树实现和API相关的文件：
+
+* [include/linux/radix-tree.h](https://github.com/torvalds/linux/blob/master/include/linux/radix-tree.h)
+* [lib/radix-tree.c](https://github.com/torvalds/linux/blob/master/lib/radix-tree.c)
+
+让我们讨论什么是`基数树`吧。基数树是一种`压缩的字典树`，而[字典树](http://en.wikipedia.org/wiki/Trie)是实现了关联数组接口并允许以`键值对`方式存储值的一种数据结构。该键通常是字符串，但能够使用任何数据类型。字典树因为它的节点而与`n叉树`不同。字典树的节点不存储键；相反，字典树的一个节点存储单个字符的标签。与一个给定节点关联的键可以通过从根遍历到该节点获得。举个例子：
+
+```
+               +-----------+
+               |           |
+               |    " "    |
+               |           |
+        +------+-----------+------+
+        |                         |
+        |                         |
+   +----v------+            +-----v-----+
+   |           |            |           |
+   |    g      |            |     c     |
+   |           |            |           |
+   +-----------+            +-----------+
+        |                         |
+        |                         |
+   +----v------+            +-----v-----+
+   |           |            |           |
+   |    o      |            |     a     |
+   |           |            |           |
+   +-----------+            +-----------+
+                                  |
+                                  |
+                            +-----v-----+
+                            |           |
+                            |     t     |
+                            |           |
+                            +-----------+
+```
+
+因此在这个例子中，我们可以看到一个有着两个键`go`和`cat`的`字典树`。压缩的字典树或者`基数树`和`字典树`不同于所有只有一个孩子的中间节点都被删除。
+
+Linu内核中的基数树是映射值到整形键的一种数据结构。[include/linux/radix-tree.h](https://github.com/torvalds/linux/blob/master/include/linux/radix-tree.h)文件中的以下结构体表示了基数树：
+
+```C
+struct radix_tree_root {
+         unsigned int            height;
+         gfp_t                   gfp_mask;
+         struct radix_tree_node  __rcu *rnode;
+};
+```
+
+这个结构体表示了一个基数树的根，并包含了3个域成员：
+
+* `height`   - 树的高度;
+* `gfp_mask` - 告诉如何执行动态内存分配;
+* `rnode`    - 孩子节点指针.
+
+我们第一个要讨论的域是`gfp_mask`：
+
+底层内核内存动态分配函数以一组标志作为` gfp_mask `，用于描述如何执行动态内存分配。这些控制分配进程的`GFP_`标志拥有以下值：(`GF_NOIO`标志)意味着睡眠等待内存，(`__GFP_HIGHMEM`标志)意味着高端内存能够被使用，(`GFP_ATOMIC`标志)意味着分配进程拥有高优先级并不能睡眠等等。
+
+* `GFP_NOIO` - 睡眠等待内存
+* `__GFP_HIGHMEM` - 高端内存能够被使用;
+* `GFP_ATOMIC` - 分配进程拥有高优先级并且不能睡眠;
+
+等等。
+
+下一个域是`rnode`：
+
+```C
+struct radix_tree_node {
+        unsigned int    path;
+        unsigned int    count;
+        union {
+                struct {
+                        struct radix_tree_node *parent;
+                        void *private_data;
+                };
+                struct rcu_head rcu_head;
+        };
+        /* For tree user */
+        struct list_head private_list;
+        void __rcu      *slots[RADIX_TREE_MAP_SIZE];
+        unsigned long   tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS];
+};
+```
+这个结构体包含的信息有父节点中的偏移以及到底端(叶节点)的高度、孩子节点的个数以及用于访问和释放节点的域成员。这些域成员描述如下：
+
+* `path` - 父节点中的偏移和到底端(叶节点)的高度 
+* `count` - 孩子节点的个数;
+* `parent` - 父节点指针;
+* `private_data` - 由树的用户使用;
+* `rcu_head` - 用于释放节点;
+* `private_list` - 由树的用户使用;
+
+`radix_tree_node`的最后两个成员——`tags`和`slots`非常重要且令人关注。Linux内核基数树的每个节点都包含一组存储指向数据指针的slots。Linux内核基数树实现的空slots存储`NULL`值。Linux内核中的基数树也支持与`radix_tree_node`结构体的`tags`域相关联的标签。标签允许在基数树存储的记录中设置各个位。
+
+既然我们了解了基数树的结构，那么该是时候看一下它的API了。
+
+Linux内核基数树API
+---------------------------------------------------------------------------------
+
+我们从结构体的初始化开始。有两种方法初始化一个新的基数树。第一种是使用`RADIX_TREE`宏：
+
+```C
+RADIX_TREE(name, gfp_mask);
+````
+
+正如你所看到的，我们传递`name`参数，所以使用`RADIX_TREE`宏，我们能够定义和初始化基数树为给定的名字。`RADIX_TREE`的实现是简单的：
+
+```C
+#define RADIX_TREE(name, mask) \
+         struct radix_tree_root name = RADIX_TREE_INIT(mask)
+
+#define RADIX_TREE_INIT(mask)   { \
+        .height = 0,              \
+        .gfp_mask = (mask),       \
+        .rnode = NULL,            \
+}
+```
+
+在`RADIX_TREE`宏的开始，我们使用给定的名字定义`radix_tree_root`结构体实例，并使用给定的mask调用`RADIX_TREE_INIT`宏。`RADIX_TREE_INIT`宏只是初始化`radix_tree_root`结构体为默认值和给定的mask而已。
+
+第二种方法是亲手定义`radix_tree_root`结构体，并且将它和mask传给`INIT_RADIX_TREE`宏：
+
+```C
+struct radix_tree_root my_radix_tree;
+INIT_RADIX_TREE(my_tree, gfp_mask_for_my_radix_tree);
+```
+
+where:
+
+```C
+#define INIT_RADIX_TREE(root, mask)  \
+do {                                 \
+        (root)->height = 0;          \
+        (root)->gfp_mask = (mask);   \
+        (root)->rnode = NULL;        \
+} while (0)
+```
+
+和`RADIX_TREE_INIT`宏所做的初始化一样，初始化为默认值。
+
+接下来是用于从基数树插入和删除数据的两个函数：
+
+* `radix_tree_insert`;
+* `radix_tree_delete`;
+
+第一个函数`radix_tree_insert`需要3个参数：
+
+* 基数树的根;
+* 索引键;
+* 插入的数据;
+
+`radix_tree_delete`函数需要和`radix_tree_insert`一样的一组参数，但是没有data。
+
+基数树的搜索以两种方法实现：
+
+* `radix_tree_lookup`;
+* `radix_tree_gang_lookup`;
+* `radix_tree_lookup_slot`.
+
+第一个函数`radix_tree_lookup`需要两个参数：
+
+* 基数树的根;
+* 索引键;
+
+这个函数尝试在树中查找给定的键，并返回和该键相关联的记录。第二个函数`radix_tree_gang_lookup`有以下的函数签名：
+
+```C
+unsigned int radix_tree_gang_lookup(struct radix_tree_root *root,
+                                    void **results,
+                                    unsigned long first_index,
+                                    unsigned int max_items);
+```
+
+和返回记录的个数，(results指向的数据)按键排序并从第一个索引开始。返回的记录个数将不会超过`max_items`。
+
+最后一个函数`radix_tree_lookup_slot`将会返回包含数据的slot。
+
+链接
+---------------------------------------------------------------------------------
+
+* [Radix tree](http://en.wikipedia.org/wiki/Radix_tree)
+* [Trie](http://en.wikipedia.org/wiki/Trie)
+
+--------------------------------------------------------------------------------
+
+via: https://github.com/0xAX/linux-insides/edit/master/DataStructures/radix-tree.md
+
+作者：[0xAX]
+译者：[cposture](https://github.com/cposture)
+校对：[校对者ID](https://github.com/校对者ID)
+
+本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译，[Linux中国](http://linux.cn/) 荣誉推出
+