mirror of
https://github.com/LCTT/TranslateProject.git
synced 2024-12-23 21:20:42 +08:00
hankchow translated
This commit is contained in:
parent
07164621eb
commit
fc9abeb0a7
@ -1,149 +0,0 @@
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: (HankChow)
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
[#]: subject: (A practical guide to learning awk)
|
||||
[#]: via: (https://opensource.com/article/20/9/awk-ebook)
|
||||
[#]: author: (Seth Kenlon https://opensource.com/users/seth)
|
||||
|
||||
A practical guide to learning awk
|
||||
======
|
||||
Get a better handle on the awk command by downloading our free eBook.
|
||||
![Person programming on a laptop on a building][1]
|
||||
|
||||
Of all the [Linux][2] commands out there (and there are many), the three most quintessential seem to be `sed`, `awk`, and `grep`. Maybe it's the arcane sound of their names, or the breadth of their potential use, or just their age, but when someone's giving an example of a "Linuxy" command, it's usually one of those three. And while `sed` and `grep` have several simple one-line standards, the less prestigious `awk` remains persistently prominent for being particularly puzzling.
|
||||
|
||||
You're likely to use `sed` for a quick string replacement or `grep` to filter for a pattern on a daily basis. You're far less likely to compose an `awk` command. I often wonder why this is, and I attribute it to a few things. First of all, many of us barely use `sed` and `grep` for anything but some variation upon these two commands:
|
||||
|
||||
|
||||
```
|
||||
$ sed -e 's/foo/bar/g' file.txt
|
||||
$ grep foo file.txt
|
||||
```
|
||||
|
||||
So, even though you might feel more comfortable with `sed` and `grep`, you may not use their full potential. Of course, there's no obligation to learn more about `sed` or `grep`, but I sometimes wonder about the way I "learn" commands. Instead of learning _how_ a command works, I often learn a specific incantation that includes a command. As a result, I often feel a false familiarity with the command. I think I know a command because I can name three or four options off the top of my head, even though I don't know what the options do and can't quite put my finger on the syntax.
|
||||
|
||||
And that's the problem, I believe, that many people face when confronted with the power and flexibility of `awk`.
|
||||
|
||||
### Learning awk to use awk
|
||||
|
||||
The basics of `awk` are surprisingly simple. It's often noted that `awk` is a programming language, and although it's a relatively basic one, it's true. This means you can learn `awk` the same way you learn a new coding language: learn its syntax using some basic commands, learn its vocabulary so you can build up to complex actions, and then practice, practice, practice.
|
||||
|
||||
### How awk parses input
|
||||
|
||||
`Awk` sees input, essentially, as an array. When `awk` scans over a text file, it treats each line, individually and in succession, as a _record_. Each record is broken into _fields_. Of course, `awk` must keep track of this information, and you can see that data using the `NR` (number of records) and `NF` (number of fields) built-in variables. For example, this gives you the line count of a file:
|
||||
|
||||
|
||||
```
|
||||
$ awk 'END { print NR;}' example.txt
|
||||
36
|
||||
```
|
||||
|
||||
This also reveals something about `awk` syntax. Whether you're writing `awk` as a one-liner or as a self-contained script, the structure of an `awk` instruction is:
|
||||
|
||||
|
||||
```
|
||||
`pattern or keyword { actions }`
|
||||
```
|
||||
|
||||
In this example, the word `END` is a special, reserved keyword rather than a pattern. A similar keyword is `BEGIN`. With both of these keywords, `awk` just executes the action in braces at the start or end of parsing data.
|
||||
|
||||
You can use a _pattern_ as a filter or qualifier so that `awk` only executes a given action when it is able to match your pattern to the current record. For instance, suppose you want to use `awk`, much as you would `grep`, to find the word _Linux_ in a file of text:
|
||||
|
||||
|
||||
```
|
||||
$ awk '/Linux/ { print $0; }' os.txt
|
||||
OS: CentOS Linux (10.1.1.8)
|
||||
OS: CentOS Linux (10.1.1.9)
|
||||
OS: Red Hat Enterprise Linux (RHEL) (10.1.1.11)
|
||||
OS: Elementary Linux (10.1.2.4)
|
||||
OS: Elementary Linux (10.1.2.5)
|
||||
OS: Elementary Linux (10.1.2.6)
|
||||
```
|
||||
|
||||
For `awk`, each line in the file is a record, and each word in a record is a field. By default, fields are separated by a space. You can change that with the `--field-separator` option, which sets the `FS` (field separator) variable to whatever you want it to be:
|
||||
|
||||
|
||||
```
|
||||
$ awk --field-separator ':' '/Linux/ { print $2; }' os.txt
|
||||
CentOS Linux (10.1.1.8)
|
||||
CentOS Linux (10.1.1.9)
|
||||
Red Hat Enterprise Linux (RHEL) (10.1.1.11)
|
||||
Elementary Linux (10.1.2.4)
|
||||
Elementary Linux (10.1.2.5)
|
||||
Elementary Linux (10.1.2.6)
|
||||
```
|
||||
|
||||
In this sample, there's an empty space before each listing because there's a blank space after each colon (`:`) in the source text. This isn't `cut`, though, so the field separator needn't be limited to one character:
|
||||
|
||||
|
||||
```
|
||||
$ awk --field-separator ': ' '/Linux/ { print $2; }' os.txt
|
||||
CentOS Linux (10.1.1.8)
|
||||
CentOS Linux (10.1.1.9)
|
||||
Red Hat Enterprise Linux (RHEL) (10.1.1.11)
|
||||
Elementary Linux (10.1.2.4)
|
||||
Elementary Linux (10.1.2.5)
|
||||
Elementary Linux (10.1.2.6)
|
||||
```
|
||||
|
||||
### Functions in awk
|
||||
|
||||
You can build your own functions in `awk` using this syntax:
|
||||
|
||||
|
||||
```
|
||||
`name(parameters) { actions }`
|
||||
```
|
||||
|
||||
Functions are important because they allow you to write code once and reuse it throughout your work. When constructing one-liners, custom functions are a little less useful than they are in scripts, but `awk` defines many functions for you already. They work basically the same as any function in any other language or spreadsheet: You learn the order that the function needs information from you, and you can feed it whatever you want to get the results.
|
||||
|
||||
There are functions to perform mathematical operations and string processing. The math ones are often fairly straightforward. You provide a number, and it crunches it:
|
||||
|
||||
|
||||
```
|
||||
$ awk 'BEGIN { print sqrt(1764); }'
|
||||
42
|
||||
```
|
||||
|
||||
String functions can be more complex but are well documented in the [GNU awk manual][3]. For example, the `split` function takes an entity that `awk` views as a single field and splits it into different parts. It requires a field, a variable to use as an array containing each part of the split, and the character you want to use as the delimiter.
|
||||
|
||||
Using the output of the previous examples, I know that there's an IP address at the very end of each record. In this case, I can send just the last field of a record to the `split` function by referencing the variable `NF` because it contains the number of fields (and the final field must be the highest number):
|
||||
|
||||
|
||||
```
|
||||
$ awk --field-separator ': ' '/Linux/ { split($NF, IP, "."); print "subnet: " IP[3]; }' os.txt
|
||||
subnet: 1
|
||||
subnet: 1
|
||||
subnet: 1
|
||||
subnet: 2
|
||||
subnet: 2
|
||||
subnet: 2
|
||||
```
|
||||
|
||||
There are many more functions, and there's no reason to limit yourself to one per block of `awk` code. You can construct complex pipelines with `awk` in your terminal, or you can write `awk` scripts to define and utilize your own functions.
|
||||
|
||||
### Download the eBook
|
||||
|
||||
Learning `awk` is mostly a matter of using `awk`. Use it even if it means duplicating functionality you already have with `sed` or `grep` or `cut` or `tr` or any other perfectly valid commands. Once you get comfortable with it, you can write Bash functions that invoke your custom `awk` commands for easier use. And eventually, you'll be able to write scripts to parse complex datasets.
|
||||
|
||||
**[Download our][4]** **[eBook][4] **to learn everything you need to know about `awk`, and start using it today.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://opensource.com/article/20/9/awk-ebook
|
||||
|
||||
作者:[Seth Kenlon][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://opensource.com/users/seth
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/computer_code_programming_laptop.jpg?itok=ormv35tV (Person programming on a laptop on a building)
|
||||
[2]: https://opensource.com/resources/linux
|
||||
[3]: https://www.gnu.org/software/gawk/manual/gawk.html
|
||||
[4]: https://opensource.com/downloads/awk-ebook
|
149
translated/tech/20200903 A practical guide to learning awk.md
Normal file
149
translated/tech/20200903 A practical guide to learning awk.md
Normal file
@ -0,0 +1,149 @@
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: (HankChow)
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
[#]: subject: (A practical guide to learning awk)
|
||||
[#]: via: (https://opensource.com/article/20/9/awk-ebook)
|
||||
[#]: author: (Seth Kenlon https://opensource.com/users/seth)
|
||||
|
||||
awk 实用学习指南
|
||||
======
|
||||
下载我们的电子书,学习如何更好地使用 `awk`。
|
||||
![Person programming on a laptop on a building][1]
|
||||
|
||||
在众多 [Linux][2] 命令中,`sed`、`awk` 和 `grep` 恐怕是其中最经典的三个命令了。它们引人注目或许是由于名字发音与众不同,也可能是它们无处不在,甚至是因为它们存在已久,但无论如何,如果要问哪些命令很有 Linux 风格,这三个命令是当之无愧的。其中 `sed` 和 `grep` 已经有很多简洁的标准用法了,但 `awk` 的使用难度却相对突出。
|
||||
|
||||
在日常使用中,通过 `sed` 实现字符串替换、通过 `grep` 实现过滤,这些都是司空见惯的操作了,但 `awk` 命令相对来说是用得比较少的。在我看来,可能的原因是大多数人都只使用 `sed` 或者 `grep` 的一些变体实现某些功能,例如:
|
||||
|
||||
|
||||
```
|
||||
$ sed -e 's/foo/bar/g' file.txt
|
||||
$ grep foo file.txt
|
||||
```
|
||||
|
||||
因此,尽管你可能会觉得 `sed` 和 `grep` 使用起来更加顺手,但实际上它们还有更多更强大的作用没有发挥出来。当然,我们没有必要在这两个命令上钻研得很深入,但我还是想理解自己是如何学习一个命令的。很多时候我会把一整串命令记住,但不会去了解其中的运行过程,这就让我产生了一种很熟悉命令的错觉,我可以随口说出某个命令的好几个选项参数,但这些参数具体有什么作用,以及它们的相关语法,我都并不明确。
|
||||
|
||||
这大概就是很多人对 `awk` 缺乏了解的原因了。
|
||||
|
||||
### 为使用而学习 awk
|
||||
|
||||
`awk` 并不深奥。它是一种相对基础的编程语言,因此你可以把它当成一门新的编程语言来学习:使用一些基本命令来熟悉语法、了解语言中的关键字并实现更复杂的功能,然后再多加练习就可以了。
|
||||
|
||||
### awk 是如何解析输入内容的
|
||||
|
||||
`awk` 的本质是将输入的内容看作是一个数组。当 `awk` 扫描一个文本文件时,会把每一行作为一条<ruby>记录<rt>record</rt></ruby>,每一条记录中又分割为多个<ruby>字段<rt>field</rt></ruby>。`awk` 记录了各条记录各个字段的信息,并通过内置变量 `NR`(记录数) 和 `NF`(字段数) 来调用相关信息。例如一下这个命令可以查看文件的行数:
|
||||
|
||||
|
||||
```
|
||||
$ awk 'END { print NR;}' example.txt
|
||||
36
|
||||
```
|
||||
|
||||
从上面的命令可以看出 `awk` 的基本语法,无论是一个单行命令还是一整个脚本,语法都是这样的:
|
||||
|
||||
|
||||
```
|
||||
`样式或关键字 { 操作 }`
|
||||
```
|
||||
|
||||
在上面的例子中,`END` 是一个关键字而不是样式,与此类似的另一个关键字是 `BEGIN`。使用 `BEGIN` 或 `END` 可以让 `awk` 在解析内容前或解析内容后执行大括号中指定的操作。
|
||||
|
||||
你可以使用<ruby>样式<rt>pattern</rt></ruby>作为过滤器或限定符,这样 `awk` 只会对匹配样式的对应记录执行指定的操作。以下这个例子就是使用 `awk` 实现 `grep` 命令在文件中查找“Linux”字符串的功能:
|
||||
|
||||
|
||||
```
|
||||
$ awk '/Linux/ { print $0; }' os.txt
|
||||
OS: CentOS Linux (10.1.1.8)
|
||||
OS: CentOS Linux (10.1.1.9)
|
||||
OS: Red Hat Enterprise Linux (RHEL) (10.1.1.11)
|
||||
OS: Elementary Linux (10.1.2.4)
|
||||
OS: Elementary Linux (10.1.2.5)
|
||||
OS: Elementary Linux (10.1.2.6)
|
||||
```
|
||||
|
||||
`awk` 会将文件中的每一行作为一条记录,将一条记录中的每个单词作为一个字段,默认情况下会按照空格作为<ruby>分隔符<rt>field separator</rt></ruby>(`FS`)切割出记录中的字段。如果想要使用其它内容作为分隔符,可以使用 `--field-separator` 选项指定分隔符:
|
||||
|
||||
|
||||
```
|
||||
$ awk --field-separator ':' '/Linux/ { print $2; }' os.txt
|
||||
CentOS Linux (10.1.1.8)
|
||||
CentOS Linux (10.1.1.9)
|
||||
Red Hat Enterprise Linux (RHEL) (10.1.1.11)
|
||||
Elementary Linux (10.1.2.4)
|
||||
Elementary Linux (10.1.2.5)
|
||||
Elementary Linux (10.1.2.6)
|
||||
```
|
||||
|
||||
在上面的例子中,可以看到在 `awk` 处理后每一行的行首都有一个空格,那是因为在源文件中每个冒号(`:`)后面都带有一个空格。和 `cut` 有所不同的是,`awk` 可以指定一个字符串作为分隔符,就像这样:
|
||||
|
||||
|
||||
```
|
||||
$ awk --field-separator ': ' '/Linux/ { print $2; }' os.txt
|
||||
CentOS Linux (10.1.1.8)
|
||||
CentOS Linux (10.1.1.9)
|
||||
Red Hat Enterprise Linux (RHEL) (10.1.1.11)
|
||||
Elementary Linux (10.1.2.4)
|
||||
Elementary Linux (10.1.2.5)
|
||||
Elementary Linux (10.1.2.6)
|
||||
```
|
||||
|
||||
### awk 中的函数
|
||||
|
||||
可以通过这样的语法在 `awk` 中自定义函数:
|
||||
|
||||
|
||||
```
|
||||
`函数名称(参数) { 操作 }`
|
||||
```
|
||||
|
||||
函数的好处在于只需要编写一次就可以多次复用,因此函数在脚本中起到的作用会比在构造单行命令时大。同时 `awk` 自身也带有很多预定义的函数,并且工作原理和其它编程语言或电子表格保持一致。你只需要了解函数需要接受什么参数,就可以放心使用了。
|
||||
|
||||
`awk` 中提供了数学运算和字符串处理的相关函数。数学运算函数通常比较简单,传入一个数字,它就会传出一个结果:
|
||||
|
||||
|
||||
```
|
||||
$ awk 'BEGIN { print sqrt(1764); }'
|
||||
42
|
||||
```
|
||||
|
||||
而字符串处理函数则稍微复杂一点,但 [GNU awk 手册][3]中也有充足的文档。例如 `split()` 函数需要传入一个待分割的单一字段、一个数组用于存放分割结果,以及用于分割的<ruby>定界符<rt>delimiter</rt></ruby>。
|
||||
|
||||
例如前面示例中的输出内容,每条记录的末尾都包含了一个 IP 地址。由于变量 `NF` 代表的是每条记录的字段数量,刚好对应的是每条记录中最后一个字段的序号,因此可以通过引用 `NF` 将每条记录的最后一个字段传入 `split()` 函数:
|
||||
|
||||
|
||||
```
|
||||
$ awk --field-separator ': ' '/Linux/ { split($NF, IP, "."); print "subnet: " IP[3]; }' os.txt
|
||||
subnet: 1
|
||||
subnet: 1
|
||||
subnet: 1
|
||||
subnet: 2
|
||||
subnet: 2
|
||||
subnet: 2
|
||||
```
|
||||
|
||||
实际上 `awk` 的功能还远远不止于此,你还可以跳出 `awk` 本身,通过命令管道和脚本来自定义更多功能。
|
||||
|
||||
### 下载电子书
|
||||
|
||||
使用 `awk` 本身就是一个学习 `awk` 的过程,即使某些操作使用 `sed`、`grep`、`cut`、`tr` 命令已经完全足够了,也可以尝试使用 `awk` 来实现。只要熟悉了 `awk`,就可以在 Bash 中自定义一些 `awk` 函数,进而解析复杂的数据。
|
||||
|
||||
[下载我们的电子书][4]学习并开始使用 `awk` 吧!
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://opensource.com/article/20/9/awk-ebook
|
||||
|
||||
作者:[Seth Kenlon][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[HankChow](https://github.com/hankchow)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://opensource.com/users/seth
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/computer_code_programming_laptop.jpg?itok=ormv35tV (Person programming on a laptop on a building)
|
||||
[2]: https://opensource.com/resources/linux
|
||||
[3]: https://www.gnu.org/software/gawk/manual/gawk.html
|
||||
[4]: https://opensource.com/downloads/awk-ebook
|
Loading…
Reference in New Issue
Block a user