[Translated] How to use awk command in Linux

2024-12-26 21:30:55 +08:00 · 2014-07-31 20:58:15 +08:00 · 2014-07-31 20:58:15 +08:00 · 98d1cb1930
commit 98d1cb1930
parent 8152973849
2 changed files with 121 additions and 133 deletions
--- a/sources/tech/20140729
+++ b/sources/tech/20140729
@ -1,133 +0,0 @@
-Translating------geekpi
-
-How to use awk command in Linux
-================================================================================
-Text processing is at the heart of Unix. From pipes to the /proc subsystem, the "everything is a file" philosophy pervades the operating system and all of the tools built for it. Because of this, getting comfortable with text-processing is one of the most important skills for an aspiring Linux system administrator, or even any power user, and awk is one of the most powerful text-processing tools available outside general-purpose programming languages.
-
-The simplest awk task is selecting fields from stdin; if you never learn any more about awk than this, you'll still have at your disposal an extremely useful tool.
-
-By default, awk separates input lines by whitespace. If you'd like to select the first field from input, you just need to tell awk to print out $1:
-
-    $ echo 'one two three four' | awk '{print $1}'
-
-> one
-
-(Yes, the curly-brace syntax is a little weird, but I promise that's about as weird as it gets in this lesson.)
-
-Can you guess how you'd select the second, third, or fourth fields? That's right, with $2, $3, and $4, respectively.
-
-    $ echo 'one two three four' | awk '{print $3}' 
-
-(Yes, the curly-brace syntax is a little weird, but I promise that's about as weird as it gets in this lesson.)
-
-Can you guess how you'd select the second, third, or fourth fields? That's right, with $2, $3, and $4, respectively.
-
-    $ echo 'one two three four' | awk '{print $3}' 
-
-> three
-
-Often when text munging, you need to create a specific format of data, and that covers more than just a single word. The good news is that awk makes it easy to print multiple fields, or even include static strings:
-
-     $ echo 'one two three four' | awk '{print $3,$1}' 
-
-> three one
-
----------
-
-    $ echo 'one two three four' | awk '{print "foo:",$3,"| bar:",$1}' 
-
-> foo: three | bar: one
-
-Ok, but what if your input isn't separated by whitespace? Just pass awk the '-F' flag with your separator:
-
-    $ echo 'one mississippi,two mississippi,three mississippi,four mississippi' | awk -F , '{print $4}' 
-
-> four mississippi
-
-Occasionally, you may find yourself working with data with a varied number of fields, and you just know you want the *last* one. awk prepopulates the $NF variable with the *number of fields*, so you can use it to grab the last element:
-
-    $ echo 'one two three four' | awk '{print $NF}' 
-
-> four
-
-You can also do simple math on $NF, in case you need the next-to-last field:
-
-    $ echo 'one two three four' | awk '{print $(NF-1)}' 
-
-> three
-
-Or even the middle field:
-
-    $ echo 'one two three four' | awk '{print $((NF/2)+1)}' 
-
-> three
-
-    $ echo 'one two three four five' | awk '{print $((NF/2)+1)}' 
-
-> three
-
-While this is all very useful, you can get away with forcing sed, cut, and grep into a form to get these results, as well (albeit with a lot more work).
-
-So, I'll leave you with one last introductory feature of awk, maintaining state across lines.
-
-     $ echo -e 'one 1\ntwo 2' | awk '{print $2}' 
-
-> 1
-> 
-> 2
-
-    $ echo -e 'one 1\ntwo 2' | awk '{sum+=$2} END {print sum}' 
-
-> 3
-
-(The END indicates that we should only perform the following block **after** we finish processing every line.)
-
-The case where I've used this is to sum up bytes from web server request logs. Imagine we have an access log that looks like this:
-
-    $ cat requests.log 
-
-> Jul 23 18:57:12 httpd[31950]: "GET /foo/bar HTTP/1.1" 200 344
-> 
-> Jul 23 18:57:13 httpd[31950]: "GET / HTTP/1.1" 200 9300
-> 
-> Jul 23 19:01:27 httpd[31950]: "GET / HTTP/1.1" 200 9300
-> 
-> Jul 23 19:01:55 httpd[31950]: "GET /foo/baz HTTP/1.1" 200 6401
-> 
-> Jul 23 19:02:31 httpd[31950]: "GET /foo/baz?page=2 HTTP/1.1" 200 6312
-
-We know the last field is the number of bytes of the response. We've already learned how to extract them using print and $NF:
-
-    $ < requests.log awk '{print $NF}' 
-
-> 344
-> 
-> 9300
-> 
-> 9300
-> 
-> 6401
-> 
-> 6312
-
-And so we can sum into a variable to gather the total number of bytes our webserver has served to clients during the timespan of our log:
-
-    $ < requests.log awk '{totalBytes+=$NF} END {print totalBytes}' 
-
-> 31657
-
-If you're looking for more to do with awk, you can find used copies of [the original awk book][1] for under 15 USD on Amazon. You may also enjoy Eric Pement's [collection of awk one-liners][2].
-
--------------------------------------------------------------------------------
-
-via: http://xmodulo.com/2014/07/use-awk-command-linux.html
-
-作者：[James Pearson][a]
-译者：[译者ID](https://github.com/译者ID)
-校对：[校对者ID](https://github.com/校对者ID)
-
-本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译，[Linux中国](http://linux.cn/) 荣誉推出
-
-[a]:http://xmodulo.com/author/james
-[1]:http://www.amazon.com/gp/product/020107981X/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=020107981X&linkCode=as2&tag=xmodulo-20&linkId=6NW62B2WBRBXRFJB
-[2]:http://www.pement.org/awk/awk1line.txt
--- a/translated/tech/20140729
+++ b/translated/tech/20140729
@ -0,0 +1,121 @@
+如何在Linux中使用awk命令
+================================================================================
+文本处理是Unix的核心。从管道到/proc子系统，“一切都是文件”的理念贯穿于操作系统和所有基于它构造的工具。正因为如此，轻松地处理文本是一个期望成为Linux系统管理员甚至是资深用户的最重要的技能之一，awk是通用编程语言之外最强大的文本处理工具之一。
+
+最简单的awk的任务是从标准输入中选择字段;如果你对awk除了这个没有学习过其他的，它还是会是你身边一个非常有用的工具。
+
+默认情况下，awk通过空格分隔输入。如果您想选择输入的第一个字段，你只需要告诉awk输出$ 1：
+
+    $ echo 'one two three four' | awk '{print $1}'
+
+> one
+
+（是的，大括号语法是有点古怪，但我保证这是我们这节课一直会遇到。）
+
+你能猜出如何选择第二，第三或第四个字段么？是的，分别用$2，$ 3，$ 4。
+
+    $ echo 'one two three four' | awk '{print $3}'
+
+> three
+
+通常在文本改写时，你需要创建一个特定的数据格式，并且它覆盖不止一个单词。好消息是，awk中可以很容易地打印多个字段，甚至包含静态字符串：
+
+     $ echo 'one two three four' | awk '{print $3,$1}' 
+
+> three one
+
+----------
+
+    $ echo 'one two three four' | awk '{print "foo:",$3,"| bar:",$1}' 
+
+> foo: three | bar: one
+
+好吧，如果你的输入不是由空格分隔怎么办？只需用awk中的'-F'标志后带上你的分隔符：
+
+    $ echo 'one mississippi,two mississippi,three mississippi,four mississippi' | awk -F , '{print $4}' 
+
+> four mississippi
+
+偶尔间，你会发现自己正在处理拥有不同的字段数量的数据，但你只知道你想要的*最后*字段。 awk中内置的$NF变量代表*字段的数量*，这样你就可以用它来抓取最后一个元素：
+
+    $ echo 'one two three four' | awk '{print $NF}' 
+
+> four
+
+你也可以用$NF做简单的数学，假如你需要倒数第二个字段：
+
+    $ echo 'one two three four' | awk '{print $(NF-1)}' 
+
+> three
+
+甚至是中间的字段：
+
+    $ echo 'one two three four' | awk '{print $((NF/2)+1)}' 
+
+> three
+
+而且这一切都非常有用，同样你可以摆脱强制使用sed，cut，和grep来得到这些结果（尽管有大量的工作）。
+
+因此，我将为你留下awk的最后介绍特性，维护跨行状态。
+
+     $ echo -e 'one 1\ntwo 2' | awk '{print $2}' 
+
+> 1
+> 
+> 2
+
+    $ echo -e 'one 1\ntwo 2' | awk '{sum+=$2} END {print sum}' 
+
+> 3
+
+（END代表的是我们在执行完每行的处理**之后**只处理下面的代码块
+
+这里我使用的例子是统计web服务器请求日志的字节大小。想象一下我们有如下这样的日志：
+
+    $ cat requests.log 
+
+> Jul 23 18:57:12 httpd[31950]: "GET /foo/bar HTTP/1.1" 200 344
+> 
+> Jul 23 18:57:13 httpd[31950]: "GET / HTTP/1.1" 200 9300
+> 
+> Jul 23 19:01:27 httpd[31950]: "GET / HTTP/1.1" 200 9300
+> 
+> Jul 23 19:01:55 httpd[31950]: "GET /foo/baz HTTP/1.1" 200 6401
+> 
+> Jul 23 19:02:31 httpd[31950]: "GET /foo/baz?page=2 HTTP/1.1" 200 6312
+
+我们知道最后一个字段是响应的字节大小。我们已经学习了如何使用$NF来抽取他们：
+
+    $ < requests.log awk '{print $NF}' 
+
+> 344
+> 
+> 9300
+> 
+> 9300
+> 
+> 6401
+> 
+> 6312
+
+接着我们可以将它们累加到一个变量中来收集我们的web服务其在日志中这段时间内的响应客户端的字节数量
+
+    $ < requests.log awk '{totalBytes+=$NF} END {print totalBytes}' 
+
+> 31657
+
+如果你正在寻找关于awk的更多资料，你可以在Amazon中在15美元内找到[原始awk手册][1]的副本。你同样可以使用Eric Pement的[单行awk命令收集][2]这本书
+
+--------------------------------------------------------------------------------
+
+via: http://xmodulo.com/2014/07/use-awk-command-linux.html
+
+作者：[James Pearson][a]
+译者：[geekpi](https://github.com/geekpi)
+校对：[校对者ID](https://github.com/校对者ID)
+
+本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译，[Linux中国](http://linux.cn/) 荣誉推出
+
+[a]:http://xmodulo.com/author/james
+[1]:http://www.amazon.com/gp/product/020107981X/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=020107981X&linkCode=as2&tag=xmodulo-20&linkId=6NW62B2WBRBXRFJB
+[2]:http://www.pement.org/awk/awk1line.txt