mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-02-03 23:40:14 +08:00
commit
d2cda53175
@ -1,214 +0,0 @@
|
||||
translating by wwy-hust
|
||||
|
||||
How to Use Awk and Regular Expressions to Filter Text or String in Files
|
||||
=============================================================================
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Linux-Awk-Command-Examples.png)
|
||||
|
||||
When we run certain commands in Unix/Linux to read or edit text from a string or file, we most times try to filter output to a given section of interest. This is where using regular expressions comes in handy.
|
||||
|
||||
### What are Regular Expressions?
|
||||
|
||||
A regular expression can be defined as a strings that represent several sequence of characters. One of the most important things about regular expressions is that they allow you to filter the output of a command or file, edit a section of a text or configuration file and so on.
|
||||
|
||||
### Features of Regular Expression
|
||||
|
||||
Regular expressions are made of:
|
||||
|
||||
- Ordinary characters such as space, underscore(_), A-Z, a-z, 0-9.
|
||||
- Meta characters that are expanded to ordinary characters, they include:
|
||||
- `(.)` it matches any single character except a newline.
|
||||
- `(*)` it matches zero or more existences of the immediate character preceding it.
|
||||
- `[ character(s) ]` it matches any one of the characters specified in character(s), one can also use a hyphen (-) to mean a range of characters such as [a-f], [1-5], and so on.
|
||||
- `^` it matches the beginning of a line in a file.
|
||||
- `$` matches the end of line in a file.
|
||||
- `\` it is an escape character.
|
||||
|
||||
In order to filter text, one has to use a text filtering tool such as awk. You can think of awk as a programming language of its own. But for the scope of this guide to using awk, we shall cover it as a simple command line filtering tool.
|
||||
|
||||
The general syntax of awk is:
|
||||
|
||||
```
|
||||
# awk 'script' filename
|
||||
```
|
||||
|
||||
Where `'script'` is a set of commands that are understood by awk and are execute on file, filename.
|
||||
|
||||
It works by reading a given line in the file, makes a copy of the line and then executes the script on the line. This is repeated on all the lines in the file.
|
||||
|
||||
The `'script'` is in the form `'/pattern/ action'` where pattern is a regular expression and the action is what awk will do when it finds the given pattern in a line.
|
||||
|
||||
### How to Use Awk Filtering Tool in Linux
|
||||
|
||||
In the following examples, we shall focus on the meta characters that we discussed above under the features of awk.
|
||||
|
||||
#### A simple example of using awk:
|
||||
|
||||
The example below prints all the lines in the file /etc/hosts since no pattern is given.
|
||||
|
||||
```
|
||||
# awk '//{print}'/etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Awk-Command-Example.gif)
|
||||
>Awk Prints all Lines in a File
|
||||
|
||||
#### Use Awk with Pattern:
|
||||
|
||||
I the example below, a pattern `localhost` has been given, so awk will match line having localhost in the `/etc/hosts` file.
|
||||
|
||||
```
|
||||
# awk '/localhost/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-Command-with-Pattern.gif)
|
||||
>Awk Print Given Matching Line in a File
|
||||
|
||||
#### Using Awk with (.) wild card in a Pattern
|
||||
|
||||
The `(.)` will match strings containing loc, localhost, localnet in the example below.
|
||||
|
||||
That is to say *** l some_single_character c ***.
|
||||
|
||||
```
|
||||
# awk '/l.c/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-with-Wild-Cards.gif)
|
||||
>Use Awk to Print Matching Strings in a File
|
||||
|
||||
#### Using Awk with (*) Character in a Pattern
|
||||
|
||||
It will match strings containing localhost, localnet, lines, capable, as in the example below:
|
||||
|
||||
```
|
||||
# awk '/l*c/{print}' /etc/localhost
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Match-Strings-in-File.gif)
|
||||
>Use Awk to Match Strings in File
|
||||
|
||||
You will also realize that `(*)` tries to a get you the longest match possible it can detect.
|
||||
|
||||
Let look at a case that demonstrates this, take the regular expression `t*t` which means match strings that start with letter `t` and end with `t` in the line below:
|
||||
|
||||
```
|
||||
this is tecmint, where you get the best good tutorials, how to's, guides, tecmint.
|
||||
```
|
||||
|
||||
You will get the following possibilities when you use the pattern `/t*t/`:
|
||||
|
||||
```
|
||||
this is t
|
||||
this is tecmint
|
||||
this is tecmint, where you get t
|
||||
this is tecmint, where you get the best good t
|
||||
this is tecmint, where you get the best good tutorials, how t
|
||||
this is tecmint, where you get the best good tutorials, how tos, guides, t
|
||||
this is tecmint, where you get the best good tutorials, how tos, guides, tecmint
|
||||
```
|
||||
|
||||
And `(*)` in `/t*t/` wild card character allows awk to choose the the last option:
|
||||
|
||||
```
|
||||
this is tecmint, where you get the best good tutorials, how to's, guides, tecmint
|
||||
```
|
||||
|
||||
#### Using Awk with set [ character(s) ]
|
||||
|
||||
Take for example the set [al1], here awk will match all strings containing character a or l or 1 in a line in the file /etc/hosts.
|
||||
|
||||
```
|
||||
# awk '/[al1]/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-Matching-Character.gif)
|
||||
>Use-Awk to Print Matching Character in File
|
||||
|
||||
The next example matches strings starting with either `K` or `k` followed by `T`:
|
||||
|
||||
```
|
||||
# awk '/[Kk]T/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-Matched-String-in-File.gif)
|
||||
>Use Awk to Print Matched String in File
|
||||
|
||||
#### Specifying Characters in a Range
|
||||
|
||||
Understand characters with awk:
|
||||
|
||||
- `[0-9]` means a single number
|
||||
- `[a-z]` means match a single lower case letter
|
||||
- `[A-Z]` means match a single upper case letter
|
||||
- `[a-zA-Z]` means match a single letter
|
||||
- `[a-zA-Z 0-9]` means match a single letter or number
|
||||
|
||||
Lets look at an example below:
|
||||
|
||||
```
|
||||
# awk '/[0-9]/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-To-Print-Matching-Numbers-in-File.gif)
|
||||
>Use Awk To Print Matching Numbers in File
|
||||
|
||||
All the line from the file /etc/hosts contain at least a single number [0-9] in the above example.
|
||||
|
||||
#### Use Awk with (^) Meta Character
|
||||
|
||||
It matches all the lines that start with the pattern provided as in the example below:
|
||||
|
||||
```
|
||||
# awk '/^fe/{print}' /etc/hosts
|
||||
# awk '/^ff/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-All-Matching-Lines-with-Pattern.gif)
|
||||
>Use Awk to Print All Matching Lines with Pattern
|
||||
|
||||
#### Use Awk with ($) Meta Character
|
||||
|
||||
It matches all the lines that end with the pattern provided:
|
||||
|
||||
```
|
||||
# awk '/ab$/{print}' /etc/hosts
|
||||
# awk '/ost$/{print}' /etc/hosts
|
||||
# awk '/rs$/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-Given-Pattern-String.gif)
|
||||
>Use Awk to Print Given Pattern String
|
||||
|
||||
#### Use Awk with (\) Escape Character
|
||||
|
||||
It allows you to take the character following it as a literal that is to say consider it just as it is.
|
||||
|
||||
In the example below, the first command prints out all line in the file, the second command prints out nothing because I want to match a line that has $25.00, but no escape character is used.
|
||||
|
||||
The third command is correct since a an escape character has been used to read $ as it is.
|
||||
|
||||
```
|
||||
# awk '//{print}' deals.txt
|
||||
# awk '/$25.00/{print}' deals.txt
|
||||
# awk '/\$25.00/{print}' deals.txt
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-with-Escape-Character.gif)
|
||||
>Use Awk with Escape Character
|
||||
|
||||
### Summary
|
||||
|
||||
That is not all with the awk command line filtering tool, the examples above a the basic operations of awk. In the next parts we shall be advancing on how to use complex features of awk. Thanks for reading through and for any additions or clarifications, post a comment in the comments section.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: http://www.tecmint.com/use-linux-awk-command-to-filter-text-string-in-files/
|
||||
|
||||
作者:[Aaron Kili][a]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: http://www.tecmint.com/author/aaronkili/
|
@ -1,4 +1,4 @@
|
||||
How to Use ‘next’ Command with Awk in Linux
|
||||
[translating]How to Use ‘next’ Command with Awk in Linux
|
||||
=============================================
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/06/Use-next-Command-with-Awk-in-Linux.png)
|
||||
|
@ -0,0 +1,212 @@
|
||||
如何使用Awk和正则表达式过滤文本或文件中的字符串
|
||||
=============================================================================
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Linux-Awk-Command-Examples.png)
|
||||
|
||||
当我们在 Unix/Linux 下使用特定的命令从字符串或文件中读取或编辑文本时,我们经常会尝试过滤输出以得到感兴趣的部分。这时正则表达式就派上用场了。
|
||||
|
||||
### 什么是正则表达式?
|
||||
|
||||
正则表达式可以定义为代表若干个字符序列的字符串。它最重要的功能就是它允许你过滤一条命令或一个文件的输出,编辑文本或配置等文件的一部分。
|
||||
|
||||
### 正则表达式的特点
|
||||
|
||||
正则表达式由以下内容组合而成:
|
||||
|
||||
- 普通的字符,例如空格、下划线、A-Z、a-z、0-9。
|
||||
- 可以扩展为普通字符的元字符,它们包括:
|
||||
- `(.)` 它匹配除了换行符外的任何单个字符。
|
||||
- `(*)` 它匹配零个或多个在其之前的立即字符。
|
||||
- `[ character(s) ]` 它匹配任何由 character(s) 指定的一个字符,你可以使用连字符(-)代表字符区间,例如 [a-f]、[1-5]等。
|
||||
- `^` 它匹配文件中一行的开头。
|
||||
- `$` 它匹配文件中一行的结尾。
|
||||
- `\` 这是一个转义字符。
|
||||
|
||||
你必须使用类似 awk 这样的文本过滤工具来过滤文本。你还可以把 awk 当作一个用于自身的编程语言。但由于这个指南的适用范围是关于使用 awk 的,我会按照一个简单的命令行过滤工具来介绍它。
|
||||
|
||||
awk 的一般语法如下:
|
||||
|
||||
```
|
||||
# awk 'script' filename
|
||||
```
|
||||
|
||||
此处 `'script'` 是一个由 awk 使用并应用于 filename 的命令集合。
|
||||
|
||||
它通过读取文件中的给定的一行,复制该行的内容并在该行上执行脚本的方式工作。这个过程会在该文件中的所有行上重复。
|
||||
|
||||
该脚本 `'script'` 中内容的格式是 `'/pattern/ action'`,其中 `pattern` 是一个正则表达式,而 `action` 是当 awk 在该行中找到此模式时应当执行的动作。
|
||||
|
||||
### 如何在 Linux 中使用 Awk 过滤工具
|
||||
|
||||
在下面的例子中,我们将聚焦于之前讨论过的元字符。
|
||||
|
||||
#### 一个使用 awk 的简单示例:
|
||||
|
||||
下面的例子打印文件 /etc/hosts 中的所有行,因为没有指定任何的模式。
|
||||
|
||||
```
|
||||
# awk '//{print}'/etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Awk-Command-Example.gif)
|
||||
>Awk 打印文件中的所有行
|
||||
|
||||
#### 结合模式使用 Awk
|
||||
|
||||
在下面的示例中,指定了模式 `localhost`,因此 awk 将匹配文件 `/etc/hosts` 中有 `localhost` 的那些行。
|
||||
|
||||
```
|
||||
# awk '/localhost/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-Command-with-Pattern.gif)
|
||||
>Awk 打印文件中匹配模式的行
|
||||
|
||||
#### 在 Awk 模式中使用通配符 (.)
|
||||
|
||||
在下面的例子中,符号 `(.)` 将匹配包含 loc、localhost、localnet 的字符串。
|
||||
|
||||
这里的意思是匹配 *** l 一些单个字符 c ***。
|
||||
|
||||
```
|
||||
# awk '/l.c/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-with-Wild-Cards.gif)
|
||||
>使用 Awk 打印文件中匹配模式的字符串
|
||||
|
||||
#### 在 Awk 模式中使用字符 (*)
|
||||
|
||||
在下面的例子中,将匹配包含 localhost、localnet、lines, capable 的字符串。
|
||||
|
||||
```
|
||||
# awk '/l*c/{print}' /etc/localhost
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Match-Strings-in-File.gif)
|
||||
>使用 Awk 匹配文件中的字符串
|
||||
|
||||
你可能也意识到 `(*)` 将会尝试匹配它可能检测到的最长的匹配。
|
||||
|
||||
让我们看一看可以证明这一点的例子,正则表达式 `t*t` 的意思是在下面的行中匹配以 `t` 开始和 `t` 结束的字符串:
|
||||
|
||||
```
|
||||
this is tecmint, where you get the best good tutorials, how to's, guides, tecmint.
|
||||
```
|
||||
|
||||
当你使用模式 `/t*t/` 时,会得到如下可能的结果:
|
||||
|
||||
```
|
||||
this is t
|
||||
this is tecmint
|
||||
this is tecmint, where you get t
|
||||
this is tecmint, where you get the best good t
|
||||
this is tecmint, where you get the best good tutorials, how t
|
||||
this is tecmint, where you get the best good tutorials, how tos, guides, t
|
||||
this is tecmint, where you get the best good tutorials, how tos, guides, tecmint
|
||||
```
|
||||
|
||||
在 `/t*t/` 中的通配符 `(*)` 将使得 awk 选择匹配的最后一项:
|
||||
|
||||
```
|
||||
this is tecmint, where you get the best good tutorials, how to's, guides, tecmint
|
||||
```
|
||||
|
||||
#### 结合集合 [ character(s) ] 使用 Awk
|
||||
|
||||
以集合 [al1] 为例,awk 将匹配文件 /etc/hosts 中所有包含字符 a 或 l 或 1 的字符串。
|
||||
|
||||
```
|
||||
# awk '/[al1]/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-Matching-Character.gif)
|
||||
>使用 Awk 打印文件中匹配的字符
|
||||
|
||||
下一个例子匹配以 `K` 或 `k` 开始头,后面跟着一个 `T` 的字符串:
|
||||
|
||||
```
|
||||
# awk '/[Kk]T/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-Matched-String-in-File.gif)
|
||||
>使用 Awk 打印文件中匹配的字符
|
||||
|
||||
#### 以范围的方式指定字符
|
||||
|
||||
awk 所能理解的字符:
|
||||
|
||||
- `[0-9]` 代表一个单独的数字
|
||||
- `[a-z]` 代表一个单独的小写字母
|
||||
- `[A-Z]` 代表一个单独的大写字母
|
||||
- `[a-zA-Z]` 代表一个单独的字母
|
||||
- `[a-zA-Z 0-9]` 代表一个单独的字母或数字
|
||||
|
||||
让我们看看下面的例子:
|
||||
|
||||
```
|
||||
# awk '/[0-9]/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-To-Print-Matching-Numbers-in-File.gif)
|
||||
>使用 Awk 打印文件中匹配的数字
|
||||
|
||||
在上面的例子中,文件 /etc/hosts 中的所有行都至少包含一个单独的数字 [0-9]。
|
||||
|
||||
#### 结合元字符 (\^) 使用 Awk
|
||||
|
||||
在下面的例子中,它匹配所有以给定模式开头的行:
|
||||
|
||||
```
|
||||
# awk '/^fe/{print}' /etc/hosts
|
||||
# awk '/^ff/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-All-Matching-Lines-with-Pattern.gif)
|
||||
>使用 Awk 打印与模式匹配的行
|
||||
|
||||
#### 结合元字符 ($) 使用 Awk
|
||||
|
||||
它将匹配所有以给定模式结尾的行:
|
||||
|
||||
```
|
||||
# awk '/ab$/{print}' /etc/hosts
|
||||
# awk '/ost$/{print}' /etc/hosts
|
||||
# awk '/rs$/{print}' /etc/hosts
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-to-Print-Given-Pattern-String.gif)
|
||||
>使用 Awk 打印与模式匹配的字符串
|
||||
|
||||
#### 结合转义字符 (\\) 使用 Awk
|
||||
|
||||
它允许你将该转义字符后面的字符作为文字,即理解为其字面的意思。
|
||||
|
||||
在下面的例子中,第一个命令打印出文件中的所有行,第二个命令中我想匹配具有 $25.00 的一行,但我并未使用转义字符,因而没有打印出任何内容。
|
||||
|
||||
第三个命令是正确的,因为一个这里使用了一个转义字符以转义 $,以将其识别为 '$'(而非元字符)。
|
||||
|
||||
```
|
||||
# awk '//{print}' deals.txt
|
||||
# awk '/$25.00/{print}' deals.txt
|
||||
# awk '/\$25.00/{print}' deals.txt
|
||||
```
|
||||
|
||||
![](http://www.tecmint.com/wp-content/uploads/2016/04/Use-Awk-with-Escape-Character.gif)
|
||||
>结合转义字符使用 Awk
|
||||
|
||||
### 总结
|
||||
|
||||
以上内容并不是 Awk 命令用做过滤工具的全部,上述的示例均是 awk 的基础操作。在下面的章节中,我将进一步介绍如何使用 awk 的高级功能。感谢您的阅读,请在评论区贴出您的评论。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: http://www.tecmint.com/use-linux-awk-command-to-filter-text-string-in-files/
|
||||
|
||||
作者:[Aaron Kili][a]
|
||||
译者:[wwy-hust](https://github.com/wwy-hust)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: http://www.tecmint.com/author/aaronkili/
|
Loading…
Reference in New Issue
Block a user