mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-28 23:20:10 +08:00
TSL
This commit is contained in:
parent
f5167f6a52
commit
ed0fcd39d6
@ -1,207 +0,0 @@
|
||||
[#]: subject: (Practice using the Linux grep command)
|
||||
[#]: via: (https://opensource.com/article/21/3/grep-cheat-sheet)
|
||||
[#]: author: (Seth Kenlon https://opensource.com/users/seth)
|
||||
[#]: collector: (lujun9972)
|
||||
[#]: translator: (lxbwolf)
|
||||
[#]: reviewer: ( )
|
||||
[#]: publisher: ( )
|
||||
[#]: url: ( )
|
||||
|
||||
Practice using the Linux grep command
|
||||
======
|
||||
Learn the basics on searching for info in your files, then download our
|
||||
cheat sheet for a quick reference guide to grep and regex.
|
||||
![Hand putting a Linux file folder into a drawer][1]
|
||||
|
||||
One of the classic Unix commands, developed way back in 1974 by Ken Thompson, is the Global Regular Expression Print (grep) command. It's so ubiquitous in computing that it's frequently used as a verb ("grepping through a file") and, depending on how geeky your audience, it fits nicely into real-world scenarios, too. (For example, "I'll have to grep my memory banks to recall that information.") In short, grep is a way to search through a file for a specific pattern of characters. If that sounds like the modern Find function available in any word processor or text editor, then you've already experienced grep's effects on the computing industry.
|
||||
|
||||
Far from just being a quaint old command that's been supplanted by modern technology, grep's true power lies in two aspects:
|
||||
|
||||
* Grep works in the terminal and operates on streams of data, so you can incorporate it into complex processes. You can not only _find_ a word in a text file; you can extract the word, send it to another command, and so on.
|
||||
* Grep uses regular expression to provide a flexible search capability.
|
||||
|
||||
|
||||
|
||||
Learning the `grep` command is easy, although it does take some practice. This article introduces you to some of its features I find most useful.
|
||||
|
||||
**[Download our free [grep cheat sheet][2]]**
|
||||
|
||||
### Installing grep
|
||||
|
||||
If you're using Linux, you already have grep installed.
|
||||
|
||||
On macOS, you have the BSD version of grep. This differs slightly from the GNU version, so if you want to follow along exactly with this article, then install GNU grep from a project like [Homebrew][3] or [MacPorts][4].
|
||||
|
||||
### Basic grep
|
||||
|
||||
The basic grep syntax is always the same. You provide the `grep` command a pattern and a file you want it to search. In return, it prints each line to your terminal with a match.
|
||||
|
||||
|
||||
```
|
||||
$ grep gnu gpl-3.0.txt
|
||||
along with this program. If not, see <[http://www.gnu.org/licenses/\>][5].
|
||||
<[http://www.gnu.org/licenses/\>][5].
|
||||
<[http://www.gnu.org/philosophy/why-not-lgpl.html\>][6].
|
||||
```
|
||||
|
||||
By default, the `grep` command is case-sensitive, so "gnu" is different from "GNU" or "Gnu." You can make it ignore capitalization with the `--ignore-case` option.
|
||||
|
||||
|
||||
```
|
||||
$ grep --ignore-case gnu gpl-3.0.txt
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
The GNU General Public License is a free, copyleft license for
|
||||
the GNU General Public License is intended to guarantee your freedom to
|
||||
GNU General Public License for most of our software; it applies also to
|
||||
[...16 more results...]
|
||||
<[http://www.gnu.org/licenses/\>][5].
|
||||
<[http://www.gnu.org/philosophy/why-not-lgpl.html\>][6].
|
||||
```
|
||||
|
||||
You can also make the `grep` command return all lines _without_ a match by using the `--invert-match` option:
|
||||
|
||||
|
||||
```
|
||||
$ grep --invert-match \
|
||||
\--ignore-case gnu gpl-3.0.txt
|
||||
Version 3, 29 June 2007
|
||||
|
||||
Copyright (C) 2007 Free Software Foundation, Inc. <[http://fsf.org/\>][7]
|
||||
[...648 lines...]
|
||||
Public License instead of this License. But first, please read
|
||||
```
|
||||
|
||||
### Pipes
|
||||
|
||||
It's useful to be able to find text in a file, but the true power of [POSIX][8] is its ability to chain commands together through "pipes." I find that my best use of grep is when it's combined with other tools, like cut, tr, or [curl][9].
|
||||
|
||||
For instance, assume I have a file that lists some technical papers I want to download. I could open the file and manually click on each link, and then click through Firefox options to save each file to my hard drive, but that's a lot of time and clicking. Instead, I could grep for the links in the file, printing _only_ the matching string by using the `--only-matching` option:
|
||||
|
||||
|
||||
```
|
||||
$ grep --only-matching http\:\/\/.*pdf example.html
|
||||
<http://example.com/linux\_whitepaper.pdf>
|
||||
<http://example.com/bsd\_whitepaper.pdf>
|
||||
<http://example.com/important\_security\_topic.pdf>
|
||||
```
|
||||
|
||||
The output is a list of URLs, each on one line. This is a natural fit for how Bash processes data, so instead of having the URLs printed to my terminal, I can just pipe them into `curl`:
|
||||
|
||||
|
||||
```
|
||||
$ grep --only-matching http\:\/\/.*pdf \
|
||||
example.html | curl --remote-name
|
||||
```
|
||||
|
||||
This downloads each file, saving it according to its remote filename onto my hard drive.
|
||||
|
||||
My search pattern in this example may seem cryptic. That's because it uses regular expression, a kind of "wildcard" language that's particularly useful when searching broadly through lots of text.
|
||||
|
||||
### Regular expression
|
||||
|
||||
Nobody is under the illusion that regular expression ("regex" for short) is easy. However, I find it often has a worse reputation than it deserves. Admittedly, there's the potential for people to get a little _too clever_ with regex until it's so unreadable and so broad that it folds in on itself, but you don't have to overdo your regex. Here's a brief introduction to regex the way I use it.
|
||||
|
||||
First, create a file called `example.txt` and enter this text into it:
|
||||
|
||||
|
||||
```
|
||||
Albania
|
||||
Algeria
|
||||
Canada
|
||||
0
|
||||
1
|
||||
3
|
||||
11
|
||||
```
|
||||
|
||||
The most basic element of regex is the humble `.` character. It represents a single character.
|
||||
|
||||
|
||||
```
|
||||
$ grep Can.da example.txt
|
||||
Canada
|
||||
```
|
||||
|
||||
The pattern `Can.da` successfully returned `Canada` because the `.` character represented any _one_ character.
|
||||
|
||||
The `.` wildcard can be modified to represent more than one character with these notations:
|
||||
|
||||
* `?` matches the preceding item zero or one time
|
||||
* `*` matches the preceding item zero or more times
|
||||
* `+` matches the preceding item one or more times
|
||||
* `{4}` matches the preceding item up to four (or any number you enter in the braces) times
|
||||
|
||||
|
||||
|
||||
Armed with this knowledge, you can practice regex on `example.txt` all afternoon, seeing what interesting combinations you come up with. Some won't work; others will. The important thing is to analyze the results, so you understand why.
|
||||
|
||||
For instance, this fails to return any country:
|
||||
|
||||
|
||||
```
|
||||
`$ grep A.a example.txt`
|
||||
```
|
||||
|
||||
It fails because the `.` character can only ever match a single character unless you level it up. Using the `*` character, you can tell `grep` to match a single character zero or as many times as necessary until it reaches the end of the word. Because you know the list you're dealing with, you know that _zero times_ is useless in this instance. There are definitely no three-letter country names in this list. So instead, you can use `+` to match a single character at least once and then again as many times as necessary until the end of the word:
|
||||
|
||||
|
||||
```
|
||||
$ grep A.+a example.txt
|
||||
Albania
|
||||
Algeria
|
||||
```
|
||||
|
||||
You can use square brackets to provide a list of letters:
|
||||
|
||||
|
||||
```
|
||||
$ grep [A,C].+a example.txt
|
||||
Albania
|
||||
Algeria
|
||||
Canada
|
||||
```
|
||||
|
||||
This works for numbers, too. The results may surprise you:
|
||||
|
||||
|
||||
```
|
||||
$ grep [1-9] example.txt
|
||||
1
|
||||
3
|
||||
11
|
||||
```
|
||||
|
||||
Are you surprised to see 11 in a search for digits 1 to 9?
|
||||
|
||||
What happens if you add 13 to your list?
|
||||
|
||||
These numbers are returned because they include 1, which is among the list of digits to match.
|
||||
|
||||
As you can see, regex is something of a puzzle, but through experimentation and practice, you can get comfortable with it and use it to improve the way you grep through your data.
|
||||
|
||||
### Download the cheatsheet
|
||||
|
||||
The `grep` command has far more options than I demonstrated in this article. There are options to better format results, list files and line numbers containing matches, provide context for results by printing the lines surrounding a match, and much more. If you're learning grep, or you just find yourself using it often and resorting to searching through its `info` pages, you'll do yourself a favor by downloading our cheat sheet for it. The cheat sheet uses short options (`-v` instead of `--invert-matching`, for instance) as a way to get you familiar with common grep shorthand. It also contains a regex section to help you remember the most common regex codes. [Download the grep cheat sheet today!][2]
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://opensource.com/article/21/3/grep-cheat-sheet
|
||||
|
||||
作者:[Seth Kenlon][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://opensource.com/users/seth
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/yearbook-haff-rx-linux-file-lead_0.png?itok=-i0NNfDC (Hand putting a Linux file folder into a drawer)
|
||||
[2]: https://opensource.com/downloads/grep-cheat-sheet
|
||||
[3]: https://opensource.com/article/20/6/homebrew-mac
|
||||
[4]: https://opensource.com/article/20/11/macports
|
||||
[5]: http://www.gnu.org/licenses/\>
|
||||
[6]: http://www.gnu.org/philosophy/why-not-lgpl.html\>
|
||||
[7]: http://fsf.org/\>
|
||||
[8]: https://opensource.com/article/19/7/what-posix-richard-stallman-explains
|
||||
[9]: https://opensource.com/downloads/curl-command-cheat-sheet
|
@ -0,0 +1,206 @@
|
||||
[#]: subject: "Practice using the Linux grep command"
|
||||
[#]: via: "https://opensource.com/article/21/3/grep-cheat-sheet"
|
||||
[#]: author: "Seth Kenlon https://opensource.com/users/seth"
|
||||
[#]: collector: "lujun9972"
|
||||
[#]: translator: "lxbwolf"
|
||||
[#]: reviewer: " "
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
|
||||
练习使用 Linux 的 grep 命令
|
||||
======
|
||||
来学习下搜索文件中内容的基本操作,然后下载我们的备忘录作为 grep 和正则表达式的快速参考指南。
|
||||
![Hand putting a Linux file folder into a drawer][1]
|
||||
|
||||
grep(<ruby>全局正则表达式打印<rt>Global Regular Expression Print</rt></ruby>)是早在 1974 年由 Ken Thompson 开发的基本 Unix 命令之一。在计算领域,它无处不在,通常被用作为动词(“搜索一个文件中的内容”)。如果你的谈话对象有极客精神,那么它也能在真实生活场景中使用。(例如,“我会搜索我的内存库来回想起那些信息。”)简而言之,grep 是一种用特定的字符模式来搜索文件中内容的方式。如果你感觉这听起来像是文字处理器或文本编辑器的现代 Find 功能,那么你就已经在计算行业感受到了grep 的影响。
|
||||
|
||||
grep 绝不是被现代技术抛弃的远古命令,它的强大体现在两个方面:
|
||||
|
||||
* grep 可以在终端操作数据流,因此你可以把它嵌入到复杂的处理中。你不仅可以在一个文本文件中*查找*文字,还可以提取文字后把它发给另一个命令。
|
||||
* grep 使用正则表达式来提供灵活的搜索能力。
|
||||
|
||||
|
||||
|
||||
虽然需要一些练习,但学习 `grep` 命令还是很容易的。本文会介绍一些我认为 grep 最有用的功能。
|
||||
|
||||
**[下载我们免费的 [grep 备忘录][2]]**
|
||||
|
||||
### 安装 grep
|
||||
|
||||
Linux 默认安装 grep。
|
||||
|
||||
MacOS 默认安装了 BSD 版的 grep。BSD 版的 grep 跟 GNU 版有一点不一样,因此如果你想完全参照本文,那么请使用 [Homebrew][3] 或 [MacPorts][4] 安装 GNU 版的 grep。
|
||||
|
||||
### 基础的 grep
|
||||
|
||||
所有版本的 grep 基础语法都一样。入参是匹配模式和你需要搜索的文件。它会把匹配到的每一行输出到你的终端。
|
||||
|
||||
|
||||
```
|
||||
$ grep gnu gpl-3.0.txt
|
||||
along with this program. If not, see <[http://www.gnu.org/licenses/\>][5].
|
||||
<[http://www.gnu.org/licenses/\>][5].
|
||||
<[http://www.gnu.org/philosophy/why-not-lgpl.html\>][6].
|
||||
```
|
||||
|
||||
`grep` 命令默认大小写敏感,因此 “gnu”、“GNU"、”Gnu“ 是三个不同的值。你可以使用 `--ignore-case` 选项来忽略大小写。
|
||||
|
||||
|
||||
```
|
||||
$ grep --ignore-case gnu gpl-3.0.txt
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
The GNU General Public License is a free, copyleft license for
|
||||
the GNU General Public License is intended to guarantee your freedom to
|
||||
GNU General Public License for most of our software; it applies also to
|
||||
[...16 more results...]
|
||||
<[http://www.gnu.org/licenses/\>][5].
|
||||
<[http://www.gnu.org/philosophy/why-not-lgpl.html\>][6].
|
||||
```
|
||||
|
||||
你也可以通过 `--invert-match` 选项来输出所有没有匹配到的行:
|
||||
|
||||
|
||||
```
|
||||
$ grep --invert-match \
|
||||
\--ignore-case gnu gpl-3.0.txt
|
||||
Version 3, 29 June 2007
|
||||
|
||||
Copyright (C) 2007 Free Software Foundation, Inc. <[http://fsf.org/\>][7]
|
||||
[...648 lines...]
|
||||
Public License instead of this License. But first, please read
|
||||
```
|
||||
|
||||
### 管道
|
||||
|
||||
能搜索文件中的文本内容是很有用的,但是 [POSIX][8] 的真正强大之处是可以通过”管道“来连接多条命令。我发现我使用 grep 最好的方式是把它与其他工具如 cut、tr 或 [curl][9] 联合使用。
|
||||
|
||||
假如现在有一个文件,文件中每一行是我想要下载的技术论文。我可以打开文件手动点击每一个链接,然后点击火狐的选项把每一个文件保存到我的硬盘,但是需要点击多次且耗费很长时间。而我还可以搜索文件中的链接,用 `--only-matching` 选项*只*打印出匹配到的字符串。
|
||||
|
||||
|
||||
```
|
||||
$ grep --only-matching http\:\/\/.*pdf example.html
|
||||
<http://example.com/linux\_whitepaper.pdf>
|
||||
<http://example.com/bsd\_whitepaper.pdf>
|
||||
<http://example.com/important\_security\_topic.pdf>
|
||||
```
|
||||
|
||||
输出是一系列的 URL,每行一个。而这与 Bash 处理数据的方式完美契合,因此我不再把 URL 打印到终端,而是把它们通过管道传给 `curl`:
|
||||
|
||||
|
||||
```
|
||||
$ grep --only-matching http\:\/\/.*pdf \
|
||||
example.html | curl --remote-name
|
||||
```
|
||||
|
||||
这条命令可以下载每一个文件,然后以各自远程的文件名命名保存在我的硬盘上。
|
||||
|
||||
这个例子中我的搜索模式可能很晦涩。那是因为它用的是正则表达式,一种在大量文本中进行模糊搜索时非常有用的”通配符“语言。
|
||||
|
||||
### 正则表达式
|
||||
|
||||
没有人会觉得正则表达式(简称 ”regex“)很简单。然而,我发现它的名声通常并不好。不可否认,很多人在使用正则表达式时”过于聪明”,以致于可读性很差,太过模糊以致于前面的模式覆盖了后面的模式,但是你仍大可不必滥用正则。这里是我使用正则的一个简明的教程。
|
||||
|
||||
首先,创建一个名为 `example.txt` 的文件,输入以下内容:
|
||||
|
||||
|
||||
```
|
||||
Albania
|
||||
Algeria
|
||||
Canada
|
||||
0
|
||||
1
|
||||
3
|
||||
11
|
||||
```
|
||||
|
||||
最基础的元素是谦逊的 `.` 字符。它表示一个字符。
|
||||
|
||||
|
||||
```
|
||||
$ grep Can.da example.txt
|
||||
Canada
|
||||
```
|
||||
|
||||
模式 `Can.da` 能成功匹配到 `Canada` 是因为 `.` 字符表示任意*一个*字符。
|
||||
|
||||
可以使用下面这些符号来使 `.` 通配符表示多个字符:
|
||||
|
||||
* `?` 匹配前面的模式零次或一次
|
||||
* `*` 匹配前面的模式零次或多次
|
||||
* `+` 匹配前面的模式一次或多次
|
||||
* `{4}` 匹配前面的模式最多 4 次(或是你在括号中写的其他次数)
|
||||
|
||||
|
||||
|
||||
了解了这些知识后,你可以用你认为有意思的所有模式来在 `example.txt` 中做练习。可能有些会成功,有些不会成功。重要的是你要去分析结果,这样你才会知道原因。
|
||||
|
||||
例如,下面的命令匹配不到任何国家:
|
||||
|
||||
|
||||
```
|
||||
`$ grep A.a example.txt`
|
||||
```
|
||||
|
||||
因为 `.` 字符只能匹配一个字符,除非你增加匹配次数。使用 `*` 字符,告诉 `grep` 匹配一个字符零次或者必要的任意多次直到单词末尾。因为你知道你要处理的内容,因此在本例中*零次*是没有必要的。在这个列表中一定没有单个字母的国家。因此,你可以用 `+` 来匹配一个字符至少一次且任意多次直到单词末尾:
|
||||
|
||||
|
||||
```
|
||||
$ grep A.+a example.txt
|
||||
Albania
|
||||
Algeria
|
||||
```
|
||||
|
||||
你可以使用方括号来提供一系列的字母:
|
||||
|
||||
|
||||
```
|
||||
$ grep [A,C].+a example.txt
|
||||
Albania
|
||||
Algeria
|
||||
Canada
|
||||
```
|
||||
|
||||
也可以用来匹配数字。结果可能会震惊你:
|
||||
|
||||
|
||||
```
|
||||
$ grep [1-9] example.txt
|
||||
1
|
||||
3
|
||||
11
|
||||
```
|
||||
|
||||
看到 11 出现在搜索数字 1 到 9 的结果中,你惊讶吗?
|
||||
|
||||
如果把 13 加到搜索列表中,会出现什么结果呢?
|
||||
|
||||
这些数字之所以会被匹配到,是因为它们包含 1,而 1 在要匹配的数字中。
|
||||
|
||||
你可以发现,正则表达式有时会令人费解,但是通过体验和练习,你可以熟练掌握它,用它来提高你搜索数据的能力。
|
||||
|
||||
### 下载备忘录
|
||||
|
||||
`grep` 命令还有很多文章中没有列出的选项。有用来更好地展示匹配结果、列出文件、列出匹配到的行号、通过打印匹配到的行周围的内容来显示上下文的选项,等等。如果你在学习 grep,或者你经常使用它并且通过查阅它的`帮助`页面来查看选项,那么你可以下载我们的备忘录。这个备忘录使用短选项(例如,使用 `-v`,而不是 `--invert-matching`)来帮助你更好地熟悉 grep。它还有一部分正则表达式可以帮你记住用途最广的正则表达式代码。 [现在就下载 grep 备忘录!][2]
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://opensource.com/article/21/3/grep-cheat-sheet
|
||||
|
||||
作者:[Seth Kenlon][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[lxbwolf](https://github.com/lxbwolf)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://opensource.com/users/seth
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/yearbook-haff-rx-linux-file-lead_0.png?itok=-i0NNfDC "Hand putting a Linux file folder into a drawer"
|
||||
[2]: https://opensource.com/downloads/grep-cheat-sheet
|
||||
[3]: https://opensource.com/article/20/6/homebrew-mac
|
||||
[4]: https://opensource.com/article/20/11/macports
|
||||
[5]: http://www.gnu.org/licenses/\>
|
||||
[6]: http://www.gnu.org/philosophy/why-not-lgpl.html\>
|
||||
[7]: http://fsf.org/\>
|
||||
[8]: https://opensource.com/article/19/7/what-posix-richard-stallman-explains
|
||||
[9]: https://opensource.com/downloads/curl-command-cheat-sheet
|
Loading…
Reference in New Issue
Block a user