Merge pull request #9841 from lujun9972/translate-MjAxODA4MDYgQSBnYXdrIHNjcmlwdCB0byBjb252ZXJ0IHNtYXJ0IHF1b3Rlcy5tZAo=

translate done: 20180806 A gawk script to convert smart quotes.md
This commit is contained in:
Xingyu.Wang 2018-08-16 15:33:42 +08:00 committed by GitHub
commit 9d5f4018c9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1,20 +1,19 @@
translating by lujun9972
A gawk script to convert smart quotes
一个转换花引号的 gawk 脚本
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/osdc_520x292_opensourceprescription.png?itok=gFrc_GTH)
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/osdc_520x292_opensourceprescription.pngitok=gFrc_GTH)
I manage a personal website and edit the web pages by hand. Since I don't have many pages on my site, this works well for me, letting me "scratch the itch" of getting into the site's code.
我管理着一个个人网站,同时手工编辑网站上的网页。由于网站上的页面并不多,这种方法对我很适合,可以让我对网站代码的细节一清二楚。
When I updated my website's design recently, I decided to turn all the plain quotes into "smart quotes," or quotes that look like those used in print material: “” instead of "".
最近我升级了网站的设计样式,我决定把所有的普通引号都转换成 "花引号",即在打印材料中使用的那种引号:用 “” 来代替 ""。
Editing all of the quotes by hand would take too long, so I decided to automate the process of converting the quotes in all of my HTML files. But doing so via a script or program requires some intelligence. The script needs to know when to convert a plain quote to a smart quote, and which quote to use.
手工修改所有的引号太耗时了,因此我决定将转换所有 HTML 文件中引号的过程自动化。不过通过程序或脚本来实现该功能需要费点劲。这个脚本需要知道何时将普通引号转换成花引号,并决定使用哪种引号(译注:左引号还是右引号,单引号还是双引号)。
You can use different methods to convert quotes. Greg Pittman wrote a [Python script][1] for fixing smart quotes in text. I wrote mine in GNU [awk][2] (gawk).
有多种方法可以转换引号。Greg Pittman 写过一个 [Python 脚本 ][1] 来修正文本中的花引号。而我自己使用 GNU [awk][2] (gawk) 来实现。
> Get our awk cheat sheet. [Free download][3].
> 下载我的 awk 备忘录。[免费下载 ][3]。
To start, I wrote a simple gawk function to evaluate a single character. If that character is a quote, the function determines if it should output a plain quote or a smart quote. The function looks at the previous character; if the previous character is a space, the function outputs a left smart quote. Otherwise, the function outputs a right smart quote. The script does the same for single quotes.
开始之前,我写了一个简单的 gawk 函数来评估单个字符。若该字符是一个引号,这该函数判断是输出普通引号还是花引号。函数查看前一个字符; 若前一个字符是空格,则函数输出左花引号。否则函数输出右花引号。脚本对单引号的处理方式也一样。
```
function smartquote (char, prevchar) {
        # print smart quotes depending on the previous character
@ -47,7 +46,7 @@ function smartquote (char, prevchar) {
}
```
With that function, the body of the gawk script processes the HTML input file character by character. The script prints all text verbatim when inside an HTML tag (for example, `<html lang="en">`. Outside any HTML tags, the script uses the `smartquote()` function to print text. The `smartquote()` function does the work of evaluating when to print plain quotes or smart quotes.
这个 gawk 脚本的主体部分通过该函数处理 HTML 输入文件的一个个字符。该脚本在 HTML 标签内部逐字原样输出所有内容(比如,`<html lang="en">`)。在 HTML 标签外,脚本使用 `smartquote()` 函数来输出文本。`smartquote()` 函数来评估是输出普通引号还是花引号。
```
function smartquote (char, prevchar) {
        ...
@ -87,12 +86,12 @@ BEGIN {htmltag = 0}
}
```
Here's an example:
下面是一个例子:
```
gawk -f quotes.awk test.html > test2.html
```
Sample input:
其输入为:
```
<!DOCTYPE html>
<html lang="en">
@ -111,7 +110,7 @@ Sample input:
```
Sample output:
其输出为:
```
<!DOCTYPE html>
<html lang="en">
@ -142,5 +141,5 @@ via: https://opensource.com/article/18/8/gawk-script-convert-smart-quotes
[a]:https://opensource.com/users/jim-hall
[1]:https://opensource.com/article/17/3/python-scribus-smart-quotes
[2]:/downloads/cheat-sheet-awk-features
[2]:https://opensource.com/downloads/cheat-sheet-awk-features
[3]:https://opensource.com/downloads/cheat-sheet-awk-features