TranslateProject/sources/tech/20180111 BASH drivers, start your engines.md

91 lines
4.2 KiB
Markdown

Translating by Torival BASH drivers, start your engines
======
![](http://www.thelinuxrain.com/content/01-articles/201-bash-drivers-start-your-engines/headimage.jpg)
There's always more than one way to do a job in the shell, and there may not be One Best Way to do that job, either.
Nevertheless, different commands with the same output can differ in how long they take, how much memory they use and how hard they make the CPU work.
Out of curiosity I trialled 6 different ways to get the last 5 characters from each line of a text file, which is a simple text-processing task. The 6 commands are explained below and are abbreviated here as awk5, echo5, grep5, rev5, sed5 and tail5. These were also the names of the files generated by the commands.
### Tracking performance
I ran the trial on a 1.6GB UTF-8 text file with 1559391514 characters on 3570866 lines, or an average of 437 characters per line, and no blank lines. The last 5 characters on every line were alphanumeric.
To time the 6 commands I used **time** (the BASH shell built-in, not GNU **time** ) and while the commands were running I checked **top** to follow memory and CPU usage. My system is the Dell OptiPlex 9020 Micro described [here][1] and runs Debian 9.
All 6 commands used between 1 and 1.4GB of memory (VIRT in **top** ), and awk5, echo5, grep5 and sed5 ran at close to 100% CPU usage. Interestingly,
rev5 ran at ca 30% CPU and tail5 at ca 15%.
To ensure that all 6 commands had done the same job, I did a **diff** on the 6 output files, each about 21 MB:
![][2]
### And the winner is...
Here are the elapsed times:
![][3]
Well, AWK (GNU AWK 4.1.4) is really fast. Sure, all 6 commands could process a 100-line file zippety-quick, but for big text-processing jobs, fire up your AWK.
### Commands used
```
awk '{print substr($0,length($0)-4,5)}' file > awk5
```
awk5 used AWK's substring function. The function works on the whole line ($0), starts at the 4th character back from the last character (length($0)-4) and returns 5 characters (5).
```
#!/bin/bash
while read line; do echo "${line: -5}"; done < file > echo5
exit
```
echo5 was run as a script and uses a **while** loop for processing one line at a time. The BASH string function "${line: -5}" returns the last 5 characters in "$line".
```
grep -o '.....$' file > grep5
```
In grep5, **grep** searches each line for the last 5 characters (.....$) and returns (with the -o option) just that searched-for string.
```
#!/bin/bash
while read line; do rev <<<"$line" | cut -c1-5 | rev; done < file > rev5
exit
```
The rev5 trick in this script has appeared often in online forums. Each line is first reversed with **rev** , then **cut** is used to return the first 5 characters, then the 5-character string is reversed with **rev**.
```
sed 's/.*\(.....\)/\1/' file > sed5
```
sed5 is a simple use of **sed** (GNU sed 4.4) but was surprisingly slow in the trial. In each line, **sed** replaces zero or more characters leading up to the last 5 with just those last 5 (as a backreference).
```
#!/bin/bash
while read line; do tail -c 6 <<<"$line"; done < file > tail5
exit
```
The "-c 6" in the tail5 script means that **tail** captures the last 5 characters in each line plus the newline character at the end.
Actually, the "-c" option captures bytes, not characters, meaning if the line ends in multi-byte characters the output will be corrupt. But would you really want to use the ultra-slow **tail** for this job in the first place?
### About the Author
Bob Mesibov is Tasmanian, retired and a keen Linux tinkerer.
--------------------------------------------------------------------------------
via: http://www.thelinuxrain.com/articles/bash-drivers-start-your-engines
作者:[Bob Mesibov][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:http://www.thelinuxrain.com
[1]:http://www.thelinuxrain.com/articles/debian-9-on-a-dell-optiplex-9020-micro
[2]:http://www.thelinuxrain.com/content/01-articles/201-bash-drivers-start-your-engines/1.png
[3]:http://www.thelinuxrain.com/content/01-articles/201-bash-drivers-start-your-engines/2.png