Merge pull request #9447 from BriFuture/master

Translation Finished
This commit is contained in:
Xingyu.Wang 2018-07-13 20:20:10 +08:00 committed by GitHub
commit c396e419cb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 291 additions and 270 deletions

View File

@ -1,270 +0,0 @@
BriFuture is translating
You don't know Bash: An introduction to Bash arrays
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/programming-code-keyboard-laptop.png?itok=pGfEfu2S)
Although software engineers regularly use the command line for many aspects of development, arrays are likely one of the more obscure features of the command line (although not as obscure as the regex operator `=~`). But obscurity and questionable syntax aside, [Bash][1] arrays can be very powerful.
### Wait, but why?
Writing about Bash is challenging because it's remarkably easy for an article to devolve into a manual that focuses on syntax oddities. Rest assured, however, the intent of this article is to avoid having you RTFM.
#### A real (actually useful) example
To that end, let's consider a real-world scenario and how Bash can help: You are leading a new effort at your company to evaluate and optimize the runtime of your internal data pipeline. As a first step, you want to do a parameter sweep to evaluate how well the pipeline makes use of threads. For the sake of simplicity, we'll treat the pipeline as a compiled C++ black box where the only parameter we can tweak is the number of threads reserved for data processing: `./pipeline --threads 4`.
### The basics
`--threads` parameter that we want to test:
```
allThreads=(1 2 4 8 16 32 64 128)
```
The first thing we'll do is define an array containing the values of theparameter that we want to test:
In this example, all the elements are numbers, but it need not be the case—arrays in Bash can contain both numbers and strings, e.g., `myArray=(1 2 "three" 4 "five")` is a valid expression. And just as with any other Bash variable, make sure to leave no spaces around the equal sign. Otherwise, Bash will treat the variable name as a program to execute, and the `=` as its first parameter!
Now that we've initialized the array, let's retrieve a few of its elements. You'll notice that simply doing `echo $allThreads` will output only the first element.
To understand why that is, let's take a step back and revisit how we usually output variables in Bash. Consider the following scenario:
```
type="article"
echo "Found 42 $type"
```
Say the variable `$type` is given to us as a singular noun and we want to add an `s` at the end of our sentence. We can't simply add an `s` to `$type` since that would turn it into a different variable, `$types`. And although we could utilize code contortions such as `echo "Found 42 "$type"s"`, the best way to solve this problem is to use curly braces: `echo "Found 42 ${type}s"`, which allows us to tell Bash where the name of a variable starts and ends (interestingly, this is the same syntax used in JavaScript/ES6 to inject variables and expressions in [template literals][2]).
So as it turns out, although Bash variables don't generally require curly brackets, they are required for arrays. In turn, this allows us to specify the index to access, e.g., `echo ${allThreads[1]}` returns the second element of the array. Not including brackets, e.g.,`echo $allThreads[1]`, leads Bash to treat `[1]` as a string and output it as such.
Yes, Bash arrays have odd syntax, but at least they are zero-indexed, unlike some other languages (I'm looking at you, `R`).
### Looping through arrays
Although in the examples above we used integer indices in our arrays, let's consider two occasions when that won't be the case: First, if we wanted the `$i`-th element of the array, where `$i` is a variable containing the index of interest, we can retrieve that element using: `echo ${allThreads[$i]}`. Second, to output all the elements of an array, we replace the numeric index with the `@` symbol (you can think of `@` as standing for `all`): `echo ${allThreads[@]}`.
#### Looping through array elements
With that in mind, let's loop through `$allThreads` and launch the pipeline for each value of `--threads`:
```
for t in ${allThreads[@]}; do
  ./pipeline --threads $t
done
```
#### Looping through array indices
Next, let's consider a slightly different approach. Rather than looping over array elements, we can loop over array indices:
```
for i in ${!allThreads[@]}; do
  ./pipeline --threads ${allThreads[$i]}
done
```
Let's break that down: As we saw above, `${allThreads[@]}` represents all the elements in our array. Adding an exclamation mark to make it `${!allThreads[@]}` will return the list of all array indices (in our case 0 to 7). In other words, the `for` loop is looping through all indices `$i` and reading the `$i`-th element from `$allThreads` to set the value of the `--threads` parameter.
This is much harsher on the eyes, so you may be wondering why I bother introducing it in the first place. That's because there are times where you need to know both the index and the value within a loop, e.g., if you want to ignore the first element of an array, using indices saves you from creating an additional variable that you then increment inside the loop.
### Populating arrays
So far, we've been able to launch the pipeline for each `--threads` of interest. Now, let's assume the output to our pipeline is the runtime in seconds. We would like to capture that output at each iteration and save it in another array so we can do various manipulations with it at the end.
#### Some useful syntax
But before diving into the code, we need to introduce some more syntax. First, we need to be able to retrieve the output of a Bash command. To do so, use the following syntax: `output=$( ./my_script.sh )`, which will store the output of our commands into the variable `$output`.
The second bit of syntax we need is how to append the value we just retrieved to an array. The syntax to do that will look familiar:
```
myArray+=( "newElement1" "newElement2" )
```
#### The parameter sweep
Putting everything together, here is our script for launching our parameter sweep:
```
allThreads=(1 2 4 8 16 32 64 128)
allRuntimes=()
for t in ${allThreads[@]}; do
  runtime=$(./pipeline --threads $t)
  allRuntimes+=( $runtime )
done
```
And voilà!
### What else you got?
In this article, we covered the scenario of using arrays for parameter sweeps. But I promise there are more reasons to use Bash arrays—here are two more examples.
#### Log alerting
In this scenario, your app is divided into modules, each with its own log file. We can write a cron job script to email the right person when there are signs of trouble in certain modules:``
```
# List of logs and who should be notified of issues
logPaths=("api.log" "auth.log" "jenkins.log" "data.log")
logEmails=("jay@email" "emma@email" "jon@email" "sophia@email")
# Look for signs of trouble in each log
for i in ${!logPaths[@]};
do
  log=${logPaths[$i]}
  stakeholder=${logEmails[$i]}
  numErrors=$( tail -n 100 "$log" | grep "ERROR" | wc -l )
  # Warn stakeholders if recently saw > 5 errors
  if [[ "$numErrors" -gt 5 ]];
  then
    emailRecipient="$stakeholder"
    emailSubject="WARNING: ${log} showing unusual levels of errors"
    emailBody="${numErrors} errors found in log ${log}"
    echo "$emailBody" | mailx -s "$emailSubject" "$emailRecipient"
  fi
done
```
#### API queries
Say you want to generate some analytics about which users comment the most on your Medium posts. Since we don't have direct database access, SQL is out of the question, but we can use APIs!
To avoid getting into a long discussion about API authentication and tokens, we'll instead use [JSONPlaceholder][3], a public-facing API testing service, as our endpoint. Once we query each post and retrieve the emails of everyone who commented, we can append those emails to our results array:
```
endpoint="https://jsonplaceholder.typicode.com/comments"
allEmails=()
# Query first 10 posts
for postId in {1..10};
do
  # Make API call to fetch emails of this posts's commenters
  response=$(curl "${endpoint}?postId=${postId}")
  # Use jq to parse the JSON response into an array
  allEmails+=( $( jq '.[].email' <<< "$response" ) )
done
```
Note here that I'm using the [`jq` tool][4] to parse JSON from the command line. The syntax of `jq` is beyond the scope of this article, but I highly recommend you look into it.
As you might imagine, there are countless other scenarios in which using Bash arrays can help, and I hope the examples outlined in this article have given you some food for thought. If you have other examples to share from your own work, please leave a comment below.
### But wait, there's more!
Since we covered quite a bit of array syntax in this article, here's a summary of what we covered, along with some more advanced tricks we did not cover:
Syntax Result `arr=()` Create an empty array `arr=(1 2 3)` Initialize array `${arr[2]}` Retrieve third element `${arr[@]}` Retrieve all elements `${!arr[@]}` Retrieve array indices `${#arr[@]}` Calculate array size `arr[0]=3` Overwrite 1st element `arr+=(4)` Append value(s) `str=$(ls)` Save `ls` output as a string `arr=( $(ls) )` Save `ls` output as an array of files `${arr[@]:s:n}` Retrieve elements at indices `n` to `s+n`
### One last thought
As we've discovered, Bash arrays sure have strange syntax, but I hope this article convinced you that they are extremely powerful. Once you get the hang of the syntax, you'll find yourself using Bash arrays quite often.
#### Bash or Python?
Which begs the question: When should you use Bash arrays instead of other scripting languages such as Python?
To me, it all boils down to dependencies—if you can solve the problem at hand using only calls to command-line tools, you might as well use Bash. But for times when your script is part of a larger Python project, you might as well use Python.
For example, we could have turned to Python to implement the parameter sweep, but we would have ended up just writing a wrapper around Bash:
```
import subprocess
all_threads = [1, 2, 4, 8, 16, 32, 64, 128]
all_runtimes = []
# Launch pipeline on each number of threads
for t in all_threads:
  cmd = './pipeline --threads {}'.format(t)
  # Use the subprocess module to fetch the return output
  p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
  output = p.communicate()[0]
  all_runtimes.append(output)
```
Since there's no getting around the command line in this example, using Bash directly is preferable.
#### Time for a shameless plug
If you enjoyed this article, there's more where that came from! [Register here to attend OSCON][5], where I'll be presenting the live-coding workshop [You Don't Know Bash][6] on July 17, 2018. No slides, no clickers—just you and me typing away at the command line, exploring the wondrous world of Bash.
This article originally appeared on [Medium][7] and is republished with permission.
--------------------------------------------------------------------------------
via: https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
作者:[Robert Aboukhalil][a]
选题:[lujun9972](https://github.com/lujun9972)
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://opensource.com/users/robertaboukhalil
[1]:https://opensource.com/article/17/7/bash-prompt-tips-and-tricks
[2]:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals
[3]:https://github.com/typicode/jsonplaceholder
[4]:https://stedolan.github.io/jq/
[5]:https://conferences.oreilly.com/oscon/oscon-or
[6]:https://conferences.oreilly.com/oscon/oscon-or/public/schedule/detail/67166
[7]:https://medium.com/@robaboukhalil/the-weird-wondrous-world-of-bash-arrays-a86e5adf2c69

View File

@ -0,0 +1,291 @@
你不知道的 Bash关于 Bash 数组的介绍
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/programming-code-keyboard-laptop.png?itok=pGfEfu2S)
尽管软件工程师常常使用命令行来进行各种开发,但命令行中的数组似乎总是一个模糊的东西(虽然没有正则操作符 `=~` 那么复杂隐晦)。除开隐晦和有疑问的语法,[Bash][1] 数组其实是非常有用的。
### 稍等,这是为什么?
写 Bash 相关的东西很难,但如果是写一篇像手册那样注重怪异语法的文章,就会非常简单。不过请放心,这篇文章的目的就是让你不用去读该死的使用手册。
#### 真实(通常是有用的)示例
为了这个目的,想象一下真实世界的场景以及 Bash 是怎么帮忙的:你正在公司里面主导一个新工作,评估并优化内部数据管线的运行时间。首先,你要做个参数扫描分析来评估管线使用线程的状况。简单起见,我们把这个管道当作一个编译好的 C++ 黑盒子,这里面我们能够调整的唯一的参数是用于处理数据的线程数量:`./pipeline --threads 4`。
### 基础
我们将要测试的 `--threads` 参数:
```
allThreads=(1 2 4 8 16 32 64 128)
```
我们首先要做的事是定义一个数组,用来容纳我们想要测试的参数:
本例中所有元素都是数字但参数并不一定是数字Bash 中的 数组可以容纳数字和字符串,比如 `myArray=(1 2 "three" 4 "five")` 就是个有效的表达式。就像 Bash 中其它的变量一样,确保赋值符号两边没有空格。否则 Bash 将会把变量名当作程序来执行,把 `=` 当作程序的第一个参数。
现在我们初始化了数组,让我们解析它其中的一些元素。仅仅输入 `echo $allThreads` ,你能发现,它只会输出第一个元素。
要理解这个产生的原因,需要回到上一步,回顾我们一般是怎么在 Bash 中输出 变量。考虑以下场景:
```
type="article"
echo "Found 42 $type"
```
假如我们得到的变量 `$type` 是一个单词,我们想要添加在句子结尾一个 `s`。我们无法直接把 `s` 加到 `$type` 里面,因为这会把它变成另一个变量,`$types`。尽管我们可以利用像 `echo "Found 42 "$type"s"` 这样的代码形变,但解决这个问题的最好方法是用一个花括号:`echo "Found 42 ${type}s"`,这让我们能够告诉 Bash 变量名的起止位置有趣的是JavaScript/ES6 在 [template literals][2] 中注入变量和表达式的语法和这里是一样的)
事实上,尽管 Bash 变量一般不用花括号,但在数组中需要用到花括号。这反而允许我们指定要访问的索引,例如 `echo ${allThreads[1]}` 返回的是数组中的第二个元素。如果不写花括号,比如 `echo $allThreads[1]`,会导致 Bash 把 `[1]` 当作字符串然后输出。
是的Bash 数组的语法很怪,但是至少他们是从 0 开始索引的,不像有些语言(说的就是你,`R` 语言)。
### 遍历数组
上面的例子中我们直接用整数作为数组的索引,我们现在考虑两种其他情况:第一,如果想要数组中的第 `$i` 个元素,这里 `$i` 是一个代表索引的变量,我们可以这样 `echo ${allThreads[$i]}` 解析这个元素。第二,要输出一个数组的所有元素,我们把数字索引换成 `@` 符号(你可以把 `@` 当作表示 `all` 的符号):`echo ${allThreads[@]}`。
#### 遍历数组元素
记住上面讲过的,我们遍历 `$allThreads` 数组,把每个值当作 `--threads` 参数启动 pipeline
```
for t in ${allThreads[@]}; do
  ./pipeline --threads $t
done
```
#### 遍历数组索引
接下来,考虑一个稍稍不同的方法。不是遍历所有的数组元素,我们可以遍历所有的索引:
```
for i in ${!allThreads[@]}; do
  ./pipeline --threads ${allThreads[$i]}
done
```
一步一步看:如之前所见,`${allThreads[@]}` 表示数组中的所有元素。前面加了个感叹号,变成 `${!allThreads[@]}`,这会返回数组索引列表(这里是 0 到 7。换句话说。`for` 循环就遍历所有的索引 `$i` 并从 `$allThreads` 中读取第 `$i` 个元素,当作 `--threads` 选项的参数。
这看上去很辣眼睛,你可能奇怪为什么我要一开始就讲这个。这是因为有时候在循环中需要同时获得索引和对应的值,例如,如果你想要忽视数组中的第一个元素,使用索引避免创建要在循环中累加的额外变量。
### 填充数组
到目前为止,我们已经能够用给定的 `--threads` 选项启动 pipeline 了。现在假设按秒计时的运行时间输出到 pipeline。我们想要捕捉每个迭代的输出然后把它保存在另一个数组中因此我们最终可以随心所欲的操作它。
#### 一些有用的语法
在深入代码前,我们要多介绍一些语法。首先,我们要能解析 Bash 命令的输出。用这个语法可以做到:`output=$( ./my_script.sh )`,这会把命令的输出存储到变量 `$output` 中。
我们需要的第二个语法是如何把我们刚刚解析的值添加到数组中。完成这个任务的语法看起来很熟悉:
```
myArray+=( "newElement1" "newElement2" )
```
#### 参数扫描
万事具备,执行参数扫描的脚步如下:
```
allThreads=(1 2 4 8 16 32 64 128)
allRuntimes=()
for t in ${allThreads[@]}; do
  runtime=$(./pipeline --threads $t)
  allRuntimes+=( $runtime )
done
```
就是这个了!
### 还有什么能做的?
这篇文章中,我们讲过使用数组进行参数扫描的场景。我担保有很多理由要使用 Bash 数组,这里就有两个例子:
#### 日志警告
本场景中,把应用分成几个模块,每一个都有它自己的日志文件。我们可以编写一个 cron 任务脚本,当某个模块中出现问题标志时向特定的人发送邮件:
```
# 日志列表,发生问题时应该通知的人
logPaths=("api.log" "auth.log" "jenkins.log" "data.log")
logEmails=("jay@email" "emma@email" "jon@email" "sophia@email")
# 在每个日志中查找问题标志
for i in ${!logPaths[@]};
do
  log=${logPaths[$i]}
  stakeholder=${logEmails[$i]}
  numErrors=$( tail -n 100 "$log" | grep "ERROR" | wc -l )
# 如果近期发现超过 5 个错误,就警告负责人
  if [[ "$numErrors" -gt 5 ]];
  then
    emailRecipient="$stakeholder"
    emailSubject="WARNING: ${log} showing unusual levels of errors"
    emailBody="${numErrors} errors found in log ${log}"
    echo "$emailBody" | mailx -s "$emailSubject" "$emailRecipient"
  fi
done
```
#### API 查询
如果你想要生成一些分析数据,分析你的 Medium 帖子中用户评论最多的。由于我们无法直接访问数据库,毫无疑问要用 SQL但我们可以用 APIs
为了避免陷入关于 API 授权和令牌的冗长讨论,我们将会使用 [JSONPlaceholder][3] 作为我们的目的,这是一个面向公众的测试服务 API。一旦我们查询每个帖子解析出评论者的邮箱我们就可以把这些邮箱添加到我们的结果数组里
```
endpoint="https://jsonplaceholder.typicode.com/comments"
allEmails=()
# 查询前 10 个帖子
for postId in {1..10};
do
# 执行 API 调用,获取该帖子评论者的邮箱
  response=$(curl "${endpoint}?postId=${postId}")
# 使用 jq 把 JSON 响应解析成数组
  allEmails+=( $( jq '.[].email' <<< "$response" ) )
done
```
注意这里我是用 [`jq` 工具][4] 从命令行里解析 JSON 数据。关于 `jq` 的语法超出了本文的范围,但我强烈建议你了解它。
你可能已经想到,使用 Bash 数组在数不胜数的场景中很有帮助,我希望这篇文章中的示例可以给你思维的启发。如果你从自己的工作中找到其它的例子想要分享出来,请在帖子下方评论。
### 请等等,还有很多东西!
由于我们在本文讲了很多数组语法,这里是关于我们讲到内容的总结,包含一些还没讲到的高级技巧:
| 语法 | 效果 |
|:--|:--|
| `arr=()` | 创建一个空数组 |
| `arr=(1 2 3)` | 初始化数组 |
| `${arr[2]}` | 解析第三个元素 |
| `${arr[@]}` | 解析所有元素 |
| `${!arr[@]}` | 解析数组索引 |
| `${#arr[@]}` | 计算数组长度 |
| `arr[0]=3` | 重写第 1 个元素 |
| `arr+=(4)` | 添加值 |
| `str=$(ls)` | 把 `ls` 输出保存到字符串 |
| `arr=( $(ls) )` | 把 `ls` 输出的文件保存到数组里 |
| `${arr[@]:s:n}` | 解析索引在 `n``s+n` 之间的元素|
>译者注: `${arr[@]:s:n}` 应该是解析索引在 `s``s+n-1` 之间的元素
### 最后一点思考
正如我们所见Bash 数组的语法很奇怪,但我希望这篇文章让你相信它们很有用。只要你理解了这些语法,你会发现以后会经常使用 Bash 数组。
#### Bash 还是 Python
问题来了:什么时候该用 Bash 数组而不是其他的脚本语法,比如 Python
对我而言,完全取决于需求——如果你可以只需要调用命令行工具就能立马解决问题,你也可以用 Bash。但有些时候当你的脚本属于一个更大的 Python 项目时,你也可以用 Python。
比如,我们可以用 Python 来实现参数扫描,但我们只用编写一个 Bash 的包装:
```
import subprocess
all_threads = [1, 2, 4, 8, 16, 32, 64, 128]
all_runtimes = []
# 用不同的线程数字启动 pipeline
for t in all_threads:
  cmd = './pipeline --threads {}'.format(t)
# 使用子线程模块获得返回的输出
  p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
  output = p.communicate()[0]
  all_runtimes.append(output)
```
由于本例中没法避免使用命令行,所以可以优先使用 Bash。
#### 羞耻的宣传时间
如果你喜欢这篇文章,这里还有很多类似的文章! [在此注册,加入 OSCON][5]2018 年 7 月 17 号我会在这做一个主题为 [你不知道的 Bash][6] 的在线编码研讨会。没有幻灯片,不需要门票,只有你和我在命令行里面敲代码,探索 Bash 中的奇妙世界。
本文章由 [Medium] 首发,再发布时已获得授权。
--------------------------------------------------------------------------------
via: https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
作者:[Robert Aboukhalil][a]
选题:[lujun9972](https://github.com/lujun9972)
译者:[BriFuture](https://github.com/BriFuture)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://opensource.com/users/robertaboukhalil
[1]:https://opensource.com/article/17/7/bash-prompt-tips-and-tricks
[2]:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals
[3]:https://github.com/typicode/jsonplaceholder
[4]:https://stedolan.github.io/jq/
[5]:https://conferences.oreilly.com/oscon/oscon-or
[6]:https://conferences.oreilly.com/oscon/oscon-or/public/schedule/detail/67166
[7]:https://medium.com/@robaboukhalil/the-weird-wondrous-world-of-bash-arrays-a86e5adf2c69