mirror of
https://github.com/LCTT/TranslateProject.git
synced 2024-12-29 21:41:00 +08:00
Merge pull request #14893 from wxy/20190315-How-To-Parse-And-Pretty-Print-JSON-With-Linux-Commandline-Tools
TSL&PRF:20190315 How To Parse And Pretty Print JSON With Linux Commandline Tools
This commit is contained in:
commit
e7ae2cb0e1
@ -1,264 +0,0 @@
|
||||
[#]: collector: "lujun9972"
|
||||
[#]: translator: "wxy"
|
||||
[#]: reviewer: " "
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
[#]: subject: "How To Parse And Pretty Print JSON With Linux Commandline Tools"
|
||||
[#]: via: "https://www.ostechnix.com/how-to-parse-and-pretty-print-json-with-linux-commandline-tools/"
|
||||
[#]: author: "EDITOR https://www.ostechnix.com/author/editor/"
|
||||
|
||||
How To Parse And Pretty Print JSON With Linux Commandline Tools
|
||||
======
|
||||
|
||||
**JSON** is a lightweight and language independent data storage format, easy to integrate with most programming languages and also easy to understand by humans, of course when properly formatted. The word JSON stands for **J** ava **S** cript **O** bject **N** otation, though it starts with JavaScript, and primarily used to exchange data between server and browser, but now being used in many fields including embedded systems. Here we’re going to parse and pretty print JSON with command line tools on Linux. It’s extremely useful for handling large JSON data in a shell scripts, or manipulating JSON data in a shell script.
|
||||
|
||||
### What is pretty printing?
|
||||
|
||||
The JSON data is structured to be somewhat more human readable. However in most cases, JSON data is stored in a single line, even without a line ending character.
|
||||
|
||||
Obviously that’s not very convenient for reading and editing manually.
|
||||
|
||||
That’s when pretty print is useful. The name is quite self explanatory, re-formatting the JSON text to be more legible by humans. This is known as **JSON pretty printing**.
|
||||
|
||||
### Parse And Pretty Print JSON With Linux Commandline Tools
|
||||
|
||||
JSON data could be parsed with command line text processors like **awk** , **sed** and **gerp**. In fact JSON.awk is an awk script to do that. However there are some dedicated tools for the same purpose.
|
||||
|
||||
1. **jq** or **jshon** , JSON parser for shell, both of them are quite useful.
|
||||
|
||||
2. Shell scripts like **JSON.sh** or **jsonv.sh** to parse JSON in bash, zsh or dash shell.
|
||||
|
||||
3. **JSON.awk** , JSON parser awk script.
|
||||
|
||||
4. Python modules like **json.tool**.
|
||||
|
||||
5. **underscore-cli** , Node.js and javascript based.
|
||||
|
||||
|
||||
|
||||
|
||||
In this tutorial I’m focusing only on **jq** , which is quite powerful JSON parser for shells with advanced filtering and scripting capability.
|
||||
|
||||
### JSON pretty printing
|
||||
|
||||
JSON data could be in one and nearly illegible for humans, so to make it somewhat readable, JSON pretty printing is here.
|
||||
|
||||
**Example:** A data from **jsonip.com** , to get external IP address in JSON format, use **curl** or **wget** tools like below.
|
||||
|
||||
```
|
||||
$ wget -cq http://jsonip.com/ -O -
|
||||
```
|
||||
|
||||
The actual data looks like this:
|
||||
|
||||
```
|
||||
{"ip":"111.222.333.444","about":"/about","Pro!":"http://getjsonip.com"}
|
||||
```
|
||||
|
||||
Now pretty print it with jq:
|
||||
|
||||
```
|
||||
$ wget -cq http://jsonip.com/ -O - | jq '.'
|
||||
```
|
||||
|
||||
This should look like below, after filtering the result with jq.
|
||||
|
||||
```
|
||||
{
|
||||
|
||||
"ip": "111.222.333.444",
|
||||
|
||||
"about": "/about",
|
||||
|
||||
"Pro!": "http://getjsonip.com"
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
The Same thing could be done with python **json.tool** module. Here is an example:
|
||||
|
||||
```
|
||||
$ cat anything.json | python -m json.tool
|
||||
```
|
||||
|
||||
This Python based solution should be fine for most users, but it’s not that useful where Python is not pre-installed or could not be installed, like on embedded systems.
|
||||
|
||||
However the json.tool python module has a distinct advantage, it’s cross platform. So, you can use it seamlessly on Windows, Linux or mac OS.
|
||||
|
||||
|
||||
### How to parse JSON with jq
|
||||
|
||||
First, you need to install jq, it’s already picked up by most GNU/Linux distributions, install it with their respective package installer commands.
|
||||
|
||||
On Arch Linux:
|
||||
|
||||
```
|
||||
$ sudo pacman -S jq
|
||||
```
|
||||
|
||||
On Debian, Ubuntu, Linux Mint:
|
||||
|
||||
```
|
||||
$ sudo apt-get install jq
|
||||
```
|
||||
|
||||
On Fedora:
|
||||
|
||||
```
|
||||
$ sudo dnf install jq
|
||||
```
|
||||
|
||||
On openSUSE:
|
||||
|
||||
```
|
||||
$ sudo zypper install jq
|
||||
```
|
||||
|
||||
For other OS or platforms, see the [official installation instructions][1].
|
||||
|
||||
**Basic filters and identifiers of jq**
|
||||
|
||||
jq could read the JSON data either from **stdin** or a **file**. You’ve to use both depending on the situation.
|
||||
|
||||
The single symbol of **.** is the most basic filter. These filters are also called as **object identifier-index**. Using a single **.** along with jq basically pretty prints the input JSON file.
|
||||
|
||||
**Single quotes** – You don’t have to use the single quote always. But if you’re combining several filters in a single line, then you must use them.
|
||||
|
||||
**Double quotes** – You’ve to enclose any special character like **@** , **#** , **$** within two double quotes, like this example, **jq .foo.”@bar”**
|
||||
|
||||
**Raw data print** – For any reason, if you need only the final parsed data, not enclosed within a double quote, use the -r flag with the jq command, like this. **– jq -r .foo.bar**.
|
||||
|
||||
**Parsing specific data**
|
||||
|
||||
To filter out a specific part of JSON, you’ve to look into the pretty printed JSON file’s data hierarchy.
|
||||
|
||||
An example of JSON data, from Wikipedia:
|
||||
|
||||
```
|
||||
{
|
||||
|
||||
"firstName": "John",
|
||||
|
||||
"lastName": "Smith",
|
||||
|
||||
"age": 25,
|
||||
|
||||
"address": {
|
||||
|
||||
"streetAddress": "21 2nd Street",
|
||||
|
||||
"city": "New York",
|
||||
|
||||
"state": "NY",
|
||||
|
||||
"postalCode": "10021"
|
||||
|
||||
},
|
||||
|
||||
"phoneNumber": [
|
||||
|
||||
{
|
||||
|
||||
"type": "home",
|
||||
|
||||
"number": "212 555-1234"
|
||||
|
||||
},
|
||||
|
||||
{
|
||||
|
||||
"type": "fax",
|
||||
|
||||
"number": "646 555-4567"
|
||||
|
||||
}
|
||||
|
||||
],
|
||||
|
||||
"gender": {
|
||||
|
||||
"type": "male"
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
I’m going to use this JSON data as an example in this tutorial, saved this as **sample.json**.
|
||||
|
||||
Let’s say I want to filter out the address from sample.json file. So the command should be like:
|
||||
|
||||
```
|
||||
$ jq .address sample.json
|
||||
```
|
||||
|
||||
**Sample output:**
|
||||
|
||||
```
|
||||
{
|
||||
|
||||
"streetAddress": "21 2nd Street",
|
||||
|
||||
"city": "New York",
|
||||
|
||||
"state": "NY",
|
||||
|
||||
"postalCode": "10021"
|
||||
|
||||
}
|
||||
```
|
||||
|
||||
Again let’s say I want the postal code, then I’ve to add another **object identifier-index** , i.e. another filter.
|
||||
|
||||
```
|
||||
$ cat sample.json | jq .address.postalCode
|
||||
```
|
||||
|
||||
Also note that the **filters are case sensitive** and you’ve to use the exact same string to get something meaningful output instead of null.
|
||||
|
||||
**Parsing elements from JSON array**
|
||||
|
||||
Elements of JSON array are enclosed within square brackets, undoubtedly quite versatile to use.
|
||||
|
||||
To parse elements from a array, you’ve to use the **[]identifier** along with other object identifier-index.
|
||||
|
||||
In this sample JSON data, the phone numbers are stored inside an array, to get all the contents from this array, you’ve to use only the brackets, like this example.
|
||||
|
||||
```
|
||||
$ jq .phoneNumber[] sample.json
|
||||
```
|
||||
|
||||
Let’s say you just want the first element of the array, then use the array object numbers starting for 0, for the first item, use **[0]** , for the next items, it should be incremented by one each step.
|
||||
|
||||
```
|
||||
$ jq .phoneNumber[0] sample.json
|
||||
```
|
||||
|
||||
**Scripting examples**
|
||||
|
||||
Let’s say I want only the the number for home, not entire JSON array data. Here’s when scripting within jq command comes handy.
|
||||
|
||||
```
|
||||
$ cat sample.json | jq -r '.phoneNumber[] | select(.type == "home") | .number'
|
||||
```
|
||||
|
||||
Here first I’m piping the results of one filer to another, then using the select attribute to select a particular type of data, again piping the result to another filter.
|
||||
|
||||
Explaining every type of jq filters and scripting is beyond the scope and purpose of this tutorial. It’s highly suggested to read the JQ manual for better understanding given below.
|
||||
|
||||
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.ostechnix.com/how-to-parse-and-pretty-print-json-with-linux-commandline-tools/
|
||||
|
||||
作者:[EDITOR][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://www.ostechnix.com/author/editor/
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://stedolan.github.io/jq/download/
|
@ -0,0 +1,228 @@
|
||||
[#]: collector: "lujun9972"
|
||||
[#]: translator: "wxy"
|
||||
[#]: reviewer: "wxy"
|
||||
[#]: publisher: " "
|
||||
[#]: url: " "
|
||||
[#]: subject: "How To Parse And Pretty Print JSON With Linux Commandline Tools"
|
||||
[#]: via: "https://www.ostechnix.com/how-to-parse-and-pretty-print-json-with-linux-commandline-tools/"
|
||||
[#]: author: "EDITOR https://www.ostechnix.com/author/editor/"
|
||||
|
||||
如何用 Linux 命令行工具解析和格式化输出 JSON
|
||||
======
|
||||
|
||||
![](https://www.ostechnix.com/wp-content/uploads/2019/03/json-720x340.png)
|
||||
|
||||
JSON 是一种轻量级且与语言无关的数据存储格式,易于与大多数编程语言集成,也易于人类理解 —— 当然,如果格式正确的话。JSON 这个词代表 **J**ava **S**cript **O**bject **N**otation,虽然它以 JavaScript 开头,而且主要用于在服务器和浏览器之间交换数据,但现在正在用于许多领域,包括嵌入式系统。在这里,我们将使用 Linux 上的命令行工具解析并格式化打印 JSON。它对于在 shell 脚本中处理大型 JSON 数据或在 shell 脚本中处理 JSON 数据非常有用。
|
||||
|
||||
### 什么是格式化输出?
|
||||
|
||||
JSON 数据的结构更具人性化。但是在大多数情况下,JSON 数据会存储在一行中,甚至没有行结束字符。
|
||||
|
||||
显然,这对于手动阅读和编辑不太方便。
|
||||
|
||||
这是<ruby>格式化输出<rt>pretty print</rt></ruby>就很有用。这个该名称不言自明:重新格式化 JSON 文本,使人们读起来更清晰。这被称为 **JSON 格式化输出**。
|
||||
|
||||
### 用 Linux 命令行工具解析和格式化输出 JSON
|
||||
|
||||
可以使用命令行文本处理器解析 JSON 数据,例如 `awk`、`sed` 和 `gerp`。实际上 `JSON.awk` 是一个来做这个的 awk 脚本。但是,也有一些专用工具可用于同一目的。
|
||||
|
||||
1. `jq` 或 `jshon`,shell 下的 JSON 解析器,它们都非常有用。
|
||||
2. Shell 脚本,如 `JSON.sh` 或 `jsonv.sh`,用于在 bash、zsh 或 dash shell 中解析JSON。
|
||||
3. `JSON.awk`,JSON 解析器 awk 脚本。
|
||||
4. 像 `json.tool` 这样的 Python 模块。
|
||||
5. `undercore-cli`,基于 Node.js 和 javascript。
|
||||
|
||||
在本教程中,我只关注 `jq`,这是一个 shell 下的非常强大的 JSON 解析器,具有高级过滤和脚本编程功能。
|
||||
|
||||
### JSON 格式化输出
|
||||
|
||||
JSON 数据可能放在一行上使人难以解读,因此为了使其具有一定的可读性,JSON 格式化输出就可用于此目的的。
|
||||
|
||||
**示例:**来自 `jsonip.com` 的数据,使用 `curl` 或 `wget` 工具获得 JSON 格式的外部 IP 地址,如下所示。
|
||||
|
||||
```
|
||||
$ wget -cq http://jsonip.com/ -O -
|
||||
```
|
||||
|
||||
实际数据看起来类似这样:
|
||||
|
||||
```
|
||||
{"ip":"111.222.333.444","about":"/about","Pro!":"http://getjsonip.com"}
|
||||
```
|
||||
|
||||
现在使用 `jq` 格式化输出它:
|
||||
|
||||
```
|
||||
$ wget -cq http://jsonip.com/ -O - | jq '.'
|
||||
```
|
||||
|
||||
通过 `jq` 过滤了该结果之后,它应该看起来类似这样:
|
||||
|
||||
```
|
||||
{
|
||||
"ip": "111.222.333.444",
|
||||
"about": "/about",
|
||||
"Pro!": "http://getjsonip.com"
|
||||
}
|
||||
```
|
||||
|
||||
同样也可以通过 Python `json.tool` 模块做到。示例如下:
|
||||
|
||||
```
|
||||
$ cat anything.json | python -m json.tool
|
||||
```
|
||||
|
||||
这种基于 Python 的解决方案对于大多数用户来说应该没问题,但是如果没有预安装或无法安装 Python 则不行,比如在嵌入式系统上。
|
||||
|
||||
然而,`json.tool` Python 模块具有明显的优势,它是跨平台的。因此,你可以在 Windows、Linux 或 Mac OS 上无缝使用它。
|
||||
|
||||
### 如何用 jq 解析 JSON
|
||||
|
||||
首先,你需要安装 `jq`,它已被大多数 GNU/Linux 发行版选中,并使用各自的软件包安装程序命令进行安装。
|
||||
|
||||
在 Arch Linux 上:
|
||||
|
||||
```
|
||||
$ sudo pacman -S jq
|
||||
```
|
||||
|
||||
在 Debian、Ubuntu、Linux Mint 上:
|
||||
|
||||
```
|
||||
$ sudo apt-get install jq
|
||||
```
|
||||
|
||||
在 Fedora 上:
|
||||
|
||||
```
|
||||
$ sudo dnf install jq
|
||||
```
|
||||
|
||||
在 openSUSE 上:
|
||||
|
||||
```
|
||||
$ sudo zypper install jq
|
||||
```
|
||||
|
||||
对于其它操作系统或平台参见[官方的安装指导][1]。
|
||||
|
||||
#### jq 的基本过滤和标识符功能
|
||||
|
||||
`jq` 可以从 `STDIN` 或文件中读取 JSON 数据。你可以根据情况使用。
|
||||
|
||||
单个符号 `.` 是最基本的过滤器。这些过滤器也称为**对象标识符-索引**。`jq` 使用单个 `.` 过滤器基本上相当将输入的 JSON 文件格式化输出。
|
||||
|
||||
- **单引号**:不必始终使用单引号。但是如果你在一行中组合几个过滤器,那么你必须使用它们。
|
||||
- **双引号**:你必须用两个双引号括起任何特殊字符,如 `@`、`#`、`$`,例如 `jq .foo.”@bar”`。
|
||||
- **原始数据打印**:不管出于任何原因,如果你只需要最终解析的数据(不包含在双引号内),请使用带有 `-r` 标志的 `jq` 命令,如下所示:`jq -r .foo.bar`。
|
||||
|
||||
#### 解析特定数据
|
||||
|
||||
要过滤出 JSON 的特定部分,你需要了解格式化输出的 JSON 文件的数据层次结构。
|
||||
|
||||
来自维基百科的 JSON 数据示例:
|
||||
|
||||
```
|
||||
{
|
||||
"firstName": "John",
|
||||
"lastName": "Smith",
|
||||
"age": 25,
|
||||
"address": {
|
||||
"streetAddress": "21 2nd Street",
|
||||
"city": "New York",
|
||||
"state": "NY",
|
||||
"postalCode": "10021"
|
||||
},
|
||||
"phoneNumber": [
|
||||
{
|
||||
"type": "home",
|
||||
"number": "212 555-1234"
|
||||
},
|
||||
{
|
||||
"type": "fax",
|
||||
"number": "646 555-4567"
|
||||
}
|
||||
],
|
||||
"gender": {
|
||||
"type": "male"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
我将在本教程中将此 JSON 数据用作示例,将其保存为 `sample.json`。
|
||||
|
||||
假设我想从 `sample.json` 文件中过滤出地址。所以命令应该是这样的:
|
||||
|
||||
```
|
||||
$ jq .address sample.json
|
||||
```
|
||||
|
||||
示例输出:
|
||||
|
||||
```
|
||||
{
|
||||
"streetAddress": "21 2nd Street",
|
||||
"city": "New York",
|
||||
"state": "NY",
|
||||
"postalCode": "10021"
|
||||
}
|
||||
```
|
||||
|
||||
再次,我想要邮政编码,然后我要添加另一个**对象标识符-索引**,即另一个过滤器。
|
||||
|
||||
```
|
||||
$ cat sample.json | jq .address.postalCode
|
||||
```
|
||||
|
||||
另请注意,**过滤器区分大小写**,并且你必须使用完全相同的字符串来获取有意义的输出,否则就是 null。
|
||||
|
||||
#### 从 JSON 数组中解析元素
|
||||
|
||||
JSON 数组的元素包含在方括号内,这无疑是非常通用的。
|
||||
|
||||
要解析数组中的元素,你必须使用 `[]` 标识符以及其他对象标识符索引。
|
||||
|
||||
在此示例 JSON 数据中,电话号码存储在数组中,要从此数组中获取所有内容,你只需使用括号,像这个示例:
|
||||
|
||||
```
|
||||
$ jq .phoneNumber[] sample.json
|
||||
```
|
||||
|
||||
假设你只想要数组的第一个元素,然后使用从 `0` 开始的数组对象编号,对于第一个项目,使用 `[0]`,对于下一个项目,它应该每步增加 1。
|
||||
|
||||
```
|
||||
$ jq .phoneNumber[0] sample.json
|
||||
```
|
||||
|
||||
#### 脚本编程示例
|
||||
|
||||
假设我只想要家庭电话,而不是整个 JSON 数组数据。这就是用 `jq` 命令脚本编写的方便之处。
|
||||
|
||||
```
|
||||
$ cat sample.json | jq -r '.phoneNumber[] | select(.type == "home") | .number'
|
||||
```
|
||||
|
||||
首先,我将一个过滤器的结果传递给另一个,然后使用 `select` 属性选择特定类型的数据,再次将结果传递给另一个过滤器。
|
||||
|
||||
解释每种类型的 `jq` 过滤器和脚本编程超出了本教程的范围和目的。强烈建议你阅读 `jq` 手册,以便更好地理解下面的内容。
|
||||
|
||||
资源:
|
||||
|
||||
- https://stedolan.github.io/jq/manual/
|
||||
- http://www.compciv.org/recipes/cli/jq-for-parsing-json/
|
||||
- https://lzone.de/cheat-sheet/jq
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://www.ostechnix.com/how-to-parse-and-pretty-print-json-with-linux-commandline-tools/
|
||||
|
||||
作者:[EDITOR][a]
|
||||
选题:[lujun9972][b]
|
||||
译者:[wxy](https://github.com/wxy)
|
||||
校对:[wxy](https://github.com/wxy)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]: https://www.ostechnix.com/author/editor/
|
||||
[b]: https://github.com/lujun9972
|
||||
[1]: https://stedolan.github.io/jq/download/
|
Loading…
Reference in New Issue
Block a user