mirror of
https://github.com/LCTT/TranslateProject.git
synced 2025-01-13 22:30:37 +08:00
Update and rename sources/tech/20210625 Use Python to parse configuration files.md to translated/tech/20210625 Use Python to parse configuration files.md
This commit is contained in:
parent
5e6b03e34e
commit
1f733eef89
@ -1,205 +0,0 @@
|
|||||||
[#]: subject: (Use Python to parse configuration files)
|
|
||||||
[#]: via: (https://opensource.com/article/21/6/parse-configuration-files-python)
|
|
||||||
[#]: author: (Moshe Zadka https://opensource.com/users/moshez)
|
|
||||||
[#]: collector: (lujun9972)
|
|
||||||
[#]: translator: (zepoch)
|
|
||||||
[#]: reviewer: ( )
|
|
||||||
[#]: publisher: ( )
|
|
||||||
[#]: url: ( )
|
|
||||||
|
|
||||||
Use Python to parse configuration files
|
|
||||||
======
|
|
||||||
The first step is choosing a configuration format: INI, JSON, YAML, or
|
|
||||||
TOML.
|
|
||||||
![Python programming language logo with question marks][1]
|
|
||||||
|
|
||||||
Sometimes, a program needs enough parameters that putting them all as command-line arguments or environment variables is not pleasant nor feasible. In those cases, you will want to use a configuration file.
|
|
||||||
|
|
||||||
There are several popular formats for configuration files. Among them are the venerable (although occasionally under-defined) `INI` format, the popular but sometimes hard to write by hand `JSON` format, the extensive yet occasionally surprising in details `YAML` format, and the newest addition, `TOML`, which many people have not heard of yet.
|
|
||||||
|
|
||||||
Your first task is to choose a format and then to document that choice. With this easy part out of the way, it is time to parse the configuration.
|
|
||||||
|
|
||||||
It is sometimes a good idea to have a class that corresponds to the "abstract" data in the configuration. Because this code will do nothing with the configuration, this is the simplest way to show parsing logic.
|
|
||||||
|
|
||||||
Imagine the configuration for a file processor: it includes an input directory, an output directory, and which files to pick up.
|
|
||||||
|
|
||||||
The abstract definition for the configuration class might look something like:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
`from __future__ import annotations`[/code] [code]
|
|
||||||
|
|
||||||
import attr
|
|
||||||
|
|
||||||
@attr.frozen
|
|
||||||
class Configuration:
|
|
||||||
@attr.frozen
|
|
||||||
class Files:
|
|
||||||
input_dir: str
|
|
||||||
output_dir: str
|
|
||||||
files: Files
|
|
||||||
@attr.frozen
|
|
||||||
class Parameters:
|
|
||||||
patterns: List[str]
|
|
||||||
parameters: Parameters
|
|
||||||
```
|
|
||||||
|
|
||||||
To make the format-specific code simpler, you will also write a function to parse this class out of dictionaries. Note that this assumes the configuration will use dashes, not underscores. This kind of discrepancy is not uncommon.
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
def configuration_from_dict(details):
|
|
||||||
files = Configuration.Files(
|
|
||||||
input_dir=details["files"]["input-dir"],
|
|
||||||
output_dir=details["files"]["output-dir"],
|
|
||||||
)
|
|
||||||
parameters = Configuration.Paraneters(
|
|
||||||
patterns=details["parameters"]["patterns"]
|
|
||||||
)
|
|
||||||
return Configuration(
|
|
||||||
files=files,
|
|
||||||
parameters=parameters,
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### JSON
|
|
||||||
|
|
||||||
JSON (JavaScript Object Notation) is a JavaScript-like format.
|
|
||||||
|
|
||||||
Here is an example configuration in JSON format:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
json_config = """
|
|
||||||
{
|
|
||||||
"files": {
|
|
||||||
"input-dir": "inputs",
|
|
||||||
"output-dir": "outputs"
|
|
||||||
},
|
|
||||||
"parameters": {
|
|
||||||
"patterns": [
|
|
||||||
"*.txt",
|
|
||||||
"*.md"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
"""
|
|
||||||
```
|
|
||||||
|
|
||||||
The parsing logic parses the JSON into Python's built-in data structures (dictionaries, lists, strings) using the `json` module and then creates the class from the dictionary:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
import json
|
|
||||||
def configuration_from_json(data):
|
|
||||||
parsed = json.loads(data)
|
|
||||||
return configuration_from_dict(parsed)
|
|
||||||
```
|
|
||||||
|
|
||||||
### INI
|
|
||||||
|
|
||||||
The INI format, originally popular on Windows, became a de facto configuration standard.
|
|
||||||
|
|
||||||
Here is the same configuration as an INI:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
ini_config="""
|
|
||||||
[files]
|
|
||||||
input-dir = inputs
|
|
||||||
output-dir = outputs
|
|
||||||
|
|
||||||
[parameters]
|
|
||||||
patterns = ['*.txt', '*.md']
|
|
||||||
"""
|
|
||||||
```
|
|
||||||
|
|
||||||
Python can parse it using the built-in `configparser` module. The parser behaves as a `dict`-like object, so it can be passed directly to `configuration_from_dict`:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
import configparser
|
|
||||||
|
|
||||||
def configuration_from_ini(data):
|
|
||||||
parser = configparser.ConfigParser()
|
|
||||||
parser.read_string(data)
|
|
||||||
return configuration_from_dict(parser)
|
|
||||||
```
|
|
||||||
|
|
||||||
### YAML
|
|
||||||
|
|
||||||
YAML (Yet Another Markup Language) is an extension of JSON that is designed to be easier to write by hand. It accomplishes this, in part, by having a long specification.
|
|
||||||
|
|
||||||
Here is the same configuration in YAML:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
yaml_config = """
|
|
||||||
files:
|
|
||||||
input-dir: inputs
|
|
||||||
output-dir: outputs
|
|
||||||
parameters:
|
|
||||||
patterns:
|
|
||||||
- '*.txt'
|
|
||||||
- '*.md'
|
|
||||||
"""
|
|
||||||
```
|
|
||||||
|
|
||||||
For Python to parse this, you will need to install a third-party module. The most popular is `PyYAML` (`pip install pyyaml`). The YAML parser also returns built-in Python data types that can be passed to `configuration_from_dict`. However, the YAML parser expects a stream, so you need to convert the string into a stream.
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
import io
|
|
||||||
import yaml
|
|
||||||
def configuration_from_yaml(data):
|
|
||||||
fp = io.StringIO(data)
|
|
||||||
parsed = yaml.safe_load(fp)
|
|
||||||
return configuration_from_dict(parsed)
|
|
||||||
```
|
|
||||||
|
|
||||||
### TOML
|
|
||||||
|
|
||||||
TOML (Tom's Own Markup Language) is designed to be a lightweight alternative to YAML. The specification is shorter, and it is already popular in some places (for example, Rust's package manager, Cargo, uses it for package configuration).
|
|
||||||
|
|
||||||
Here is the same configuration as a TOML:
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
toml_config = """
|
|
||||||
[files]
|
|
||||||
input-dir = "inputs"
|
|
||||||
output-dir = "outputs"
|
|
||||||
|
|
||||||
[parameters]
|
|
||||||
patterns = [ "*.txt", "*.md",]
|
|
||||||
"""
|
|
||||||
```
|
|
||||||
|
|
||||||
In order to parse TOML, you need to install a third-party package. The most popular one is called, simply, `toml`. Like YAML and JSON, it returns basic Python data types.
|
|
||||||
|
|
||||||
|
|
||||||
```
|
|
||||||
import toml
|
|
||||||
def configuration_from_toml(data):
|
|
||||||
parsed = toml.loads(data)
|
|
||||||
return configuration_from_dict(parsed)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Summary
|
|
||||||
|
|
||||||
Choosing a configuration format is a subtle tradeoff. However, once you make the decision, Python can parse most of the popular formats using a handful of lines of code.
|
|
||||||
|
|
||||||
--------------------------------------------------------------------------------
|
|
||||||
|
|
||||||
via: https://opensource.com/article/21/6/parse-configuration-files-python
|
|
||||||
|
|
||||||
作者:[Moshe Zadka][a]
|
|
||||||
选题:[lujun9972][b]
|
|
||||||
译者:[译者ID](https://github.com/译者ID)
|
|
||||||
校对:[校对者ID](https://github.com/校对者ID)
|
|
||||||
|
|
||||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|
||||||
|
|
||||||
[a]: https://opensource.com/users/moshez
|
|
||||||
[b]: https://github.com/lujun9972
|
|
||||||
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/python_programming_question.png?itok=cOeJW-8r (Python programming language logo with question marks)
|
|
@ -0,0 +1,205 @@
|
|||||||
|
[#]: subject: "Use Python to parse configuration files"
|
||||||
|
[#]: via: "https://opensource.com/article/21/6/parse-configuration-files-python"
|
||||||
|
[#]: author: "Moshe Zadka https://opensource.com/users/moshez"
|
||||||
|
[#]: collector: "lujun9972"
|
||||||
|
[#]: translator: "zepoch"
|
||||||
|
[#]: reviewer: " "
|
||||||
|
[#]: publisher: " "
|
||||||
|
[#]: url: " "
|
||||||
|
|
||||||
|
使用Python解析配置文件
|
||||||
|
======
|
||||||
|
第一步是选择配置文件的格式:INI、JSON、YAML 或 TOML。
|
||||||
|
|
||||||
|
![Python programming language logo with question marks][1]
|
||||||
|
|
||||||
|
有时,程序需要足够的参数,将它们全部作为命令行参数或环境变量既不让人愉快也不可行。 在这些情况下,您将需要使用配置文件。
|
||||||
|
|
||||||
|
有几种流行的配置文件格式。 其中包括古老的(虽然偶尔定义不足)`INI`格式,虽然流行但有时难以手写的`JSON`格式,广泛但偶尔在细节方面令人惊讶的`YAML`格式,以及最新添加的`TOML `,很多人还没有听说过。
|
||||||
|
|
||||||
|
您的首要任务是选择一种格式,然后记录该选择。 解决了这个简单的部分之后就是时候解析配置了。
|
||||||
|
|
||||||
|
有时,在配置中拥有一个与“抽象“数据相对应的类是一个不错的想法。 因为这段代码不会对配置做任何事情,所以这是展示解析逻辑最简单的方式。
|
||||||
|
|
||||||
|
想象一下文件处理器的配置:它包括一个输入目录、一个输出目录和要提取的文件。
|
||||||
|
|
||||||
|
配置类的抽象定义可能类似于:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
`from __future__ import annotations`[/code] [code]
|
||||||
|
|
||||||
|
import attr
|
||||||
|
|
||||||
|
@attr.frozen
|
||||||
|
class Configuration:
|
||||||
|
@attr.frozen
|
||||||
|
class Files:
|
||||||
|
input_dir: str
|
||||||
|
output_dir: str
|
||||||
|
files: Files
|
||||||
|
@attr.frozen
|
||||||
|
class Parameters:
|
||||||
|
patterns: List[str]
|
||||||
|
parameters: Parameters
|
||||||
|
```
|
||||||
|
|
||||||
|
为了使特定于格式的代码更简单,您还将编写一个函数来从字典中解析此类。 请注意,这假设配置将使用破折号,而不是下划线。 这种差异并不少见。
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
def configuration_from_dict(details):
|
||||||
|
files = Configuration.Files(
|
||||||
|
input_dir=details["files"]["input-dir"],
|
||||||
|
output_dir=details["files"]["output-dir"],
|
||||||
|
)
|
||||||
|
parameters = Configuration.Paraneters(
|
||||||
|
patterns=details["parameters"]["patterns"]
|
||||||
|
)
|
||||||
|
return Configuration(
|
||||||
|
files=files,
|
||||||
|
parameters=parameters,
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
### JSON
|
||||||
|
|
||||||
|
JSON(JavaScript Object Notation)是一种类似于 JavaScript 的格式。
|
||||||
|
|
||||||
|
以下是 JSON 格式的示例配置:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
json_config = """
|
||||||
|
{
|
||||||
|
"files": {
|
||||||
|
"input-dir": "inputs",
|
||||||
|
"output-dir": "outputs"
|
||||||
|
},
|
||||||
|
"parameters": {
|
||||||
|
"patterns": [
|
||||||
|
"*.txt",
|
||||||
|
"*.md"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
解析逻辑使用 `json` 模块将 JSON 解析为 Python 的内置数据结构(字典、列表、字符串),然后从字典中创建类:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
import json
|
||||||
|
def configuration_from_json(data):
|
||||||
|
parsed = json.loads(data)
|
||||||
|
return configuration_from_dict(parsed)
|
||||||
|
```
|
||||||
|
|
||||||
|
### INI
|
||||||
|
|
||||||
|
INI 格式,最初只在 Windows 上流行,之后成为配置标准格式。
|
||||||
|
|
||||||
|
这是与 INI 相同的配置:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
ini_config="""
|
||||||
|
[files]
|
||||||
|
input-dir = inputs
|
||||||
|
output-dir = outputs
|
||||||
|
|
||||||
|
[parameters]
|
||||||
|
patterns = ['*.txt', '*.md']
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
Python 可以使用内置的 `configparser` 模块解析它。解析器充当类似 `dict` 的对象,因此可以直接传递给 `configuration_from_dict`:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
import configparser
|
||||||
|
|
||||||
|
def configuration_from_ini(data):
|
||||||
|
parser = configparser.ConfigParser()
|
||||||
|
parser.read_string(data)
|
||||||
|
return configuration_from_dict(parser)
|
||||||
|
```
|
||||||
|
|
||||||
|
### YAML
|
||||||
|
|
||||||
|
YAML (Yet Another Markup Language) 是 JSON 的扩展,旨在更易于手动编写。 部分 YAML 需要通过具有较长的规范来实现这一点。
|
||||||
|
|
||||||
|
以下是 YAML 中的相同配置:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
yaml_config = """
|
||||||
|
files:
|
||||||
|
input-dir: inputs
|
||||||
|
output-dir: outputs
|
||||||
|
parameters:
|
||||||
|
patterns:
|
||||||
|
- '*.txt'
|
||||||
|
- '*.md'
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
要让 Python 解析它,您需要安装第三方模块。 最受欢迎的是`PyYAML`(`pip install pyyaml`)。 YAML 解析器还返回可以传递给 `configuration_from_dict` 的内置 Python 数据类型。 但是,YAML 解析器需要一个字节流,因此您需要将字符串转换为字节流。
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
import io
|
||||||
|
import yaml
|
||||||
|
def configuration_from_yaml(data):
|
||||||
|
fp = io.StringIO(data)
|
||||||
|
parsed = yaml.safe_load(fp)
|
||||||
|
return configuration_from_dict(parsed)
|
||||||
|
```
|
||||||
|
|
||||||
|
### TOML
|
||||||
|
|
||||||
|
TOML(Tom's Own Markup Language)旨在成为 YAML 的轻量级替代品。 规范比较短,已经在一些地方流行了(比如 Rust 的包管理器 Cargo 就用它来进行包配置)。
|
||||||
|
|
||||||
|
这是与 TOML 相同的配置:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
toml_config = """
|
||||||
|
[files]
|
||||||
|
input-dir = "inputs"
|
||||||
|
output-dir = "outputs"
|
||||||
|
|
||||||
|
[parameters]
|
||||||
|
patterns = [ "*.txt", "*.md",]
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
为了解析 TOML,您需要安装第三方包。 最流行的一种被简单地称为 `toml`。 与 YAML 和 JSON 一样,它返回基本的 Python 数据类型。
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
import toml
|
||||||
|
def configuration_from_toml(data):
|
||||||
|
parsed = toml.loads(data)
|
||||||
|
return configuration_from_dict(parsed)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Summary总结
|
||||||
|
|
||||||
|
选择配置格式是一种微妙的权衡。 但是,一旦您做出决定,Python 就可以使用少量代码来解析大多数流行的格式。
|
||||||
|
|
||||||
|
--------------------------------------------------------------------------------
|
||||||
|
|
||||||
|
via: https://opensource.com/article/21/6/parse-configuration-files-python
|
||||||
|
|
||||||
|
作者:[Moshe Zadka][a]
|
||||||
|
选题:[lujun9972][b]
|
||||||
|
译者:[zepoch](https://github.com/zepoch)
|
||||||
|
校对:[校对者ID](https://github.com/校对者ID)
|
||||||
|
|
||||||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||||
|
|
||||||
|
[a]: https://opensource.com/users/moshez
|
||||||
|
[b]: https://github.com/lujun9972
|
||||||
|
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/python_programming_question.png?itok=cOeJW-8r "Python programming language logo with question marks"
|
Loading…
Reference in New Issue
Block a user