Update and rename sources/tech/20210625 Use Python to parse configuration files.md to translated/tech/20210625 Use Python to parse configuration files.md

This commit is contained in:
zEpoch 2021-07-01 19:24:30 +08:00 committed by GitHub
parent 5e6b03e34e
commit 1f733eef89
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 205 additions and 205 deletions

View File

@ -1,205 +0,0 @@
[#]: subject: (Use Python to parse configuration files)
[#]: via: (https://opensource.com/article/21/6/parse-configuration-files-python)
[#]: author: (Moshe Zadka https://opensource.com/users/moshez)
[#]: collector: (lujun9972)
[#]: translator: (zepoch)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
Use Python to parse configuration files
======
The first step is choosing a configuration format: INI, JSON, YAML, or
TOML.
![Python programming language logo with question marks][1]
Sometimes, a program needs enough parameters that putting them all as command-line arguments or environment variables is not pleasant nor feasible. In those cases, you will want to use a configuration file.
There are several popular formats for configuration files. Among them are the venerable (although occasionally under-defined) `INI` format, the popular but sometimes hard to write by hand `JSON` format, the extensive yet occasionally surprising in details `YAML` format, and the newest addition, `TOML`, which many people have not heard of yet.
Your first task is to choose a format and then to document that choice. With this easy part out of the way, it is time to parse the configuration.
It is sometimes a good idea to have a class that corresponds to the "abstract" data in the configuration. Because this code will do nothing with the configuration, this is the simplest way to show parsing logic.
Imagine the configuration for a file processor: it includes an input directory, an output directory, and which files to pick up.
The abstract definition for the configuration class might look something like:
```
`from __future__ import annotations`[/code] [code]
import attr
@attr.frozen
class Configuration:
    @attr.frozen
    class Files:
        input_dir: str
        output_dir: str
    files: Files
    @attr.frozen
    class Parameters:
        patterns: List[str]
    parameters: Parameters
```
To make the format-specific code simpler, you will also write a function to parse this class out of dictionaries. Note that this assumes the configuration will use dashes, not underscores. This kind of discrepancy is not uncommon.
```
def configuration_from_dict(details):
    files = Configuration.Files(
        input_dir=details["files"]["input-dir"],
        output_dir=details["files"]["output-dir"],
    )
    parameters = Configuration.Paraneters(
        patterns=details["parameters"]["patterns"]
    )
    return Configuration(
        files=files,
        parameters=parameters,
    )
```
### JSON
JSON (JavaScript Object Notation) is a JavaScript-like format.
Here is an example configuration in JSON format:
```
json_config = """
{
    "files": {
        "input-dir": "inputs",
        "output-dir": "outputs"
    },
    "parameters": {
        "patterns": [
            "*.txt",
            "*.md"
        ]
    }
}
"""
```
The parsing logic parses the JSON into Python's built-in data structures (dictionaries, lists, strings) using the `json` module and then creates the class from the dictionary:
```
import json
def configuration_from_json(data):
    parsed = json.loads(data)
    return configuration_from_dict(parsed)
```
### INI
The INI format, originally popular on Windows, became a de facto configuration standard.
Here is the same configuration as an INI:
```
ini_config="""
[files]
input-dir = inputs
output-dir = outputs
[parameters]
patterns = ['*.txt', '*.md']
"""
```
Python can parse it using the built-in `configparser` module. The parser behaves as a `dict`-like object, so it can be passed directly to `configuration_from_dict`:
```
import configparser
def configuration_from_ini(data):
    parser = configparser.ConfigParser()
    parser.read_string(data)
    return configuration_from_dict(parser)
```
### YAML
YAML (Yet Another Markup Language) is an extension of JSON that is designed to be easier to write by hand. It accomplishes this, in part, by having a long specification.
Here is the same configuration in YAML:
```
yaml_config = """
files:
  input-dir: inputs
  output-dir: outputs
parameters:
  patterns:
  - '*.txt'
  - '*.md'
"""
```
For Python to parse this, you will need to install a third-party module. The most popular is `PyYAML` (`pip install pyyaml`). The YAML parser also returns built-in Python data types that can be passed to `configuration_from_dict`. However, the YAML parser expects a stream, so you need to convert the string into a stream.
```
import io
import yaml
def configuration_from_yaml(data):
    fp = io.StringIO(data)
    parsed = yaml.safe_load(fp)
    return configuration_from_dict(parsed)
```
### TOML
TOML (Tom's Own Markup Language) is designed to be a lightweight alternative to YAML. The specification is shorter, and it is already popular in some places (for example, Rust's package manager, Cargo, uses it for package configuration).
Here is the same configuration as a TOML:
```
toml_config = """
[files]
input-dir = "inputs"
output-dir = "outputs"
[parameters]
patterns = [ "*.txt", "*.md",]
"""
```
In order to parse TOML, you need to install a third-party package. The most popular one is called, simply, `toml`. Like YAML and JSON, it returns basic Python data types.
```
import toml
def configuration_from_toml(data):
    parsed = toml.loads(data)
    return configuration_from_dict(parsed)
```
### Summary
Choosing a configuration format is a subtle tradeoff. However, once you make the decision, Python can parse most of the popular formats using a handful of lines of code.
--------------------------------------------------------------------------------
via: https://opensource.com/article/21/6/parse-configuration-files-python
作者:[Moshe Zadka][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/moshez
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/python_programming_question.png?itok=cOeJW-8r (Python programming language logo with question marks)

View File

@ -0,0 +1,205 @@
[#]: subject: "Use Python to parse configuration files"
[#]: via: "https://opensource.com/article/21/6/parse-configuration-files-python"
[#]: author: "Moshe Zadka https://opensource.com/users/moshez"
[#]: collector: "lujun9972"
[#]: translator: "zepoch"
[#]: reviewer: " "
[#]: publisher: " "
[#]: url: " "
使用Python解析配置文件
======
第一步是选择配置文件的格式INI、JSON、YAML 或 TOML。
![Python programming language logo with question marks][1]
有时,程序需要足够的参数,将它们全部作为命令行参数或环境变量既不让人愉快也不可行。 在这些情况下,您将需要使用配置文件。
有几种流行的配置文件格式。 其中包括古老的(虽然偶尔定义不足)`INI`格式,虽然流行但有时难以手写的`JSON`格式,广泛但偶尔在细节方面令人惊讶的`YAML`格式,以及最新添加的`TOML `,很多人还没有听说过。
您的首要任务是选择一种格式,然后记录该选择。 解决了这个简单的部分之后就是时候解析配置了。
有时,在配置中拥有一个与“抽象“数据相对应的类是一个不错的想法。 因为这段代码不会对配置做任何事情,所以这是展示解析逻辑最简单的方式。
想象一下文件处理器的配置:它包括一个输入目录、一个输出目录和要提取的文件。
配置类的抽象定义可能类似于:
```
`from __future__ import annotations`[/code] [code]
import attr
@attr.frozen
class Configuration:
@attr.frozen
class Files:
input_dir: str
output_dir: str
files: Files
@attr.frozen
class Parameters:
patterns: List[str]
parameters: Parameters
```
为了使特定于格式的代码更简单,您还将编写一个函数来从字典中解析此类。 请注意,这假设配置将使用破折号,而不是下划线。 这种差异并不少见。
```
def configuration_from_dict(details):
files = Configuration.Files(
input_dir=details["files"]["input-dir"],
output_dir=details["files"]["output-dir"],
)
parameters = Configuration.Paraneters(
patterns=details["parameters"]["patterns"]
)
return Configuration(
files=files,
parameters=parameters,
)
```
### JSON
JSONJavaScript Object Notation是一种类似于 JavaScript 的格式。
以下是 JSON 格式的示例配置:
```
json_config = """
{
"files": {
"input-dir": "inputs",
"output-dir": "outputs"
},
"parameters": {
"patterns": [
"*.txt",
"*.md"
]
}
}
"""
```
解析逻辑使用 `json` 模块将 JSON 解析为 Python 的内置数据结构(字典、列表、字符串),然后从字典中创建类:
```
import json
def configuration_from_json(data):
parsed = json.loads(data)
return configuration_from_dict(parsed)
```
### INI
INI 格式,最初只在 Windows 上流行,之后成为配置标准格式。
这是与 INI 相同的配置:
```
ini_config="""
[files]
input-dir = inputs
output-dir = outputs
[parameters]
patterns = ['*.txt', '*.md']
"""
```
Python 可以使用内置的 `configparser` 模块解析它。解析器充当类似 `dict` 的对象,因此可以直接传递给 `configuration_from_dict`
```
import configparser
def configuration_from_ini(data):
parser = configparser.ConfigParser()
parser.read_string(data)
return configuration_from_dict(parser)
```
### YAML
YAML (Yet Another Markup Language) 是 JSON 的扩展,旨在更易于手动编写。 部分 YAML 需要通过具有较长的规范来实现这一点。
以下是 YAML 中的相同配置:
```
yaml_config = """
files:
input-dir: inputs
output-dir: outputs
parameters:
patterns:
- '*.txt'
- '*.md'
"""
```
要让 Python 解析它,您需要安装第三方模块。 最受欢迎的是`PyYAML``pip install pyyaml`)。 YAML 解析器还返回可以传递给 `configuration_from_dict` 的内置 Python 数据类型。 但是YAML 解析器需要一个字节流,因此您需要将字符串转换为字节流。
```
import io
import yaml
def configuration_from_yaml(data):
fp = io.StringIO(data)
parsed = yaml.safe_load(fp)
return configuration_from_dict(parsed)
```
### TOML
TOMLTom's Own Markup Language旨在成为 YAML 的轻量级替代品。 规范比较短,已经在一些地方流行了(比如 Rust 的包管理器 Cargo 就用它来进行包配置)。
这是与 TOML 相同的配置:
```
toml_config = """
[files]
input-dir = "inputs"
output-dir = "outputs"
[parameters]
patterns = [ "*.txt", "*.md",]
"""
```
为了解析 TOML您需要安装第三方包。 最流行的一种被简单地称为 `toml`。 与 YAML 和 JSON 一样,它返回基本的 Python 数据类型。
```
import toml
def configuration_from_toml(data):
parsed = toml.loads(data)
return configuration_from_dict(parsed)
```
### Summary总结
选择配置格式是一种微妙的权衡。 但是一旦您做出决定Python 就可以使用少量代码来解析大多数流行的格式。
--------------------------------------------------------------------------------
via: https://opensource.com/article/21/6/parse-configuration-files-python
作者:[Moshe Zadka][a]
选题:[lujun9972][b]
译者:[zepoch](https://github.com/zepoch)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/moshez
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/python_programming_question.png?itok=cOeJW-8r "Python programming language logo with question marks"