Merge pull request #26268 from hanszhao80/master

[提交译文][tech]20210101 Djinn- A Code Generator and Templating Language Inspired by Jinja2
This commit is contained in:
六开箱 2022-06-29 10:36:42 +08:00 committed by GitHub
commit 93ad76aebc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 265 additions and 261 deletions

View File

@ -1,261 +0,0 @@
[#]: collector: (lujun9972)
[#]: translator: (hanszhao80)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Djinn: A Code Generator and Templating Language Inspired by Jinja2)
[#]: via: (https://theartofmachinery.com/2021/01/01/djinn.html)
[#]: author: (Simon Arneaud https://theartofmachinery.com)
Djinn: A Code Generator and Templating Language Inspired by Jinja2
======
Code generators can be useful tools. I sometimes use the command line version of [Jinja2][1] to generate highly redundant config files and other text files, but its feature-limited for transforming data. Obviously the author of Jinja2 thinks differently, but I wanted something like list comprehensions or Ds composable range algorithms.
I decided to make a tool thats like Jinja2, but lets me generate complex files by transforming data with range algorithms. The idea was dead simple: a templating language that gets rewritten directly to D code. That way it supports everything D does, simply because it _is_ D. I wanted a standalone code generator, but thanks to [Ds `mixin` feature][2], the same templating language works as an embedded templating language (for HTML in a web app, for example). (For more on that trick, see [this post about translating Brainfuck to D to machine code all at compile time using `mixin`s][3].)
As usual, [its on GitLab][4]. [The examples in this post can be found there, too.][5]
### Hello world example
Heres an example to demonstrate the idea:
```
Hello [= retro("dlrow") ]!
[: enum one = 1; :]
1 + 1 = [= one + one ]
```
`[= some_expression ]` is like `{{ some_expression }}` in Jinja2, and it renders a value to the output. `[: some_statement; :]` is like `{% some_statement %}` and causes full code statements to be executed. I changed the syntax because D also uses curly braces a lot, and mixing the two made templates hard to read. (There are also special non-D directives, like `include`, that get wrapped in `[<` and `>]`.)
If you save the above to a file called `hello.txt.dj` and run the `djinn` command line tool against it, youll get a file called `hello.txt` containing what you might guess:
```
Hello world!
1 + 1 = 2
```
If youve used Jinja2, you might be wondering what happened to the second line. Djinn has a special rule that simplifies formatting and whitespace handling: if a source line contains `[:` statements or `[<` directives but doesnt contain any non-whitespace output, the whole line is ignored for output purposes. Blank lines are still rendered.
### Generating data
Okay, now for something a bit more practical: generating CSV data.
```
x,f(x)
[: import std.mathspecial;
foreach (x; iota(-1.0, 1.0, 0.1)) :]
[= "%0.1f,%g", x, normalDistribution(x) ]
```
A `[=` and `]` pair can contain multiple expressions separated by commas. If the first expression is a double-quoted string, its interpreted as a [format string][6]. Heres the output:
```
x,f(x)
-1.0,0.158655
-0.9,0.18406
-0.8,0.211855
-0.7,0.241964
-0.6,0.274253
-0.5,0.308538
-0.4,0.344578
-0.3,0.382089
-0.2,0.42074
-0.1,0.460172
0.0,0.5
0.1,0.539828
0.2,0.57926
0.3,0.617911
0.4,0.655422
0.5,0.691462
0.6,0.725747
0.7,0.758036
0.8,0.788145
0.9,0.81594
```
### Making an image
This example is just for the heck of it. [The classic netpbm image library defined a bunch of image formats][7], some of which are text-based. For example, heres an image of a 3x3 cross:
```
P2 # identifier for Portable GrayMap
3 3 # width and height
7 # value for pure white (0 is black)
7 0 7
0 0 0
7 0 7
```
You can save the above text to a file named something like `cross.pgm` and many image tools will understand it. Heres some Djinn code that generates a [Mandelbrot set][8] fractal in the same format:
```
[:
import std.complex;
enum W = 640;
enum H = 480;
enum kMaxIter = 20;
ubyte mb(uint x, uint y)
{
const c = complex(3.0 * (x - W / 1.5) / W, 2.0 * (y - H / 2.0) / H);
auto z = complex(0.0);
ubyte ret = kMaxIter;
while (abs(z) <= 2 && --ret) z = z * z + c;
return ret;
}
:]
P2
[= W ] [= H ]
[= kMaxIter ]
[: foreach (y; 0..H) :]
[= "%(%s %)", iota(W).map!(x => mb(x, y)) ]
```
The resulting file is about 800kB, but it compresses nicely as a PNG:
```
$ # Converting with GraphicsMagick
$ gm convert mandelbrot.pgm mandelbrot.png
```
And here it is:
![][9]
### Solving a puzzle
Heres a puzzle:
![][10]
The 5x5 grid needs to be filled in with numbers from 1 to 5, using each number once in each row, and once in each column. (I.e., to make a 5x5 Latin square.) The numbers in neighbouring cells must also satisfy the inequalities indicated by any `>` greater-than signs.
[I used linear programming (LP) some months ago.][11] LP problems are systems of continuous variables with linear constraints. This time Ill use mixed integer linear programming (MILP), which generalises LP by also allowing integer-constrained variables. It turns out thats enough to be NP complete, and MILP happens to be reasonably good for modelling this puzzle.
In that previous post, I used the Julia library JuMP to help spec the problem. This time Ill use the [CPLEX text-based format][12], which is supported by several LP and MILP solvers (and can be easily converted to other formats by off-the-shelf tools if needed). Heres the LP from the previous post in CPLEX format:
```
Minimize
obj: v
Subject To
ptotal: pr + pp + ps = 1
rock: 4 ps - 5 pp - v <= 0
paper: 5 pr - 8 ps - v <= 0
scissors: 8 pp - 4 pr - v <= 0
Bounds
0 <= pr <= 1
0 <= pp <= 1
0 <= ps <= 1
End
```
CPLEX format is nice to read, but non-trivial problems take a lot of variables and constraints to model, making it painful and error-prone to write out manually. There are domain-specific languages like [ZIMPL][13] for speccing MILPs and LPs in a high-level way. Theyre pretty cool for many problems, but ultimately theyre not as expressive as a general-purpose language with a good library like JuMP — or as a code generator with D.
Ill model the puzzle using two sets of variables: (v_{r,c}) and (i_{r,c,v}). (v_{r,c}) will hold the value (1-5) of the cell at row (r) and column (c). (i_{r,c,v}) will be an indicator binary thats 1 if the cell at row (r) and column (c) has value (v), and 0 otherwise. These two sets of variables are redundant representations of the grid, but the first representation makes it easier to model the inequality constraints, while the second representation makes it easier to model the uniqueness constraints. I just need to add some extra constraints to force the two representations to be consistent. But first, lets start with the basic constraint that each cell must have exactly one value. Mathematically, that means all the indicators for a given row and column must be 0, except for one that is 1. That can be enforced by this equation:
[i_{r,c,1} + i_{r,c,2} + i_{r,c,3} + i_{r,c,4} + i_{r,c,5} = 1]
The CPLEX constraints for all rows and columns can be generated with this Djinn code:
```
\ Cell has one value
[:
foreach (r; iota(N))
foreach (c; iota(N))
:]
[= "%-(%s + %)", vs.map!(v => ivar(r, c, v)) ] = 1
[::]
```
`ivar()` is a helper function that gives us the string identifier for an (i) variable, and `vs` stores the numbers 1-5 for convenience. The constraints for uniqueness within rows and columns are exactly the same, but iterating over the other two dimensions of (i).
To make the (i) vars consistent with the (v) vars, we need constraints like this (remember, only one of the (i) vars is non-zero):
[i_{r,c,1} + 2i_{r,c,2} + 3i_{r,c,3} + 4i_{r,c,4} + 5i_{r,c,5} = v_{r,c}]
CPLEX requires all variables to be on the left, so the Djinn code looks like this:
```
\ Link i vars with v vars
[:
foreach (r; iota(N))
foreach (c; iota(N))
:]
[= "%-(%s + %)", vs.map!(v => text(v, ' ', ivar(r, c, v))) ] - [= vvar(r,c) ] = 0
[::]
```
The constraints for the neighouring cell inequalities and for the bottom left corner being 4 are all trivial to write. All thats left is to declare the indicator variables to be binary, and set the bounds for the (v) vars. All up, there are 150 variables and 111 constraints, plus bounds for the variables. [You can see the full code in the repo.][14]
The [GNU Linear Programming Kit][15] has a command line tool that can solve this CPLEX MILP. Unfortunately, its output is a big dump of everything, so I used awk to pull out whats needed:
```
$ time glpsol --lp inequality.lp -o /dev/stdout | awk '/v[0-9][0-9]/ { print $2, $4 }' | sort
v00 1
v01 3
v02 2
v03 5
v04 4
v10 2
v11 5
v12 4
v13 1
v14 3
v20 3
v21 1
v22 5
v23 4
v24 2
v30 5
v31 4
v32 3
v33 2
v34 1
v40 4
v41 2
v42 1
v43 3
v44 5
real 0m0.114s
user 0m0.106s
sys 0m0.005s
```
Heres the solution written out in the original grid:
![][16]
These examples are just for playing around, but Im sure you get the idea. The `README.md` for the Djinn repo is itself generated using a Djinn template, by the way.
As I said, Djinn can also be used as a compile-time templating language embedded inside D code. I primarily wanted a code generator, but thats a bonus thanks to Ds metaprogramming features.
--------------------------------------------------------------------------------
via: https://theartofmachinery.com/2021/01/01/djinn.html
作者:[Simon Arneaud][a]
选题:[lujun9972][b]
译者:[hanszhao80](https://github.com/hanszhao80)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://theartofmachinery.com
[b]: https://github.com/lujun9972
[1]: https://jinja2docs.readthedocs.io/en/stable/
[2]: https://dlang.org/articles/mixin.html
[3]: https://theartofmachinery.com/2017/12/31/compile_time_brainfuck.html
[4]: https://gitlab.com/sarneaud/djinn
[5]: https://gitlab.com/sarneaud/djinn/-/tree/v0.1.0/examples
[6]: https://dlang.org/phobos/std_format.html#format-string
[7]: http://netpbm.sourceforge.net/doc/#formats
[8]: https://en.wikipedia.org/wiki/Mandelbrot_set
[9]: https://theartofmachinery.com/images/djinn/mandelbrot.png
[10]: https://theartofmachinery.com/images/djinn/inequality.svg
[11]: https://theartofmachinery.com/2020/05/21/glico_weighted_rock_paper_scissors.html
[12]: http://lpsolve.sourceforge.net/5.0/CPLEX-format.htm
[13]: https://zimpl.zib.de/
[14]: https://gitlab.com/sarneaud/djinn/-/tree/v0.1.0/examples/inequality.lp.dj
[15]: https://www.gnu.org/software/glpk/
[16]: https://theartofmachinery.com/images/djinn/inequality_solution.svg

View File

@ -0,0 +1,265 @@
[#]: collector: (lujun9972)
[#]: translator: (hanszhao80)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Djinn: A Code Generator and Templating Language Inspired by Jinja2)
[#]: via: (https://theartofmachinery.com/2021/01/01/djinn.html)
[#]: author: (Simon Arneaud https://theartofmachinery.com)
Djinn一个受 Jinja2 启发的代码生成器和模板语言
======
代码生成器是非常有用的工具。我有时使用 [jinja2][1] 的命令行版本来生成高度冗余的配置文件和其他文本文件但它在转换数据方面功能有限。显然Jinja2 的作者有不同的想法,但我想要类似于 <ruby>列表推导<rt>list comprehensions</rt></ruby> 或 D 语言的 <ruby>可组合范围<rt>composable range</rt></ruby> 算法之类的东西。
我决定制作一个类似于 Jinja2 的工具,但让我可以通过使用范围算法转换数据来生成复杂的文件。这个想法非常简单:一个直接用 D 语言代码重写的模板语言。这样它就支持 D 语言所能做的一切,仅仅因为它 _是_ D 语言。我想要一个独立的代码生成器,但是由于 [ D 语言的 `mixin` 特性][2]相同的模板语言可以作为嵌入式模板语言工作例如Web 应用程序中的 HTML。有关该技巧的更多信息请参阅 [这篇关于在编译时使用 mixins 将 Brainfuck 转换为 D 到机器代码的帖子][3]。
像往常一样,[源码在 GitLab 上][4]。[这篇文章中的例子也可以在这里找到。][5]
### 你好世界示例
这是一个演示这个想法的例子:
```
Hello [= retro("dlrow") ]!
[: enum one = 1; :]
1 + 1 = [= one + one ]
```
`[= some_expression ]` 类似于 Jinja2 中的 `{{ some_expression }}`,它在输出中呈现一个值。`[: some_statement; :]` 类似于 `{% some_statement %}` ,用于执行完整的代码语句。我更改了语法,因为 D 也大量使用花括号,并且将两者混合使模板难以阅读(还有一些特殊的非 D 指令,比如 `include`,它们被包裹在 `[<``>]` 中)。
如果你将上面的内容保存到一个名为 `hello.txt.dj` 的文件中并运行 `djinn` 命令行工具,你会得到一个名为 `hello.txt` 的文件,其中包含你可能猜到的内容:
```
Hello world!
1 + 1 = 2
```
如果您使用过 Jinja2您可能想知道第二行发生了什么。Djinn 有一个简化格式化和空格处理的特殊规则:如果源代码行包含 `[:` 语句或 `[<` 指令但不包含任何非空格输出,则整行都会被忽略输出。空行则仍会原样呈现。
### 生成数据
好的,现在来讲一些更实用的东西:生成 CSV 数据。
```
x,f(x)
[: import std.mathspecial;
foreach (x; iota(-1.0, 1.0, 0.1)) :]
[= "%0.1f,%g", x, normalDistribution(x) ]
```
一个 `[=``]` 对可以包含多个用逗号分隔的表达式。如果第一个表达式是一个由双引号包裹的字符串,则会被解释为 [格式化字符串][6]。下面是输出结果:
```
x,f(x)
-1.0,0.158655
-0.9,0.18406
-0.8,0.211855
-0.7,0.241964
-0.6,0.274253
-0.5,0.308538
-0.4,0.344578
-0.3,0.382089
-0.2,0.42074
-0.1,0.460172
0.0,0.5
0.1,0.539828
0.2,0.57926
0.3,0.617911
0.4,0.655422
0.5,0.691462
0.6,0.725747
0.7,0.758036
0.8,0.788145
0.9,0.81594
```
### 制作图片
这个例子展示了一个图片的生成过程。[经典的 Netpbm 图像库定义了一堆图像格式][7],其中一些是基于文本的。例如,这是一个 3 x 3 向量的图像:
```
P2 # <ruby>便携式灰色地图<rt>Portable GrayMap</rt></ruby>格式标识
3 3 # 宽和高
7 # 代表纯白色的值 (0 代表黑色)
7 0 7
0 0 0
7 0 7
```
你可以将上述文本保存到名为 `cross.pgm` 之类的文件中,很多图像工具都知道如何解析它。下面是一些 Djinn 代码,它以相同的格式生成 [Mandelbrot 集][8] 分形:
```
[:
import std.complex;
enum W = 640;
enum H = 480;
enum kMaxIter = 20;
ubyte mb(uint x, uint y)
{
const c = complex(3.0 * (x - W / 1.5) / W, 2.0 * (y - H / 2.0) / H);
auto z = complex(0.0);
ubyte ret = kMaxIter;
while (abs(z) <= 2 && --ret) z = z * z + c;
return ret;
}
:]
P2
[= W ] [= H ]
[= kMaxIter ]
[: foreach (y; 0..H) :]
[= "%(%s %)", iota(W).map!(x => mb(x, y)) ]
```
生成的文件大约为 800 kB但它可以很好地被压缩为 PNG
```
$ # 使用 GraphicsMagick 进行转换
$ gm convert mandelbrot.pgm mandelbrot.png
```
结果如下:
![][9]
### 解决谜题
这里有一个谜题:
![][10]
一个 5 行 5 列的网格需要用 1 到 5 的数字填充,每个数字在每一行中限使用一次,在每列中限使用一次(即,制作一个 5 行 5 列的<ruby>拉丁方格<rt>Latin square</rt></ruby>)。相邻单元格中的数字还必须满足所有 `>` 大于号表示的不等式。
[几个月前我使用了 <ruby>线性规划<rt>linear programming</rt></ruby>(英文缩写 LP][11]。线性规划问题是具有线性约束的连续变量系统。这次我将使用<ruby>混合整数线性规划<rt>mixed integer linear programming</rt></ruby>(英文缩写 MILP),它通过允许整数约束变量来归纳 LP。事实证明这足以成为 NP 完备的,而 MILP 恰好可以很好地模拟这个谜题。
在上一篇文章中,我使用 Julia 库 JuMP 来帮助解决这个问题。这次我将使用 [CPLEX基于文本的格式][12],它受到多个 LP 和 MILP 求解器的支持(如果需要,可以通过现成的工具轻松转换为其他格式)。这是上一篇文章中 CPLEX 格式的 LP
```
Minimize
obj: v
Subject To
ptotal: pr + pp + ps = 1
rock: 4 ps - 5 pp - v <= 0
paper: 5 pr - 8 ps - v <= 0
scissors: 8 pp - 4 pr - v <= 0
Bounds
0 <= pr <= 1
0 <= pp <= 1
0 <= ps <= 1
End
```
CPLEX 格式易于阅读,但复杂度高的问题需要大量变量和约束来建模,这使得手工编码既痛苦又容易出错。有一些特定领域的语言,例如 [ZIMPL][13],用于以高级方式描述 MILP 和 LP。对于许多问题来说它们非常酷但最终它们不如具有良好库如 JuMP支持的通用语言或使用 D 语言的代码生成器那样富有表现力。
我将使用两组变量来模拟这个谜题:`v_{r,c}` 和 `i_{r,c,v}`。`v_{r,c}` 将保存 r 行 c 列单元格的值(从 1 到 5。`i_{r,c,v}` 是一个二进制指示器,如果 r 行 c 列的单元格的值是 v则该指示器值为 1否则为 0。这两组变量是网格的冗余表示但第一种表示更容易对不等式约束进行建模而第二种表示更容易对唯一性约束进行建模。我只需要添加一些额外的约束来强制这两个表示是一致的。但首先让我们从每个单元格必须只有一个值的基本约束开始。从数学上讲这意味着给定行和列的所有指示器都必须为 0但只有一个值为 1 的例外。这可以通过以下等式强制约束:
```
[i_{r,c,1} + i_{r,c,2} + i_{r,c,3} + i_{r,c,4} + i_{r,c,5} = 1]
```
可以使用以下 Djinn 代码生成对所有行和列的 CPLEX 约束:
```
\ 单元格只有一个值
[:
foreach (r; iota(N))
foreach (c; iota(N))
:]
[= "%-(%s + %)", vs.map!(v => ivar(r, c, v)) ] = 1
[::]
```
`ivar()` 是一个辅助函数,它为我们提供变量名为 i 的字符串标识符,而 `vs` 存储从 1 到 5 的数字以方便使用。行和列内唯一性的约束完全相同,但在 i 的其他两个维度上迭代。
为了使变量组 i 与变量组 v 保持一致,我们需要如下约束(请记住,变量组 i 中只有一个元素的值是非零的):
```
[i_{r,c,1} + 2i_{r,c,2} + 3i_{r,c,3} + 4i_{r,c,4} + 5i_{r,c,5} = v_{r,c}]
```
CPLEX 要求所有变量都位于左侧,因此 Djinn 代码如下所示:
```
\ 连接变量组 i 和变量组 v
[:
foreach (r; iota(N))
foreach (c; iota(N))
:]
[= "%-(%s + %)", vs.map!(v => text(v, ' ', ivar(r, c, v))) ] - [= vvar(r,c) ] = 0
[::]
```
不等符号相邻的和左下角值为为 4 单元格的约束写起来都很简单。剩下的便是将指示器变量声明为二进制,并为变量组 v 设置边界。加上变量的边界,总共有 150 个变量和 111 个约束 [你可以在仓库中看到完整的代码][14]。
[GNU 线性规划工具集][15] 有一个命令行工具可以解决这个 CPLEX MILP。不幸的是它的输出是一个包含了所有内容的体积很大的转储所以我使用 awk 命令来提取需要的内容:
```
$ time glpsol --lp inequality.lp -o /dev/stdout | awk '/v[0-9][0-9]/ { print $2, $4 }' | sort
v00 1
v01 3
v02 2
v03 5
v04 4
v10 2
v11 5
v12 4
v13 1
v14 3
v20 3
v21 1
v22 5
v23 4
v24 2
v30 5
v31 4
v32 3
v33 2
v34 1
v40 4
v41 2
v42 1
v43 3
v44 5
real 0m0.114s
user 0m0.106s
sys 0m0.005s
```
这是在原始网格中写出的解决方案:
![][16]
这些例子只是用来玩的但我相信你已经明白了。顺便说一下Djinn 代码仓库的 `README.md` 文件本身是使用 Djinn 模板生成的。
正如我所说Djinn 也可以用作嵌入在 D 语言代码中的编译期模板语言。我最初只是想要一个代码生成器,得益于 D 语言的元编程功能,这算是一个额外获得的功能。
--------------------------------------------------------------------------------
via: https://theartofmachinery.com/2021/01/01/djinn.html
作者:[Simon Arneaud][a]
选题:[lujun9972][b]
译者:[hanszhao80](https://github.com/hanszhao80)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://theartofmachinery.com
[b]: https://github.com/lujun9972
[1]: https://jinja2docs.readthedocs.io/en/stable/
[2]: https://dlang.org/articles/mixin.html
[3]: https://theartofmachinery.com/2017/12/31/compile_time_brainfuck.html
[4]: https://gitlab.com/sarneaud/djinn
[5]: https://gitlab.com/sarneaud/djinn/-/tree/v0.1.0/examples
[6]: https://dlang.org/phobos/std_format.html#format-string
[7]: http://netpbm.sourceforge.net/doc/#formats
[8]: https://en.wikipedia.org/wiki/Mandelbrot_set
[9]: https://theartofmachinery.com/images/djinn/mandelbrot.png
[10]: https://theartofmachinery.com/images/djinn/inequality.svg
[11]: https://theartofmachinery.com/2020/05/21/glico_weighted_rock_paper_scissors.html
[12]: http://lpsolve.sourceforge.net/5.0/CPLEX-format.htm
[13]: https://zimpl.zib.de/
[14]: https://gitlab.com/sarneaud/djinn/-/tree/v0.1.0/examples/inequality.lp.dj
[15]: https://www.gnu.org/software/glpk/
[16]: https://theartofmachinery.com/images/djinn/inequality_solution.svg