Translated

This commit is contained in:
aREversez 2022-07-30 19:09:41 +08:00 committed by GitHub
parent 2f2cfd89ae
commit 91f21ac17b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -7,116 +7,116 @@
[#]: publisher: " " [#]: publisher: " "
[#]: url: " " [#]: url: " "
How Much of a Genius-Level Move Was Using Binary Space Partitioning in Doom? 在《毁灭战士》中应用二叉空间分割技术是何等天才之举?
====== ======
In 1993, id Software released the first-person shooter _Doom_, which quickly became a phenomenon. The game is now considered one of the most influential games of all time. 1993 年,游戏开发公司 id Software 发行了一款第一人称射击游戏 _《毁灭战士》_,游戏一经发行迅速爆火。在今天看来,《毁灭战士》可谓有史以来最具影响力的游戏之一。
A decade after _Doom_s release, in 2003, journalist David Kushner published a book about id Software called _Masters of Doom_, which has since become the canonical account of _Doom_s creation. I read _Masters of Doom_ a few years ago and dont remember much of it now, but there was one story in the book about lead programmer John Carmack that has stuck with me. This is a loose gloss of the story (see below for the full details), but essentially, early in the development of _Doom_, Carmack realized that the 3D renderer he had written for the game slowed to a crawl when trying to render certain levels. This was unacceptable, because _Doom_ was supposed to be action-packed and frenetic. So Carmack, realizing the problem with his renderer was fundamental enough that he would need to find a better rendering algorithm, started reading research papers. He eventually implemented a technique called “binary space partitioning,” never before used in a video game, that dramatically sped up the _Doom_ engine. _《毁灭战士》_ 发行之后的第十年2003 年),记者大卫·库什纳出版了一本关于 id Software 的书,书名为 _《Doom 启示录》_,后被奉为记录 _Doom_ 创作史的典范读物。几年前我曾读过这本书,如今内容已记得不太真切了,但是书中有一个关于 id Software 首席程序员约翰·卡马克的故事,我印象特别深刻。这里只对故事做粗略描述(具体情节请往下阅读)。实际上,早在 _《毁灭战士》_ 开发前期,卡马克就发现自己为这款游戏编写的 3D 渲染器在渲染某些关卡时,出现速度缓慢的问题。对于 _《毁灭战士》_ 这一对动感和速度有着相当高要求的射击游戏来说,这是一个非常严重的问题。意识到了这一问题的严重性,卡马克需要一个更加有效的渲染算法,于是他开始阅读相关文献。最后,他采用“二叉空间分割”技术,极大地提升了《毁灭战士》游戏引擎的运行速度,而这项技术此前从未用于电子游戏当中。
That story about Carmack applying cutting-edge academic research to video games has always impressed me. It is my explanation for why Carmack has become such a legendary figure. He deserves to be known as the archetypal genius video game programmer for all sorts of reasons, but this episode with the academic papers and the binary space partitioning is the justification I think of first. 一直以来,我对这个故事的印象十分深刻。卡马克将学术前沿研究运用于电子游戏之中,我觉得这正是他之所以成为传奇人物的原因。无论从哪个角度来看,卡马克都应该是电子游戏行业中人尽皆知的典型天才程序员,只不过上面这个故事是我最先能够想到的理由。
Obviously, the story is impressive because “binary space partitioning” sounds like it would be a difficult thing to just read about and implement yourself. Ive long assumed that what Carmack did was a clever intellectual leap, but because Ive never understood what binary space partitioning is or how novel a technique it was when Carmack decided to use it, Ive never known for sure. On a spectrum from Homer Simpson to Albert Einstein, how much of a genius-level move was it really for Carmack to add binary space partitioning to _Doom_? 显而易见,“二叉空间分割”这个术语听起来就是难度相当高的课题,能够自行阅读文献并将其付诸实施实属不易,所以这个故事给我留下了深刻的印象。我一直认为卡马克的做法十分具有创见性,不过由于我既不懂二叉空间分割到底是怎样的一项技术,也不晓得这项技术在当时究竟有多么革新,所以我也并不确定自己的观点是否正确。如果按照从荷马·辛普森到爱因斯坦的顺序为天才列出一套级别体系,那么卡马克将二叉空间分割技术运用于 _《毁灭战士》_ 的做法究竟属于什么级别的天才之举呢?
Ive also wondered where binary space partitioning first came from and how the idea found its way to Carmack. So this post is about John Carmack and _Doom_, but it is also about the history of a data structure: the binary space partitioning tree (or BSP tree). It turns out that the BSP tree, rather interestingly, and like so many things in computer science, has its origins in research conducted for the military. 同时,我也在想,二叉空间分割这个概念最初是从哪儿来的,又是怎样吸引到卡马克的?因此,本篇文章不仅仅会讲述约翰·卡马克和 _《毁灭战士》_ 的故事也会探讨二叉空间分割树BSP 树数据结构的发展历史。有意思的是BSP 树和计算机科学领域其他许多技术一样,最初都起源于军事研究领域。
Thats right: E1M1, the first level of _Doom_, was brought to you by the US Air Force. 没错_《毁灭战士》_ 的第一关卡 E1M1 就受到了美国空军的启发。
### The VSD Problem ### VSD 难题
The BSP tree is a solution to one of the thorniest problems in computer graphics. In order to render a three-dimensional scene, a renderer has to figure out, given a particular viewpoint, what can be seen and what cannot be seen. This is not especially challenging if you have lots of time, but a respectable real-time game engine needs to figure out what can be seen and what cannot be seen at least 30 times a second. BSP 树是计算机图形领域最具挑战性难题的解决方案之一。举个例子,为了渲染出三维场景,渲染器必须能够区分可见物体和不可见物体。如果渲染时间比较充足,这一要求也算不上大问题;但是理想来说,实时游戏引擎在 1 秒内至少需要完成 30 次区分任务。
This problem is sometimes called the problem of visible surface determination. Michael Abrash, a programmer who worked with Carmack on _Quake_ (id Softwares follow-up to _Doom_), wrote about the VSD problem in his famous _Graphics Programming Black Book_: 这一问题有时被称为可见表面检测问题。当时,卡马克与迈克尔·亚伯拉什携手合作,一起开发 _《雷神之锤》_id Software 继 _《毁灭战士》_ 之后开发的游戏)。关于可见表面检测问题,亚伯拉什在自己的著作 _《图形程序开发人员指南》_ 中写道:
> I want to talk about what is, in my opinion, the toughest 3-D problem of all: visible surface determination (drawing the proper surface at each pixel), and its close relative, culling (discarding non-visible polygons as quickly as possible, a way of accelerating visible surface determination). In the interests of brevity, Ill use the abbreviation VSD to mean both visible surface determination and culling from now on. > 我想探讨一下在我看来 3D 中最棘手的一个问题:可见表面检测问题(在每个像素点上绘制合适的表面)以及与之密切相关的隐面消除问题(迅速去除不可见的多边形,用于加快可见表面检测速度)。简略起见,我将在下文采用缩写 VSD 来表示可见表面检测和隐面消除。
> Why do I think VSD is the toughest 3-D challenge? Although rasterization issues such as texture mapping are fascinating and important, they are tasks of relatively finite scope, and are being moved into hardware as 3-D accelerators appear; also, they only scale with increases in screen resolution, which are relatively modest. > 为什么我会认为 VSD 是 3D 中最棘手的问题呢?尽管纹理映射等光栅化问题更让人感兴趣而且也更重要,但是相对而言,这些问题比较容易解决。随着 3D 加速器的出现,它们逐渐被归为硬件范围内的问题。同时,它们只受到屏幕分辨率的影响,而这种影响相对较为稳定。
> In contrast, VSD is an open-ended problem, and there are dozens of approaches currently in use. Even more significantly, the performance of VSD, done in an unsophisticated fashion, scales directly with scene complexity, which tends to increase as a square or cube function, so this very rapidly becomes the limiting factor in rendering realistic worlds.[1][1] > 相反VSD 却像是一个无底洞,目前应对方案也有很多。但实际上,在采用简单的方法处理 VSD 时其性能会直接受到场景复杂程度的影响而场景的复杂程度通常会以平方级或立方级的形式增大。所以在渲染过程中VSD 很快就会成为制约因素。[1][1]
Abrash was writing about the difficulty of the VSD problem in the late 90s, years after _Doom_ had proved that regular people wanted to be able to play graphically intensive games on their home computers. In the early 90s, when id Software first began publishing games, the games had to be programmed to run efficiently on computers not designed to run them, computers meant for word processing, spreadsheet applications, and little else. To make this work, especially for the few 3D games that id Software published before _Doom_, id Software had to be creative. In these games, the design of all the levels was constrained in such a way that the VSD problem was easier to solve. _《毁灭战士》_ 这款游戏告诉我们,普通人盼望着能用自家电脑玩很吃图形配置的游戏。之后数年,即上个世纪九十年代,亚伯拉什讨论了复杂的 VSD 问题。同时代早期id Software 成立后发行了一些游戏。尽管当时的计算机还只是用来处理文字与表格或者执行其他任务未尝想过要在上面运行游戏id Software 必须对发行的游戏进行编程,使其能在计算机上流畅运行。为了实现这一飞跃,尤其是为了能让在 _《毁灭战士》_ 之前发行的少数 3D 游戏在电脑上运行id Software 必须做出革新。在这些游戏中,所有的关卡在设计时都施加了一定的限制,以便更容易解决 VSD 问题。
For example, in _Wolfenstein 3D_, the game id Software released just prior to _Doom_, every level is made from walls that are axis-aligned. In other words, in the Wolfenstein universe, you can have north-south walls or west-east walls, but nothing else. Walls can also only be placed at fixed intervals on a grid—all hallways are either one grid square wide, or two grid squares wide, etc., but never 2.5 grid squares wide. Though this meant that the id Software team could only design levels that all looked somewhat the same, it made Carmacks job of writing a renderer for _Wolfenstein_ much simpler. 例如,在 _《毁灭战士》_ 之前id Software 发行了 _《德军总部 3D》_,该游戏的每一个关卡都是由与坐标轴平齐的墙壁组成。换言之,在《德军总部 3D》的游戏画面里你看到的只有南北方向或者东西方向的墙壁。在游戏中墙壁与墙壁之间有着固定的间隔所有过道的宽度或是一个方格或是两个方格等等但绝不会出现 2.5 个方格。如此一来,尽管 id Software 团队只能设计出外观十分相似的关卡,但这也让卡马克为 _《德军总部 3D》_ 编写渲染器的工作简单了不少。
The _Wolfenstein_ renderer solved the VSD problem by “marching” rays into the virtual world from the screen. Usually a renderer that uses rays is a “raycasting” renderer—these renderers are often slow, because solving the VSD problem in a raycaster involves finding the first intersection between a ray and something in your world, which in the general case requires lots of number crunching. But in _Wolfenstein_, because all the walls are aligned with the grid, the only location a ray can possibly intersect a wall is at the grid lines. So all the renderer needs to do is check each of those intersection points. If the renderer starts by checking the intersection point nearest to the players viewpoint, then checks the next nearest, and so on, and stops when it encounters the first wall, the VSD problem has been solved in an almost trivial way. A ray is just marched forward from each pixel until it hits something, which works because the marching is so cheap in terms of CPU cycles. And actually, since all walls are the same height, it is only necessary to march a single ray for every _column_ of pixels. 通过将屏幕上的光线“齐射”入虚拟游戏世界_《德军总部》_ 的渲染器解决了 VSD 问题。通常来说,使用光线的渲染器叫做“光线投射”渲染器。这种渲染器的速度一般较慢,因为解决内部的 VSD 问题涉及到在光线和游戏中的物体之间找到第一个交点,这通常需要进行大量的计算。但在 _《德军总部》_由于所有的墙壁都与网格平齐所以光线与墙壁相交的位置只能在网格线上。如此一来渲染器只需逐个检查这些交点即可。如果渲染器先从离玩家视角最近的交点开始检查接着检查下一个最近的交点以此类推最后遇到第一面墙壁时停止检查。这样VSD 问题便轻而易举地得到了解决。光线从每一个像素点向前投射,与画面物体接触时停止运动,这一方法是可行的。因为结合 CPU 周期来看,投射的成本很低。事实上,由于每面墙壁高度相同,因此针对同列的像素点,投射的光线只需一条。
This rendering shortcut made _Wolfenstein_ fast enough to run on underpowered home PCs in the era before dedicated graphics cards. But this approach would not work for _Doom_, since the id team had decided that their new game would feature novel things like diagonal walls, stairs, and ceilings of different heights. Ray marching was no longer viable, so Carmack wrote a different kind of renderer. Whereas the _Wolfenstein_ renderer, with its ray for every column of pixels, is an “image-first” renderer, the _Doom_ renderer is an “object-first” renderer. This means that rather than iterating through the pixels on screen and figuring out what color they should be, the _Doom_ renderer iterates through the objects in a scene and projects each onto the screen in turn. 尽管当时还没有专业的图形显卡_《德军总部》_ 凭借这一取巧之法得以在配置较低的个人电脑上正常运行起来。然而,这个办法并不适用于 _《毁灭战士》_。id Software 为这款新游戏增添了许多新元素——倾斜的墙面、楼梯以及高低不一的天花板。光线投射的办法自然也就不好用了于是卡马克编写出了一个新的渲染器。_《德军总部》_ 的渲染器关注的是图像,将光线投射到屏幕像素表示的列上,而 _《毁灭战士》_ 关注的则是物体。换句话说_《毁灭战士》_ 的渲染器会记录游戏场景中的所有物体,继而将其投射到屏幕当中;而非记录屏幕上的像素点,判断每个像素点的颜色。
In an object-first renderer, one easy way to solve the VSD problem is to use a z-buffer. Each time you project an object onto the screen, for each pixel you want to draw to, you do a check. If the part of the object you want to draw is closer to the player than what was already drawn to the pixel, then you can overwrite what is there. Otherwise you have to leave the pixel as is. This approach is simple, but a z-buffer requires a lot of memory, and the renderer may still expend a lot of CPU cycles projecting level geometry that is never going to be seen by the player. 对于强调物体的渲染器来说,可以使用 Z 缓冲器来解决 VSD 问题,比较简单。每次将物体投射到屏幕上时,需要对每个用于绘制的像素点进行检查。如果你想绘制出的物体的部分和已经绘制在目标像素点上的物体相比更加接近玩家,可以将其覆盖。否则,必须保持像素不变。尽管办法很简单,但是 Z 缓冲器对内存的要求较高,投射关卡几何图形时渲染器仍会消耗大量的 CPU 资源,虽然玩家无法看到这些几何图形。
In the early 1990s, there was an additional drawback to the z-buffer approach: On IBM-compatible PCs, which used a video adapter system called VGA, writing to the output frame buffer was an expensive operation. So time spent drawing pixels that would only get overwritten later tanked the performance of your renderer. 在 20 世纪 90 年代,使用 Z 缓冲器的方法还存在着其他缺陷:与 IBM 公司机器兼容的个人电脑搭载了显示适配器系统 VGA在这类电脑上将图像写入帧缓冲器的成本非常之高。因此消耗在绘制像素点上的大量时间可能会让渲染器崩溃而且像素点经过绘制后只能在后期才能覆盖。
Since writing to the frame buffer was so expensive, the ideal renderer was one that started by drawing the objects closest to the player, then the objects just beyond those objects, and so on, until every pixel on screen had been written to. At that point the renderer would know to stop, saving all the time it might have spent considering far-away objects that the player cannot see. But ordering the objects in a scene this way, from closest to farthest, is tantamount to solving the VSD problem. Once again, the question is: What can be seen by the player? 考虑到将图像写入帧缓冲器的成本非常之高,理想的渲染器需要首先绘制离玩家最近的物体,接着是比较近的物体,以此类推,直到屏幕上每个像素点都写入了信息。这时,渲染器会停止运行,大幅缩短远处不可见物体的渲染时间。这种由近及远对物体进行排序的方法也可以解决 VSD 问题。那么问题又来了:什么才是玩家可以看到的?
Initially, Carmack tried to solve this problem by relying on the layout of _Doom_s levels. His renderer started by drawing the walls of the room currently occupied by the player, then flooded out into neighboring rooms to draw the walls in those rooms that could be seen from the current room. Provided that every room was convex, this solved the VSD issue. Rooms that were not convex could be split into convex “sectors.” You can see how this rendering technique might have looked if run at extra-slow speed [in this video][2], where YouTuber Bisqwit demonstrates a renderer of his own that works according to the same general algorithm. This algorithm was successfully used in Duke Nukem 3D, released three years after _Doom_, when CPUs were more powerful. But, in 1993, running on the hardware then available, the _Doom_ renderer that used this algorithm struggled with complicated levels—particularly when sectors were nested inside of each other, which was the only way to create something like a circular pit of stairs. A circular pit of stairs led to lots of repeated recursive descents into a sector that had already been drawn, strangling the game engines speed. 最初,卡马克打算依靠 _《毁灭战士》_ 的关卡布局来解决 VSD 问题。首先用渲染器绘制出玩家目前所在房间的墙壁,之后玩家冲进隔壁房间,再绘制出隔壁房间的墙壁。由于每个房间互不遮挡,这一办法也能解决 VSD 问题。而互相遮挡的房间可以分割成若干互不遮挡的“区域”。在 YouTube 上的一个 [视频][2] 中Bisqwit 展示了自己制作出来的使用了相同算法的渲染器。可以看到,如果玩家移动速度非常慢,便能一睹渲染的具体过程。这一算法同样运用到了《毁灭公爵 3D》当中这款游戏在 _《毁灭战士》_ 推出三年之后发行,当时 CPU 的性能也更加强大了。1993 年,尽管在硬件上已经可以运行游戏了,但是使用这一算法的 _《毁灭战士》_ 渲染器在复杂的层级结构上依旧表现吃力,尤其是在房间分割出来的各部分相互嵌套的情况下。不巧的是,这类层级结构正是构造环形楼梯等物体的唯一办法。沿着环形楼梯走下去,直到走入已经绘制好的区域,由于这其中涉及多次循环下降运动,导致游戏引擎的运行速度大幅降低。
Around the time that the id team realized that the _Doom_ game engine might be too slow, id Software was asked to port _Wolfenstein 3D_ to the Super Nintendo. The Super Nintendo was even less powerful than the IBM-compatible PCs of the day, and it turned out that the ray-marching _Wolfenstein_ renderer, simple as it was, didnt run fast enough on the Super Nintendo hardware. So Carmack began looking for a better algorithm. It was actually for the Super Nintendo port of _Wolfenstein_ that Carmack first researched and implemented binary space partitioning. In _Wolfenstein_, this was relatively straightforward because all the walls were axis-aligned; in _Doom_, it would be more complex. But Carmack realized that BSP trees would solve _Doom_s speed problems too. 在 id Software 团队意识到 _《毁灭战士》_ 游戏引擎的速度可能过慢时,公司还面临着其他任务:将 _《德军总部 3D》_ 移植到超级任天堂游戏机(简称“超任”)上。那时,超任的性能比兼容 IBM 公司机器的个人电脑还要差。结果表明,尽管光线投射渲染器非常简单,但是想要在超任上快速运行是不可能的。于是,卡马克着手研究更为高效的算法。事实上,也正是为了顺利将 _《德军总部》_ 移植到超任,卡马克首次研究了二叉空间分割技术,并将其付诸应用。由于 _《德军总部》_ 的墙壁与坐标轴平齐,所以二叉空间分割技术应用起来也比较简单直接;但是 _《毁灭战士》_ 的情况则比较复杂。不过,卡马克发现,二叉空间分割树同样可以用来解决 _《毁灭战士》_ 速度过慢的问题。
### Binary Space Partitioning ### 二叉空间分割
Binary space partitioning makes the VSD problem easier to solve by splitting a 3D scene into parts ahead of time. For now, you just need to grasp why splitting a scene is useful: If you draw a line (really a plane in 3D) across your scene, and you know which side of the line the player or camera viewpoint is on, then you also know that nothing on the other side of the line can obstruct something on the viewpoints side of the line. If you repeat this process many times, you end up with a 3D scene split into many sections, which wouldnt be an improvement on the original scene except now you know more about how different parts of the scene can obstruct each other. 二叉空间分割会提前将 3D 场景分割为若干部分,借以解决 VSD 问题。讲到这里,你需要先了解一下为什么分割场景可以奏效:如果你在场景上画条线(对应 3D 空间里的一个平面),你就可以指出玩家或者摄像机视角在这条线的哪一侧,在这条线另一侧的物体无法遮挡玩家所在一侧的物体。如果多次重复这一操作,该 3D 场景最终会被分割为多个区域。最后,你要明白场景中不同的部分是会相互遮挡的,这样才能理解为什么这些区域可以起到优化原来场景的作用。
The first people to write about dividing a 3D scene like this were researchers trying to establish for the US Air Force whether computer graphics were sufficiently advanced to use in flight simulators. They released their findings in a 1969 report called “Study for Applying Computer-Generated Images to Visual Simulation.” The report concluded that computer graphics could be used to train pilots, but also warned that the implementation would be complicated by the VSD problem: 首次阐述上述 3D 场景分割的是美国空军的研究员他们曾尝试向美国空军证明计算机图形已经非常先进可以应用到飞行模拟器领域。1969 年,他们将研究发现发表在一份题为《计算机生成图像在图形仿真中的应用研究》的报告中。该报告的总结部分指出,计算机图形可用于训练飞行员,但其实际应用可能会受制于 VSD 问题:
> One of the most significant problems that must be faced in the real-time computation of images is the priority, or hidden-line, problem. In our everyday visual perception of our surroundings, it is a problem that nature solves with trivial ease; a point of an opaque object obscures all other points that lie along the same line of sight and are more distant. In the computer, the task is formidable. The computations required to resolve priority in the general case grow exponentially with the complexity of the environment, and soon they surpass the computing load associated with finding the perspective images of the objects.[2][3] > 实时图像处理需要解决的一个关键问题就是优先级问题或者隐藏线问题。在我们平时用眼睛观察外界时,大自然替我们轻易地解决了这一问题:注视不透明物体上一点时,同一视觉方向的物体以及距离较远的物体就会变得模糊。但在计算机中,这项任务却非常困难。图像处理通常需要解决优先级问题,随着环境复杂程度的增加,计算量会呈指数级增长,随即就会超过绘制物体透视图所需得计算负载。[2][3]
One solution these researchers mention, which according to them was earlier used in a project for NASA, is based on creating what I am going to call an “occlusion matrix.” The researchers point out that a plane dividing a scene in two can be used to resolve “any priority conflict” between objects on opposite sides of the plane. In general you might have to add these planes explicitly to your scene, but with certain kinds of geometry you can just rely on the faces of the objects you already have. They give the example in the figure below, where \\(p_1\\), \\(p_2\\), and \\(p_3\\) are the separating planes. If the camera viewpoint is on the forward or “true” side of one of these planes, then \\(p_i\\) evaluates to 1. The matrix shows the relationships between the three objects based on the three dividing planes and the location of the camera viewpoint—if object \\(a_i\\) obscures object \\(a_j\\), then entry \\(a_{ij}\\) in the matrix will be a 1. 他们在报告中提出了一项基于构造“遮挡矩阵”的方案,这一方案早些时候曾被应用于 NASA 的项目当中。研究员指出,平面将场景一分为二,可用来解决平面两侧物体之间存在的“任何优先级问题”】。通常情况下,可能需要将实实在在的平面添加到场景中,但是有了几何图形,只需借助几何物体的表面即可。他们举了一个例子,如下图:\\(p_1\\)、\\(p_2\\) 以及 \\(p_3\\) 是三个不同的平面,如果摄像机视角位于其中一个平面的前方或“正”面,\\(p_i\\) 的值就等于 1。这种矩阵展示出基于三个不同平面和摄像机视角位置的三个物体之间的关系——如果物体 \\(a_i\\) 遮挡了物体 \\(a_j\\),那么 \\(a_{ij}\\) 在此矩阵中的数值等于 1。
![][4] ![][4]
The researchers propose that this matrix could be implemented in hardware and re-evaluated every frame. Basically the matrix would act as a big switch or a kind of pre-built z-buffer. When drawing a given object, no video would be output for the parts of the object when a 1 exists in the objects column and the corresponding row object is also being drawn. 研究员指出,这种矩阵可以应用到硬件中,对每一帧进行重新评估。该矩阵基本上可以用作大型交换器,或者一种预置的 Z 缓冲器。在绘制给定的物体时,如果在物体所在列上得出数值 1并且所在行已经在绘制中那么物体被遮挡的部分就不会绘制出来。
The major drawback with this matrix approach is that to represent a scene with \\(n\\) objects you need a matrix of size \\(n^2\\). So the researchers go on to explore whether it would be feasible to represent the occlusion matrix as a “priority list” instead, which would only be of size \\(n\\) and would establish an order in which objects should be drawn. They immediately note that for certain scenes like the one in the figure above no ordering can be made (since there is an occlusion cycle), so they spend a lot of time laying out the mathematical distinction between “proper” and “improper” scenes. Eventually they conclude that, at least for “proper” scenes—and it should be easy enough for a scene designer to avoid “improper” cases—a priority list could be generated. But they leave the list generation as an exercise for the reader. It seems the primary contribution of this 1969 study was to point out that it should be possible to use partitioning planes to order objects in a scene for rendering, at least _in theory_. 不过,该矩阵方法的主要缺点在于,为了在场景中表示出 \\(n\\) 个物体,需要将矩阵的尺寸调整为 \\(n^2\\)。于是,研究员们继续深入,探究使用遮挡矩阵作为“优先级顺序表”的可行性。遮挡矩阵的尺寸还是 \\(n\\),可确定物体绘制的顺序。他们随即发现,诸如上图此类场景根本无法确定顺序(因为它存在循环阻塞的现象)。因此,他们不遗余力,讲明“合适”与“不合适”场景之间在数学方面的区别。最后,他们得出了一个结论:在“合适的”场景下,优先级顺序表是可以制作出来的;而对场景设计师来说,避免设计出“不合适”的场景也不是一件难事。但是,他们并没有说明如何生成顺序表。可以说,这份研究的首要贡献在于提出了可以采用平面分割的方法,对场景中的物体进行渲染排顺。至少,这在 _理论上_ 是可行的。
It was not until 1980 that a paper, titled “On Visible Surface Generation by A Priori Tree Structures,” demonstrated a concrete algorithm to accomplish this. The 1980 paper, written by Henry Fuchs, Zvi Kedem, and Bruce Naylor, introduced the BSP tree. The authors say that their novel data structure is “an alternative solution to an approach first utilized a decade ago but due to a few difficulties, not widely exploited”—here referring to the approach taken in the 1969 Air Force study.[3][5] A BSP tree, once constructed, can easily be used to provide a priority ordering for objects in the scene. 直到 1980 年,一份题为《基于优先级树结构的可见表面生成》的论文提出了解决该问题的具体算法。在这份论文中,作者亨利·福克斯、泽维·凯德姆以及布鲁斯·内勒介绍了 BSP 树。他们指出这种新的数据结构“可以替代十年前首次使用但由于一些问题未得到广泛发展的方案”(此处即前文 1969 年美国空军相关研究中的方案)。[3][5] BSP 树一经生成,即可用于确定场景中物体的优先级顺序。
Fuchs, Kedem, and Naylor give a pretty readable explanation of how a BSP tree works, but let me see if I can provide a less formal but more concise one. 三人在论文中详细明了地解释了 BSP 树的工作原理。在本文,我将尝试使用更加通俗具体的语言,介绍给大家。
You begin by picking one polygon in your scene and making the plane in which the polygon lies your partitioning plane. That one polygon also ends up as the root node in your tree. The remaining polygons in your scene will be on one side or the other of your root partitioning plane. The polygons on the “forward” side or in the “forward” half-space of your plane end up in the left subtree of your root node, while the polygons on the “back” side or in the “back” half-space of your plane end up in the right subtree. You then repeat this process recursively, picking a polygon from your left and right subtrees to be the new partitioning planes for their respective half-spaces, which generates further half-spaces and further sub-trees. You stop when you run out of polygons. 首先,在场景中选定一个多边形,将该多边形所在的平面作为分割平面。同时,该多边形充当树的根节点。场景中剩下的多边形会分散在分割平面的两侧。位于分割表面“前方”或者与分割平面相交后位于“前”半部分的多边形落在了根节点左侧的左子树上;位于分割表面“后方”或者与分割平面相交后位于“后”半部分的多边形落在了右子树上。接着,递归重复这一过程:在左子树和右子树上各选定一个多边形,作为各自空间新的分割平面,继而二分出来更多的子空间和子树。等到全部的多边形均选定之后,二叉空间分割也就结束了。
Say you want to render the geometry in your scene from back-to-front. (This is known as the “painters algorithm,” since it means that polygons further from the camera will get drawn over by polygons closer to the camera, producing a correct rendering.) To achieve this, all you have to do is an in-order traversal of the BSP tree, where the decision to render the left or right subtree of any node first is determined by whether the camera viewpoint is in either the forward or back half-space relative to the partitioning plane associated with the node. So at each node in the tree, you render all the polygons on the “far” side of the plane first, then the polygon in the partitioning plane, then all the polygons on the “near” side of the plane—”far” and “near” being relative to the camera viewpoint. This solves the VSD problem because, as we learned several paragraphs back, the polygons on the far side of the partitioning plane cannot obstruct anything on the near side. 由后向前将场景中的几何图形进行渲染就是所谓的“画家算法”。因为在绘制时,距离摄像机较远的多边形会被距离摄像机较近的多边形所覆盖,借此正确进行渲染任务。如果想要实现这一算法,必须按中序遍历 BSP 树,左右子树的渲染顺序由摄像机视角与节点所在分割平面的位置关系决定的。因此,针对树上的每个节点,首先渲染距离分割平面较“远”一侧的所有多边形,接着是位于平面上的多边形,最后是距离平面较“近”一侧的所有多边形——“远”与“近”相对于摄像机视角而言。根据前文,距离分割平面较远一侧的多边形无法遮挡近侧的物体,所以这种方法可以解决 VSD 问题。
The following diagram shows the construction and traversal of a BSP tree representing a simple 2D scene. In 2D, the partitioning planes are instead partitioning lines, but the basic idea is the same in a more complicated 3D scene. 下图表示一个简单的 2D 场景的 BSP 树的构造与遍历过程。在 2D 中,分割平面变成了分割线,但就基本原理而言,与复杂的 3D 场景并无二质。
![][6] _Step One: The root partitioning line along wall D splits the remaining geometry into two sets._ ![][6] _第一步:根分割线落在 D 墙上,将剩下的几何图形分为两组。_
![][7] _Step Two: The half-spaces on either side of D are split again. Wall C is the only wall in its half-space so no split is needed. Wall B forms the new partitioning line in its half-space. Wall A must be split into two walls since it crosses the partitioning line._ ![][7] _第二步:继续分割位于 D 墙两侧的空间。C 墙是其中一侧的唯一一堵墙壁因此无需再分。另一侧B 墙形成新的分割平面。因为 A 墙与新的分割平面相交,所以必须将其分割为两堵墙。_
![][8] _A back-to-front ordering of the walls relative to the viewpoint in the top-right corner, useful for implementing the painters algorithm. This is just an in-order traversal of the tree._ ![][8] _第三步:参照右上方视角,由后向前对墙壁进行排序,对执行画家算法很有帮助。这就是树的中序遍历过程。_
The really neat thing about a BSP tree, which Fuchs, Kedem, and Naylor stress several times, is that it only has to be constructed once. This is somewhat surprising, but the same BSP tree can be used to render a scene no matter where the camera viewpoint is. The BSP tree remains valid as long as the polygons in the scene dont move. This is why the BSP tree is so useful for real-time rendering—all the hard work that goes into constructing the tree can be done beforehand rather than during rendering. 福克斯、凯德姆以及内勒多次强调了 BSP 树的优势:无需重复构建。可能有些难以置信,但实际上无论摄像机视角位于何处,场景的渲染只需一棵 BSP 树。只要场景中的多边形没有移动BSP 树就不会失效。因此BSP 树在实时渲染任务中非常实用——构建树时的所有艰巨任务都可以在渲染工作开展之前完成。
One issue that Fuchs, Kedem, and Naylor say needs further exploration is the question of what makes a “good” BSP tree. The quality of your BSP tree will depend on which polygons you decide to use to establish your partitioning planes. I skipped over this earlier, but if you partition using a plane that intersects other polygons, then in order for the BSP algorithm to work, you have to split the intersected polygons in two, so that one part can go in one half-space and the other part in the other half-space. If this happens a lot, then building a BSP tree will dramatically increase the number of polygons in your scene. 同时,三人也提到了一项需要进一步深入研究的问题:究竟怎样才能构建出一棵“高质量的” BSP 树BSP 树的质量取决于用作分割平面的多边形的选择。我在前文跳过了这一问题,不过如果用作分割平面的多边形与其他多边形相交,那么为了避免 BSP 算法失效必须将相交的多边形一分为二这样两部分就可以分在不同的空间。但是如果这种现象反复出现BSP 树的构建势必会大幅增加场景中多边形的数量。
Bruce Naylor, one of the authors of the 1980 paper, would later write about this problem in his 1993 paper, “Constructing Good Partitioning Trees.” According to John Romero, one of Carmacks fellow id Software co-founders, this paper was one of the papers that Carmack read when he was trying to implement BSP trees in _Doom_.[4][9] 内勒后来在其 1993 年的论文《<ruby>构建高质量的分割树<rt>Constructing Good Partitioning Trees</rt></ruby>》中提及这一问题。与卡马克一同建立 id Software 的约翰·罗梅洛指出,这篇论文是卡马克在 _《毁灭战士》_ 中引入 BSP 树时读到的论文之一。[4][9]
### BSP Trees in Doom ### 《毁灭战士》中的 BSP 树
Remember that, in his first draft of the _Doom_ renderer, Carmack had been trying to establish a rendering order for level geometry by “flooding” the renderer out from the players current room into neighboring rooms. BSP trees were a better way to establish this ordering because they avoided the issue where the renderer found itself visiting the same room (or sector) multiple times, wasting CPU cycles. 别忘了,卡马克首次为 _《毁灭战士》_ 设计渲染器时通过让渲染器渲染玩家所在房间之外的临近房间试图为关卡几何图形建立一套渲染顺序。对此BSP 树是个不错的选择因为在玩家进入之前的房间区域BSP 树能够避免让渲染器重复劳动,从而节省 CPU 资源。
“Adding BSP trees to _Doom_” meant, in practice, adding a BSP tree generator to the _Doom_ level editor. When a level in _Doom_ was complete, a BSP tree was generated from the level geometry. According to Fabien Sanglard, the generation process could take as long as eight seconds for a single level and 11 minutes for all the levels in the original _Doom_.[5][10] The generation process was lengthy in part because Carmacks BSP generation algorithm tries to search for a “good” BSP tree using various heuristics. An eight-second delay would have been unforgivable at runtime, but it was not long to wait when done offline, especially considering the performance gains the BSP trees brought to the renderer. The generated BSP tree for a single level would have then ended up as part of the level data loaded into the game when it starts. 实际上,“将 BSP 树引入 _《毁灭战士》_”意味着将 BSP 树生成器引入 _《毁灭战士》_ 的关卡编辑器中。_《毁灭战士》_ 的关卡制作完成之时BSP 树就会在关卡几何图形的基础上生成。根据程序员法比安·桑格勒德的说法,在原版 _《毁灭战士》_ 中,一个关卡的 BSP 树生成时间需要 8 秒,全部关卡合计共需 11 分钟 [5][10]。之所以生成时间较长,部分原因在于卡马克所用的 BSP 生成算法,该算法尝试使用各种启发式方法找出“高质量” BSP 树。在运行时8 秒的延时可能让人无法接受;但是在线下等 8 秒,时间并不算长,尤其是考虑到 BSP 树提升了渲染器的性能。每个关卡生成的 BSP 树将在游戏启动时作为关卡数据载入。
Carmack put a spin on the BSP tree algorithm outlined in the 1980 paper, because once _Doom_ is started and the BSP tree for the current level is read into memory, the renderer uses the BSP tree to draw objects front-to-back rather than back-to-front. In the 1980 paper, Fuchs, Kedem, and Naylor show how a BSP tree can be used to implement the back-to-front painters algorithm, but the painters algorithm involves a lot of over-drawing that would have been expensive on an IBM-compatible PC. So the _Doom_ renderer instead starts with the geometry closer to the player, draws that first, then draws the geometry farther away. This reverse ordering is easy to achieve using a BSP tree, since you can just make the opposite traversal decision at each node in the tree. To ensure that the farther-away geometry is not drawn over the closer geometry, the _Doom_ renderer uses a kind of implicit z-buffer that provides much of the benefit of a z-buffer with a much smaller memory footprint. There is one array that keeps track of occlusion in the horizontal dimension, and another two arrays that keep track of occlusion in the vertical dimension from the top and bottom of the screen. The _Doom_ renderer can get away with not using an actual z-buffer because _Doom_ is not technically a fully 3D game. The cheaper data structures work because certain things never appear in _Doom_: The horizontal occlusion array works because there are no sloping walls, and the vertical occlusion arrays work because no walls have, say, two windows, one above the other. 卡马克非常赞赏 1980 年论文中提出的 BSP 树算法,因为在 _《毁灭战士》_ 开始运行时,当前关卡的 BSP 树就会读取到内存中,渲染器通过 BSP 树由前向后绘制物体,而非由后向前进行绘制。福克斯、凯德姆以及内勒在那篇论文中演示了 BSP 树可用于执行由后向前的画家算法,但是画家算法会造成许多重复的绘制任务,对于与 IBM 机器兼容的个人电脑来说负担较大。因此_毁灭战士_ 的渲染器换了个方向,首先绘制距离玩家较近的图形,之后再绘制离玩家较远的。采用这种相反的顺序,更有利于 BSP 树的应用因为在树的每个节点都可以进行反向遍历。为了避免绘制出来的远处图形遮挡到近处的图形_《毁灭战士》_ 的渲染器使用了一种内置的 Z 缓冲器,这种缓冲器不仅具备普通 Z 缓冲器的优势而且对内存的要求也较低。Z 缓冲器有两组数组一组记录水平方向的遮挡关系另一组自屏幕由上及下记录垂直方向的遮挡关系。_《毁灭战士》_ 的渲染器就算不使用真正的 Z 缓冲器也无伤大雅,因为从技术上来看它并不是真正的 3D 游戏。BSP 树数据结构的成本虽然不高,但却能够起作用,其原因在于 _《毁灭战士》_ 不会发生以下问题:水平方向的遮挡数组能够运行,是因为该游戏中没有倾斜的墙体;垂直方向的遮挡数组能够运行,是因为该游戏不存在有着一上一下两扇窗户的墙体。
The only other tricky issue left is how to incorporate _Doom_s moving characters into the static level geometry drawn with the aid of the BSP tree. The enemies in _Doom_ cannot be a part of the BSP tree because they move; the BSP tree only works for geometry that never moves. So the _Doom_ renderer draws the static level geometry first, keeping track of the segments of the screen that were drawn to (with yet another memory-efficient data structure). It then draws the enemies in back-to-front order, clipping them against the segments of the screen that occlude them. This process is not as optimal as rendering using the BSP tree, but because there are usually fewer enemies visible than there is level geometry in a level, speed isnt as much of an issue here. 剩下比较棘手的问题是如何将 _《毁灭战士》_ 中处于运动中的角色融入到借助 BSP 树绘制的静止的关卡几何图形中。该游戏中的敌人不可能纳入 BSP 树之中,因为他们会移动,而 BSP 树只对静止的几何形状起作用。所以渲染器首先绘制静止的关卡几何图形,同时与另一个内存使用效率较高的数据结构协作,记录屏幕上分割出来用于绘制的区域。之后,渲染器按照由后往前的顺序绘制敌人,并消除被屏幕上的区域遮挡住的敌人。这一过程与使用 BSP 树进行渲染相比,效果稍差一些。但是由于关卡中能看到的敌人的数量少于几何图形的数量,所以速度问题并没有那么重要。
Using BSP trees in _Doom_ was a major win. Obviously it is pretty neat that Carmack was able to figure out that BSP trees were the perfect solution to his problem. But was it a _genius_-level move? 将 BSP 树应用到 _《毁灭战士》_ 中可谓一大成功。卡马克能够想到 BSP 树是解决 VSD 问题的最佳方案,无疑非常高明。但是这可以称得上是天才之举吗?
In his excellent book about the _Doom_ game engine, Fabien Sanglard quotes John Romero saying that Bruce Naylors paper, “Constructing Good Partitioning Trees,” was mostly about using BSP trees to cull backfaces from 3D models.[6][11] According to Romero, Carmack thought the algorithm could still be useful for _Doom_, so he went ahead and implemented it. This description is quite flattering to Carmack—it implies he saw that BSP trees could be useful for real-time video games when other people were still using the technique to render static scenes. There is a similarly flattering story in _Masters of Doom_: Kushner suggests that Carmack read Naylors paper and asked himself, “what if you could use a BSP to create not just one 3D image but an entire virtual world?”[7][12] 桑格勒德在其关于 _《毁灭战士》_ 游戏引擎的书中引用了罗梅洛的话:内勒的论文《构建高质量的分割树》主要讲述使用 BSP 树消除 3D 模型的背面。[6][11] 根据罗梅洛所言,卡马克认为这种算法对 _《毁灭战士》_ 依然有效,所以他放手一试,将 BSP 技术应用到了该游戏中。不过这话说得有些奉承的意味——意在暗示卡马克在别人仍然使用 BSP 树渲染静止的场景时,发现该技术可以用于实时游戏领域。在 _《Doom 启示录》_ 也有给卡马克戴高帽的故事。该书作者库什纳认为,卡马克在阅读内勒的论文之后,问了自己,“如果使用 BSP 技术创造一整个虚拟世界,而不仅仅是一张 3D 图像,会怎么样呢?” [7][12]。
This framing ignores the history of the BSP tree. When those US Air Force researchers first realized that partitioning a scene might help speed up rendering, they were interested in speeding up _real-time_ rendering, because they were, after all, trying to create a flight simulator. The flight simulator example comes up again in the 1980 BSP paper. Fuchs, Kedem, and Naylor talk about how a BSP tree would be useful in a flight simulator that pilots use to practice landing at the same airport over and over again. Since the airport geometry never changes, the BSP tree can be generated just once. Clearly what they have in mind is a real-time simulation. In the introduction to their paper, they even motivate their research by talking about how real-time graphics systems must be able to create an image in at least 1/30th of a second. 这些“片面之词”忽视了 BSP 树的发展历史。当美国空军研究人员开始意识到场景分割可能会加快渲染任务的时候,他们就对提升 _实时_ 渲染的速度产生了兴趣毕竟他们当时想要开发出飞行模拟器。1980 年,同样的案例再次出现在了福克斯等人的论文中,他们探讨了 BSP 树如何应用于飞行模拟器中帮助飞行员进行训练重复将飞机降至同一空港。由于空港的地形不会发生改变BSP 树只需生成一次,即可一劳永逸。很明显,他们考虑的是实时模拟。在论文的引言部分,福克斯等人还谈到实时图形系统必须在至少 1/30 秒内生成一张图像,由此介绍了他们的研究动机。
So Carmack was not the first person to think of using BSP trees in a real-time graphics simulation. Of course, its one thing to anticipate that BSP trees might be used this way and another thing to actually do it. But even in the implementation Carmack may have had more guidance than is commonly assumed. The [Wikipedia page about BSP trees][13], at least as of this writing, suggests that Carmack consulted a 1991 paper by Chen and Gordon as well as a 1990 textbook called _Computer Graphics: Principles and Practice_. Though no citation is provided for this claim, it is probably true. The 1991 Chen and Gordon paper outlines a front-to-back rendering approach using BSP trees that is basically the same approach taken by _Doom_, right down to what Ive called the “implicit z-buffer” data structure that prevents farther polygons being drawn over nearer polygons. The textbook provides a great overview of BSP trees and some pseudocode both for building a tree and for displaying one. (Ive been able to skim through the 1990 edition thanks to my wonderful university library.) _Computer Graphics: Principles and Practice_ is a classic text in computer graphics, so Carmack might well have owned it. 因此,卡马克不是第一个想到在实时图形模拟中应用 BSP 树的人。诚然设想与付诸实践是两码事。但是即使在实施的过程中卡马克受到的帮助与指导可比人们想象中的要多得多。至少是到这篇文章写成之时BSP 树的 [维基百科词条][13] 页面显示,卡马克参考了 1991 年陈和戈登的一篇论文以及 1990 年的一本教材 _《计算机图形学原理及实践》_。尽管该页面并未提供引用信息,但是基本上不会出错。陈和戈登的论文介绍了运用 BSP 树由前向后的渲染方法,这种方法与 _《毁灭战士》_ 用到的方法基本一致还包括我称之为“内置缓冲器”的数据结构可用于防止远处的图形在绘制时遮挡近处的图形。_《计算机图形学原理及实践》_ 详细介绍了 BSP 树以及一些构建并展示 BSP 树的伪代码(非常感谢我大学的图书馆,让我能够一睹这本教材的 1990 年的版本)。因为这本书是计算机图形学的经典之作,所以卡马克很可能也有一本。
Still, Carmack found himself faced with a novel problem—”How can we make a first-person shooter run on a computer with a CPU that cant even do floating-point operations?”—did his research, and proved that BSP trees are a useful data structure for real-time video games. I still think that is an impressive feat, even if the BSP tree had first been invented a decade prior and was pretty well theorized by the time Carmack read about it. Perhaps the accomplishment that we should really celebrate is the _Doom_ game engine as a whole, which is a seriously nifty piece of work. Ive mentioned it once already, but Fabien Sanglards book about the _Doom_ game engine (_Game Engine Black Book: DOOM_) is an excellent overview of all the different clever components of the game engine and how they fit together. We shouldnt forget that the VSD problem was just one of many problems that Carmack had to solve to make the _Doom_ engine work. That he was able, on top of everything else, to read about and implement a complicated data structure unknown to most programmers speaks volumes about his technical expertise and his drive to perfect his craft. 然而,卡马克发现自己遇到一个新问题:如何让第一人称射击游戏在一台 CPU 甚至都无法进行浮点操作的电脑上运行呢?通过调查研究,他证明了 BSP 树的数据结构非常适用于实时游戏渲染。尽管 BSP 树早已提出,而且到了卡马克的时代,相关理论已经非常成熟了,但我始终认为,卡马克的做法可谓惊人之壮举。也许,得到人们称誉的应该是整个 _《毁灭战士》_ 的游戏引擎,毕竟它的确非常精致。我在前文也提及过,但是桑格勒德的 _《游戏引擎黑皮书:毁灭战士》_ 很好地讲解了这款游戏引擎的非凡之处以及这些优势相互契合之法。要明白VSD 问题只是卡马克在编写 _《毁灭战士》_ 游戏引擎时需要解决的诸多问题之一。不得不说,面对不为大多数程序员所知的复杂的数据结构,卡马克能够查阅相关文献,将其付诸实践,仅此一点就足以说明其技术之精湛、匠心之独到。
_If you enjoyed this post, more like it come out every four weeks! Follow [@TwoBitHistory][14] on Twitter or subscribe to the [RSS feed][15] to make sure you know when a new post is out._ _如果你喜欢这篇文章,欢迎关注推特 [@TwoBitHistory][14],也可通过 [RSS feed][15] 订阅,获取最新文章(每四周更新一篇)。_
_Previously on TwoBitHistory…_ _TwoBitHistory 文章回顾……_
> I've wanted to learn more about GNU Readline for a while, so I thought I'd turn that into a new blog post. Includes a few fun facts from an email exchange with Chet Ramey, who maintains Readline (and Bash):<https://t.co/wnXeuyjgMx> > 我曾想花上一段时间深入了解 GNU Readline所以我以此为主题写了一篇新博客包括在与 Readline 库(以及 Bash的维护者切特·雷米通过邮件交流时了解到的一些趣事<https://t.co/wnXeuyjgMx>
> >
> — TwoBitHistory (@TwoBitHistory) [August 22, 2019][16] > — TwoBitHistory (@TwoBitHistory) [2019 年 8 月 22 日][16]
1. Michael Abrash, “Michael Abrashs Graphics Programming Black Book,” James Gregory, accessed November 6, 2019, <http://www.jagregory.com/abrash-black-book/#chapter-64-quakes-visible-surface-determination>. [↩︎][17] 1. Michael Abrash, “Michael Abrashs Graphics Programming Black Book,” James Gregory, accessed November 6, 2019, <http://www.jagregory.com/abrash-black-book/#chapter-64-quakes-visible-surface-determination>. [↩︎][17]