Merge pull request #14150 from chen-ni/master

提交翻译
This commit is contained in:
HuanCheng Bai 2019-06-18 13:38:12 +08:00 committed by GitHub
commit f8bf01fd56
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 139 additions and 145 deletions

View File

@ -1,145 +0,0 @@
[#]: collector: (lujun9972)
[#]: translator: (chen-ni)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (A short primer on assemblers, compilers, and interpreters)
[#]: via: (https://opensource.com/article/19/5/primer-assemblers-compilers-interpreters)
[#]: author: (Erik O'Shaughnessy https://opensource.com/users/jnyjny/users/shawnhcorey/users/jnyjny/users/jnyjny)
A short primer on assemblers, compilers, and interpreters
======
A gentle introduction to the historical evolution of programming
practices.
![keyboard with connected dots][1]
In the early days of computing, hardware was expensive and programmers were cheap. In fact, programmers were so cheap they weren't even called "programmers" and were in fact usually mathematicians or electrical engineers. Early computers were used to solve complex mathematical problems quickly, so mathematicians were a natural fit for the job of "programming."
### What is a program?
First, a little background. Computers can't do anything by themselves, so they require programs to drive their behavior. Programs can be thought of as very detailed recipes that take an input and produce an output. The steps in the recipe are composed of instructions that operate on data. While that sounds complicated, you probably know how this statement works:
```
`1 + 2 = 3`
```
The plus sign is the "instruction" while the numbers 1 and 2 are the data. Mathematically, the equal sign indicates that both sides of an equation are "equivalent," however most computer languages use some variant of equals to mean "assignment." If a computer were executing that statement, it would store the results of the addition (the "3") somewhere in memory.
Computers know how to do math with numbers and move data around the machine's memory hierarchy. I won't say too much about memory except that it generally comes in two different flavors: fast/small and slow/big. CPU registers are very fast, very small and act as scratch pads. Main memory is typically very big and not nearly as fast as register memory. CPUs shuffle the data they are working with from main memory to registers and back again while a program executes.
### Assemblers
Computers were very expensive and people were cheap. Programmers spent endless hours translating hand-written math into computer instructions that the computer could execute. The very first computers had terrible user interfaces, some only consisting of toggle switches on the front panel. The switches represented 1s and 0s in a single "word" of memory. The programmer would configure a word, indicate where to store it, and commit the word to memory. It was time-consuming and error-prone.
![Programmers operate the ENIAC computer][2]
_Programmers[Betty Jean Jennings][3] (left) and [Fran Bilas][4] (right) operate [ENIAC's][5] main control panel._
Eventually, an [electrical engineer][6] decided his time wasn't cheap and wrote a program with input written as a "recipe" expressed in terms people could read that output a computer-readable version. This was the first "assembler" and it was very controversial. The people that owned the expensive machines didn't want to "waste" compute time on a task that people were already doing; albeit slowly and with errors. Over time, people came to appreciate the speed and accuracy of the assembler versus a hand-assembled program, and the amount of "real work" done with the computer increased.
While assembler programs were a big step up from toggling bit patterns into the front panel of a machine, they were still pretty specialized. The addition example above might have looked something like this:
```
01 MOV R0, 1
02 MOV R1, 2
03 ADD R0, R1, R2
04 MOV 64, R0
05 STO R2, R0
```
Each line is a computer instruction, beginning with a shorthand name of the instruction followed by the data the instruction works on. This little program will first "move" the value 1 into a register called R0, then 2 into register R1. Line 03 adds the contents of registers R0 and R1 and stores the resulting value into register R2. Finally, lines 04 and 05 identify where the result should be stored in main memory (address 64). Managing where data is stored in memory is one of the most time-consuming and error-prone parts of writing computer programs.
### Compilers
Assembly was much better than writing computer instructions by hand; however, early programmers yearned to write programs like they were accustomed to writing mathematical formulae. This drove the development of higher-level compiled languages, some of which are historical footnotes and others are still in use today. [ALGO][7] is one such footnote, while real problems continue to be solved today with languages like [Fortran][8] and [C][9].
![Genealogy tree of ALGO and Fortran][10]
Genealogy tree of ALGO and Fortran programming languages
The introduction of these "high-level" languages allowed programmers to write their programs in simpler terms. In the C language, our addition assembly program would be written:
```
int x;
x = 1 + 2;
```
The first statement describes a piece of memory the program will use. In this case, the memory should be the size of an integer and its name is **x** The second statement is the addition, although written "backward." A C programmer would read that as "X is assigned the result of one plus two." Notice the programmer doesn't need to say where to put **x** in memory, as the compiler takes care of that.
A new type of program called a "compiler" would turn the program written in a high-level language into an assembly language version and then run it through the assembler to produce a machine-readable version of the program. This composition of programs is often called a "toolchain," in that one program's output is sent directly to another program's input.
The huge advantage of compiled languages over assembly language programs was porting from one computer model or brand to another. In the early days of computing, there was an explosion of different types of computing hardware from companies like IBM, Digital Equipment Corporation, Texas Instruments, UNIVAC, Hewlett Packard, and others. None of these computers shared much in common besides needing to be plugged into an electrical power supply. Memory and CPU architectures differed wildly, and it often took man-years to translate programs from one computer to another.
With high-level languages, the compiler toolchain only had to be ported to the new platform. Once the compiler was available, high-level language programs could be recompiled for a new computer with little or no modification. Compilation of high-level languages was truly revolutionary.
![IBM PC XT][11]
IBM PC XT released in 1983, is an early example of the decreasing cost of hardware.
Life became very good for programmers. It was much easier to express the problems they wanted to solve using high-level languages. The cost of computer hardware was falling dramatically due to advances in semiconductors and the invention of integrated chips. Computers were getting faster and more capable, as well as much less expensive. At some point, possibly in the late '80s, there was an inversion and programmers became more expensive than the hardware they used.
### Interpreters
Over time, a new programming model rose where a special program called an "interpreter" would read a program and turn it into computer instructions to be executed immediately. The interpreter takes the program as input and interprets it into an intermediate form, much like a compiler. Unlike a compiler, the interpreter then executes the intermediate form of the program. This happens every time an interpreted program runs, whereas a compiled program is compiled just one time and the computer executes the machine instructions "as written."
As a side note, when people say "interpreted programs are slow," this is the main source of the perceived lack of performance. Modern computers are so amazingly capable that most people can't tell the difference between compiled and interpreted programs.
Interpreted programs, sometimes called "scripts," are even easier to port to different hardware platforms. Because the script doesn't contain any machine-specific instructions, a single version of a program can run on many different computers without changes. The catch, of course, is the interpreter must be ported to the new machine to make that possible.
One example of a very popular interpreted language is [perl][12]. A complete perl expression of our addition problem would be:
```
`$x = 1 + 2`
```
While it looks and acts much like the C version, it lacks the variable initialization statement. There are other differences (which are beyond the scope of this article), but you can see that we can write a computer program that is very close to how a mathematician would write it by hand with pencil and paper.
### Virtual Machines
The latest craze in programming models is the virtual machine, often abbreviated as VM. There are two flavors of virtual machine; system virtual machines and process virtual machines. Both types of VMs provide a level of abstraction from the "real" computing hardware, though they have different scopes. A system virtual machine is software that offers a substitute for the physical hardware, while a process virtual machine is designed to execute a program in a system-independent manner. So in this case, a process virtual machine (virtual machine from here on) is similar in scope to an interpreter in that a program is first compiled into an intermediated form before the virtual machine executes it.
The main difference between an interpreter and a virtual machine is the virtual machine implements an idealized CPU accessed through its virtual instruction set. This abstraction makes it possible to write front-end language tools that compile programs written in different languages and target the virtual machine. Probably the most popular and well known virtual machine is the Java Virtual Machine (JVM). The JVM was initially only for the Java programming language back in the 1990s, but it now hosts [many][13] popular computer languages: Scala, Jython, JRuby, Clojure, and Kotlin to list just a few. There are other examples that may not be common knowledge. I only recently learned that my favorite language, [Python][14], is not an interpreted language, but a [language hosted on a virtual machine][15]!
Virtual machines continue the historical trend of reducing the amount of platform-specific knowledge a programmer needs to express their problem in a language that supports their domain-specific needs.
### That's a wrap
I hope you enjoy this primer on some of the less visible parts of software. Are there other topics you want me to dive into next? Let me know in the comments.
* * *
_This article was originally published on[PyBites][16] and is reprinted with permission._
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/5/primer-assemblers-compilers-interpreters
作者:[Erik O'Shaughnessy][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jnyjny/users/shawnhcorey/users/jnyjny/users/jnyjny
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/programming_keyboard_coding.png?itok=E0Vvam7A (keyboard with connected dots)
[2]: https://opensource.com/sites/default/files/uploads/two_women_operating_eniac.gif (Programmers operate the ENIAC computer)
[3]: https://en.wikipedia.org/wiki/Jean_Bartik (Jean Bartik)
[4]: https://en.wikipedia.org/wiki/Frances_Spence (Frances Spence)
[5]: https://en.wikipedia.org/wiki/ENIAC
[6]: https://en.wikipedia.org/wiki/Nathaniel_Rochester_%28computer_scientist%29
[7]: https://en.wikipedia.org/wiki/ALGO
[8]: https://en.wikipedia.org/wiki/Fortran
[9]: https://en.wikipedia.org/wiki/C_(programming_language)
[10]: https://opensource.com/sites/default/files/uploads/algolfortran_family-by-borkowski.png (Genealogy tree of ALGO and Fortran)
[11]: https://opensource.com/sites/default/files/uploads/639px-ibm_px_xt_color.jpg (IBM PC XT)
[12]: www.perl.org
[13]: https://en.wikipedia.org/wiki/List_of_JVM_languages
[14]: /resources/python
[15]: https://opensource.com/article/18/4/introduction-python-bytecode
[16]: https://pybit.es/python-interpreters.html

View File

@ -0,0 +1,139 @@
[#]: collector: (lujun9972)
[#]: translator: (chen-ni)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (A short primer on assemblers, compilers, and interpreters)
[#]: via: (https://opensource.com/article/19/5/primer-assemblers-compilers-interpreters)
[#]: author: (Erik O'Shaughnessy https://opensource.com/users/jnyjny/users/shawnhcorey/users/jnyjny/users/jnyjny)
浅谈汇编器,编译器和解释器
======
简单介绍一下编程方式的历史演变
![keyboard with connected dots][1]
在计算机诞生不久的早期年代,硬件非常昂贵,而程序员比较廉价。这些廉价程序员甚至都没有“程序员”这个头衔,并且常常是由数学家或者电气工程师来充当这个角色的。早期的计算机被用来在短时间内解决复杂的数学问题,所以数学家天然就适合“编程”工作。
### 什么是程序?
首先来看一点背景知识。计算机自己是做不了任何事情的,它们的任何行为都需要程序来引导。你可以把程序看成是非常精确的菜谱,这种菜谱读取一个输入,然后生成对应的输出。菜谱里的各个步骤由操作数据的指令构成。听上去有点儿复杂,不过你或许知道下面这个语句是什么意思:
```
`1 + 2 = 3`
```
其中的加号是“指令”,而数字 1 和 2 是数据。数学上的等号意味着等式两边的部分是“等价”的不过在大部分编程语言中等号或者它的变形都是“赋值”的意思。如果计算机执行上面这个语句它会把这个加法的结果也就是“3”储存在内存中的某个地方。
计算机知道如何使用数字进行数学运算,以及如何在内存结构中移动数据。在这里就不对内存进行展开了,你只需要知道内存一般分为两大类:“速度快/空间小”和“速度慢/空间大”。CPU 寄存器的读写速度非常快但是空间非常小相当于一个速记便签。主存储器通常有很大的空间但是读写速度就比寄存器差远了。在程序运行的时候CPU 不断将它所需要用到的数据从主存储器挪动到寄存器,然后再把结果放回到主存储器。
### 汇编器
当时的计算机很贵,而人力比较便宜。程序员需要耗费很多时间把手写的数学表达式翻译成计算机可以执行的指令。最初的计算机只有非常糟糕的用户界面,有些甚至只有前面板上的拨动开关。这些开关就代表一个内存“单元”里的一个个 “0” 和 “1”。程序员需要配置一个内存单元选择好储存位置然后把这个单元提交到内存里。这是一个既耗时又容易出错的过程。
![Programmers operate the ENIAC computer][2]
_程序员[Betty Jean Jennings][3] (左) 和 [Fran Bilas][4] (右) 在操作 [ENIAC][5] 的主控制面板._
后来有一名 [电气工程师][6] 认为自己的时间很宝贵,就写了一个能够把人们可以读懂的“菜谱”一样的程序转换成计算机可以读懂的版本的程序。这就是最初的“汇编器”,在当时引起了不小的争议。这些昂贵机器的主人不希望把计算资源浪费在人们已经在做的任务上(虽然又慢又容易出错)。不过随着时间的推移,人们逐渐发现使用汇编器在速度和准确性上都胜于人工编写机器语言,并且计算机完成的“实际工作量”增加了。
尽管汇编器相比在机器面板上切换比特的状态已经是很大的进步了,这种编程方式仍然非常专业。上面加法的例子在汇编语言中看起来差不多是这样的:
```
01 MOV R0, 1
02 MOV R1, 2
03 ADD R0, R1, R2
04 MOV 64, R0
05 STO R2, R0
```
每一行都是一个计算机指令,前面是一个指令的简写,后面是指令所操作的数据。这个小小的程序会将数值 1 “移动”到寄存器 R0然后把 2 移动到寄存器 R1。03 行把 R0 和 R1 两个寄存器里的数值相加,然后将结果储存在 R2 寄存器里。最后04 行和 05 行决定结果应该被放在主存储器里的什么位置(在这里是地址 64。管理内存中存储数据的位置是编程过程中最耗时也最容易出错的部分之一。
### 编译器
汇编器已经比手写计算机指令要好太多了,不过早期的程序员还是渴望能够按照他们所习惯的方式,像书写数学公式一样地去写程序。这种需求驱动了更高级别编译语言的发展,其中有一些已经成为历史,另一些如今还在使用。比如[ALGO][7] 就已经成为历史了,但是像 [Fortran][8] 和 [C][9] 这样的语言仍然在不断解决实际问题。
![Genealogy tree of ALGO and Fortran][10]
ALGO 和 Fortran 编程语言的谱系树
这些“高级”语言使得程序员可以用更简单的方式编写程序。在 C 语言中,我们的加法程序就变成了这样:
```
int x;
x = 1 + 2;
```
第一个语句描述了该程序将要使用的一块内存。在这个例子中,这块内存应该占一个整数的大小,名字是 **x**。第二个语句是加法,虽然是倒着写的。一个 C 语言的程序员会说这是 "X 被赋值为 1 加 2 的结果"。需要注意的是,程序员并不需要决定在内存的什么位置储存 **x**,这个任务交给编译器了。
这种被称为“编译器”的新程序可以把用高级语言写的程序转换成汇编语言,再使用汇编器把汇编语言转换成机器可读的程序。这种程序的组合常常被称为“工具链”,因为一个程序的输出就直接成为另一个程序的输入。
编译语言相比汇编语言的优势体现在从一台计算机迁移到不同型号或者品牌的另一台计算机上的时候。在计算机的早期岁月里,包括 IBMDigital Equipment Corporation德州仪器UNIVAC 以及惠普在内的很多公司都在尝试不同类型的计算机硬件。这些计算机除了都需要连接电源之外就没有太多共同点了。它们在内存和 CPU 架构上的差异相当大,当时经常需要人们花费数年来将一台计算机的程序翻译成另一台计算机的程序。
有了高级语言,我们只需要把编译器工具链迁移到新的平台就行了。只要有可用的编译器,高级语言写的程序最多只需要经过小幅修改就可以在新的计算机上被重新编译。高级语言的编译是一个真正的革命性成果。
![IBM PC XT][11]
1983 发布的 IBM PC XT 是硬件价格下降的早期例子。
程序员们的生活得到了很好的改善。相比之下,通过高级语言表达他们想要解决的问题让事情变得轻松很多。由于半导体技术的进步以及集成芯片的发明,计算机硬件的价格急剧下降。计算机的速度越来越快,能力也越来越强,并且还便宜了很多。从某个时点往后(也许是 80 年代末期吧),事情发生了转变,程序员变得比他们所使用的硬件更值钱了。
### 解释器
随着时间的推移,一种新的编程方式兴起了。一种被称为“解释器”的特殊程序可以将程序直接转换成可以立即执行的计算机指令。和编译器差不多,解释器读取程序并将它转换成一个中间形态。但和编译器不同的是,解释器直接执行程序的这个中间形态。解释型语言在每一次执行的时候都要经历这个过程;而编译程序只需要编译一次,之后计算机每次只需要执行编译好的机器指令就可以了。
顺便说一句,这个特性就是导致人们感觉解释型程序运行得比较慢的原因。不过现代计算机的性能出奇地强大,以至于大多数人无法区分编译型程序和解释型程序。
解释型程序(有时也被成为“脚本”)甚至更容易被迁移到不同的硬件平台上。因为脚本并不包含任何机器特有的指令,同一个版本的程序可以不经过任何修改就直接在很多不同的计算机上运行。不过当然了,解释器必须得先迁移到新的机器上才行。
一个很流行的解释型语言是 [perl][12]。用 perl 完整地表达我们的加法问题会是这样的:
```
`$x = 1 + 2`
```
虽然这个程序看起来和 C 语言的版本差不多,运行上也没有太大区别,但却缺少了初始化变量的语句。其实还有一些其它的区别(超出这篇文章的范围了),但你应该已经注意到,我们写计算机程序的方式已经和数学家用纸笔手写数学表达式非常接近了。
### 虚拟机
最新潮的编程方式要数虚拟机(经常简称 VM了。虚拟机分为两大类系统虚拟机和进程虚拟机。这两种虚拟机都提供一种对“真实的”计算硬件的不同级别的抽象不过他们的作用域不同。系统虚拟机是一个提供物理硬件的替代的软件而进程虚拟机则被设计用来以一种“系统独立”的方式执行程序。所以在这个例子里进程虚拟机往后我所说的虚拟机都是指这个类型的作用域和解释器的比较类似因为也是先将程序编译成一个中间形态然后虚拟机再执行这个中间形态。
虚拟机和解释器的主要区别在于,虚拟机创造了一个虚拟的 CPU以及一套虚拟的指令集。有了这层抽象我们就可以编写前端工具来把不同语言的程序编译成虚拟机可以接受的程序了。也许最流行也最知名的虚拟机就是 Java 虚拟机JVM了。JVM 最初在 1990 年代只支持 Java 语言,但是如今却可以运行 [许多][13] 流行的编程语言,包括 ScalaJythonJRubyClojure以及 Kotlin 等等。还有其它一些不太常见的例子,在这里就不说了。我也是最近才知道,我最喜欢的语言 Python 并不是一个解释型语言,而是一个 [运行在虚拟机上的语言][15]!
虚拟机仍然在延续这样一个历史趋势:让程序员在使用特定领域的编程语言解决问题的时候,所需要的对特定计算平台的了解变得越来越少了。
### 就是这样了
希望你喜欢这篇简单介绍软件背后运行原理的短文。有什么其它话题是你想让我接下来讨论的吗?在评论里告诉我吧。
* * *
_This article was originally published on[PyBites][16] and is reprinted with permission._
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/5/primer-assemblers-compilers-interpreters
作者:[Erik O'Shaughnessy][a]
选题:[lujun9972][b]
译者:[译者ID](https://github.com/chen-ni)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/jnyjny/users/shawnhcorey/users/jnyjny/users/jnyjny
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/programming_keyboard_coding.png?itok=E0Vvam7A (keyboard with connected dots)
[2]: https://opensource.com/sites/default/files/uploads/two_women_operating_eniac.gif (Programmers operate the ENIAC computer)
[3]: https://en.wikipedia.org/wiki/Jean_Bartik (Jean Bartik)
[4]: https://en.wikipedia.org/wiki/Frances_Spence (Frances Spence)
[5]: https://en.wikipedia.org/wiki/ENIAC
[6]: https://en.wikipedia.org/wiki/Nathaniel_Rochester_%28computer_scientist%29
[7]: https://en.wikipedia.org/wiki/ALGO
[8]: https://en.wikipedia.org/wiki/Fortran
[9]: https://en.wikipedia.org/wiki/C_(programming_language)
[10]: https://opensource.com/sites/default/files/uploads/algolfortran_family-by-borkowski.png (Genealogy tree of ALGO and Fortran)
[11]: https://opensource.com/sites/default/files/uploads/639px-ibm_px_xt_color.jpg (IBM PC XT)
[12]: www.perl.org
[13]: https://en.wikipedia.org/wiki/List_of_JVM_languages
[14]: /resources/python
[15]: https://opensource.com/article/18/4/introduction-python-bytecode
[16]: https://pybit.es/python-interpreters.html