Translated by cposture (#4181)

* Translated by cposture

* Translating by cposture

* translating partly

* translating partly 75

* Translated by cposture

* Translated by cposture
This commit is contained in:
陈家启 2016-07-13 16:41:30 +08:00 committed by Ezio
parent a0bab457e9
commit 4cd961ced3
2 changed files with 228 additions and 227 deletions

View File

@ -1,227 +0,0 @@
Translating by cposture 2016.07.09
Create Your Own Shell in Python : Part I
Im curious to know how a shell (like bash, csh, etc.) works internally. So, I implemented one called yosh (Your Own SHell) in Python to answer my own curiosity. The concept I explain in this article can be applied to other languages as well.
(Note: You can find source code used in this blog post here. I distribute it with MIT license.)
Lets start.
### Step 0: Project Structure
For this project, I use the following project structure.
```
yosh_project
|-- yosh
|-- __init__.py
|-- shell.py
```
`yosh_project` is the root project folder (you can also name it just `yosh`).
`yosh` is the package folder and `__init__.py` will make it a package named the same as the package folder name. (If you dont write Python, just ignore it.)
`shell.py` is our main shell file.
### Step 1: Shell Loop
When you start a shell, it will show a command prompt and wait for your command input. After it receives the command and executes it (the detail will be explained later), your shell will be back to the wait loop for your next command.
In `shell.py`, we start by a simple main function calling the shell_loop() function as follows:
```
def shell_loop():
# Start the loop here
def main():
shell_loop()
if __name__ == "__main__":
main()
```
Then, in our `shell_loop()`, we use a status flag to indicate whether the loop should continue or stop. In the beginning of the loop, our shell will show a command prompt and wait to read command input.
```
import sys
SHELL_STATUS_RUN = 1
SHELL_STATUS_STOP = 0
def shell_loop():
status = SHELL_STATUS_RUN
while status == SHELL_STATUS_RUN:
# Display a command prompt
sys.stdout.write('> ')
sys.stdout.flush()
# Read command input
cmd = sys.stdin.readline()
```
After that, we tokenize the command input and execute it (well implement the tokenize and execute functions soon).
Therefore, our shell_loop() will be the following.
```
import sys
SHELL_STATUS_RUN = 1
SHELL_STATUS_STOP = 0
def shell_loop():
status = SHELL_STATUS_RUN
while status == SHELL_STATUS_RUN:
# Display a command prompt
sys.stdout.write('> ')
sys.stdout.flush()
# Read command input
cmd = sys.stdin.readline()
# Tokenize the command input
cmd_tokens = tokenize(cmd)
# Execute the command and retrieve new status
status = execute(cmd_tokens)
```
Thats all of our shell loop. If we start our shell with python shell.py, it will show the command prompt. However, it will throw an error if we type a command and hit enter because we dont define tokenize function yet.
To exit the shell, try ctrl-c. I will tell how to exit gracefully later.
### Step 2: Tokenization
When a user types a command in our shell and hits enter. The command input will be a long string containing both a command name and its arguments. Therefore, we have to tokenize it (split a string into several tokens).
It seems simple at first glance. We might use cmd.split() to separate the input by spaces. It works well for a command like `ls -a my_folder` because it splits the command into a list `['ls', '-a', 'my_folder']` which we can use them easily.
However, there are some cases that some arguments are quoted with single or double quotes like `echo "Hello World"` or `echo 'Hello World'`. If we use cmd.split(), we will get a list of 3 tokens `['echo', '"Hello', 'World"']` instead of 2 tokens `['echo', 'Hello World']`.
Fortunately, Python provides a library called shlex that helps us split like a charm. (Note: we can also use regular expression but its not the main point of this article.)
```
import sys
import shlex
...
def tokenize(string):
return shlex.split(string)
...
```
Then, we will send these tokens to the execution process.
### Step 3: Execution
This is the core and fun part of a shell. What happened when a shell executes mkdir test_dir? (Note: mkdir is a program to be executed with arguments test_dir for creating a directory named test_dir.)
The first function involved in this step is execvp. Before I explain what execvp does, lets see it in action.
```
import os
...
def execute(cmd_tokens):
# Execute command
os.execvp(cmd_tokens[0], cmd_tokens)
# Return status indicating to wait for next command in shell_loop
return SHELL_STATUS_RUN
...
```
Try running our shell again and input a command `mkdir test_dir`, then, hit enter.
The problem is, after we hit enter, our shell exits instead of waiting for the next command. However, the directory is correctly created.
So, what execvp really does?
execvp is a variant of a system call exec. The first argument is the program name. The v indicates the second argument is a list of program arguments (variable number of arguments). The p indicates the PATH environment will be used for searching for the given program name. In our previous attempt, the mkdir program was found based on your PATH environment variable.
(There are other variants of exec such as execv, execvpe, execl, execlp, execlpe; you can google them for more information.)
exec replaces the current memory of a calling process with a new process to be executed. In our case, our shell process memory was replaced by `mkdir` program. Then, mkdir became the main process and created the test_dir directory. Finally, its process exited.
The main point here is that our shell process was replaced by mkdir process already. Thats the reason why our shell disappeared and did not wait for the next command.
Therefore, we need another system call to rescue: fork.
fork will allocate new memory and copy the current process into a new process. We called this new process as child process and the caller process as parent process. Then, the child process memory will be replaced by a execed program. Therefore, our shell, which is a parent process, is safe from memory replacement.
Lets see our modified code.
```
...
def execute(cmd_tokens):
# Fork a child shell process
# If the current process is a child process, its `pid` is set to `0`
# else the current process is a parent process and the value of `pid`
# is the process id of its child process.
pid = os.fork()
if pid == 0:
# Child process
# Replace the child shell process with the program called with exec
os.execvp(cmd_tokens[0], cmd_tokens)
elif pid > 0:
# Parent process
while True:
# Wait response status from its child process (identified with pid)
wpid, status = os.waitpid(pid, 0)
# Finish waiting if its child process exits normally
# or is terminated by a signal
if os.WIFEXITED(status) or os.WIFSIGNALED(status):
break
# Return status indicating to wait for next command in shell_loop
return SHELL_STATUS_RUN
...
```
When the parent process call `os.fork()`, you can imagine that all source code is copied into a new child process. At this point, the parent and child process see the same code and run in parallel.
If the running code is belong to the child process, pid will be 0. Else, the running code is belong to the parent process, pid will be the process id of the child process.
When os.execvp is invoked in the child process, you can imagine like all the source code of the child process is replaced by the code of a program that is being called. However, the code of the parent process is not changed.
When the parent process finishes waiting its child process to exit or be terminated, it returns the status indicating to continue the shell loop.
### Run
Now, you can try running our shell and enter mkdir test_dir2. It should work properly. Our main shell process is still there and waits for the next command. Try ls and you will see the created directories.
However, there are some problems here.
First, try cd test_dir2 and then ls. Its supposed to enter the directory test_dir2 which is an empty directory. However, you will see that the directory was not changed into test_dir2.
Second, we still have no way to exit from our shell gracefully.
We will continue to solve such problems in [Part 2][1].
--------------------------------------------------------------------------------
via: https://hackercollider.com/articles/2016/07/05/create-your-own-shell-in-python-part-1/
作者:[Supasate Choochaisri][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://disqus.com/by/supasate_choochaisri/
[1]: https://hackercollider.com/articles/2016/07/06/create-your-own-shell-in-python-part-2/

View File

@ -0,0 +1,228 @@
使用 Python 创建你自己的 ShellPart I
==========================================
我很想知道一个 shell (像 bashcsh 等)内部是如何工作的。为了满足自己的好奇心,我使用 Python 实现了一个名为 **yosh** Your Own Shell的 Shell。本文章所介绍的概念也可以应用于其他编程语言。
(提示:你可以在[这里](https://github.com/supasate/yosh)查找本博文使用的源代码,代码以 MIT 许可证发布。在 Mac OS X 10.11.5 上,我使用 Python 2.7.10 和 3.4.3 进行了测试。它应该可以运行在其他类 Unix 环境,比如 Linux 和 Windows 上的 Cygwin。
让我们开始吧。
### 步骤 0项目结构
对于此项目,我使用了以下的项目结构。
```
yosh_project
|-- yosh
|-- __init__.py
|-- shell.py
```
`yosh_project` 为项目根目录(你也可以把它简单命名为 `yosh`)。
`yosh` 为包目录,且 `__init__.py` 可以使它成为与包目录名字相同的包(如果你不写 Python可以忽略它。
`shell.py` 是我们主要的脚本文件。
### 步骤 1Shell 循环
当启动一个 shell它会显示一个命令提示符并等待你的命令输入。在接收了输入的命令并执行它之后稍后文章会进行详细解释你的 shell 会重新回到循环,等待下一条指令。
`shell.py`,我们会以一个简单的 mian 函数开始,该函数调用了 shell_loop() 函数,如下:
```
def shell_loop():
# Start the loop here
def main():
shell_loop()
if __name__ == "__main__":
main()
```
接着,在 `shell_loop()`,为了指示循环是否继续或停止,我们使用了一个状态标志。在循环的开始,我们的 shell 将显示一个命令提示符,并等待读取命令输入。
```
import sys
SHELL_STATUS_RUN = 1
SHELL_STATUS_STOP = 0
def shell_loop():
status = SHELL_STATUS_RUN
while status == SHELL_STATUS_RUN:
# Display a command prompt
sys.stdout.write('> ')
sys.stdout.flush()
# Read command input
cmd = sys.stdin.readline()
```
之后,我们切分命令输入并进行执行(我们即将实现`命令切分`和`执行`函数)。
因此,我们的 shell_loop() 会是如下这样:
```
import sys
SHELL_STATUS_RUN = 1
SHELL_STATUS_STOP = 0
def shell_loop():
status = SHELL_STATUS_RUN
while status == SHELL_STATUS_RUN:
# Display a command prompt
sys.stdout.write('> ')
sys.stdout.flush()
# Read command input
cmd = sys.stdin.readline()
# Tokenize the command input
cmd_tokens = tokenize(cmd)
# Execute the command and retrieve new status
status = execute(cmd_tokens)
```
这就是我们整个 shell 循环。如果我们使用 `python shell.py` 启动我们的 shell它会显示命令提示符。然而如果我们输入命令并按回车它会抛出错误因为我们还没定义`命令切分`函数。
为了退出 shell可以尝试输入 ctrl-c。稍后我将解释如何以优雅的形式退出 shell。
### 步骤 2命令切分
当用户在我们的 shell 中输入命令并按下回车键,该命令将会是一个包含命令名称及其参数的很长的字符串。因此,我们必须切分该字符串(分割一个字符串为多个标记)。
咋一看似乎很简单。我们或许可以使用 `cmd.split()`,以空格分割输入。它对类似 `ls -a my_folder` 的命令起作用,因为它能够将命令分割为一个列表 `['ls', '-a', 'my_folder']`,这样我们便能轻易处理它们了。
然而,也有一些类似 `echo "Hello World"``echo 'Hello World'` 以单引号或双引号引用参数的情况。如果我们使用 cmd.spilt我们将会得到一个存有 3 个标记的列表 `['echo', '"Hello', 'World"']` 而不是 2 个标记的列表 `['echo', 'Hello World']`
幸运的是Python 提供了一个名为 `shlex` 的库,它能够帮助我们效验如神地分割命令。(提示:我们也可以使用正则表达式,但它不是本文的重点。)
```
import sys
import shlex
...
def tokenize(string):
return shlex.split(string)
...
```
然后我们将这些标记发送到执行进程。
### 步骤 3执行
这是 shell 中核心和有趣的一部分。当 shell 执行 `mkdir test_dir` 时,到底发生了什么?(提示: `mkdir` 是一个带有 `test_dir` 参数的执行程序,用于创建一个名为 `test_dir` 的目录。)
`execvp` 是涉及这一步的首个函数。在我们解释 `execvp` 所做的事之前,让我们看看它的实际效果。
```
import os
...
def execute(cmd_tokens):
# Execute command
os.execvp(cmd_tokens[0], cmd_tokens)
# Return status indicating to wait for next command in shell_loop
return SHELL_STATUS_RUN
...
```
再次尝试运行我们的 shell并输入 `mkdir test_dir` 命令,接着按下回车键。
在我们敲下回车键之后,问题是我们的 shell 会直接退出而不是等待下一个命令。然而,目标正确地被创建。
因此,`execvp` 实际上做了什么?
`execvp` 是系统调用 `exec` 的一个变体。第一个参数是程序名字。`v` 表示第二个参数是一个程序参数列表(可变参数)。`p` 表示环境变量 `PATH` 会被用于搜索给定的程序名字。在我们上一次的尝试中,它将会基于我们的 `PATH` 环境变量查找`mkdir` 程序。
(还有其他 `exec` 变体,比如 execv、execvpe、execl、execlp、execlpe你可以 google 它们获取更多的信息。)
`exec` 会用即将运行的新进程替换调用进程的当前内存。在我们的例子中,我们的 shell 进程内存会被替换为 `mkdir` 程序。接着,`mkdir` 成为主进程并创建 `test_dir` 目录。最后该进程退出。
这里的重点在于**我们的 shell 进程已经被 `mkdir` 进程所替换**。这就是我们的 shell 消失且不会等待下一条命令的原因。
因此,我们需要其他的系统调用来解决问题:`fork`。
`fork` 会开辟新的内存并拷贝当前进程到一个新的进程。我们称这个新的进程为**子进程**,调用者进程为**父进程**。然后,子进程内存会被替换为被执行的程序。因此,我们的 shell也就是父进程可以免受内存替换的危险。
让我们看看修改的代码。
```
...
def execute(cmd_tokens):
# Fork a child shell process
# If the current process is a child process, its `pid` is set to `0`
# else the current process is a parent process and the value of `pid`
# is the process id of its child process.
pid = os.fork()
if pid == 0:
# Child process
# Replace the child shell process with the program called with exec
os.execvp(cmd_tokens[0], cmd_tokens)
elif pid > 0:
# Parent process
while True:
# Wait response status from its child process (identified with pid)
wpid, status = os.waitpid(pid, 0)
# Finish waiting if its child process exits normally
# or is terminated by a signal
if os.WIFEXITED(status) or os.WIFSIGNALED(status):
break
# Return status indicating to wait for next command in shell_loop
return SHELL_STATUS_RUN
...
```
当我们的父进程调用 `os.fork()`时,你可以想象所有的源代码被拷贝到了新的子进程。此时此刻,父进程和子进程看到的是相同的代码,且并行运行着。
如果运行的代码属于子进程,`pid` 将为 `0`。否则,如果运行的代码属于父进程,`pid` 将会是子进程的进程 id。
`os.execvp` 在子进程中被调用时,你可以想象子进程的所有源代码被替换为正被调用程序的代码。然而父进程的代码不会被改变。
当父进程完成等待子进程退出或终止时,它会返回一个状态,指示继续 shell 循环。
### 运行
现在,你可以尝试运行我们的 shell 并输入 `mkdir test_dir2`。它应该可以正确执行。我们的主 shell 进程仍然存在并等待下一条命令。尝试执行 `ls`,你可以看到已创建的目录。
但是,这里仍有许多问题。
第一,尝试执行 `cd test_dir2`,接着执行 `ls`。它应该会进入到一个空的 `test_dir2` 目录。然而,你将会看到目录并没有变为 `test_dir2`
第二,我们仍然没有办法优雅地退出我们的 shell。
我们将会在 [Part 2][1] 解决诸如此类的问题。
--------------------------------------------------------------------------------
via: https://hackercollider.com/articles/2016/07/05/create-your-own-shell-in-python-part-1/
作者:[Supasate Choochaisri][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://disqus.com/by/supasate_choochaisri/
[1]: https://hackercollider.com/articles/2016/07/06/create-your-own-shell-in-python-part-2/