Translated tech/20170321 Writing a Linux Debugger Part 1 Setup.md

2025-03-21 02:10:11 +08:00 · 2017-04-12 20:54:49 +08:00 · 2017-04-12 20:54:49 +08:00 · ecd06ccdfd
commit ecd06ccdfd
parent a7b067d419
2 changed files with 236 additions and 237 deletions
--- a/sources/tech/20170321
+++ b/sources/tech/20170321
@ -1,237 +0,0 @@
-ictlyh Translating
-Writing a Linux Debugger Part 1: Setup
-============================================================
-
-Anyone who has written more than a hello world program should have used a debugger at some point (if you haven’t, drop what you’re doing and learn how to use one). However, although these tools are in such widespread use, there aren’t a lot of resources which tell you how they work and how to write one[1][1], especially when compared to other toolchain technologies like compilers. In this post series we’ll learn what makes debuggers tick and write one for debugging Linux programs.
-
-We’ll support the following features:
-
-*   Launch, halt, and continue execution
-*   Set breakpoints on
-    *   Memory addresses
-    *   Source code lines
-    *   Function entry
-*   Read and write registers and memory
-*   Single stepping
-    *   Instruction
-    *   Step in
-    *   Step out
-    *   Step over
-*   Print current source location
-*   Print backtrace
-*   Print values of simple variables
-
-In the final part I’ll also outline how you could add the following to your debugger:
-
-*   Remote debugging
-*   Shared library and dynamic loading support
-*   Expression evaluation
-*   Multi-threaded debugging support
-
-I’ll be focusing on C and C++ for this project, but it should work just as well with any language which compiles down to machine code and outputs standard DWARF debug information (if you don’t know what that is yet, don’t worry, this will be covered soon). Additionally, my focus will be on just getting something up and running which works most of the time, so things like robust error handling will be eschewed in favour of simplicity.
-
-* * *
-
-### Series index
-
-These links will go live as the rest of the posts are released.
-
-1.  [Setup][2]
-2.  [Breakpoints][3]
-3.  Registers and memory
-4.  Elves and dwarves
-5.  Stepping, source and signals
-6.  Stepping on dwarves
-7.  Source-level breakpoints
-8.  Stack unwinding
-9.  Reading variables
-10.  Next steps
-
-* * *
-
-### Getting set up
-
-Before we jump into things, let’s get our environment set up. I’ll be using two dependencies in this tutorial: [Linenoise][4] for handling our command line input, and [libelfin][5] for parsing the debug information. You could use the more traditional libdwarf instead of libelfin, but the interface is nowhere near as nice, and libelfin also provides a mostly complete DWARF expression evaluator, which will save you a lot of time if you want to read variables. Make sure that you use the fbreg branch of my fork of libelfin, as it hacks on some extra support for reading variables on x86.
-
-Once you’ve either installed these on your system, or got them building as dependencies with whatever build system you prefer, it’s time to get started. I just set them to build along with the rest of my code in my CMake files.
-
-* * *
-
-### Launching the executable
-
-Before we actually debug anything, we’ll need to launch the debugee program. We’ll do this with the classic fork/exec pattern.
-
-```
-int main(int argc, char* argv[]) {
-    if (argc < 2) {
-        std::cerr << "Program name not specified";
-        return -1;
-    }
-
-    auto prog = argv[1];
-
-    auto pid = fork();
-    if (pid == 0) {
-        //we're in the child process
-        //execute debugee
-
-    }
-    else if (pid >= 1)  {
-        //we're in the parent process
-        //execute debugger
-    }
-```
-
-We call `fork` and this causes our program to split into two processes. If we are in the child process, `fork` returns `0`, and if we are in the parent process, it returns the process ID of the child process.
-
-If we’re in the child process, we want to replace whatever we’re currently executing with the program we want to debug.
-
-```
-   ptrace(PTRACE_TRACEME, 0, nullptr, nullptr);
-   execl(prog.c_str(), prog.c_str(), nullptr);
-```
-
-Here we have our first encounter with `ptrace`, which is going to become our best friend when writing our debugger. `ptrace`allows us to observe and control the execution of another process by reading registers, reading memory, single stepping and more. The API is very ugly; it’s a single function which you provide with an enumerator value for what you want to do, and then some arguments which will either be used or ignored depending on which value you supply. The signature looks like this:
-
-```
-long ptrace(enum __ptrace_request request, pid_t pid,
-            void *addr, void *data);
-```
-
-`request` is what we would like to do to the traced process; `pid`is the process ID of the traced process; `addr` is a memory address, which is used in some calls to designate an address in the tracee; and `data` is some request-specific resource. The return value often gives error information, so you probably want to check that in your real code; I’m just omitting it for brevity. You can have a look at the man pages for more information.
-
-The request we send in the above code, `PTRACE_TRACEME`, indicates that this process should allow its parent to trace it. All of the other arguments are ignored, because API design isn’t important /s.
-
-Next, we call `execl`, which is one of the many `exec` flavours. We execute the given program, passing the name of it as a command-line argument and a `nullptr` to terminate the list. You can pass any other arguments needed to execute your program here if you like.
-
-After we’ve done this, we’re finished with the child process; we’ll just let it keep running until we’re finished with it.
-
-* * *
-
-### Adding our debugger loop
-
-Now that we’ve launched the child process, we want to be able to interact with it. For this, we’ll create a `debugger` class, give it a loop for listening to user input, and launch that from our parent fork of our `main` function.
-
-```
-else if (pid >= 1)  {
-    //parent
-    debugger dbg{prog, pid};
-    dbg.run();
-}
-```
-
-```
-class debugger {
-public:
-    debugger (std::string prog_name, pid_t pid)
-        : m_prog_name{std::move(prog_name)}, m_pid{pid} {}
-
-    void run();
-
-private:
-    std::string m_prog_name;
-    pid_t m_pid;
-};
-```
-
-In our `run` function, we need to wait until the child process has finished launching, then just keep on getting input from linenoise until we get an EOF (ctrl+d).
-
-```
-void debugger::run() {
-    int wait_status;
-    auto options = 0;
-    waitpid(m_pid, &wait_status, options);
-
-    char* line = nullptr;
-    while((line = linenoise("minidbg> ")) != nullptr) {
-        handle_command(line);
-        linenoiseHistoryAdd(line);
-        linenoiseFree(line);
-    }
-}
-```
-
-When the traced process is launched, it will be sent a `SIGTRAP`signal, which is a trace or breakpoint trap. We can wait until this signal is sent using the `waitpid` function.
-
-After we know the process is ready to be debugged, we listen for user input. The `linenoise` function takes a prompt to display and handles user input by itself. This means we get a nice command line with history and navigation commands without doing much work at all. When we get the input, we give the command to a `handle_command` function which we’ll write shortly, then we add this command to the linenoise history and free the resource.
-
-* * *
-
-### Handling input
-
-Our commands will follow a similar format to gdb and lldb. To continue the program, a user will type `continue` or `cont` or even just `c`. If they want to set a breakpoint on an address, they’ll write `break 0xDEADBEEF`, where `0xDEADBEEF` is the desired address in hexadecimal format. Let’s add support for these commands.
-
-```
-void debugger::handle_command(const std::string& line) {
-    auto args = split(line,' ');
-    auto command = args[0];
-
-    if (is_prefix(command, "continue")) {
-        continue_execution();
-    }
-    else {
-        std::cerr << "Unknown command\n";
-    }
-}
-```
-
-`split` and `is_prefix` are a couple of small helper functions:
-
-```
-std::vector<std::string> split(const std::string &s, char delimiter) {
-    std::vector<std::string> out{};
-    std::stringstream ss {s};
-    std::string item;
-
-    while (std::getline(ss,item,delimiter)) {
-        out.push_back(item);
-    }
-
-    return out;
-}
-
-bool is_prefix(const std::string& s, const std::string& of) {
-    if (s.size() > of.size()) return false;
-    return std::equal(s.begin(), s.end(), of.begin());
-}
-```
-
-We’ll add `continue_execution` to the `debugger` class.
-
-```
-void debugger::continue_execution() {
-    ptrace(PTRACE_CONT, m_pid, nullptr, nullptr);
-
-    int wait_status;
-    auto options = 0;
-    waitpid(m_pid, &wait_status, options);
-}
-```
-
-For now our `continue_execution` function will just use `ptrace` to tell the process to continue, then `waitpid` until it’s signalled.
-
-* * *
-
-### Finishing up
-
-Now you should be able to compile some C or C++ program, run it through your debugger, see it halting on entry, and be able to continue execution from your debugger. In the next part we’ll learn how to get our debugger to set breakpoints. If you come across any issues, please let me know in the comments!
-
-You can find the code for this post [here][6].
-
--------------------------------------------------------------------------------
-
-via: http://blog.tartanllama.xyz/c++/2017/03/21/writing-a-linux-debugger-setup/
-
-作者：[Simon Brand ][a]
-译者：[译者ID](https://github.com/译者ID)
-校对：[校对者ID](https://github.com/校对者ID)
-
-本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译，[Linux中国](https://linux.cn/) 荣誉推出
-
-[a]:https://www.linkedin.com/in/simon-brand-36520857
-[1]:http://blog.tartanllama.xyz/c++/2017/03/21/writing-a-linux-debugger-setup/#fn:1
-[2]:http://blog.tartanllama.xyz/c++/2017/03/21/writing-a-linux-debugger-setup/
-[3]:http://blog.tartanllama.xyz/c++/2017/03/24/writing-a-linux-debugger-breakpoints/
-[4]:https://github.com/antirez/linenoise
-[5]:https://github.com/TartanLlama/libelfin/tree/fbreg
-[6]:https://github.com/TartanLlama/minidbg/tree/tut_setup
--- a/translated/tech/20170321
+++ b/translated/tech/20170321
@ -0,0 +1,236 @@
+开发 Linux 调试器第一部分：起步
+============================================================
+
+任何写过 hello world 程序的人都应该使用过调试器（如果你还没有，那就停下手头的工作先学习一下吧）。但是，尽管这些工具已经得到了广泛的使用，却并没有太多的资源告诉你它们的工作原理以及如何开发[1][1]，尤其是和其它类似编译器等工具链技术相比的时候。
+
+我们将会支持以下功能：
+
+*	启动、暂停、继续执行
+*	在不同地方设置断点
+	*	内存地址
+	*	源代码行
+	*	函数入口
+*	读写寄存器和内存
+*	单步执行
+	*	指令
+	*	进入函数
+	*	跳出函数
+	*	跳过函数
+*	打印当前代码地址
+*	打印函数调用栈
+*	打印简单变量的值
+
+在最后一部分，我还会大概介绍如何给你的调试器添加下面的功能：
+
+*	远程调试
+*	共享库和动态库支持
+*	表达式计算
+*	多线程调试支持
+
+在本项目中我会将重点放在 C 和 C++，但对于那些将源码编译为机器码并输出标准 DWARE 调试信息的语言也应该能起作用（如果你还不知道这些东西是什么，别担心，马上就会介绍到啦）。另外，我只关注如何将程序运行起来并在大部分情况下能正常工作，为了简便，会避开类似健壮错误处理方面的东西。
+
+* * *
+
+### 系列文章索引
+
+随着后面文章的发布，这些链接会逐渐生效。
+
+1.	[启动][2]
+2.	[断点][3]
+3.	寄存器和内存
+4.	Elves 和 dwarves
+5.	逐步、源码和信号
+6.	Stepping on dwarves
+7.	源码层断点
+8.	调用栈
+9.	读取变量
+10.	下一步
+
+* * *
+
+### 准备环境
+
+在我们正式开始之前，我们首先要设置环境。在这篇文章中我会依赖两个工具：[Linenoise][4] 用于处理命令行输入，[libelfin][5] 用于解析调试信息。你也可以使用更传统的 libdwarf 而不是 libelin，但是界面没有那么友好，另外 libelfin 还提供大部分完整的 DWARF 表达式求值程序，当你想读取变量的值时这能帮你节省很多时间。确认你使用的是我 libelfin 仓库中的 fbreg 分支，因为它提供 x86 上读取变量的额外支持。
+
+一旦你在系统上安装或者使用你喜欢的编译系统编译好了这些依赖工具，就可以开始啦。我在 CMake 文件中把它们设置为和我其余的代码一起编译。
+
+* * *
+
+### 启动可执行程序
+
+在真正调试任何程序之前，我们需要启动被调试的程序。我们会使用经典的 fork/exec 模式。
+
+```
+int main(int argc, char* argv[]) {
+    if (argc < 2) {
+        std::cerr << "Program name not specified";
+        return -1;
+    }
+
+    auto prog = argv[1];
+
+    auto pid = fork();
+    if (pid == 0) {
+        //we're in the child process
+        //execute debugee
+
+    }
+    else if (pid >= 1)  {
+        //we're in the parent process
+        //execute debugger
+    }
+```
+
+我们调用 `fort` 把我们的程序分成两个进程。如果我们是在子进程，`fork` 返回 0，如果我们是在父进程，它会返回子进程的进程ID。
+
+如果我们是在子进程，我们要用希望调试的程序替换正在执行的程序。
+
+```
+   ptrace(PTRACE_TRACEME, 0, nullptr, nullptr);
+   execl(prog.c_str(), prog.c_str(), nullptr);
+```
+
+这里我们第一次遇到了 `ptrace`，它会在我们编写调试器的时候经常遇到。`ptrace` 通过读取寄存器、内存、逐步调试等让我们观察和控制另一个进程的执行。API 非常简单；你需要给这个简单函数提供一个枚举值用于你想要进行的操作，然后是一些取决于你提供的值可能会被使用也可能会被忽略的参数。函数签名看起来类似：
+
+```
+long ptrace(enum __ptrace_request request, pid_t pid,
+            void *addr, void *data);
+```
+
+`request` 是我们想对被跟踪进程进行的操作；`pid` 是被跟踪进程的进程 ID；`addr` 是一个内存地址，用于在一些调用中指定被跟踪程序的地址；`data` 是和 `request` 相应的资源。返回值通常是一些错误信息，因此在你实际的代码中你也许应该检查返回值；为了简洁我这里就省略了。你可以查看 man 手册获取更多（关于 ptrace）的信息。
+
+上面代码中我们发送的请求 `PTRACE_TRACEME` 表示这个进程应该允许父进程跟踪它。所有其它参数都会被忽略，因为 API 设计并不是很重要
+
+下一步，我们会调用 `execl`，这是很多类似的 `exec` 函数之一。我们执行指定的程序，通过命令行参数传递它的名称，然后用一个 `nullptr` 终止列表。如果你愿意，你还可以传递其它执行你的程序所需的参数。
+
+在完成这些后，我们就会结束子进程的执行；在我们结束它之前它会一直执行。
+
+* * *
+
+### 添加调试循环
+
+现在我们已经启动了子进程，我们想要能够和它进行交互。为此，我们会创建一个 `debugger` 类，循环监听用户输入，然后在我们父进程的 `main` 函数中启动它。
+
+```
+else if (pid >= 1)  {
+    //parent
+    debugger dbg{prog, pid};
+    dbg.run();
+}
+```
+
+```
+class debugger {
+public:
+    debugger (std::string prog_name, pid_t pid)
+        : m_prog_name{std::move(prog_name)}, m_pid{pid} {}
+
+    void run();
+
+private:
+    std::string m_prog_name;
+    pid_t m_pid;
+};
+```
+
+在 `run` 函数中，我们需要等待，直到子进程完成启动，然后一直从 linenoise 获取输入直到收到 EOF（ctrl+d）。
+
+```
+void debugger::run() {
+    int wait_status;
+    auto options = 0;
+    waitpid(m_pid, &wait_status, options);
+
+    char* line = nullptr;
+    while((line = linenoise("minidbg> ")) != nullptr) {
+        handle_command(line);
+        linenoiseHistoryAdd(line);
+        linenoiseFree(line);
+    }
+}
+```
+
+当被跟踪的进程启动时，会发送一个 `SIGTRAP` 信号给它，这是一个跟踪或者断点中断。我们可以使用 `waitpid` 函数等待直到这个信号被发送。
+
+当我们知道进程可以被调试之后，我们监听用户输入。`linenoise` 函数它自己会用一个窗口显示和处理用户输入。这意味着我们不需要做太多的工作就会有一个有历史记录和导航命令的命令行。当我们获取到输入时，我们把命令发给我们写的小程序 `handle_command`，然后我们把这个命令添加到 linenoise 历史并释放资源。
+
+* * *
+
+### 处理输入
+
+我们的命令和 gdb 以及 lldb 有类似的格式。要继续执行程序，用户需要输入 `continue` 或 `cont` 甚至只需 `c`。如果他们想在一个地址中设置断点，他们会输入 `break 0xDEADBEEF`，其中 `0xDEADBEEF` 就是所需地址的 16 进制格式。让我们来增加对这些命令的支持吧。
+
+```
+void debugger::handle_command(const std::string& line) {
+    auto args = split(line,' ');
+    auto command = args[0];
+
+    if (is_prefix(command, "continue")) {
+        continue_execution();
+    }
+    else {
+        std::cerr << "Unknown command\n";
+    }
+}
+```
+
+`split` 和 `is_prefix` 是一对有用的小程序：
+
+```
+std::vector<std::string> split(const std::string &s, char delimiter) {
+    std::vector<std::string> out{};
+    std::stringstream ss {s};
+    std::string item;
+
+    while (std::getline(ss,item,delimiter)) {
+        out.push_back(item);
+    }
+
+    return out;
+}
+
+bool is_prefix(const std::string& s, const std::string& of) {
+    if (s.size() > of.size()) return false;
+    return std::equal(s.begin(), s.end(), of.begin());
+}
+```
+
+我们会把 `continue_execution` 函数添加到 `debuger` 类。
+
+```
+void debugger::continue_execution() {
+    ptrace(PTRACE_CONT, m_pid, nullptr, nullptr);
+
+    int wait_status;
+    auto options = 0;
+    waitpid(m_pid, &wait_status, options);
+}
+```
+
+现在我们的 `continue_execution` 函数会用 `ptrace` 告诉进程继续执行，然后用 `waitpid` 等待直到收到信号。
+
+* * *
+
+### 完成
+
+现在你应该编译一些 C 或者 C++ 程序，然后用你的调试器运行它们，看它是否能在函数入口暂停、从调试器中继续执行。在下一篇文章中，我们会学习如何让我们的调试器设置断点。如果你遇到了任何问题，在下面的评论框中告诉我吧！
+
+你可以在[这里][6]找到该项目的代码。
+
+--------------------------------------------------------------------------------
+
+via: http://blog.tartanllama.xyz/c++/2017/03/21/writing-a-linux-debugger-setup/
+
+作者：[Simon Brand ][a]
+译者：[ictlyh](https://github.com/ictlyh)
+校对：[校对者ID](https://github.com/校对者ID)
+
+本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译，[Linux中国](https://linux.cn/) 荣誉推出
+
+[a]:https://www.linkedin.com/in/simon-brand-36520857
+[1]:http://blog.tartanllama.xyz/c++/2017/03/21/writing-a-linux-debugger-setup/#fn:1
+[2]:http://blog.tartanllama.xyz/c++/2017/03/21/writing-a-linux-debugger-setup/
+[3]:http://blog.tartanllama.xyz/c++/2017/03/24/writing-a-linux-debugger-breakpoints/
+[4]:https://github.com/antirez/linenoise
+[5]:https://github.com/TartanLlama/libelfin/tree/fbreg
+[6]:https://github.com/TartanLlama/minidbg/tree/tut_setup