translated

This commit is contained in:
geekpi 2017-09-21 08:51:20 +08:00
parent c75841cc25
commit 5e0802428f
2 changed files with 150 additions and 151 deletions

View File

@ -1,151 +0,0 @@
translating---geekpi
Writing a Linux Debugger Part 10: Advanced topics
============================================================
Were finally here at the last post of the series! This time Ill be giving a high-level overview of some more advanced concepts in debugging: remote debugging, shared library support, expression evaluation, and multi-threaded support. These ideas are more complex to implement, so I wont walk through how to do so in detail, but Im happy to answer questions about these concepts if you have any.
* * *
### Series index
1. [Setup][1]
2. [Breakpoints][2]
3. [Registers and memory][3]
4. [Elves and dwarves][4]
5. [Source and signals][5]
6. [Source-level stepping][6]
7. [Source-level breakpoints][7]
8. [Stack unwinding][8]
9. [Handling variables][9]
10. [Advanced topics][10]
* * *
### Remote debugging
Remote debugging is very useful for embedded systems or debugging the effects of environment differences. It also sets a nice divide between the high-level debugger operations and the interaction with the operating system and hardware. In fact, debuggers like GDB and LLDB operate as remote debuggers even when debugging local programs. The general architecture is this:
![debugarch](https://blog.tartanllama.xyz/assets/debugarch.png)
The debugger is the component which we interact with through the command line. Maybe if youre using an IDE therell be another layer on top which communicates with the debugger through the  _machine interface_ . On the target machine (which may be the same as the host) there will be a  _debug stub_ , which in theory is a very small wrapper around the OS debug library which carries out all of your low-level debugging tasks like setting breakpoints on addresses. I say “in theory” because stubs are getting larger and larger these days. The LLDB debug stub on my machine is 7.6MB, for example. The debug stub communicates with the debugee process using some OS-specific features (in our case, `ptrace`), and with the debugger though some remote protocol.
The most common remote protocol for debugging is the GDB remote protocol. This is a text-based packet format for communicating commands and information between the debugger and debug stub. I wont go into detail about it, but you can read all you could want to know about it [here][11]. If you launch LLDB and execute the command `log enable gdb-remote packets` then youll get a trace of all packets sent through the remote protocol. On GDB you can write `set remotelogfile <file>` to do the same.
As a simple example, heres the packet to set a breakpoint:
```
$Z0,400570,1#43
```
`$` marks the start of the packet. `Z0` is the command to insert a memory breakpoint. `400570` and `1` are the argumets, where the former is the address to set a breakpoint on and the latter is a target-specific breakpoint kind specifier. Finally, the `#43` is a checksum to ensure that there was no data corruption.
The GDB remote protocol is very easy to extend for custom packets, which is very useful for implementing platform- or language-specific functionality.
* * *
### Shared library and dynamic loading support
The debugger needs to know what shared libraries have been loaded by the debuggee so that it can set breakpoints, get source-level information and symbols, etc. As well as finding libraries which have been dynamically linked against, the debugger must track libraries which are loaded at runtime through `dlopen`. To facilitate this, the dynamic linker maintains a  _rendezvous structure_ . This structure maintains a linked list of shared library descriptors, along with a pointer to a function which is called whenever the linked list is updated. This structure is stored where the `.dynamic` section of the ELF file is loaded, and is initialized before program execution.
A simple tracing algorithm is this:
* The tracer looks up the entry point of the program in the ELF header (or it could use the auxillary vector stored in `/proc/<pid>/aux`)
* The tracer places a breakpoint on the entry point of the program and begins execution.
* When the breakpoint is hit, the address of the rendezvous structure is found by looking up the load address of `.dynamic` in the ELF file.
* The rendezvous structure is examined to get the list of currently loaded libraries.
* A breakpoint is set on the linker update function.
* Whenever the breakpoint is hit, the list is updated.
* The tracer infinitely loops, continuing the program and waiting for a signal until the tracee signals that it has exited.
Ive written a small demonstration of these concepts, which you can find [here][12]. I can do a more detailed write up of this in the future if anyone is interested.
* * *
### Expression evaluation
Expression evaluation is a feature which lets users evaluate expressions in the original source language while debugging their application. For example, in LLDB or GDB you could execute `print foo()` to call the `foo` function and print the result.
Depending on how complex the expression is, there are a few different ways of evaluating it. If the expression is a simple identifier, then the debugger can look at the debug information, locate the variable and print out the value, just like we did in the last part of this series. If the expression is a bit more complex, then it may be possible to compile the code to an intermediate representation (IR) and interpret that to get the result. For example, for some expressions LLDB will use Clang to compile the expression to LLVM IR and interpret that. If the expression is even more complex, or requires calling some function, then the code might need to be JITted to the target and executed in the address space of the debuggee. This involves calling `mmap` to allocate some executable memory, then the compiled code is copied to this block and is executed. LLDB does this by using LLVMs JIT functionality.
If you want to know more about JIT compilation, Id highly recommend [Eli Benderskys posts on the subject][13].
* * *
### Multi-threaded debugging support
The debugger shown in this series only supports single threaded applications, but to debug most real-world applications, multi-threaded support is highly desirable. The simplest way to support this is to trace thread creation and parse the procfs to get the information you want.
The Linux threading library is called `pthreads`. When `pthread_create` is called, the library creates a new thread using the `clone` syscall, and we can trace this syscall with `ptrace` (assuming your kernel is older than 2.5.46). To do this, youll need to set some `ptrace` options after attaching to the debuggee:
```
ptrace(PTRACE_SETOPTIONS, m_pid, nullptr, PTRACE_O_TRACECLONE);
```
Now when `clone` is called, the process will be signaled with our old friend `SIGTRAP`. For the debugger in this series, you can add a case to `handle_sigtrap` which can handle the creation of the new thread:
```
case (SIGTRAP | (PTRACE_EVENT_CLONE << 8)):
//get the new thread ID
unsigned long event_message = 0;
ptrace(PTRACE_GETEVENTMSG, pid, nullptr, message);
//handle creation
//...
```
Once youve got that, you can look in `/proc/<pid>/task/` and read the memory maps and suchlike to get all the information you need.
GDB uses `libthread_db`, which provides a bunch of helper functions so that you dont need to do all the parsing and processing yourself. Setting up this library is pretty weird and I wont show how it works here, but you can go and read [this tutorial][14] if youd like to use it.
The most complex part of multithreaded support is modelling the thread state in the debugger, particularly if you want to support [non-stop mode][15] or some kind of heterogeneous debugging where you have more than just a CPU involved in your computation.
* * *
### The end!
Whew! This series took a long time to write, but I learned a lot in the process and I hope it was helpful. Get in touch on Twitter [@TartanLlama][16] or in the comments section if you want to chat about debugging or have any questions about the series. If there are any other debugging topics youd like to see covered then let me know and I might do a bonus post.
--------------------------------------------------------------------------------
via: https://blog.tartanllama.xyz/writing-a-linux-debugger-advanced-topics/
作者:[Simon Brand ][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://www.twitter.com/TartanLlama
[1]:https://blog.tartanllama.xyz/writing-a-linux-debugger-setup/
[2]:https://blog.tartanllama.xyz/writing-a-linux-debugger-breakpoints/
[3]:https://blog.tartanllama.xyz/writing-a-linux-debugger-registers/
[4]:https://blog.tartanllama.xyz/writing-a-linux-debugger-elf-dwarf/
[5]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-signal/
[6]:https://blog.tartanllama.xyz/writing-a-linux-debugger-dwarf-step/
[7]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-break/
[8]:https://blog.tartanllama.xyz/writing-a-linux-debugger-unwinding/
[9]:https://blog.tartanllama.xyz/writing-a-linux-debugger-variables/
[10]:https://blog.tartanllama.xyz/writing-a-linux-debugger-advanced-topics/
[11]:https://sourceware.org/gdb/onlinedocs/gdb/Remote-Protocol.html
[12]:https://github.com/TartanLlama/dltrace
[13]:http://eli.thegreenplace.net/tag/code-generation
[14]:http://timetobleed.com/notes-about-an-odd-esoteric-yet-incredibly-useful-library-libthread_db/
[15]:https://sourceware.org/gdb/onlinedocs/gdb/Non_002dStop-Mode.html
[16]:https://twitter.com/TartanLlama

View File

@ -0,0 +1,150 @@
开发一个 Linux 调试器(十):高级主题
============================================================
我们终于来到这个系列的最后一篇文章!这一次,我将对调试中的一些更高级的概念进行高层的概述:远程调试、共享库支持、表达式计算和多线程支持。这些想法实现起来比较复杂,所以我不会详细说明如何做,但是如果有的话,我很乐意回答有关这些概念的问题。
* * *
### 系列索引
1. [准备环境][1]
2. [断点][2]
3. [寄存器和内存][3]
4. [Elves 和 dwarves][4]
5. [源码和信号][5]
6. [源码层逐步执行][6]
7. [源码层断点][7]
8. [调用栈][8]
9. [处理变量][9]
10. [高级主题][10]
* * *
### 远程调试
远程调试对于嵌入式系统或不同环境的调试非常有用。它还在高级调试器操作和与操作系统和硬件的交互之间设置了一个很好的分界线。事实上,像 GDB 和 LLDB 这样的调试器即使在调试本地程序时也可以作为远程调试器运行。一般架构是这样的:
![debugarch](https://blog.tartanllama.xyz/assets/debugarch.png)
调试器是我们通过命令行交互的组件。也许如果你使用的是 IDE那么顶层中另一个层可以通过_机器接口_与调试器进行通信。在目标机器上可能与本机一样有一个 _debug stub_ ,它理论上是一个非常小的操作系统调试库的包装程序,它执行所有的低级调试任务,如在地址上设置断点。我说“在理论上”,因为如今 stub 变得越来越大。例如,我机器上的 LLDB debug stub 大小是 7.6MB。debug stub 通过一些使用特定于操作系统的功能(在我们的例子中是 “ptrace”和被调试进程以及通过远程协议的调试器通信。
最常见的远程调试协议是 GDB 远程协议。这是一种基于文本的数据包格式,用于在调试器和 debug
stub 之间传递命令和信息。我不会详细介绍它,但你可以在[这里][11]阅读你想知道的。如果你启动 LLDB 并执行命令 `log enable gdb-remote packets`,那么你将获得通过远程协议发送的所有数据包的跟踪。在 GDB 上,你可以用 `set remotelogfile <file>` 做同样的事情。
作为一个简单的例子,这是数据包设置断点:
```
$Z0,400570,1#43
```
`$` 标记数据包的开始。`Z0` 是插入内存断点的命令。`400570` 和 `1` 是参数,其中前者是设置断点的地址,后者是特定目标的断点类型说明符。最后,`#43` 是校验值,以确保数据没有损坏。
GDB 远程协议非常易于扩展自定义数据包,这对于实现平台或语言特定的功能非常有用。
* * *
### 共享库和动态加载支持
调试器需要知道调试程序加载了哪些共享库,以便它可以设置断点,获取源代码级别的信息和符号等。除查找被动态链接的库之外,调试器还必须跟踪在运行时通过 `dlopen` 加载的库。为了打到这个目的,动态链接器维护一个 _交会结构体_。该结构体维护共享描述符的链表以及指向每当更新链表时调用的函数的指针。这个结构存储在 ELF 文件的 `.dynamic` 段中,在程序执行之前被初始化。
一个简单的跟踪算法:
* 追踪程序在 ELF 头中查找程序的入口(或者可以使用存储在 `/proc/<pid>/aux` 中的辅助向量)
* 追踪程序在程序的入口处设置一个断点,并开始执行。
* 当到达断点时,通过在 ELF 文件中查找 `.dynamic` 的加载地址找到交汇结构体的地址。
* 检查交汇结构体以获取当前加载的库的列表。
* 链接器更新函数上设置断点
* 每当到达断点时,列表都会更新
* 追踪程序无限循环,继续执行程序并等待信号,直到追踪程序信号退出。
我给这些概念写了一个小例子,你可以在[这里][12]找到。如果有人有兴趣,我可以将来写得更详细一点。
* * *
### 表达式计算
表达式计算是程序的一项功能,允许用户在调试程序时对原始源语言中的表达式进行计算。例如,在 LLDB 或 GDB 中,可以执行 `print foo()` 来调用 `foo` 函数并打印结果。
根据表达的复杂程度,有几种不同的计算方法。如果表达式只是一个简单的标识符,那么调试器可以查看调试信息,找到变量并打印出该值,就像我们在本系列最后一部分中所做的那样。如果表达式有点复杂,则可能将代码编译成中间表达式 IR 并解释来获得结果。例如对于某些表达式LLDB 将使用 Clang 将表达式编译为 LLVM IR 并将其解释。如果表达式更复杂,或者需要调用某些函数,那么代码可能需要 JIT 到目标并在被调试者的地址空间中执行。这涉及到调用 `mmap` 来分配一些可执行内存然后将编译的代码复制到该块并执行。LLDB 通过使用 LLVM 的 JIT 功能来实现。
如果你想更多地了解 JIT 编译,我强烈推荐[ Eli Bendersky 关于这个主题的文章][13]。
* * *
### 多线程调试支持
本系列展示的调试器仅支持单线程应用程序,但是为了调试大多数真实程序,多线程支持是非常需要的。支持这一点的最简单的方法是跟踪线程创建并解析 procfs 以获取所需的信息。
Linux 线程库称为 `pthreads`。当调用 `pthread_create` 时,库会使用 `clone` 系统调用来创建一个新的线程,我们可以用 `ptrace` 跟踪这个系统调用(假设你的内核早于 2.5.46)。为此,你需要在连接到调试器之后设置一些 `ptrace` 选项:
```
ptrace(PTRACE_SETOPTIONS, m_pid, nullptr, PTRACE_O_TRACECLONE);
```
现在当 `clone` 被调用时,该进程将收到我们的老朋友 `SIGTRAP` 发出信号。对于本系列中的调试器,你可以将一个例子添加到 `handle_sigtrap` 来处理新线程的创建:
```
case (SIGTRAP | (PTRACE_EVENT_CLONE << 8)):
//get the new thread ID
unsigned long event_message = 0;
ptrace(PTRACE_GETEVENTMSG, pid, nullptr, message);
//handle creation
//...
```
一旦收到了,你可以看看 `/proc/<pid>/task/` 并查看内存映射之类来获得所需的所有信息。
GDB 使用 `libthread_db`,它提供了一堆帮助函数,这样你就不需要自己解析和处理。设置这个库很奇怪,我不会在这展示它如何工作,但如果你想使用它,你可以去阅读[这个教程][14]。
多线程支持中最复杂的部分是调试器中线程状态的建模,特别是如果你希望支持[不间断模式][15]或当你计算中涉及不止一个 CPU 的某种异构调试。
* * *
### 最后!
呼!这个系列花了很长时间才写完,但是我在这个过程中学到了很多东西,我希望它是有帮助的。如果你聊有关调试或本系列中的任何问题,请在 Twitter [@TartanLlama][16]或评论区联系我。如果你有想看到的其他任何调试主题,让我知道我或许会再发其他的文章。
--------------------------------------------------------------------------------
via: https://blog.tartanllama.xyz/writing-a-linux-debugger-advanced-topics/
作者:[Simon Brand ][a]
译者:[geekpi](https://github.com/geekpi)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://www.twitter.com/TartanLlama
[1]:https://blog.tartanllama.xyz/writing-a-linux-debugger-setup/
[2]:https://blog.tartanllama.xyz/writing-a-linux-debugger-breakpoints/
[3]:https://blog.tartanllama.xyz/writing-a-linux-debugger-registers/
[4]:https://blog.tartanllama.xyz/writing-a-linux-debugger-elf-dwarf/
[5]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-signal/
[6]:https://blog.tartanllama.xyz/writing-a-linux-debugger-dwarf-step/
[7]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-break/
[8]:https://blog.tartanllama.xyz/writing-a-linux-debugger-unwinding/
[9]:https://blog.tartanllama.xyz/writing-a-linux-debugger-variables/
[10]:https://blog.tartanllama.xyz/writing-a-linux-debugger-advanced-topics/
[11]:https://sourceware.org/gdb/onlinedocs/gdb/Remote-Protocol.html
[12]:https://github.com/TartanLlama/dltrace
[13]:http://eli.thegreenplace.net/tag/code-generation
[14]:http://timetobleed.com/notes-about-an-odd-esoteric-yet-incredibly-useful-library-libthread_db/
[15]:https://sourceware.org/gdb/onlinedocs/gdb/Non_002dStop-Mode.html
[16]:https://twitter.com/TartanLlama