mirror of
https://github.com/LCTT/TranslateProject.git
synced 2024-12-26 21:30:55 +08:00
commit
9de8e717d7
@ -1,337 +0,0 @@
|
||||
translating----geekpi
|
||||
|
||||
Writing a Linux Debugger Part 9: Handling variables
|
||||
============================================================
|
||||
|
||||
Variables are sneaky. At one moment they’ll be happily sitting in registers, but as soon as you turn your head they’re spilled to the stack. Maybe the compiler completely throws them out of the window for the sake of optimization. Regardless of how often variables move around in memory, we need some way to track and manipulate them in our debugger. This post will teach you more about handling variables in your debugger and demonstrate a simple implementation using `libelfin`.
|
||||
|
||||
* * *
|
||||
|
||||
### Series index
|
||||
|
||||
1. [Setup][1]
|
||||
|
||||
2. [Breakpoints][2]
|
||||
|
||||
3. [Registers and memory][3]
|
||||
|
||||
4. [Elves and dwarves][4]
|
||||
|
||||
5. [Source and signals][5]
|
||||
|
||||
6. [Source-level stepping][6]
|
||||
|
||||
7. [Source-level breakpoints][7]
|
||||
|
||||
8. [Stack unwinding][8]
|
||||
|
||||
9. [Handling variables][9]
|
||||
|
||||
10. [Advanced topics][10]
|
||||
|
||||
* * *
|
||||
|
||||
Before you get started, make sure that the version of `libelfin` you are using is the [`fbreg` branch of my fork][11]. This contains some hacks to support getting the base of the current stack frame and evaluating location lists, neither of which are supported by vanilla `libelfin`. You might need to pass `-gdwarf-2` to GCC to get it to generate compatible DWARF information. But before we get into the implementation, I’ll give a more detailed description of how locations are encoded in DWARF 5, which is the most recent specification. If you want more information than what I write here, then you can grab the standard from [here][12].
|
||||
|
||||
### DWARF locations
|
||||
|
||||
The location of a variable in memory at a given moment is encoded in the DWARF information using the `DW_AT_location`attribute. Location descriptions can be either single location descriptions, composite location descriptions, or location lists.
|
||||
|
||||
* Simple location descriptions describe the location of one contiguous piece (usually all) of an object. A simple location description may describe a location in addressable memory, or in a register, or the lack of a location (with or without a known value).
|
||||
* Example:
|
||||
* `DW_OP_fbreg -32`
|
||||
|
||||
* A variable which is entirely stored -32 bytes from the stack frame base
|
||||
|
||||
* Composite location descriptions describe an object in terms of pieces, each of which may be contained in part of a register or stored in a memory location unrelated to other pieces.
|
||||
* Example:
|
||||
* `DW_OP_reg3 DW_OP_piece 4 DW_OP_reg10 DW_OP_piece 2`
|
||||
|
||||
* A variable whose first four bytes reside in register 3 and whose next two bytes reside in register 10.
|
||||
|
||||
* Location lists describe objects which have a limited lifetime or change location during their lifetime.
|
||||
* Example:
|
||||
* `<loclist with 3 entries follows>`
|
||||
* `[ 0]<lowpc=0x2e00><highpc=0x2e19>DW_OP_reg0`
|
||||
|
||||
* `[ 1]<lowpc=0x2e19><highpc=0x2e3f>DW_OP_reg3`
|
||||
|
||||
* `[ 2]<lowpc=0x2ec4><highpc=0x2ec7>DW_OP_reg2`
|
||||
|
||||
* A variable whose location moves between registers depending on the current value of the program counter
|
||||
|
||||
The `DW_AT_location` is encoded in one of three different ways, depending on the kind of location description. `exprloc`s encode simple and composite location descriptions. They consist of a byte length followed by a DWARF expression or location description. `loclist`s and `loclistptr`s encode location lists. They give indexes or offsets into the `.debug_loclists` section, which describes the actual location lists.
|
||||
|
||||
### DWARF Expressions
|
||||
|
||||
The actual location of the variables is computed using DWARF expressions. These consist of a series of operations which operate on a stack of values. There are an impressive number of DWARF operations available, so I won’t explain them all in detail. Instead I’ll give a few examples from each class of expression to give you a taste of what is available. Also, don’t get scared off by these; `libelfin` will take care off all of this complexity for us.
|
||||
|
||||
* Literal encodings
|
||||
* `DW_OP_lit0`, `DW_OP_lit1`, …, `DW_OP_lit31`
|
||||
* Push the literal value on to the stack
|
||||
|
||||
* `DW_OP_addr <addr>`
|
||||
* Pushes the address operand on to the stack
|
||||
|
||||
* `DW_OP_constu <unsigned>`
|
||||
* Pushes the unsigned value on to the stack
|
||||
|
||||
* Register values
|
||||
* `DW_OP_fbreg <offset>`
|
||||
* Pushes the value found at the base of the stack frame, offset by the given value
|
||||
|
||||
* `DW_OP_breg0`, `DW_OP_breg1`, …, `DW_OP_breg31 <offset>`
|
||||
* Pushes the contents of the given register plus the given offset to the stack
|
||||
|
||||
* Stack operations
|
||||
* `DW_OP_dup`
|
||||
* Duplicate the value at the top of the stack
|
||||
|
||||
* `DW_OP_deref`
|
||||
* Treats the top of the stack as a memory address, and replaces it with the contents of that address
|
||||
|
||||
* Arithmetic and logical operations
|
||||
* `DW_OP_and`
|
||||
* Pops the top two values from the stack and pushes back the logical `AND` of them
|
||||
|
||||
* `DW_OP_plus`
|
||||
* Same as `DW_OP_and`, but adds the values
|
||||
|
||||
* Control flow operations
|
||||
* `DW_OP_le`, `DW_OP_eq`, `DW_OP_gt`, etc.
|
||||
* Pops the top two values, compares them, and pushes `1` if the condition is true and `0`otherwise
|
||||
|
||||
* `DW_OP_bra <offset>`
|
||||
* Conditional branch: if the top of the stack is not `0`, skips back or forward in the expression by `offset`
|
||||
|
||||
* Type conversions
|
||||
* `DW_OP_convert <DIE offset>`
|
||||
* Converts value on the top of the stack to a different type, which is described by the DWARF information entry at the given offset
|
||||
|
||||
* Special operations
|
||||
* `DW_OP_nop`
|
||||
* Do nothing!
|
||||
|
||||
### DWARF types
|
||||
|
||||
DWARF’s representation of types needs to be strong enough to give debugger users useful variable representations. Users most often want to be able to debug at the level of their application rather than at the level of their machine, and they need a good idea of what their variables are doing to achieve that.
|
||||
|
||||
DWARF types are encoded in DIEs along with the majority of the other debug information. They can have attributes to indicate their name, encoding, size, endianness, etc. A myriad of type tags are available to express pointers, arrays, structures, typedefs, anything else you could see in a C or C++ program.
|
||||
|
||||
Take this simple structure as an example:
|
||||
|
||||
```
|
||||
struct test{
|
||||
int i;
|
||||
float j;
|
||||
int k[42];
|
||||
test* next;
|
||||
};
|
||||
```
|
||||
|
||||
The parent DIE for this struct is this:
|
||||
|
||||
```
|
||||
< 1><0x0000002a> DW_TAG_structure_type
|
||||
DW_AT_name "test"
|
||||
DW_AT_byte_size 0x000000b8
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000001
|
||||
|
||||
```
|
||||
|
||||
The above says that we have a structure called `test` of size `0xb8`, declared at line `1` of `test.cpp`. All there are then many children DIEs which describe the members.
|
||||
|
||||
```
|
||||
< 2><0x00000032> DW_TAG_member
|
||||
DW_AT_name "i"
|
||||
DW_AT_type <0x00000063>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000002
|
||||
DW_AT_data_member_location 0
|
||||
< 2><0x0000003e> DW_TAG_member
|
||||
DW_AT_name "j"
|
||||
DW_AT_type <0x0000006a>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000003
|
||||
DW_AT_data_member_location 4
|
||||
< 2><0x0000004a> DW_TAG_member
|
||||
DW_AT_name "k"
|
||||
DW_AT_type <0x00000071>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000004
|
||||
DW_AT_data_member_location 8
|
||||
< 2><0x00000056> DW_TAG_member
|
||||
DW_AT_name "next"
|
||||
DW_AT_type <0x00000084>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000005
|
||||
DW_AT_data_member_location 176(as signed = -80)
|
||||
|
||||
```
|
||||
|
||||
Each member has a name, a type (which is a DIE offset), a declaration file and line, and a byte offset into the structure where the member is located. The types which are pointed to come next.
|
||||
|
||||
```
|
||||
< 1><0x00000063> DW_TAG_base_type
|
||||
DW_AT_name "int"
|
||||
DW_AT_encoding DW_ATE_signed
|
||||
DW_AT_byte_size 0x00000004
|
||||
< 1><0x0000006a> DW_TAG_base_type
|
||||
DW_AT_name "float"
|
||||
DW_AT_encoding DW_ATE_float
|
||||
DW_AT_byte_size 0x00000004
|
||||
< 1><0x00000071> DW_TAG_array_type
|
||||
DW_AT_type <0x00000063>
|
||||
< 2><0x00000076> DW_TAG_subrange_type
|
||||
DW_AT_type <0x0000007d>
|
||||
DW_AT_count 0x0000002a
|
||||
< 1><0x0000007d> DW_TAG_base_type
|
||||
DW_AT_name "sizetype"
|
||||
DW_AT_byte_size 0x00000008
|
||||
DW_AT_encoding DW_ATE_unsigned
|
||||
< 1><0x00000084> DW_TAG_pointer_type
|
||||
DW_AT_type <0x0000002a>
|
||||
|
||||
```
|
||||
|
||||
As you can see, `int` on my laptop is a 4-byte signed integer type, and `float` is a 4-byte float. The integer array type is defined by pointing to the `int` type as its element type, a `sizetype` (think `size_t`) as the index type, with `2a` elements. The `test*` type is a `DW_TAG_pointer_type` which references the `test` DIE.
|
||||
|
||||
* * *
|
||||
|
||||
### Implementing a simple variable reader
|
||||
|
||||
As mentioned, `libelfin` will deal with most of the complexity for us. However, it doesn’t implement all of the different methods for representing variable locations, and handling a lot of them in our code would get pretty complex. As such, I’ve chosen to only support `exprloc`s for now. Feel free to add support for more types of expression. If you’re really feeling brave, submit some patches to `libelfin` to help complete the necessary support!
|
||||
|
||||
Handling variables is mostly down to locating the different parts in memory or registers, then reading or writing is the same as you’ve seen before. I’ll only show you how to implement reading for the sake of simplicity.
|
||||
|
||||
First we need to tell `libelfin` how to read registers from our process. We do this by creating a class which inherits from `expr_context` and uses `ptrace` to handle everything:
|
||||
|
||||
```
|
||||
class ptrace_expr_context : public dwarf::expr_context {
|
||||
public:
|
||||
ptrace_expr_context (pid_t pid) : m_pid{pid} {}
|
||||
|
||||
dwarf::taddr reg (unsigned regnum) override {
|
||||
return get_register_value_from_dwarf_register(m_pid, regnum);
|
||||
}
|
||||
|
||||
dwarf::taddr pc() override {
|
||||
struct user_regs_struct regs;
|
||||
ptrace(PTRACE_GETREGS, m_pid, nullptr, ®s);
|
||||
return regs.rip;
|
||||
}
|
||||
|
||||
dwarf::taddr deref_size (dwarf::taddr address, unsigned size) override {
|
||||
//TODO take into account size
|
||||
return ptrace(PTRACE_PEEKDATA, m_pid, address, nullptr);
|
||||
}
|
||||
|
||||
private:
|
||||
pid_t m_pid;
|
||||
};
|
||||
```
|
||||
|
||||
The reading will be handled by a `read_variables` function in our `debugger` class:
|
||||
|
||||
```
|
||||
void debugger::read_variables() {
|
||||
using namespace dwarf;
|
||||
|
||||
auto func = get_function_from_pc(get_pc());
|
||||
|
||||
//...
|
||||
}
|
||||
```
|
||||
|
||||
The first thing we do above is find the function which we’re currently in. Then we need to loop through the entries in that function, looking for variables:
|
||||
|
||||
```
|
||||
for (const auto& die : func) {
|
||||
if (die.tag == DW_TAG::variable) {
|
||||
//...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
We get the location information by looking up the `DW_AT_location` entry in the DIE:
|
||||
|
||||
```
|
||||
auto loc_val = die[DW_AT::location];
|
||||
```
|
||||
|
||||
Then we ensure that it’s an `exprloc` and ask `libelfin` to evaluate the expression for us:
|
||||
|
||||
```
|
||||
if (loc_val.get_type() == value::type::exprloc) {
|
||||
ptrace_expr_context context {m_pid};
|
||||
auto result = loc_val.as_exprloc().evaluate(&context);
|
||||
```
|
||||
|
||||
Now that we’ve evaluated the expression, we need to read the contents of the variable. It could be in memory or a register, so we’ll handle both cases:
|
||||
|
||||
```
|
||||
switch (result.location_type) {
|
||||
case expr_result::type::address:
|
||||
{
|
||||
auto value = read_memory(result.value);
|
||||
std::cout << at_name(die) << " (0x" << std::hex << result.value << ") = "
|
||||
<< value << std::endl;
|
||||
break;
|
||||
}
|
||||
|
||||
case expr_result::type::reg:
|
||||
{
|
||||
auto value = get_register_value_from_dwarf_register(m_pid, result.value);
|
||||
std::cout << at_name(die) << " (reg " << result.value << ") = "
|
||||
<< value << std::endl;
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
throw std::runtime_error{"Unhandled variable location"};
|
||||
}
|
||||
```
|
||||
|
||||
As you can see I’ve simply printed out the value without interpreting it based on the type of the variable. Hopefully from this code you can see how you could support writing variables, or searching for variables with a given name.
|
||||
|
||||
Finally we can add this to our command parser:
|
||||
|
||||
```
|
||||
else if(is_prefix(command, "variables")) {
|
||||
read_variables();
|
||||
}
|
||||
```
|
||||
|
||||
### Testing it out
|
||||
|
||||
Write a few small functions which have some variables, compile it without optimizations and with debug info, then see if you can read the values of your variables. Try writing to the memory address where a variable is stored and see the behaviour of the program change.
|
||||
|
||||
* * *
|
||||
|
||||
Nine posts down, one to go! Next time I’ll be talking about some more advanced concepts which might interest you. For now you can find the code for this post [here][13]
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://blog.tartanllama.xyz/writing-a-linux-debugger-variables/
|
||||
|
||||
作者:[ Simon Brand][a]
|
||||
译者:[译者ID](https://github.com/译者ID)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://www.twitter.com/TartanLlama
|
||||
[1]:https://blog.tartanllama.xyz/writing-a-linux-debugger-setup/
|
||||
[2]:https://blog.tartanllama.xyz/writing-a-linux-debugger-breakpoints/
|
||||
[3]:https://blog.tartanllama.xyz/writing-a-linux-debugger-registers/
|
||||
[4]:https://blog.tartanllama.xyz/writing-a-linux-debugger-elf-dwarf/
|
||||
[5]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-signal/
|
||||
[6]:https://blog.tartanllama.xyz/writing-a-linux-debugger-dwarf-step/
|
||||
[7]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-break/
|
||||
[8]:https://blog.tartanllama.xyz/writing-a-linux-debugger-unwinding/
|
||||
[9]:https://blog.tartanllama.xyz/writing-a-linux-debugger-variables/
|
||||
[10]:https://blog.tartanllama.xyz/writing-a-linux-debugger-advanced-topics/
|
||||
[11]:https://github.com/TartanLlama/libelfin/tree/fbreg
|
||||
[12]:http://dwarfstd.org/
|
||||
[13]:https://github.com/TartanLlama/minidbg/tree/tut_variable
|
@ -0,0 +1,335 @@
|
||||
开发一个 Linux 调试器(九):处理变量
|
||||
============================================================
|
||||
|
||||
变量是偷偷摸摸的。有时,它们会很高兴地呆在寄存器中,但是一转头就会跑到堆栈中。为了优化,编译器可能会完全将它们从窗口中抛出。无论变量在内存中的移动频率如何,我们都需要一些方法在调试器中跟踪和操作它们。这篇文章将会教你如何处理调试器中的变量,并使用 `libelfin` 演示一个简单的实现。
|
||||
|
||||
* * *
|
||||
|
||||
### 系列文章索引
|
||||
|
||||
1. [设置][1]
|
||||
|
||||
2. [断点][2]
|
||||
|
||||
3. [寄存器和内存][3]
|
||||
|
||||
4. [ELF 和 DWARF][4]
|
||||
|
||||
5. [源和信号][5]
|
||||
|
||||
6. [源码级单步调试][6]
|
||||
|
||||
7. [源码级断点][7]
|
||||
|
||||
8. [堆栈展开][8]
|
||||
|
||||
9. [处理变量][9]
|
||||
|
||||
10. [高级话题][10]
|
||||
|
||||
* * *
|
||||
|
||||
在开始之前,请确保你使用的 `libelfin` 版本是[我分支上的 `fbreg`][11]。这包含了一些 hack 来支持获取当前堆栈帧的基址并评估位置列表,这些都不是由原生的 `libelfin` 提供的。你可能需要给 GCC 传递 `-gdwarf-2` 参数使其生成兼容的 DWARF 信息。但是在实现之前,我将详细说明 DWARF 5 最新规范中的位置编码方式。如果你想要了解更多信息,那么你可以从[这里][12]获取标准。
|
||||
|
||||
### DWARF 未知
|
||||
|
||||
使用 `DW_AT_location` 属性在 DWARF 信息中编码给定时刻内存中变量的位置。位置描述可以是单个位置描述,复合位置描述或位置列表。
|
||||
|
||||
* 简单的位置描述描述对象的一个连续的部分(通常是所有)的位置。简单位置描述可以描述可寻址存储器或寄存器中的位置,或缺少位置(具有或不具有已知值)。
|
||||
* 比如:
|
||||
* `DW_OP_fbreg -32`
|
||||
|
||||
* 一个完全存储的变量 - 从堆栈帧基址开始的32个字节
|
||||
|
||||
* 复合位置描述根据片段描述对象,每个对象可以包含在寄存器的一部分中或存储在与其他片段无关的存储器位置中。
|
||||
* 比如:
|
||||
* `DW_OP_reg3 DW_OP_piece 4 DW_OP_reg10 DW_OP_piece 2`
|
||||
|
||||
* 前四个字节位于寄存器 3 中,后两个字节位于寄存器 10 中的一个变量。
|
||||
|
||||
* 位置列表描述了具有有限周期或在周期内更改位置的对象。
|
||||
* 比如:
|
||||
* `<loclist with 3 entries follows>`
|
||||
* `[ 0]<lowpc=0x2e00><highpc=0x2e19>DW_OP_reg0`
|
||||
|
||||
* `[ 1]<lowpc=0x2e19><highpc=0x2e3f>DW_OP_reg3`
|
||||
|
||||
* `[ 2]<lowpc=0x2ec4><highpc=0x2ec7>DW_OP_reg2`
|
||||
|
||||
* 根据程序计数器的当前值,位置在寄存器之间移动的变量
|
||||
|
||||
根据位置描述的种类,`DW_AT_location` 以三种不同的方式进行编码。`exprloc` 编码简单和复合的位置描述。它们由一个字节长度组成,后跟一个 DWARF 表达式或位置描述。`loclist` 和 `loclistptr` 的编码位置列表。它们在 `.debug_loclists` 部分中提供索引或偏移量,该部分描述了实际的位置列表。
|
||||
|
||||
### DWARF 表达式
|
||||
|
||||
使用 DWARF 表达式计算变量的实际位置。这包括操作堆栈值的一系列操作。有很多 DWARF 操作可用,所以我不会详细解释它们。相反,我会从每一个表达式中给出一些例子,给你一个可用的东西。另外,不要害怕这些; `libelfin` 将为我们处理所有这些复杂性。
|
||||
|
||||
* 字面编码
|
||||
* `DW_OP_lit0`、`DW_OP_lit1`。。。`DW_OP_lit31`
|
||||
* 将字面值压入堆栈
|
||||
|
||||
* `DW_OP_addr <addr>`
|
||||
* 将地址操作数压入堆栈
|
||||
|
||||
* `DW_OP_constu <unsigned>`
|
||||
* 将无符号值压入堆栈
|
||||
|
||||
* 寄存器值
|
||||
* `DW_OP_fbreg <offset>`
|
||||
* 压入在堆栈帧基址找到的值,偏移给定值
|
||||
|
||||
* `DW_OP_breg0`、`DW_OP_breg1`。。。 `DW_OP_breg31 <offset>`
|
||||
* 将给定寄存器的内容加上给定的偏移量压入堆栈
|
||||
|
||||
* 堆栈操作
|
||||
* `DW_OP_dup`
|
||||
* 复制堆栈顶部的值
|
||||
|
||||
* `DW_OP_deref`
|
||||
* 将堆栈顶部视为内存地址,并将其替换为该地址的内容
|
||||
|
||||
* 算术和逻辑运算
|
||||
* `DW_OP_and`
|
||||
* 弹出堆栈顶部的两个值,并压回它们的逻辑 `AND`
|
||||
|
||||
* `DW_OP_plus`
|
||||
* 与 `DW_OP_and` 相同,但是会添加值
|
||||
|
||||
* 控制流操作
|
||||
* `DW_OP_le`、`DW_OP_eq`、`DW_OP_gt` 等
|
||||
* 弹出前两个值,比较它们,并且如果条件为真,则压入 `1`,否则为 `0`
|
||||
|
||||
* `DW_OP_bra <offset>`
|
||||
* 条件分支:如果堆栈的顶部不是 `0`,则通过 `offset` 在表达式中向后或向后跳过
|
||||
|
||||
* 输入转化
|
||||
* `DW_OP_convert <DIE offset>`
|
||||
* 将堆栈顶部的值转换为不同的类型,它由给定偏移量的 DWARF 信息条目描述
|
||||
|
||||
* 特殊操作
|
||||
* `DW_OP_nop`
|
||||
* 什么都能不做!
|
||||
|
||||
### DWARF 类型
|
||||
|
||||
DWARF 的类型表示需要足够强大来为调试器用户提供有用的变量表示。用户经常希望能够在应用程序级别进行调试,而不是在机器级别进行调试,并且他们需要了解他们的变量正在做什么。
|
||||
|
||||
DWARF 类型与大多数其他调试信息一起编码在 DIE 中。它们可以具有指示其名称、编码、大小、字节等的属性。无数的类型标签可用于表示指针、数组、结构体、typedef 以及 C 或 C++ 程序中可以看到的任何其他内容。
|
||||
|
||||
以这个简单的结构体为例:
|
||||
|
||||
```
|
||||
struct test{
|
||||
int i;
|
||||
float j;
|
||||
int k[42];
|
||||
test* next;
|
||||
};
|
||||
```
|
||||
|
||||
这个结构体的父 DIE 是这样的:
|
||||
|
||||
```
|
||||
< 1><0x0000002a> DW_TAG_structure_type
|
||||
DW_AT_name "test"
|
||||
DW_AT_byte_size 0x000000b8
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000001
|
||||
|
||||
```
|
||||
|
||||
上面说的是我们有一个叫做 `test` 的结构体,大小为 `0xb8`,在 `test.cpp` 的第 `1` 行声明。接下来有许多描述成员的子 DIE。
|
||||
|
||||
```
|
||||
< 2><0x00000032> DW_TAG_member
|
||||
DW_AT_name "i"
|
||||
DW_AT_type <0x00000063>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000002
|
||||
DW_AT_data_member_location 0
|
||||
< 2><0x0000003e> DW_TAG_member
|
||||
DW_AT_name "j"
|
||||
DW_AT_type <0x0000006a>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000003
|
||||
DW_AT_data_member_location 4
|
||||
< 2><0x0000004a> DW_TAG_member
|
||||
DW_AT_name "k"
|
||||
DW_AT_type <0x00000071>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000004
|
||||
DW_AT_data_member_location 8
|
||||
< 2><0x00000056> DW_TAG_member
|
||||
DW_AT_name "next"
|
||||
DW_AT_type <0x00000084>
|
||||
DW_AT_decl_file 0x00000001 test.cpp
|
||||
DW_AT_decl_line 0x00000005
|
||||
DW_AT_data_member_location 176(as signed = -80)
|
||||
|
||||
```
|
||||
|
||||
每个成员都有一个名称,一个类型(它是一个 DIE 偏移量),一个声明文件和行,以及一个字节偏移到该成员所在的结构体中。类型指向下一个。
|
||||
|
||||
```
|
||||
< 1><0x00000063> DW_TAG_base_type
|
||||
DW_AT_name "int"
|
||||
DW_AT_encoding DW_ATE_signed
|
||||
DW_AT_byte_size 0x00000004
|
||||
< 1><0x0000006a> DW_TAG_base_type
|
||||
DW_AT_name "float"
|
||||
DW_AT_encoding DW_ATE_float
|
||||
DW_AT_byte_size 0x00000004
|
||||
< 1><0x00000071> DW_TAG_array_type
|
||||
DW_AT_type <0x00000063>
|
||||
< 2><0x00000076> DW_TAG_subrange_type
|
||||
DW_AT_type <0x0000007d>
|
||||
DW_AT_count 0x0000002a
|
||||
< 1><0x0000007d> DW_TAG_base_type
|
||||
DW_AT_name "sizetype"
|
||||
DW_AT_byte_size 0x00000008
|
||||
DW_AT_encoding DW_ATE_unsigned
|
||||
< 1><0x00000084> DW_TAG_pointer_type
|
||||
DW_AT_type <0x0000002a>
|
||||
|
||||
```
|
||||
|
||||
如你所见,我笔记本电脑上的 `int` 是一个 4 字节的有符号整数类型,`float`是一个 4 字节的浮点数。整数数组类型通过指向 `int` 类型作为其元素类型,`sizetype`(可以认为是 `size_t`)作为索引类型,它具有 `2a` 个元素。 `test *` 类型是 `DW_TAG_pointer_type`,它引用 `test` DIE。
|
||||
|
||||
* * *
|
||||
|
||||
### 实现简单的变量读取器
|
||||
|
||||
如上所述,`libelfin` 将处理我们大部分的复杂性。但是,它并没有实现用于表示可变位置的所有不同方法,并且在我们的代码中处理这些将变得非常复杂。因此,我现在选择只支持 `exprloc`。请随意添加对更多类型表达式的支持。如果你真的有勇气,请提交补丁到 `libelfin` 中来帮助完成必要的支持!
|
||||
|
||||
处理变量主要是将不同部分定位在存储器或寄存器中,读取或写入与之前一样。为了简单起见,我只会告诉你如何实现读取。
|
||||
|
||||
首先我们需要告诉 `libelfin` 如何从我们的进程中读取寄存器。我们创建一个继承自 `expr_context` 的类并使用 `ptrace` 来处理所有内容:
|
||||
|
||||
```
|
||||
class ptrace_expr_context : public dwarf::expr_context {
|
||||
public:
|
||||
ptrace_expr_context (pid_t pid) : m_pid{pid} {}
|
||||
|
||||
dwarf::taddr reg (unsigned regnum) override {
|
||||
return get_register_value_from_dwarf_register(m_pid, regnum);
|
||||
}
|
||||
|
||||
dwarf::taddr pc() override {
|
||||
struct user_regs_struct regs;
|
||||
ptrace(PTRACE_GETREGS, m_pid, nullptr, ®s);
|
||||
return regs.rip;
|
||||
}
|
||||
|
||||
dwarf::taddr deref_size (dwarf::taddr address, unsigned size) override {
|
||||
//TODO take into account size
|
||||
return ptrace(PTRACE_PEEKDATA, m_pid, address, nullptr);
|
||||
}
|
||||
|
||||
private:
|
||||
pid_t m_pid;
|
||||
};
|
||||
```
|
||||
|
||||
读取将由我们 `debugger` 类中的 `read_variables` 函数处理:
|
||||
|
||||
```
|
||||
void debugger::read_variables() {
|
||||
using namespace dwarf;
|
||||
|
||||
auto func = get_function_from_pc(get_pc());
|
||||
|
||||
//...
|
||||
}
|
||||
```
|
||||
|
||||
我们上面做的第一件事是找到我们目前进入的函数,然后我们需要循环访问该函数中的条目来寻找变量:
|
||||
|
||||
```
|
||||
for (const auto& die : func) {
|
||||
if (die.tag == DW_TAG::variable) {
|
||||
//...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
我们通过查找 DIE 中的 `DW_AT_location` 条目获取位置信息:
|
||||
|
||||
```
|
||||
auto loc_val = die[DW_AT::location];
|
||||
```
|
||||
|
||||
接着我们确保它是一个 `exprloc`,并请求 `libelfin` 来评估我们的表达式:
|
||||
|
||||
```
|
||||
if (loc_val.get_type() == value::type::exprloc) {
|
||||
ptrace_expr_context context {m_pid};
|
||||
auto result = loc_val.as_exprloc().evaluate(&context);
|
||||
```
|
||||
|
||||
现在我们已经评估了表达式,我们需要读取变量的内容。它可以在内存或寄存器中,因此我们将处理这两种情况:
|
||||
|
||||
```
|
||||
switch (result.location_type) {
|
||||
case expr_result::type::address:
|
||||
{
|
||||
auto value = read_memory(result.value);
|
||||
std::cout << at_name(die) << " (0x" << std::hex << result.value << ") = "
|
||||
<< value << std::endl;
|
||||
break;
|
||||
}
|
||||
|
||||
case expr_result::type::reg:
|
||||
{
|
||||
auto value = get_register_value_from_dwarf_register(m_pid, result.value);
|
||||
std::cout << at_name(die) << " (reg " << result.value << ") = "
|
||||
<< value << std::endl;
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
throw std::runtime_error{"Unhandled variable location"};
|
||||
}
|
||||
```
|
||||
|
||||
你可以看到,我根据变量的类型,打印输出了值而没有解释。希望通过这个代码,你可以看到如何支持编写变量,或者用给定的名字搜索变量。
|
||||
|
||||
最后我们可以将它添加到我们的命令解析器中:
|
||||
|
||||
```
|
||||
else if(is_prefix(command, "variables")) {
|
||||
read_variables();
|
||||
}
|
||||
```
|
||||
|
||||
### 测试一下
|
||||
|
||||
编写一些具有一些变量的小功能,不用优化并带有调试信息编译它,然后查看是否可以读取变量的值。尝试写入存储变量的内存地址,并查看程序改变的行为。
|
||||
|
||||
* * *
|
||||
|
||||
已经有九篇文章了,还剩最后一篇!下一次我会讨论一些你可能会感兴趣的更高级的概念。现在你可以在[这里][13]找到这个帖子的代码。
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
via: https://blog.tartanllama.xyz/writing-a-linux-debugger-variables/
|
||||
|
||||
作者:[ Simon Brand][a]
|
||||
译者:[geekpi](https://github.com/geekpi)
|
||||
校对:[校对者ID](https://github.com/校对者ID)
|
||||
|
||||
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
||||
|
||||
[a]:https://www.twitter.com/TartanLlama
|
||||
[1]:https://blog.tartanllama.xyz/writing-a-linux-debugger-setup/
|
||||
[2]:https://blog.tartanllama.xyz/writing-a-linux-debugger-breakpoints/
|
||||
[3]:https://blog.tartanllama.xyz/writing-a-linux-debugger-registers/
|
||||
[4]:https://blog.tartanllama.xyz/writing-a-linux-debugger-elf-dwarf/
|
||||
[5]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-signal/
|
||||
[6]:https://blog.tartanllama.xyz/writing-a-linux-debugger-dwarf-step/
|
||||
[7]:https://blog.tartanllama.xyz/writing-a-linux-debugger-source-break/
|
||||
[8]:https://blog.tartanllama.xyz/writing-a-linux-debugger-unwinding/
|
||||
[9]:https://blog.tartanllama.xyz/writing-a-linux-debugger-variables/
|
||||
[10]:https://blog.tartanllama.xyz/writing-a-linux-debugger-advanced-topics/
|
||||
[11]:https://github.com/TartanLlama/libelfin/tree/fbreg
|
||||
[12]:http://dwarfstd.org/
|
||||
[13]:https://github.com/TartanLlama/minidbg/tree/tut_variable
|
Loading…
Reference in New Issue
Block a user