TranslateProject/sources/tech/20160810 How does gdb work.md
2018-03-04 16:38:21 +08:00

10 KiB
Raw Blame History

translating by ucasFL

How does gdb work?

Hello! Today I was working a bit on my ruby stacktrace project and I realized that now I know a couple of things about how gdb works internally.

Lately Ive been using gdb to look at Ruby programs, so were going to be running gdb on a Ruby program. This really means the Ruby interpreter. First, were going to print out the address of a global variable: ruby_current_thread:

getting a global variable

Heres how to get the address of the global ruby_current_thread:

$ sudo gdb -p 2983
(gdb) p & ruby_current_thread
$2 = (rb_thread_t **) 0x5598a9a8f7f0 <ruby_current_thread>

There are a few places a variable can live: on the heap, the stack, or in your programs text. Global variables are part of your program! You can think of them as being allocated at compile time, kind of. It turns out we can figure out the address of a global variable pretty easily! Lets see how gdb came up with 0x5598a9a8f7f0.

We can find the approximate region this variable lives in by looking at a cool file in /proc called /proc/$pid/maps.

$ sudo cat /proc/2983/maps | grep bin/ruby
5598a9605000-5598a9886000 r-xp 00000000 00:32 323508                     /home/bork/.rbenv/versions/2.1.6/bin/ruby
5598a9a86000-5598a9a8b000 r--p 00281000 00:32 323508                     /home/bork/.rbenv/versions/2.1.6/bin/ruby
5598a9a8b000-5598a9a8d000 rw-p 00286000 00:32 323508                     /home/bork/.rbenv/versions/2.1.6/bin/ruby

So! Theres this starting address 5598a9605000 Thatlike  0x5598a9a8f7f0, but different. How different? Well, heres what I get when I subtract them:

(gdb) p/x 0x5598a9a8f7f0 - 0x5598a9605000
$4 = 0x48a7f0

“Whats that number?”, you might ask? WELL. Lets look at the symbol tablefor our program with nm.

sudo nm /proc/2983/exe | grep ruby_current_thread
000000000048a7f0 b ruby_current_thread

Whats that we see? Could it be 0x48a7f0? Yes it is! So!! If we want to find the address of a global variable in our program, all we need to do is look up the name of the variable in the symbol table, and then add that to the start of the range in /proc/whatever/maps, and were done!

So now we know how gdb does that. But gdb does so much more!! Lets skip ahead to…

dereferencing pointers

(gdb) p ruby_current_thread
$1 = (rb_thread_t *) 0x5598ab3235b0

The next thing were going to do is dereference that ruby_current_threadpointer. We want to see whats in that address! To do that, gdb will run a bunch of system calls like this:

ptrace(PTRACE_PEEKTEXT, 2983, 0x5598a9a8f7f0, [0x5598ab3235b0]) = 0

You remember this address 0x5598a9a8f7f0? gdb is asking “hey, whats in that address exactly”? 2983 is the PID of the process were running gdb on. Its using the ptrace system call which is how gdb does everything.

Awesome! So we can dereference memory and figure out what bytes are at what memory addresses. Some useful gdb commands to know here are x/40w variable and x/40b variable which will display 40 words / bytes at a given address, respectively.

describing structs

The memory at an address looks like this. A bunch of bytes!

(gdb) x/40b ruby_current_thread
0x5598ab3235b0:	16	-90	55	-85	-104	85	0	0
0x5598ab3235b8:	32	47	50	-85	-104	85	0	0
0x5598ab3235c0:	16	-64	-55	115	-97	127	0	0
0x5598ab3235c8:	0	0	2	0	0	0	0	0
0x5598ab3235d0:	-96	-83	-39	115	-97	127	0	0

Thats useful, but not that useful! If you are a human like me and want to know what it MEANS, you need more. Like this:

(gdb) p *(ruby_current_thread)
$8 = {self = 94114195940880, vm = 0x5598ab322f20, stack = 0x7f9f73c9c010,
	stack_size = 131072, cfp = 0x7f9f73d9ada0, safe_level = 0,    raised_flag = 0,
	last_status = 8, state = 0, waiting_fd = -1, passed_block = 0x0,
	passed_bmethod_me = 0x0, passed_ci = 0x0,    top_self = 94114195612680,
	top_wrapper = 0, base_block = 0x0, root_lep = 0x0, root_svar = 8, thread_id =
	140322820187904,

GOODNESS. That is a lot more useful. How does gdb know that there are all these cool fields like stack_size? Enter DWARF. DWARF is a way to store extra debugging data about your program, so that debuggers like gdb can do their job better! Its generally stored as part of a binary. If I run dwarfdump on my Ruby binary, I get some output like this:

(Ive redacted it heavily to make it easier to understand)

DW_AT_name                  "rb_thread_struct"
DW_AT_byte_size             0x000003e8
DW_TAG_member
  DW_AT_name                  "self"
  DW_AT_type                  <0x00000579>
  DW_AT_data_member_location  DW_OP_plus_uconst 0
DW_TAG_member
  DW_AT_name                  "vm"
  DW_AT_type                  <0x0000270c>
  DW_AT_data_member_location  DW_OP_plus_uconst 8
DW_TAG_member
  DW_AT_name                  "stack"
  DW_AT_type                  <0x000006b3>
  DW_AT_data_member_location  DW_OP_plus_uconst 16
DW_TAG_member
  DW_AT_name                  "stack_size"
  DW_AT_type                  <0x00000031>
  DW_AT_data_member_location  DW_OP_plus_uconst 24
DW_TAG_member
  DW_AT_name                  "cfp"
  DW_AT_type                  <0x00002712>
  DW_AT_data_member_location  DW_OP_plus_uconst 32
DW_TAG_member
  DW_AT_name                  "safe_level"
  DW_AT_type                  <0x00000066>

So. The name of the type of ruby_current_thread is rb_thread_struct. It has size 0x3e8 (or 1000 bytes), and it has a bunch of member items. stack_size is one of them, at an offset of 24, and it has type 31. Whats 31? No worries! We can look that up in the DWARF info too!

< 1><0x00000031>    DW_TAG_typedef
                      DW_AT_name                  "size_t"
                      DW_AT_type                  <0x0000003c>
< 1><0x0000003c>    DW_TAG_base_type
                      DW_AT_byte_size             0x00000008
                      DW_AT_encoding              DW_ATE_unsigned
                      DW_AT_name                  "long unsigned int"

So! stack_size has type size_t, which means long unsigned int, and is 8 bytes. That means that we can read the stack size!

How that would break down, once we have the DWARF debugging data, is:

  1. Read the region of memory that ruby_current_thread is pointing to

  2. Add 24 bytes to get to stack_size

  3. Read 8 bytes (in little-endian format, since were on x86)

  4. Get the answer!

Which in this case is 131072 or 128 kb.

To me, this makes it a lot more obvious what debugging info is for – if we didnt have all this extra metadata about what all these variables meant, we would have no idea what the bytes at address 0x5598ab3235b0 meant.

This is also why you can install debug info for a program separately from your program gdb doesnt care where it gets the extra debug info from.

DWARF is confusing

Ive been reading a bunch of DWARF info recently. Right now Im using libdwarf which hasnt been the best experience the API is confusing, you initialize everything in a weird way, and its really slow (it takes 0.3 seconds to read all the debugging data out of my Ruby program which seems ridiculous). Ive been told that libdw from elfutils is better.

Also, I casually remarked that you can look at DW_AT_data_member_location to get the offset of a struct member! But I looked up on Stack Overflow how to actually do that and I got this answer. Basically you start with a check like:

dwarf_whatform(attrs[i], &form, &error);
    if (form == DW_FORM_data1 || form == DW_FORM_data2
        form == DW_FORM_data2 || form == DW_FORM_data4
        form == DW_FORM_data8 || form == DW_FORM_udata) {

and then it keeps GOING. Why are there 8 million different DW_FORM_data things I need to check for? What is happening? I have no idea.

Anyway my impression is that DWARF is a large and complicated standard (and possibly the libraries people use to generate DWARF are subtly incompatible?), but its what we have, so thats what we work with!

I think its really cool that I can write code that reads DWARF and my code actually mostly works. Except when it crashes. Im working on that.

unwinding stacktraces

In an earlier version of this post, I said that gdb unwinds stacktraces using libunwind. It turns out that this isnt true at all!

Someone whos worked on gdb a lot emailed me to say that they actually spent a ton of time figuring out how to unwind stacktraces so that they can do a better job than libunwind does. This means that if you get stopped in the middle of a weird program with less debug info than you might hope for thats done something strange with its stack, gdb will try to figure out where you are anyway. Thanks <3

other things gdb does

The few things Ive described here (reading memory, understanding DWARF to show you structs) arent everything gdb does just looking through Brendan Gregggdb example from yesterday, we see that gdb also knows how to

  • disassemble assembly

  • show you the contents of your registers

and in terms of manipulating your program, it can

  • set breakpoints and step through a program

  • modify memory (!! danger !!)

Knowing more about how gdb works makes me feel a lot more confident when using it! I used to get really confused because gdb kind of acts like a C REPL sometimes you type ruby_current_thread->cfp->iseq, and it feels like writing C code! But youre not really writing C at all, and it was easy for me to run into limitations in gdb and not understand why.

Knowing that its using DWARF to figure out the contents of the structs gives me a better mental model and have more correct expectations! Awesome.


via: https://jvns.ca/blog/2016/08/10/how-does-gdb-work/

作者: Julia Evans 译者:译者ID 校对:校对者ID

本文由 LCTT 原创编译,Linux中国 荣誉推出