TranslateProject/sources/tech/20180104 How does gdb call functions.md
2018-01-13 21:33:58 +08:00

11 KiB
Raw Blame History

How does gdb call functions?

(previous gdb posts: how does gdb work? (2016) and three things you can do with gdb (2014))

I discovered this week that you can call C functions from gdb! I thought this was cool because Id previously thought of gdb as mostly a read-only debugging tool.

I was really surprised by that (how does that WORK??). As I often do, I asked on Twitter how that even works, and I got a lot of really useful answers! My favorite answer was Evan Klitzkes example C code showing a way to do it. Code that  works  is very exciting!

I believe (through some stracing & experiments) that that example C code is different from how gdb actually calls functions, so Ill talk about what Ive figured out about what gdb does in this post and how Ive figured it out.

There is a lot I still dont know about how gdb calls functions, and very likely some things in here are wrong.

What does it mean to call a C function from gdb?

Before I get into how this works, lets talk quickly about why I found it surprising / nonobvious.

So, you have a running C program (the “target program”). You want to run a function from it. To do that, you need to basically:

  • pause the program (because it is already running code!)

  • find the address of the function you want to call (using the symbol table)

  • convince the program (the “target program”) to jump to that address

  • when the function returns, restore the instruction pointer and registers to what they were before

Using the symbol table to figure out the address of the function you want to call is pretty straightforward heres some sketchy (but working!) Rust code that Ive been using on Linux to do that. This code uses the elf crate. If I wanted to find the address of the foo function in PID 2345, Id run elf_symbol_value("/proc/2345/exe", "foo").

fn elf_symbol_value(file_name: &str, symbol_name: &str) -> Result<u64, Box<std::error::Error>> {
    // open the ELF file
    let file = elf::File::open_path(file_name).ok().ok_or("parse error")?;
    // loop over all the sections & symbols until you find the right one!
    let sections = &file.sections;
    for s in sections {
        for sym in file.get_symbols(&s).ok().ok_or("parse error")? {
            if sym.name == symbol_name {
                return Ok(sym.value);
            }
        }
    }
    None.ok_or("No symbol found")?
}

This wont totally work on its own, you also need to look at the memory maps of the file and add the symbol offset to the start of the place that file is mapped. But finding the memory maps isnt so hard, theyre in /proc/PID/maps.

Anyway, this is all to say that finding the address of the function to call seemed straightforward to me but that the rest of it (change the instruction pointer? restore the registers? what else?) didnt seem so obvious!

You cant just jump

I kind of said this already but you cant just find the address of the function you want to run and then jump to that address. I tried that in gdb (jump foo) and the program segfaulted. Makes sense!

How you can call C functions from gdb

First, lets see that this is possible. I wrote a tiny C program that sleeps for 1000 seconds and called it test.c:

#include <unistd.h>

int foo() {
    return 3;
}
int main() {
    sleep(1000);
}

Next, compile and run it:

$ gcc -o test  test.c
$ ./test

Finally, lets attach to the test program with gdb:

$ sudo gdb -p $(pgrep -f test)
(gdb) p foo()
$1 = 3
(gdb) quit

So I ran p foo() and it ran the function! Thats fun.

Why is this useful?

a few possible uses for this:

  • it lets you treat gdb a little bit like a C REPL, which is fun and I imagine could be useful for development

  • utility functions to display / navigate complex data structures quickly while debugging in gdb (thanks @invalidop)

  • set an arbitrary processs namespace while its running (featuring a not-so-surprising appearance from my colleague nelhage!)

  • probably more that I dont know about

How it works

I got a variety of useful answers on Twitter when I asked how calling functions from gdb works! A lot of them were like “well you get the address of the function from the symbol table” but that is not the whole story!!

One person pointed me to this nice 2 part series on how gdb works that theyd written: Debugging with the natives, part 1 and Debugging with the natives, part 2. Part 1 explains approximately how calling functions works (or could work figuring out what gdb actually does isnt trivial, but Ill try my best!).

The steps outlined there are:

  1. Stop the process

  2. Create a new stack frame (far away from the actual stack)

  3. Save all the registers

  4. Set the registers to the arguments you want to call your function with

  5. Set the stack pointer to the new stack frame

  6. Put a trap instruction somewhere in memory

  7. Set the return address to that trap instruction

  8. Set the instruction pointer register to the address of the function you want to call

  9. Start the process again!

Im not going to go through how gdb does all of these (I dont know!) but here are a few things Ive learned about the various pieces this evening.

Create a stack frame

If youre going to run a C function, most likely it needs a stack to store variables on! You definitely dont want it to clobber your current stack. Concretely before gdb calls your function (by setting the instruction pointer to it and letting it go), it needs to set the stack pointer to… something.

There was some speculation on Twitter about how this works:

i think it constructs a new stack frame for the call right on top of the stack where youre sitting!

and:

Are you certain it does that? It could allocate a pseudo stack, then temporarily change sp value to that location. You could try, put a breakpoint there and look at the sp register address, see if its contiguous to your current program register?

I did an experiment where (inside gdb) I ran:`

(gdb) p $rsp
$7 = (void *) 0x7ffea3d0bca8
(gdb) break foo
Breakpoint 1 at 0x40052a
(gdb) p foo()
Breakpoint 1, 0x000000000040052a in foo ()
(gdb) p $rsp
$8 = (void *) 0x7ffea3d0bc00

This seems in line with the “gdb constructs a new stack frame for the call right on top of the stack where youre sitting” theory, since the stack pointer ($rsp) goes from being ...bca8 to ..bc00 – stack pointers grow downward, so a bc00stack pointer is after a bca8 pointer. Interesting!

So it seems like gdb just creates the new stack frames right where you are. Thats a bit surprising to me!

change the instruction pointer

Lets see whether gdb changes the instruction pointer!

(gdb) p $rip
$1 = (void (*)()) 0x7fae7d29a2f0 <__nanosleep_nocancel+7>
(gdb) b foo
Breakpoint 1 at 0x40052a
(gdb) p foo()
Breakpoint 1, 0x000000000040052a in foo ()
(gdb) p $rip
$3 = (void (*)()) 0x40052a <foo+4>

It does! The instruction pointer changes from 0x7fae7d29a2f0 to 0x40052a (the address of the foo function).

I stared at the strace output and I still dont understand how it changes, but thats okay.

aside: how breakpoints are set!!

Above I wrote break foo. I straced gdb while running all of this and understood almost nothing but I found ONE THING that makes sense to me!!

Here are some of the system calls that gdb uses to set a breakpoint. Its really simple! It replaces one instruction with cc (which https://defuse.ca/online-x86-assembler.htm tells me means int3 which means send SIGTRAP), and then once the program is interrupted, it puts the instruction back the way it was.

I was putting a breakpoint on a function foo with the address 0x400528.

This PTRACE_POKEDATA is how gdb changes the code of running programs.

// change the 0x400528 instructions
25622 ptrace(PTRACE_PEEKTEXT, 25618, 0x400528, [0x5d00000003b8e589]) = 0
25622 ptrace(PTRACE_POKEDATA, 25618, 0x400528, 0x5d00000003cce589) = 0
// start the program running
25622 ptrace(PTRACE_CONT, 25618, 0x1, SIG_0) = 0
// get a signal when it hits the breakpoint
25622 ptrace(PTRACE_GETSIGINFO, 25618, NULL, {si_signo=SIGTRAP, si_code=SI_KERNEL, si_value={int=-1447215360, ptr=0x7ffda9bd3f00}}) = 0
// change the 0x400528 instructions back to what they were before
25622 ptrace(PTRACE_PEEKTEXT, 25618, 0x400528, [0x5d00000003cce589]) = 0
25622 ptrace(PTRACE_POKEDATA, 25618, 0x400528, 0x5d00000003b8e589) = 0

put a trap instruction somewhere

When gdb runs a function, it also puts trap instructions in a bunch of places! Heres one of them (per strace). Its basically replacing one instruction with cc (int3).

5908  ptrace(PTRACE_PEEKTEXT, 5810, 0x7f6fa7c0b260, [0x48f389fd89485355]) = 0
5908  ptrace(PTRACE_PEEKTEXT, 5810, 0x7f6fa7c0b260, [0x48f389fd89485355]) = 0
5908 ptrace(PTRACE_POKEDATA, 5810, 0x7f6fa7c0b260, 0x48f389fd894853cc) = 0

What0x7f6fa7c0b260? Well, I looked in the processs memory maps, and it turns its somewhere in /lib/x86_64-linux-gnu/libc-2.23.so. Thats weird! Why is gdb putting trap instructions in libc?

Well, lets see what function thats in. It turns out it__libc_siglongjmp. The other functions gdb is putting traps in are __longjmp____longjmp_chkdl_main, and _dl_close_worker.

Why? I dont know! Maybe for some reason when our function foo() returns, its calling longjmp, and that is how gdb gets control back? Im not sure.

how gdb calls functions is complicated!

Im going to stop there (its 1am!), but now I know a little more!

It seems like the answer to “how does gdb call a function?” is definitely not that simple. I found it interesting to try to figure a little bit of it out and hopefully you have too!

I still have a lot of unanswered questions about how exactly gdb does all of these things, but thats okay. I dont really need to know the details of how this works and Im happy to have a slightly improved understanding.


via: https://jvns.ca/blog/2018/01/04/how-does-gdb-call-functions/

作者:Julia Evans 译者:译者ID 校对:校对者ID

本文由 LCTT 原创编译,Linux中国 荣誉推出