mirror of
https://github.com/LCTT/TranslateProject.git
synced 2024-12-29 21:41:00 +08:00
335 lines
16 KiB
Markdown
335 lines
16 KiB
Markdown
|
[#]: subject: "Compiler optimization and its effect on debugger line information"
|
|||
|
[#]: via: "https://opensource.com/article/23/3/compiler-optimization-debugger-line-information"
|
|||
|
[#]: author: "Will Cohen https://opensource.com/users/wcohen"
|
|||
|
[#]: collector: "lkxed"
|
|||
|
[#]: translator: " "
|
|||
|
[#]: reviewer: " "
|
|||
|
[#]: publisher: " "
|
|||
|
[#]: url: " "
|
|||
|
|
|||
|
Compiler optimization and its effect on debugger line information
|
|||
|
======
|
|||
|
|
|||
|
In my [previous article][1], I described the DWARF information used to map regular and inlined functions between an executable binary and its source code. Functions can be dozens of lines, so you might like to know specifically where the processor is in your source code. The compiler includes information mapping between instructions and specific lines in the source code to provide a precise location. In this article, I describe line mapping information, and some of the issues caused by compiler optimizations.
|
|||
|
|
|||
|
Start with the same example code from the previous article:
|
|||
|
|
|||
|
```
|
|||
|
#include <stdlib.h>
|
|||
|
#include <stdio.h>
|
|||
|
|
|||
|
int a;
|
|||
|
double b;
|
|||
|
|
|||
|
int
|
|||
|
main(int argc, char* argv[])
|
|||
|
{
|
|||
|
a = atoi(argv[1]);
|
|||
|
b = atof(argv[2]);
|
|||
|
a = a + 1;
|
|||
|
b = b / 42.0;
|
|||
|
printf ("a = %d, b = %f\n", a, b);
|
|||
|
return 0;
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
The compiler only includes the line mapping information when the code is compiled with debugging information enabled (the `-g` option):
|
|||
|
|
|||
|
```
|
|||
|
$ gcc -O2 -g example.c -o example
|
|||
|
```
|
|||
|
|
|||
|
### Examining line number information
|
|||
|
|
|||
|
Line information is stored in a machine readable format, but human readable output can be generated with `llvm-objdump` or `odjdump`.
|
|||
|
|
|||
|
```
|
|||
|
$ llvm-objdump --line-numbers example
|
|||
|
```
|
|||
|
|
|||
|
For the main function, you get output listing the assembly code instruction with the file and line number associated with the instruction:
|
|||
|
|
|||
|
```
|
|||
|
0000000000401060 <main>:
|
|||
|
; main():
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:9
|
|||
|
401060: 53 pushq %rbx
|
|||
|
; /usr/include/stdlib.h:364
|
|||
|
401061: 48 8b 7e 08 movq 8(%rsi), %rdi
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:9
|
|||
|
401065: 48 89 f3 movq %rsi, %rbx
|
|||
|
; /usr/include/stdlib.h:364
|
|||
|
401068: ba 0a 00 00 00 movl $10, %edx
|
|||
|
40106d: 31 f6 xorl %esi, %esi
|
|||
|
40106f: e8 dc ff ff ff callq 0x401050 <strtol@plt>
|
|||
|
; /usr/include/bits/stdlib-float.h:27
|
|||
|
401074: 48 8b 7b 10 movq 16(%rbx), %rdi
|
|||
|
401078: 31 f6 xorl %esi, %esi
|
|||
|
; /usr/include/stdlib.h:364
|
|||
|
40107a: 89 05 c8 2f 00 00 movl %eax, 12232(%rip) # 0x404048 <a>
|
|||
|
; /usr/include/bits/stdlib-float.h:27
|
|||
|
401080: e8 ab ff ff ff callq 0x401030 <strtod@plt>
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:12
|
|||
|
401085: 8b 05 bd 2f 00 00 movl 12221(%rip), %eax # 0x404048 <a>
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:14
|
|||
|
40108b: bf 10 20 40 00 movl $4202512, %edi # imm = 0x402010
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:13
|
|||
|
401090: f2 0f 5e 05 88 0f 00 00 divsd 3976(%rip), %xmm0 # 0x402020 <__dso_handle+0x18>
|
|||
|
401098: f2 0f 11 05 a0 2f 00 00 movsd %xmm0, 12192(%rip) # 0x404040 <b>
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:12
|
|||
|
4010a0: 8d 70 01 leal 1(%rax), %esi
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:14
|
|||
|
4010a3: b8 01 00 00 00 movl $1, %eax
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:12
|
|||
|
4010a8: 89 35 9a 2f 00 00 movl %esi, 12186(%rip) # 0x404048 <a>
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:14
|
|||
|
4010ae: e8 8d ff ff ff callq 0x401040 <printf@plt>
|
|||
|
; /home/wcohen/present/202207youarehere/example.c:16
|
|||
|
4010b3: 31 c0 xorl %eax, %eax
|
|||
|
4010b5: 5b popq %rbx
|
|||
|
4010b6: c3
|
|||
|
```
|
|||
|
|
|||
|
The first instruction at `0x401060` maps to the original source code file `example.c` line 9, the opening `{` for the main function.
|
|||
|
|
|||
|
The next instruction `0x401061` maps to line 364 of `stdlib.h` line 364, the inlined `atoi` function. This is setting up one of the arguments to the later `strtol` call.
|
|||
|
|
|||
|
The instruction `0x401065` is also associated with the opening `{` of the main function.
|
|||
|
|
|||
|
Instructions `0x401068` and `0x40106d` set the remaining arguments for the `strtol` call that takes place at `0x40106f`. In this case, you can see that the compiler has reordered the instructions and causes some bouncing between line 9 of `example.c` and line 364, or the `stdlib.h` include file, as you step through the instructions on the debugger.
|
|||
|
|
|||
|
You can also see some mixing of instructions for lines 12, 13, and 14 from `example.c` in the output of `llvm-objdump` above. The compiler has moved the divide instructions (`0x40190`) for line 13 before some of the instructions for line 12 to hide the latency of the divide. As you step through the instructions in the debugger for this code, you see the debugger jump back and forth between lines rather than doing all the instructions from one line before moving on to the next line. Also notice as you step though that line 13 with the divide operation was not shown, but the divide definitely occurred to produce the output. You can see GDB bouncing between lines when stepping through the program's main function:
|
|||
|
|
|||
|
```
|
|||
|
(gdb) run 1 2
|
|||
|
Starting program: /home/wcohen/present/202207youarehere/example 1 2
|
|||
|
[Thread debugging using libthread_db enabled]
|
|||
|
Using host libthread_db library "/lib64/libthread_db.so.1".
|
|||
|
|
|||
|
Breakpoint 1, main (argc=3, argv=0x7fffffffdbe8) at /usr/include/stdlib.h:364
|
|||
|
364 return (int) strtol (__nptr, (char **) NULL, 10);
|
|||
|
(gdb) print $pc
|
|||
|
$10 = (void (*)()) 0x401060 <main>
|
|||
|
(gdb) next
|
|||
|
10 a = atoi(argv[1]);
|
|||
|
(gdb) print $pc
|
|||
|
$11 = (void (*)()) 0x401061 <main+1>
|
|||
|
(gdb) next
|
|||
|
11 b = atof(argv[2]);
|
|||
|
(gdb) print $pc
|
|||
|
$12 = (void (*)()) 0x401074 <main+20>
|
|||
|
(gdb) next
|
|||
|
10 a = atoi(argv[1]);
|
|||
|
(gdb) print $pc
|
|||
|
$13 = (void (*)()) 0x40107a <main+26>
|
|||
|
(gdb) next
|
|||
|
11 b = atof(argv[2]);
|
|||
|
(gdb) print $pc
|
|||
|
$14 = (void (*)()) 0x401080 <main+32>
|
|||
|
(gdb) next
|
|||
|
12 a = a + 1;
|
|||
|
(gdb) print $pc
|
|||
|
$15 = (void (*)()) 0x401085 <main+37>
|
|||
|
(gdb) next
|
|||
|
14 printf ("a = %d, b = %f\n", a, b);
|
|||
|
(gdb) print $pc
|
|||
|
$16 = (void (*)()) 0x4010ae <main+78>
|
|||
|
(gdb) next
|
|||
|
a = 2, b = 0.047619
|
|||
|
15 return 0;
|
|||
|
(gdb) print $pc
|
|||
|
$17 = (void (*)()) 0x4010b3 <main+83>
|
|||
|
```
|
|||
|
|
|||
|
With this simple example, you can see that the order of instructions does not match the original source code. When the program is running normally, you would never observe those changes. However, they are quite visible when using a debugger to step through the code. The boundaries between lines of code become blurred. This has other implications. When you decide to set a breakpoint to a line following a line with variable update, the compiler scheduler may have moved the variable after the location you expect the variable to be updated, and you don’t get the expected value for the variable at the breakpoint.
|
|||
|
|
|||
|
### Which of the instructions for a line get the breakpoint?
|
|||
|
|
|||
|
With the previous `example.c`, the compiler generated multiple instructions to implement individual lines of code. How does the debugger know which of those instructions should be the one that it places the breakpoint on? There’s an additional statement flag in the line information that marks the recommended locations to place the breakpoints. You can see those instructions marked with `S` in the column below `SBPE` in `eu-readelf --debug-dump=decodedline example`:
|
|||
|
|
|||
|
```
|
|||
|
DWARF section [31] '.debug_line' at offset 0x50fd:
|
|||
|
|
|||
|
CU [c] example.c
|
|||
|
line:col SBPE* disc isa op address (Statement Block Prologue Epilogue *End)
|
|||
|
/home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
|
|||
|
9:1 S 0 0 0 0x0000000000401060 <main>
|
|||
|
10:2 S 0 0 0 0x0000000000401060 <main>
|
|||
|
/usr/include/stdlib.h (mtime: 0, length: 0)
|
|||
|
362:1 S 0 0 0 0x0000000000401060 <main>
|
|||
|
364:3 S 0 0 0 0x0000000000401060 <main>
|
|||
|
/home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
|
|||
|
9:1 0 0 0 0x0000000000401060 <main>
|
|||
|
/usr/include/stdlib.h (mtime: 0, length: 0)
|
|||
|
364:16 0 0 0 0x0000000000401061 <main+0x1>
|
|||
|
364:16 0 0 0 0x0000000000401065 <main+0x5>
|
|||
|
/home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
|
|||
|
9:1 0 0 0 0x0000000000401065 <main+0x5>
|
|||
|
/usr/include/stdlib.h (mtime: 0, length: 0)
|
|||
|
364:16 0 0 0 0x0000000000401068 <main+0x8>
|
|||
|
364:16 0 0 0 0x000000000040106f <main+0xf>
|
|||
|
364:16 0 0 0 0x0000000000401074 <main+0x14>
|
|||
|
/usr/include/bits/stdlib-float.h (mtime: 0, length: 0)
|
|||
|
27:10 0 0 0 0x0000000000401074 <main+0x14>
|
|||
|
/usr/include/stdlib.h (mtime: 0, length: 0)
|
|||
|
364:10 0 0 0 0x000000000040107a <main+0x1a>
|
|||
|
/home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
|
|||
|
11:2 S 0 0 0 0x0000000000401080 <main+0x20>
|
|||
|
/usr/include/bits/stdlib-float.h (mtime: 0, length: 0)
|
|||
|
25:1 S 0 0 0 0x0000000000401080 <main+0x20>
|
|||
|
27:3 S 0 0 0 0x0000000000401080 <main+0x20>
|
|||
|
27:10 0 0 0 0x0000000000401080 <main+0x20>
|
|||
|
27:10 0 0 0 0x0000000000401085 <main+0x25>
|
|||
|
/home/wcohen/present/202207youarehere/example.c (mtime: 0, length: 0)
|
|||
|
12:2 S 0 0 0 0x0000000000401085 <main+0x25>
|
|||
|
12:8 0 0 0 0x0000000000401085 <main+0x25>
|
|||
|
14:2 0 0 0 0x000000000040108b <main+0x2b>
|
|||
|
13:8 0 0 0 0x0000000000401090 <main+0x30>
|
|||
|
13:4 0 0 0 0x0000000000401098 <main+0x38>
|
|||
|
12:8 0 0 0 0x00000000004010a0 <main+0x40>
|
|||
|
14:2 0 0 0 0x00000000004010a3 <main+0x43>
|
|||
|
12:4 0 0 0 0x00000000004010a8 <main+0x48>
|
|||
|
13:2 S 0 0 0 0x00000000004010ae <main+0x4e>
|
|||
|
14:2 S 0 0 0 0x00000000004010ae <main+0x4e>
|
|||
|
15:2 S 0 0 0 0x00000000004010b3 <main+0x53>
|
|||
|
16:1 0 0 0 0x00000000004010b3 <main+0x53>
|
|||
|
16:1 0 0 0 0x00000000004010b6 <main+0x56>
|
|||
|
16:1 * 0 0 0 0x00000000004010b6 <main+0x56>
|
|||
|
```
|
|||
|
|
|||
|
- Groups of instructions are delimited by the path to the source file for those instructions.
|
|||
|
- The left column contains the line number and column that the instruction maps back to, followed by the flags.
|
|||
|
- The hexadecimal number is the address of the instruction, followed by the offset into the function of the instruction.
|
|||
|
|
|||
|
If you look carefully at the output, you see that some instructions map back to multiple lines in the code. For example, `0x0000000000401060` maps to both line 9 and 10 of `example.c`. The same instruction also maps to lines 362 and 364 of `/usr/include/stdlib.h`. The mappings are not one-to-one. One line of source code may map to multiple instructions, and one instruction may map to multiple lines of code. When the debugger decides to print out a single line mapping for an instruction, it might not be the one that you expect.
|
|||
|
|
|||
|
### Merging and eliminating of lines
|
|||
|
|
|||
|
As you saw in the output of the detailed line mapping information, mappings are not one-to-one. There are cases where the compiler can eliminate instructions because they have no effect on the final result of the program. The compiler may also merge instructions from separate lines through optimizations, such as common subexpression elimination (CSE), and omit that the instruction could have come from more than one place in the code.
|
|||
|
|
|||
|
The following example was compiled on an x86_64 Fedora 36 machine, using GCC-12.2.1. Depending on the particular environment, you may not get the same results, because different versions of compilers may optimize the code differently.
|
|||
|
|
|||
|
Note the **if-else** statement in the code. Both have statements doing the same expensive divides. The compiler factors out the divide operation.
|
|||
|
|
|||
|
```
|
|||
|
#include <stdlib.h>
|
|||
|
#include <stdio.h>
|
|||
|
|
|||
|
int
|
|||
|
main(int argc, char* argv[])
|
|||
|
{
|
|||
|
int a,b,c;
|
|||
|
a = atoi(argv[1]);
|
|||
|
b = atoi(argv[2]);
|
|||
|
if (b) {
|
|||
|
c = 100/a;
|
|||
|
} else {
|
|||
|
c = 100/a;
|
|||
|
}
|
|||
|
printf ("a = %d, b = %d, c = %d\n", a, b, c);
|
|||
|
return 0;
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Looking at `objdump -dl whichline`, you see one divide operation in the binary:
|
|||
|
|
|||
|
```
|
|||
|
/home/wcohen/present/202207youarehere/whichline.c:13
|
|||
|
401085: b8 64 00 00 00 mov $0x64,%eax
|
|||
|
40108a: f7 fb idiv %ebx
|
|||
|
```
|
|||
|
|
|||
|
Line 13 is one of the lines with a divide, but you might suspect that there are other line numbers associated with those addresses. Look at the output of `eu-readelf --debug-dump=decodedline whichline` to see whether there are other line numbers associated with those addresses.
|
|||
|
|
|||
|
Line 11, where the other divide occurs, is not in this list:
|
|||
|
|
|||
|
```
|
|||
|
/usr/include/stdlib.h (mtime: 0, length: 0)
|
|||
|
364:16 0 0 0 0x0000000000401082 <main+0x32>
|
|||
|
364:16 0 0 0 0x0000000000401085 <main+0x35>
|
|||
|
/home/wcohen/present/202207youarehere/whichline.c (mtime: 0, length: 0)
|
|||
|
10:2 S 0 0 0 0x0000000000401085 <main+0x35>
|
|||
|
13:3 S 0 0 0 0x0000000000401085 <main+0x35>
|
|||
|
15:2 S 0 0 0 0x0000000000401085 <main+0x35>
|
|||
|
13:5 0 0 0 0x0000000000401085 <main+0x35>
|
|||
|
```
|
|||
|
|
|||
|
If the results are unused, the compiler may completely eliminate generating code for some lines.
|
|||
|
|
|||
|
Consider the following example, where the **else** clause computes `c = 100 * a`, but does not use it:
|
|||
|
|
|||
|
```
|
|||
|
#include <stdlib.h>
|
|||
|
#include <stdio.h>
|
|||
|
|
|||
|
int
|
|||
|
main(int argc, char* argv[])
|
|||
|
{
|
|||
|
int a,b,c;
|
|||
|
a = atoi(argv[1]);
|
|||
|
b = atoi(argv[2]);
|
|||
|
if (b) {
|
|||
|
c = 100/a;
|
|||
|
printf ("a = %d, b = %d, c = %d\n", a, b, c);
|
|||
|
} else {
|
|||
|
c = 100 * a;
|
|||
|
printf ("a = %d, b = %d\n", a, b);
|
|||
|
}
|
|||
|
return 0;
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Compile `eliminate.c` with GCC:
|
|||
|
|
|||
|
```
|
|||
|
$ gcc -O2 -g eliminate.c -o eliminate
|
|||
|
```
|
|||
|
|
|||
|
When looking through the output generated by `objdump -dl eliminate`, there’s no sign of the multiplication for `100 * a` (line 14) of `eliminate.c`. The compiler has determined that the value was not used and eliminated it.
|
|||
|
|
|||
|
When looking through the output of `objdump -dl eliminate`, there is no:
|
|||
|
|
|||
|
```
|
|||
|
/home/wcohen/present/202207youarehere/eliminate.c:14
|
|||
|
```
|
|||
|
|
|||
|
Maybe it’s hidden as one of the other views of line information. You can use `eu-readelf` with the `--debug-dump` option to get a complete view of the line information:
|
|||
|
|
|||
|
```
|
|||
|
$ eu-readelf --debug-dump=decodedline eliminate > eliminate.lines
|
|||
|
```
|
|||
|
|
|||
|
It turns out that GCC did record some mapping information. It seems that `0x4010a5` maps to the multiplication statement, in addition to the [printf][2] at line 15:
|
|||
|
|
|||
|
```
|
|||
|
/home/wcohen/present/202207youarehere/eliminate.c (mtime: 0, length: 0)
|
|||
|
…
|
|||
|
18:1 0 0 0 0x00000000004010a4 <main+0x54>
|
|||
|
14:3 S 0 0 0 0x00000000004010a5 <main+0x55>
|
|||
|
15:3 S 0 0 0 0x00000000004010a5 <main+0x55>
|
|||
|
15:3 0 0 0 0x00000000004010b0 <main+0x60>
|
|||
|
15:3 * 0 0 0 0x00000000004010b6 <main+0x66>
|
|||
|
```
|
|||
|
|
|||
|
### Optimization affects line information
|
|||
|
|
|||
|
The line information included into compiled binaries is helpful when pinpointing where in code the processor is. However, optimization can affect the line information, and what you see when debugging the code.
|
|||
|
|
|||
|
When using a debugger, expect that boundaries between lines of code are fuzzy, and the debugger is likely to bounce between them when stepping through the code. An instruction might map to multiple lines of source code, but the debugger may only reports one. The compiler may entirely eliminate instructions associated with a line of code, and it may or may not include line mapping information. The line information generated by the compiler is helpful, but keep in mind that something might be lost in translation.
|
|||
|
|
|||
|
--------------------------------------------------------------------------------
|
|||
|
|
|||
|
via: https://opensource.com/article/23/3/compiler-optimization-debugger-line-information
|
|||
|
|
|||
|
作者:[Will Cohen][a]
|
|||
|
选题:[lkxed][b]
|
|||
|
译者:[译者ID](https://github.com/译者ID)
|
|||
|
校对:[校对者ID](https://github.com/校对者ID)
|
|||
|
|
|||
|
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
|
|||
|
|
|||
|
[a]: https://opensource.com/users/wcohen
|
|||
|
[b]: https://github.com/lkxed/
|
|||
|
[1]: https://opensource.com/article/23/2/function-name
|
|||
|
[2]: https://opensource.com/article/20/8/printf
|