mirror of
https://github.com/mirror/tinycc.git
synced 2025-03-24 10:00:07 +08:00
trying to fix the bug of unclean FPU st(0)
Date: Mon, 8 Jun 2009 19:06:56 +0800 From: Soloist Deng <soloist.deng-gmail-com> Subject: [Tinycc-devel] trying to fix the bug of unclean FPU st(0) Hi all: I am using tcc-0.9.25, and the FPU bug brought a big trouble to me. I read the source and tried to fix it. Below is my solution. There are two places where program(`o(0xd9dd)') will generates `fstp %st(1)': vpop() in tccgen.c:689 and save_reg() in tccgen.c:210. We should first change both of them to `o(0xd8dd) // fstp %st(0)'. But these changes are not enough. Let's check the following code. void foo() { double var = 2.7; var++; } Using the changed tcc will generate following machine code: .text:08000000 public foo .text:08000000 foo proc near .text:08000000 .text:08000000 var_18 = qword ptr -18h .text:08000000 var_10 = qword ptr -10h .text:08000000 var_8 = qword ptr -8 .text:08000000 .text:08000000 push ebp .text:08000001 mov ebp, esp .text:08000003 sub esp, 18h .text:08000009 nop .text:0800000A fld L_0 .text:08000010 fst [ebp+var_8] .text:08000013 fstp st(0) .text:08000015 fld [ebp+var_8] .text:08000018 fst [ebp+var_10] .text:0800001B fstp st(0) .text:0800001D fst [ebp+var_18] .text:08000020 fstp st(0) .text:08000022 fld L_1 .text:08000028 fadd [ebp+var_10] .text:0800002B fst [ebp+var_8] .text:0800002E fstp st(0) .text:08000030 leave .text:08000031 retn .text:08000031 foo endp .text:08000031 .text:08000031 _text ends -------------------------------------------------- .data:08000040 ; Segment type: Pure data .data:08000040 ; Segment permissions: Read/Write .data:08000040 ; Segment alignment '32byte' can not be represented in assembly .data:08000040 _data segment page public 'DATA' use32 .data:08000040 assume cs:_data .data:08000040 ;org 8000040h .data:08000040 L_0 dq 400599999999999Ah .data:08000048 L_1 dq 3FF0000000000000h .data:08000048 _data ends Please notice the code snippet from 0800000A to 08000020 // double var = 2.7; load constant to st(0) .text:0800000A fld L_0 // double var = 2.7; store st(0) to `var' .text:08000010 fst [ebp+var_8] // double var = 2.7; poping st(0) will empty the floating registers stack .text:08000013 fstp st(0) After that ,tcc will call `void inc(int post, int c)" in tccgen.c:2150, and produce 08000015 to 0800001B through the calling chain (inc ->gv_dup) // load from `var' to st(0) .text:08000015 fld [ebp+var_8] // store st(0) to a temporary location .text:08000018 fst [ebp+var_10] // poping st(0) will empty the floating registers stack .text:0800001B fstp st(0) And the calling chain (gen_op('+')->gen_opif('+')->gen_opf('+')->gv(rc=2)->get_reg(rc=2)->save_reg(r=3)) will produce 0800001D to 08000020 . // store st(0) to a temporary location, but floating stack is empty! .text:0800001D fst [ebp+var_18] // poping st(0) will empty the floating registers stack .text:08000020 fstp st(0) The `0800001D fst [ebp+var_18]' will store st(0) to a memory location, but st(0) is empty. That will cause FPU invalid operation exception(#IE). Why does tcc do that? Please read `gv_dup' called by `inc' carefully. Notice these lines: (1): r = gv(rc); (2): r1 = get_reg(rc); (3): sv.r = r; sv.c.ul = 0; (4) load(r1, &sv); /* move r to r1 */ (5) vdup(); /* duplicates value */ (6) vtop->r = r1; (1) let the vtop occupy TREG_ST0, and `r' will be TREG_ST0. (2) try to get a free floating register,but tcc assume there is only one, so it wil force vtop goto memory and assign `r1' with TREG_ST0. When executing (3), it will do nothing because `r' equals `r1'. (5) duplicates vtop. Then (6) let the new vtop occupy TREG_ST0, but this will cause problem because the old vtop has been moved to memory, so the new duplicated vtop does not reside in TREG_ST0 but also in memory after that. TREG_ST0 is not occupied but freely availabe now. `gen_op('+')' need at least one oprand in register, so it will incorrectly think TREG_ST0 is occupied by vtop and produce instructions(0800001D and 08000020) to store it to a temporary memory location. According program above, if `r' == `r1' it is impossible for the old vtop to still occupy the `r' register . And `load' will do nothing too at this condition. So the `gv_dup' can not promise the semantics that old vtop in one register and the new duplicated vtop in another register at the same time. I changed (6) to if (r != r1) { vtop->r = r1; } Then the new generated machine code will be : .text:08000000 push ebp .text:08000001 mov ebp, esp .text:08000003 sub esp, 10h .text:08000009 nop .text:0800000A fld L_0 .text:08000010 fst [ebp+var_8] .text:08000013 fstp st(0) .text:08000015 fld [ebp+var_8] .text:08000018 fst [ebp+var_10] .text:0800001B fstp st(0) .text:0800001D fld L_1 .text:08000023 fadd [ebp+var_10] .text:08000026 fst [ebp+var_8] .text:08000029 fstp st(0) .text:0800002B leave .text:0800002C retn It works well, and will clean the floating registers stack when return. Finally, I want to know there is any potential problem of this fixing ? soloist
This commit is contained in:
parent
a342bbadc8
commit
c3701df16c
7
tccgen.c
7
tccgen.c
@ -207,7 +207,7 @@ void save_reg(int r)
|
||||
#if defined(TCC_TARGET_I386) || defined(TCC_TARGET_X86_64)
|
||||
/* x86 specific: need to pop fp register ST0 if saved */
|
||||
if (r == TREG_ST0) {
|
||||
o(0xd9dd); /* fstp %st(1) */
|
||||
o(0xd8dd); /* fstp %st(0) */
|
||||
}
|
||||
#endif
|
||||
#ifndef TCC_TARGET_X86_64
|
||||
@ -686,7 +686,7 @@ void vpop(void)
|
||||
#if defined(TCC_TARGET_I386) || defined(TCC_TARGET_X86_64)
|
||||
/* for x86, we need to pop the FP stack */
|
||||
if (v == TREG_ST0 && !nocode_wanted) {
|
||||
o(0xd9dd); /* fstp %st(1) */
|
||||
o(0xd8dd); /* fstp %st(0) */
|
||||
} else
|
||||
#endif
|
||||
if (v == VT_JMP || v == VT_JMPI) {
|
||||
@ -738,7 +738,8 @@ void gv_dup(void)
|
||||
load(r1, &sv); /* move r to r1 */
|
||||
vdup();
|
||||
/* duplicates value */
|
||||
vtop->r = r1;
|
||||
if (r != r1)
|
||||
vtop->r = r1;
|
||||
}
|
||||
}
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user