trying to fix the bug of unclean FPU st(0)

mirror of https://github.com/mirror/tinycc.git synced 2025-03-24 10:00:07 +08:00

Date: Mon, 8 Jun 2009 19:06:56 +0800
From: Soloist Deng <soloist.deng-gmail-com>
Subject: [Tinycc-devel] trying to fix the bug of unclean FPU st(0)

Hi all:

   I  am using  tcc-0.9.25, and the FPU bug brought a big trouble to
me. I read the source and tried to fix it.
Below is my solution.

 There are two places where program(`o(0xd9dd)') will generates `fstp
%st(1)': vpop() in tccgen.c:689 and save_reg() in tccgen.c:210.
We should first change both of them to `o(0xd8dd) // fstp %st(0)'.
But these changes are not enough.  Let's check the following code.

void foo()
{
 double var = 2.7;
 var++;
}

Using  the changed tcc will generate following machine code:

.text:08000000                 public foo
.text:08000000 foo             proc near
.text:08000000
.text:08000000 var_18          = qword ptr -18h
.text:08000000 var_10          = qword ptr -10h
.text:08000000 var_8           = qword ptr -8
.text:08000000
.text:08000000                 push    ebp
.text:08000001                 mov     ebp, esp
.text:08000003                 sub     esp, 18h
.text:08000009                 nop
.text:0800000A                 fld     L_0
.text:08000010                 fst     [ebp+var_8]
.text:08000013                 fstp    st(0)
.text:08000015                 fld     [ebp+var_8]
.text:08000018                 fst     [ebp+var_10]
.text:0800001B                 fstp    st(0)
.text:0800001D                 fst     [ebp+var_18]
.text:08000020                 fstp    st(0)
.text:08000022                 fld     L_1
.text:08000028                 fadd    [ebp+var_10]
.text:0800002B                 fst     [ebp+var_8]
.text:0800002E                 fstp    st(0)
.text:08000030                 leave
.text:08000031                 retn
.text:08000031 foo             endp
.text:08000031
.text:08000031 _text           ends
--------------------------------------------------
.data:08000040 ; Segment type: Pure data
.data:08000040 ; Segment permissions: Read/Write
.data:08000040 ; Segment alignment '32byte' can not be represented in assembly
.data:08000040 _data           segment page public 'DATA' use32
.data:08000040                 assume cs:_data
.data:08000040                 ;org 8000040h
.data:08000040 L_0             dq 400599999999999Ah
.data:08000048 L_1             dq 3FF0000000000000h
.data:08000048 _data           ends

Please notice the code snippet from 0800000A  to 08000020
// double var = 2.7; load constant to st(0)
.text:0800000A                 fld     L_0
// double var = 2.7; store st(0) to `var'
.text:08000010                 fst     [ebp+var_8]
// double var = 2.7; poping st(0)  will empty the floating registers stack
.text:08000013                 fstp    st(0)

  After that ,tcc will call `void inc(int post, int c)" in
tccgen.c:2150, and produce 08000015 to 0800001B through the calling
chain (inc ->gv_dup)
// load from `var' to st(0)
.text:08000015                 fld     [ebp+var_8]
// store st(0) to a temporary location
.text:08000018                 fst     [ebp+var_10]
// poping st(0)  will empty the floating registers stack
.text:0800001B                 fstp    st(0)

  And the calling chain
(gen_op('+')->gen_opif('+')->gen_opf('+')->gv(rc=2)->get_reg(rc=2)->save_reg(r=3))
will produce 0800001D to 08000020 .
// store st(0) to a temporary location, but floating stack is empty!
.text:0800001D                 fst     [ebp+var_18]
// poping st(0)  will empty the floating registers stack
.text:08000020                 fstp    st(0)

   The `0800001D   fst     [ebp+var_18]' will store st(0) to a memory
location, but st(0) is empty. That will cause  FPU invalid operation
exception(#IE).
Why does tcc do that? Please read `gv_dup' called by `inc' carefully.
Notice these lines:

(1):        r = gv(rc);
(2):        r1 = get_reg(rc);
(3):        sv.r = r;
            sv.c.ul = 0;
(4)         load(r1, &sv); /* move r to r1 */
(5)         vdup();
            /* duplicates value */
(6)         vtop->r = r1;

 (1)  let the vtop occupy TREG_ST0, and `r' will be TREG_ST0.  (2)
try to get a free floating register,but tcc assume
there is only one, so it wil force vtop goto memory and assign `r1'
with TREG_ST0. When executing (3), it will do nothing
because `r' equals `r1'. (5) duplicates vtop.  Then (6) let the new
vtop occupy TREG_ST0, but this will cause problem
because the old vtop has been moved to memory, so the new duplicated
vtop does not reside in TREG_ST0 but also
in memory after that. TREG_ST0 is not occupied but freely availabe
now.   `gen_op('+')'  need at least one oprand in register,
so it will incorrectly think TREG_ST0 is occupied by vtop and produce
instructions(0800001D and 08000020) to store it to
a temporary memory location.

  According program above, if `r' == `r1' it is impossible for the old
vtop to still occupy the `r' register .  And `load' will do nothing
too at this condition.
So the `gv_dup' can not promise the semantics that old vtop in one
register and the new duplicated vtop in another register at the same
time.

  I changed (6) to
if (r != r1)
{
 vtop->r = r1;
}

  Then the new generated machine code will be :

.text:08000000                 push    ebp
.text:08000001                 mov     ebp, esp
.text:08000003                 sub     esp, 10h
.text:08000009                 nop
.text:0800000A                 fld     L_0
.text:08000010                 fst     [ebp+var_8]
.text:08000013                 fstp    st(0)
.text:08000015                 fld     [ebp+var_8]
.text:08000018                 fst     [ebp+var_10]
.text:0800001B                 fstp    st(0)
.text:0800001D                 fld     L_1
.text:08000023                 fadd    [ebp+var_10]
.text:08000026                 fst     [ebp+var_8]
.text:08000029                 fstp    st(0)
.text:0800002B                 leave
.text:0800002C                 retn

 It works well, and will clean the floating registers stack when return.
 Finally, I want to know there is any potential problem of this fixing ?

soloist

This commit is contained in:

Soloist Deng

2009-06-08 19:26:19 +02:00

committed by

grischka

parent a342bbadc8

commit c3701df16c

1 changed files with 4 additions and 3 deletions

									
										7

tccgen.c
									
										View File
										
				@ -207,7 +207,7 @@ void save_reg(int r)

				#if defined(TCC_TARGET_I386) || defined(TCC_TARGET_X86_64)

				                /* x86 specific: need to pop fp register ST0 if saved */

				                if (r == TREG_ST0) {

				                    o(0xd9dd); /* fstp %st(1) */

				                    o(0xd8dd); /* fstp %st(0) */

				                }

				#endif

				#ifndef TCC_TARGET_X86_64

				@ -686,7 +686,7 @@ void vpop(void)

				#if defined(TCC_TARGET_I386) || defined(TCC_TARGET_X86_64)

				    /* for x86, we need to pop the FP stack */

				    if (v == TREG_ST0 && !nocode_wanted) {

				        o(0xd9dd); /* fstp %st(1) */

				        o(0xd8dd); /* fstp %st(0) */

				    } else

				#endif

				    if (v == VT_JMP || v == VT_JMPI) {

				@ -738,7 +738,8 @@ void gv_dup(void)

				        load(r1, &sv); /* move r to r1 */

				        vdup();

				        /* duplicates value */

				        vtop->r = r1;

				        if (r != r1)

				            vtop->r = r1;

				    }

				}

trying to fix the bug of unclean FPU st(0)

7 tccgen.c Unescape Escape View File

7

tccgen.c

View File