Go to file
Soloist Deng c3701df16c trying to fix the bug of unclean FPU st(0)
Date: Mon, 8 Jun 2009 19:06:56 +0800
From: Soloist Deng <soloist.deng-gmail-com>
Subject: [Tinycc-devel] trying to fix the bug of unclean FPU st(0)

Hi all:

   I  am using  tcc-0.9.25, and the FPU bug brought a big trouble to
me. I read the source and tried to fix it.
Below is my solution.

 There are two places where program(`o(0xd9dd)') will generates `fstp
%st(1)': vpop() in tccgen.c:689 and save_reg() in tccgen.c:210.
We should first change both of them to `o(0xd8dd) // fstp %st(0)'.
But these changes are not enough.  Let's check the following code.

void foo()
{
 double var = 2.7;
 var++;
}

Using  the changed tcc will generate following machine code:

.text:08000000                 public foo
.text:08000000 foo             proc near
.text:08000000
.text:08000000 var_18          = qword ptr -18h
.text:08000000 var_10          = qword ptr -10h
.text:08000000 var_8           = qword ptr -8
.text:08000000
.text:08000000                 push    ebp
.text:08000001                 mov     ebp, esp
.text:08000003                 sub     esp, 18h
.text:08000009                 nop
.text:0800000A                 fld     L_0
.text:08000010                 fst     [ebp+var_8]
.text:08000013                 fstp    st(0)
.text:08000015                 fld     [ebp+var_8]
.text:08000018                 fst     [ebp+var_10]
.text:0800001B                 fstp    st(0)
.text:0800001D                 fst     [ebp+var_18]
.text:08000020                 fstp    st(0)
.text:08000022                 fld     L_1
.text:08000028                 fadd    [ebp+var_10]
.text:0800002B                 fst     [ebp+var_8]
.text:0800002E                 fstp    st(0)
.text:08000030                 leave
.text:08000031                 retn
.text:08000031 foo             endp
.text:08000031
.text:08000031 _text           ends
--------------------------------------------------
.data:08000040 ; Segment type: Pure data
.data:08000040 ; Segment permissions: Read/Write
.data:08000040 ; Segment alignment '32byte' can not be represented in assembly
.data:08000040 _data           segment page public 'DATA' use32
.data:08000040                 assume cs:_data
.data:08000040                 ;org 8000040h
.data:08000040 L_0             dq 400599999999999Ah
.data:08000048 L_1             dq 3FF0000000000000h
.data:08000048 _data           ends

Please notice the code snippet from 0800000A  to 08000020
// double var = 2.7; load constant to st(0)
.text:0800000A                 fld     L_0
// double var = 2.7; store st(0) to `var'
.text:08000010                 fst     [ebp+var_8]
// double var = 2.7; poping st(0)  will empty the floating registers stack
.text:08000013                 fstp    st(0)

  After that ,tcc will call `void inc(int post, int c)" in
tccgen.c:2150, and produce 08000015 to 0800001B through the calling
chain (inc ->gv_dup)
// load from `var' to st(0)
.text:08000015                 fld     [ebp+var_8]
// store st(0) to a temporary location
.text:08000018                 fst     [ebp+var_10]
// poping st(0)  will empty the floating registers stack
.text:0800001B                 fstp    st(0)

  And the calling chain
(gen_op('+')->gen_opif('+')->gen_opf('+')->gv(rc=2)->get_reg(rc=2)->save_reg(r=3))
will produce 0800001D to 08000020 .
// store st(0) to a temporary location, but floating stack is empty!
.text:0800001D                 fst     [ebp+var_18]
// poping st(0)  will empty the floating registers stack
.text:08000020                 fstp    st(0)

   The `0800001D   fst     [ebp+var_18]' will store st(0) to a memory
location, but st(0) is empty. That will cause  FPU invalid operation
exception(#IE).
Why does tcc do that? Please read `gv_dup' called by `inc' carefully.
Notice these lines:

(1):        r = gv(rc);
(2):        r1 = get_reg(rc);
(3):        sv.r = r;
            sv.c.ul = 0;
(4)         load(r1, &sv); /* move r to r1 */
(5)         vdup();
            /* duplicates value */
(6)         vtop->r = r1;

 (1)  let the vtop occupy TREG_ST0, and `r' will be TREG_ST0.  (2)
try to get a free floating register,but tcc assume
there is only one, so it wil force vtop goto memory and assign `r1'
with TREG_ST0. When executing (3), it will do nothing
because `r' equals `r1'. (5) duplicates vtop.  Then (6) let the new
vtop occupy TREG_ST0, but this will cause problem
because the old vtop has been moved to memory, so the new duplicated
vtop does not reside in TREG_ST0 but also
in memory after that. TREG_ST0 is not occupied but freely availabe
now.   `gen_op('+')'  need at least one oprand in register,
so it will incorrectly think TREG_ST0 is occupied by vtop and produce
instructions(0800001D and 08000020) to store it to
a temporary memory location.

  According program above, if `r' == `r1' it is impossible for the old
vtop to still occupy the `r' register .  And `load' will do nothing
too at this condition.
So the `gv_dup' can not promise the semantics that old vtop in one
register and the new duplicated vtop in another register at the same
time.

  I changed (6) to
if (r != r1)
{
 vtop->r = r1;
}

  Then the new generated machine code will be :

.text:08000000                 push    ebp
.text:08000001                 mov     ebp, esp
.text:08000003                 sub     esp, 10h
.text:08000009                 nop
.text:0800000A                 fld     L_0
.text:08000010                 fst     [ebp+var_8]
.text:08000013                 fstp    st(0)
.text:08000015                 fld     [ebp+var_8]
.text:08000018                 fst     [ebp+var_10]
.text:0800001B                 fstp    st(0)
.text:0800001D                 fld     L_1
.text:08000023                 fadd    [ebp+var_10]
.text:08000026                 fst     [ebp+var_8]
.text:08000029                 fstp    st(0)
.text:0800002B                 leave
.text:0800002C                 retn

 It works well, and will clean the floating registers stack when return.
 Finally, I want to know there is any potential problem of this fixing ?

soloist
2009-06-17 02:09:26 +02:00
examples fix makefiles etc for subdirs 2009-04-18 15:08:03 +02:00
include drop alloca #define 2009-05-16 22:30:13 +02:00
lib x86-64: Align return value of alloca by 16. 2009-06-11 08:33:41 +09:00
tests drop alloca #define 2009-05-16 22:30:13 +02:00
win32 drop alloca #define 2009-05-16 22:30:13 +02:00
.cvsignore update 2005-04-15 00:11:02 +00:00
arm-gen.c ARM: fix big immediate offset construction 2009-05-11 18:54:14 +02:00
c67-gen.c Set VT_LVAL_xxx flags for function arguments in gfunc_prolog (Daniel Glöckner) 2008-09-12 02:36:32 +02:00
Changelog ulibc: #define TCC_UCLIBC and load elf_interp 2009-05-16 22:29:40 +02:00
coff.h C67 COFF executable format support (TK) 2004-10-05 22:33:55 +00:00
configure fix makefiles etc for subdirs 2009-04-18 15:08:03 +02:00
COPYING changed license to LGPL 2003-05-24 14:18:56 +00:00
elf.h Imported several macros required by x86-64 2008-12-02 02:25:54 +01:00
i386-asm.c get rid of a warning and fix .bat 2008-03-25 21:05:48 +00:00
i386-asm.h push/pop fixes - added fxsave and fxrstor 2004-10-18 00:14:16 +00:00
i386-gen.c move some global variables into TCCState 2009-05-11 18:45:44 +02:00
il-gen.c Set VT_LVAL_xxx flags for function arguments in gfunc_prolog (Daniel Glöckner) 2008-09-12 02:36:32 +02:00
il-opcodes.h added CIL target 2002-02-10 16:14:03 +00:00
libtcc.c drop alloca #define 2009-05-16 22:30:13 +02:00
libtcc.h libtcc: add support to be build as DLL 2009-04-18 15:08:03 +02:00
Makefile x86-64: Add alloca. 2009-06-09 03:23:08 +09:00
README win32: readme.txt->tcc-win32.txt, update tcc-doc 2009-04-18 15:08:03 +02:00
stab.def added 2002-12-08 14:36:36 +00:00
stab.h added 2002-12-08 14:36:36 +00:00
tcc-doc.texi win32: readme.txt->tcc-win32.txt, update tcc-doc 2009-04-18 15:08:03 +02:00
tcc.c enable making tcc using libtcc 2009-05-11 18:46:02 +02:00
tcc.h fix build with msvc 2009-05-11 18:53:52 +02:00
tccasm.c Import more changesets from Rob Landley's fork (part 2) 2007-11-21 17:16:31 +00:00
tcccoff.c fix unused/uninitalized warnings 2009-05-11 18:46:39 +02:00
tccelf.c tccelf: accept BSS symbol with same name from other module 2009-06-17 02:08:54 +02:00
tccgen.c trying to fix the bug of unclean FPU st(0) 2009-06-17 02:09:26 +02:00
tccpe.c move some global variables into TCCState 2009-05-11 18:45:44 +02:00
tccpp.c fix "cached include" optimization 2009-05-11 18:55:16 +02:00
tcctok.h drop alloca #define 2009-05-16 22:30:13 +02:00
texi2pod.pl automatic man page generation from tcc-doc.texi 2003-05-18 18:11:06 +00:00
TODO Udated and cleaned up TODO. 2008-01-16 22:33:56 +00:00
VERSION update Changelog, bump version: 0.9.25 2009-05-11 19:01:26 +02:00
x86_64-gen.c Improve the test coverage: !val for float/double/long long f. 2009-04-18 15:08:01 +02:00

Tiny C Compiler - C Scripting Everywhere - The Smallest ANSI C compiler
-----------------------------------------------------------------------

Features:
--------

- SMALL! You can compile and execute C code everywhere, for example on
  rescue disks.

- FAST! tcc generates optimized x86 code. No byte code
  overhead. Compile, assemble and link about 7 times faster than 'gcc
  -O0'.

- UNLIMITED! Any C dynamic library can be used directly. TCC is
  heading torward full ISOC99 compliance. TCC can of course compile
  itself.

- SAFE! tcc includes an optional memory and bound checker. Bound
  checked code can be mixed freely with standard code.

- Compile and execute C source directly. No linking or assembly
  necessary. Full C preprocessor included. 

- C script supported : just add '#!/usr/local/bin/tcc -run' at the first
  line of your C source, and execute it directly from the command
  line.

Documentation:
-------------

1) Installation on a i386 Linux host (for Windows read tcc-win32.txt)

   ./configure
   make
   make test
   make install

By default, tcc is installed in /usr/local/bin.
./configure --help  shows configuration options.


2) Introduction

We assume here that you know ANSI C. Look at the example ex1.c to know
what the programs look like.

The include file <tcclib.h> can be used if you want a small basic libc
include support (especially useful for floppy disks). Of course, you
can also use standard headers, although they are slower to compile.

You can begin your C script with '#!/usr/local/bin/tcc -run' on the first
line and set its execute bits (chmod a+x your_script). Then, you can
launch the C code as a shell or perl script :-) The command line
arguments are put in 'argc' and 'argv' of the main functions, as in
ANSI C.

3) Examples

ex1.c: simplest example (hello world). Can also be launched directly
as a script: './ex1.c'.

ex2.c: more complicated example: find a number with the four
operations given a list of numbers (benchmark).

ex3.c: compute fibonacci numbers (benchmark).

ex4.c: more complicated: X11 program. Very complicated test in fact
because standard headers are being used !

ex5.c: 'hello world' with standard glibc headers.

tcc.c: TCC can of course compile itself. Used to check the code
generator.

tcctest.c: auto test for TCC which tests many subtle possible bugs. Used
when doing 'make test'.

4) Full Documentation

Please read tcc-doc.html to have all the features of TCC.

Additional information is available for the Windows port in tcc-win32.txt.

License:
-------

TCC is distributed under the GNU Lesser General Public License (see
COPYING file).

Fabrice Bellard.