mirror of
https://github.com/mirror/tinycc.git
synced 2025-01-19 05:30:07 +08:00
c3701df16c
Date: Mon, 8 Jun 2009 19:06:56 +0800 From: Soloist Deng <soloist.deng-gmail-com> Subject: [Tinycc-devel] trying to fix the bug of unclean FPU st(0) Hi all: I am using tcc-0.9.25, and the FPU bug brought a big trouble to me. I read the source and tried to fix it. Below is my solution. There are two places where program(`o(0xd9dd)') will generates `fstp %st(1)': vpop() in tccgen.c:689 and save_reg() in tccgen.c:210. We should first change both of them to `o(0xd8dd) // fstp %st(0)'. But these changes are not enough. Let's check the following code. void foo() { double var = 2.7; var++; } Using the changed tcc will generate following machine code: .text:08000000 public foo .text:08000000 foo proc near .text:08000000 .text:08000000 var_18 = qword ptr -18h .text:08000000 var_10 = qword ptr -10h .text:08000000 var_8 = qword ptr -8 .text:08000000 .text:08000000 push ebp .text:08000001 mov ebp, esp .text:08000003 sub esp, 18h .text:08000009 nop .text:0800000A fld L_0 .text:08000010 fst [ebp+var_8] .text:08000013 fstp st(0) .text:08000015 fld [ebp+var_8] .text:08000018 fst [ebp+var_10] .text:0800001B fstp st(0) .text:0800001D fst [ebp+var_18] .text:08000020 fstp st(0) .text:08000022 fld L_1 .text:08000028 fadd [ebp+var_10] .text:0800002B fst [ebp+var_8] .text:0800002E fstp st(0) .text:08000030 leave .text:08000031 retn .text:08000031 foo endp .text:08000031 .text:08000031 _text ends -------------------------------------------------- .data:08000040 ; Segment type: Pure data .data:08000040 ; Segment permissions: Read/Write .data:08000040 ; Segment alignment '32byte' can not be represented in assembly .data:08000040 _data segment page public 'DATA' use32 .data:08000040 assume cs:_data .data:08000040 ;org 8000040h .data:08000040 L_0 dq 400599999999999Ah .data:08000048 L_1 dq 3FF0000000000000h .data:08000048 _data ends Please notice the code snippet from 0800000A to 08000020 // double var = 2.7; load constant to st(0) .text:0800000A fld L_0 // double var = 2.7; store st(0) to `var' .text:08000010 fst [ebp+var_8] // double var = 2.7; poping st(0) will empty the floating registers stack .text:08000013 fstp st(0) After that ,tcc will call `void inc(int post, int c)" in tccgen.c:2150, and produce 08000015 to 0800001B through the calling chain (inc ->gv_dup) // load from `var' to st(0) .text:08000015 fld [ebp+var_8] // store st(0) to a temporary location .text:08000018 fst [ebp+var_10] // poping st(0) will empty the floating registers stack .text:0800001B fstp st(0) And the calling chain (gen_op('+')->gen_opif('+')->gen_opf('+')->gv(rc=2)->get_reg(rc=2)->save_reg(r=3)) will produce 0800001D to 08000020 . // store st(0) to a temporary location, but floating stack is empty! .text:0800001D fst [ebp+var_18] // poping st(0) will empty the floating registers stack .text:08000020 fstp st(0) The `0800001D fst [ebp+var_18]' will store st(0) to a memory location, but st(0) is empty. That will cause FPU invalid operation exception(#IE). Why does tcc do that? Please read `gv_dup' called by `inc' carefully. Notice these lines: (1): r = gv(rc); (2): r1 = get_reg(rc); (3): sv.r = r; sv.c.ul = 0; (4) load(r1, &sv); /* move r to r1 */ (5) vdup(); /* duplicates value */ (6) vtop->r = r1; (1) let the vtop occupy TREG_ST0, and `r' will be TREG_ST0. (2) try to get a free floating register,but tcc assume there is only one, so it wil force vtop goto memory and assign `r1' with TREG_ST0. When executing (3), it will do nothing because `r' equals `r1'. (5) duplicates vtop. Then (6) let the new vtop occupy TREG_ST0, but this will cause problem because the old vtop has been moved to memory, so the new duplicated vtop does not reside in TREG_ST0 but also in memory after that. TREG_ST0 is not occupied but freely availabe now. `gen_op('+')' need at least one oprand in register, so it will incorrectly think TREG_ST0 is occupied by vtop and produce instructions(0800001D and 08000020) to store it to a temporary memory location. According program above, if `r' == `r1' it is impossible for the old vtop to still occupy the `r' register . And `load' will do nothing too at this condition. So the `gv_dup' can not promise the semantics that old vtop in one register and the new duplicated vtop in another register at the same time. I changed (6) to if (r != r1) { vtop->r = r1; } Then the new generated machine code will be : .text:08000000 push ebp .text:08000001 mov ebp, esp .text:08000003 sub esp, 10h .text:08000009 nop .text:0800000A fld L_0 .text:08000010 fst [ebp+var_8] .text:08000013 fstp st(0) .text:08000015 fld [ebp+var_8] .text:08000018 fst [ebp+var_10] .text:0800001B fstp st(0) .text:0800001D fld L_1 .text:08000023 fadd [ebp+var_10] .text:08000026 fst [ebp+var_8] .text:08000029 fstp st(0) .text:0800002B leave .text:0800002C retn It works well, and will clean the floating registers stack when return. Finally, I want to know there is any potential problem of this fixing ? soloist |
||
---|---|---|
examples | ||
include | ||
lib | ||
tests | ||
win32 | ||
.cvsignore | ||
arm-gen.c | ||
c67-gen.c | ||
Changelog | ||
coff.h | ||
configure | ||
COPYING | ||
elf.h | ||
i386-asm.c | ||
i386-asm.h | ||
i386-gen.c | ||
il-gen.c | ||
il-opcodes.h | ||
libtcc.c | ||
libtcc.h | ||
Makefile | ||
README | ||
stab.def | ||
stab.h | ||
tcc-doc.texi | ||
tcc.c | ||
tcc.h | ||
tccasm.c | ||
tcccoff.c | ||
tccelf.c | ||
tccgen.c | ||
tccpe.c | ||
tccpp.c | ||
tcctok.h | ||
texi2pod.pl | ||
TODO | ||
VERSION | ||
x86_64-gen.c |
Tiny C Compiler - C Scripting Everywhere - The Smallest ANSI C compiler ----------------------------------------------------------------------- Features: -------- - SMALL! You can compile and execute C code everywhere, for example on rescue disks. - FAST! tcc generates optimized x86 code. No byte code overhead. Compile, assemble and link about 7 times faster than 'gcc -O0'. - UNLIMITED! Any C dynamic library can be used directly. TCC is heading torward full ISOC99 compliance. TCC can of course compile itself. - SAFE! tcc includes an optional memory and bound checker. Bound checked code can be mixed freely with standard code. - Compile and execute C source directly. No linking or assembly necessary. Full C preprocessor included. - C script supported : just add '#!/usr/local/bin/tcc -run' at the first line of your C source, and execute it directly from the command line. Documentation: ------------- 1) Installation on a i386 Linux host (for Windows read tcc-win32.txt) ./configure make make test make install By default, tcc is installed in /usr/local/bin. ./configure --help shows configuration options. 2) Introduction We assume here that you know ANSI C. Look at the example ex1.c to know what the programs look like. The include file <tcclib.h> can be used if you want a small basic libc include support (especially useful for floppy disks). Of course, you can also use standard headers, although they are slower to compile. You can begin your C script with '#!/usr/local/bin/tcc -run' on the first line and set its execute bits (chmod a+x your_script). Then, you can launch the C code as a shell or perl script :-) The command line arguments are put in 'argc' and 'argv' of the main functions, as in ANSI C. 3) Examples ex1.c: simplest example (hello world). Can also be launched directly as a script: './ex1.c'. ex2.c: more complicated example: find a number with the four operations given a list of numbers (benchmark). ex3.c: compute fibonacci numbers (benchmark). ex4.c: more complicated: X11 program. Very complicated test in fact because standard headers are being used ! ex5.c: 'hello world' with standard glibc headers. tcc.c: TCC can of course compile itself. Used to check the code generator. tcctest.c: auto test for TCC which tests many subtle possible bugs. Used when doing 'make test'. 4) Full Documentation Please read tcc-doc.html to have all the features of TCC. Additional information is available for the Windows port in tcc-win32.txt. License: ------- TCC is distributed under the GNU Lesser General Public License (see COPYING file). Fabrice Bellard.