For vstack Fabrice used the trick to initialize vtop to &vstack[-1], so
that on first push, vtop becomes &vstack[0] and a value is also stored
there - everything works.
Except that when tcc is compiled with bounds-checking enabled, vstack - 1
returns INVALID_POINTER and oops...
Let's workaround it with artificial 1 vstack slot which will not be
used, but only serve as an indicator that pointing to &vstack[-1] is ok.
Now, tcc, after being self-compiled with -b works:
$ ./tcc -B. -o tccb -DONE_SOURCE -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" tcc.c -ldl
$ cd tests
$ ../tcc -B.. -run tcctest.c >1
$ ../tccb -B.. -run tcctest.c >2
$ diff -u 1 2
and note, tcc's compilation speed is not affected:
$ ./tcc -B. -bench -DONE_SOURCE -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -c tcc.c
before: 8270 idents, 47221 lines, 1527730 bytes, 0.152 s, 309800 lines/s, 10.0 MB/s
after: 8271 idents, 47221 lines, 1527733 bytes, 0.152 s, 310107 lines/s, 10.0 MB/s
But note, that `tcc -b -run tcc` is still broken - for example it crashes
on
$ cat x.c
double get100 () { return 100.0; }
$ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -run \
-DONE_SOURCE ./tcc.c -B. -c x.c
Runtime error: dereferencing invalid pointer
./tccpp.c:1953: at 0xa7beebdf parse_number() (included from ./libtcc.c, ./tcc.c)
./tccpp.c:3003: by 0xa7bf0708 next() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:4465: by 0xa7bfe348 block() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:4440: by 0xa7bfe212 block() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:5529: by 0xa7c01929 gen_function() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:5767: by 0xa7c02602 decl0() (included from ./libtcc.c, ./tcc.c)
that's because lib/bcheck.c runtime needs more fixes -- see next
patches.
Continuing d6072d37 (Add __builtin_frame_address(0)) implement
__builtin_frame_address for levels greater than zero, in order for
tinycc to be able to compile its own lib/bcheck.c after
cffb7af9 (lib/bcheck: Prevent __bound_local_new / __bound_local_delete
from being miscompiled).
I'm new to the internals, and used the most simple way to do it.
Generated code is not very good for levels >= 2, compare
gcc tcc
level=0 mov %ebp,%eax lea 0x0(%ebp),%eax
level=1 mov 0x0(%ebp),%eax mov 0x0(%ebp),%eax
level=2 mov 0x0(%ebp),%eax mov 0x0(%ebp),%eax
mov (%eax),%eax mov %eax,-0x10(%ebp)
mov -0x10(%ebp),%eax
mov (%eax),%eax
level=3 mov 0x0(%ebp),%eax mov 0x0(%ebp),%eax
mov (%eax),%eax mov (%eax),%ecx
mov (%eax),%eax mov (%ecx),%eax
But this is still an improvement and for bcheck we need level=1 for
which the code is good.
For the tests I had to force gcc use -O0 to not inline the functions.
And -fno-omit-frame-pointer just in case.
If someone knows how to improve the generated code - help is
appreciated.
Thanks,
Kirill
Cc: Michael Matz <matz@suse.de>
Cc: Shinichiro Hamaji <shinichiro.hamaji@gmail.com>
Currently, VLA are not forbidden for static variable. This leads to
problems even if for fixed-size array when the size expression uses the
ternary operator (cond ? then-value : else-value) because it is parsed
as a general expression which leads to code generated in this case.
This commit solve the problem by forbidding VLA for static variables.
Although not required for the fix, avoiding code generation when the
expression is constant would be a nice addition though.
The code for shifts is now similar to code for binary arithmetic operations,
except that only the first argument is considered, as required by the ISO C
standard.
On 2012-06-26 15:07:57 +0200, Vincent Lefevre wrote:
> ISO C99 TC3 says: [6.5.7#3] "The integer promotions are performed on
> each of the operands. The type of the result is that of the promoted
> left operand."
I've written a patch (attached). Now the shift problems no longer
occur with the testcase and with GNU MPFR's "make check".
--
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Sometimes the result of a comparison is not directly used in a jump,
but in arithmetic or further comparisons. If those further things
do a vswap() with the VT_CMP as current top, and then generate
instructions for the new top, this most probably destroys the flags
(e.g. if it's a bitfield load like in the example).
vswap() must do the same like vsetc() and not allow VT_CMP vtops
to be moved down.
This matters when sizeof is directly used in arithmetic,
ala "uintptr_t t; t &= -sizeof(long)" (for alignment). When sizeof
isn't size_t (as it's specified to be) this masking will truncate
the high bits of the uintptr_t object (if uintptr_t is larger than
uint).
This needs to be accepted:
typedef int foo;
void f (void) { foo: return; }
namespaces for labels and types are different. The problem is that
the block parser always tries to find a decl first and that routine
doesn't peek enough to detect this case. Needs some adjustments
to unget_tok() so that we can call it even when we already called
it once, but next() didn't come around restoring the buffer yet.
(It lazily does so not when the buffer becomes empty, but rather
when the next call detects that the buffer is empty, i.e. it requires
two next() calls until the unget buffer gets switched back).
Removes a premature optimization of char/short loads
rewriting the source type. It did so also for bitfield
loads, thereby removing all the shifts/maskings.
(cond ? 0 : ptr)->member wasn't handled correctly. If one arm
is a null pointer constant (which also can be a pointer) the result
type is that of the other arm.
This fixes a bug introduced in commit
8d107d9ffd
that produced wrong code because of interference between
0x10 bits VT_CONST and x86_64-gen.c:TREG_MEM
Also fully zero-pad long doubles on x86-64 to avoid random
bytes in output files which disturb file comparison.
This allows passing colon separated paths to
tcc_add_library_path
tcc_add_sysinclude_path
tcc_add_include_path
Also there are new configure variables
CONFIG_TCC_LIBPATH
CONFIG_TCC_SYSINCLUDE_PATHS
which define the lib/sysinclude paths all in one and can
be overridden from configure/make
For TCC_TARGET_PE semicolons (;) are used as separators
Also, \b in the path string is replaced by s->tcc_lib_path
(CONFIG_TCCDIR rsp. -B option)
Since no code should be generated outside a function, force expr_cond to
only consider constant expression when outside a function since the
generic code can generate some code.
Basically, with:
typedef __attribute__((aligned(16))) struct _xyz {
...
} xyz, *pxyz;
we want the struct aligned but not the pointer.
FIXME: This patch is a hack, waiting for someone in the knowledge
of correct __attribute__ semantics.
VLA inserts a call to alloca via enum TOK_alloca, but TOK_alloca
only exists on I386 and X86_64 targets. This patch just emits an
error at compile-time if someone tries to compile some VLA code
for a TOK_alloca-less target. The best solution might be to just
push the problem to link-time, since the existence-or-not of a
alloca implementation can only be determined by linking. It seems
like just declaring TOK_alloca unconditionally would achieve that,
but for now, this at least gets the cross compilers to build.
I don't know if it makes a difference to gen_op(TOK_PDIV) or not,
but logically the ptr1_is_vla test in TP's VLA patch seems out of
order, where the patch to fix it would be:
------------------------------------------------------------------
@@ -1581,15 +1581,15 @@ ST_FUNC void gen_op(int op)
u = pointed_size(&vtop[-1].type);
}
gen_opic(op);
+ if (ptr1_is_vla)
+ vswap();
/* set to integer type */
#ifdef TCC_TARGET_X86_64
vtop->type.t = VT_LLONG;
#else
vtop->type.t = VT_INT;
#endif
- if (ptr1_is_vla)
- vswap();
- else
+ if (!ptr1_is_vla)
vpushi(u);
gen_op(TOK_PDIV);
} else {
------------------------------------------------------------------
Instead of that patch, which increases the complexity of the code,
this one fixes the problem by just rolling back and retrying with
a simpler approach.
A VLA is not really an array, it's a pointer-to-an-array.
Making this explicit allows us to back out a few parts
of the original VLA patch and paves the way for the next
part of the fix, where a VLA will be stored on the runtime
stack as a pointer-to-an-array, rather than on the compile-
time stack as a Sym*.
test target in Makefile does not depend on tcc.
i'm not sure why, but i can think of at least one
good reason. in my local tree I have it modified
to do so, but somehow inadvertently reverted that
so when i did "make test" before committing, it
didn't actually test my changes. sorry.
previously, tcc would accept a prototype of a function returning
an array, but not giving those functions bodies nor calling them.
it seems that gcc has never supported them, so we should probably
just error out... but it's possible that someone already using
tcc includes some header that contains an unused prototype for
one, so let's continue to support that.
Add support for asm labels for variables, that is the ability to rename
a variable at assembly level with __asm__ ("newname") appended in
its declaration.
See http://gcc.gnu.org/onlinedocs/gcc-4.4.4/gcc/Asm-Labels.html for more
details.
- Fix function assembly label mechanism introduced in commit
9b09fc376e to only accept alternative
name for function declaration.
- merge the code with the one introduced in commit
264a103610.
- Don't memorize token for asm label but directly the asm label.
Error out on static function without file scope and give an explaination
to the user
This is a rewrite of e9406c09a3 but
considering problems raised about static local function pointers in
632ee5a540.
* Disable C99 VLA detection when alloca is unavailable and protect the
new reference to TOK_alloca in decl_initializer in order to compile
and run for architecture without working alloca.
Not all code of C99 VLA is commented as it would required many ifdef
stanza. Just the detection is commented so that VT_VLA is never set
any type and the C99 VLA code is compiled but never called. However
vpush_global_sym(&func_old_type, TOK_alloca) in decl_initializer needs
to be protected by an ifdef stanza as well because it uses TOK_alloca.
* include alloca and C99 VLA tests according to availability of
TOK_alloca instead of relying on the current architecture
Implement C99 Variable Length Arrays in tinycc:
- Support VLA with multiple level (nested vla)
- Update documentation with regards to VT_VLA
- Add a testsuite in tcctest.c
- add __builtin_va_arg_types to check how arguments were passed
- move most code of stdarg into libtcc1.c
- remove __builtin_malloc and __builtin_free
- add a test case based on the bug report
(http://www.mail-archive.com/tinycc-devel@nongnu.org/msg03036.html)
Add support for asm labels for functions, that is the ability to rename
a function at assembly level with __asm__ ("newname") appended in
function declaration.
This reverts commit 433ecdfc9d.
The patch breaks e.g. with
for ((i = 10); --i;);
In particular to check for a type decl. this is not sufficient:
if (tok < TOK_UIDENT) {
A future approach to c99 loop variables might instead use:
if (parse_btype(...)) {
plus refactor function decl() accordingly.
Unary expression can start with a parenthesis. Thus, the current test
to detect which sizeof form is being parsed is inaccurate. This patch
makes tcc able to handle things like sizeof (x)[1] where x is declared
as char x[5]; wich is a valid unary expression
Error out with an explicit message when trying to initialize a
character array with something that's not a literal (optionally
enclosed in braces) as per C99 6.7.8:14; thanks to Antti-Juhani
Kaijanaho <ajk@debian.org> who did all the work.
When storing structs with a memcpy call in vstore(),
so far a needless entry remaining on the vstack
sometimes resulted in an useless store generated by
save_regs() in gfunc_call() for the memcpy routine.
Date: Mon, 8 Jun 2009 19:06:56 +0800
From: Soloist Deng <soloist.deng-gmail-com>
Subject: [Tinycc-devel] trying to fix the bug of unclean FPU st(0)
Hi all:
I am using tcc-0.9.25, and the FPU bug brought a big trouble to
me. I read the source and tried to fix it.
Below is my solution.
There are two places where program(`o(0xd9dd)') will generates `fstp
%st(1)': vpop() in tccgen.c:689 and save_reg() in tccgen.c:210.
We should first change both of them to `o(0xd8dd) // fstp %st(0)'.
But these changes are not enough. Let's check the following code.
void foo()
{
double var = 2.7;
var++;
}
Using the changed tcc will generate following machine code:
.text:08000000 public foo
.text:08000000 foo proc near
.text:08000000
.text:08000000 var_18 = qword ptr -18h
.text:08000000 var_10 = qword ptr -10h
.text:08000000 var_8 = qword ptr -8
.text:08000000
.text:08000000 push ebp
.text:08000001 mov ebp, esp
.text:08000003 sub esp, 18h
.text:08000009 nop
.text:0800000A fld L_0
.text:08000010 fst [ebp+var_8]
.text:08000013 fstp st(0)
.text:08000015 fld [ebp+var_8]
.text:08000018 fst [ebp+var_10]
.text:0800001B fstp st(0)
.text:0800001D fst [ebp+var_18]
.text:08000020 fstp st(0)
.text:08000022 fld L_1
.text:08000028 fadd [ebp+var_10]
.text:0800002B fst [ebp+var_8]
.text:0800002E fstp st(0)
.text:08000030 leave
.text:08000031 retn
.text:08000031 foo endp
.text:08000031
.text:08000031 _text ends
--------------------------------------------------
.data:08000040 ; Segment type: Pure data
.data:08000040 ; Segment permissions: Read/Write
.data:08000040 ; Segment alignment '32byte' can not be represented in assembly
.data:08000040 _data segment page public 'DATA' use32
.data:08000040 assume cs:_data
.data:08000040 ;org 8000040h
.data:08000040 L_0 dq 400599999999999Ah
.data:08000048 L_1 dq 3FF0000000000000h
.data:08000048 _data ends
Please notice the code snippet from 0800000A to 08000020
// double var = 2.7; load constant to st(0)
.text:0800000A fld L_0
// double var = 2.7; store st(0) to `var'
.text:08000010 fst [ebp+var_8]
// double var = 2.7; poping st(0) will empty the floating registers stack
.text:08000013 fstp st(0)
After that ,tcc will call `void inc(int post, int c)" in
tccgen.c:2150, and produce 08000015 to 0800001B through the calling
chain (inc ->gv_dup)
// load from `var' to st(0)
.text:08000015 fld [ebp+var_8]
// store st(0) to a temporary location
.text:08000018 fst [ebp+var_10]
// poping st(0) will empty the floating registers stack
.text:0800001B fstp st(0)
And the calling chain
(gen_op('+')->gen_opif('+')->gen_opf('+')->gv(rc=2)->get_reg(rc=2)->save_reg(r=3))
will produce 0800001D to 08000020 .
// store st(0) to a temporary location, but floating stack is empty!
.text:0800001D fst [ebp+var_18]
// poping st(0) will empty the floating registers stack
.text:08000020 fstp st(0)
The `0800001D fst [ebp+var_18]' will store st(0) to a memory
location, but st(0) is empty. That will cause FPU invalid operation
exception(#IE).
Why does tcc do that? Please read `gv_dup' called by `inc' carefully.
Notice these lines:
(1): r = gv(rc);
(2): r1 = get_reg(rc);
(3): sv.r = r;
sv.c.ul = 0;
(4) load(r1, &sv); /* move r to r1 */
(5) vdup();
/* duplicates value */
(6) vtop->r = r1;
(1) let the vtop occupy TREG_ST0, and `r' will be TREG_ST0. (2)
try to get a free floating register,but tcc assume
there is only one, so it wil force vtop goto memory and assign `r1'
with TREG_ST0. When executing (3), it will do nothing
because `r' equals `r1'. (5) duplicates vtop. Then (6) let the new
vtop occupy TREG_ST0, but this will cause problem
because the old vtop has been moved to memory, so the new duplicated
vtop does not reside in TREG_ST0 but also
in memory after that. TREG_ST0 is not occupied but freely availabe
now. `gen_op('+')' need at least one oprand in register,
so it will incorrectly think TREG_ST0 is occupied by vtop and produce
instructions(0800001D and 08000020) to store it to
a temporary memory location.
According program above, if `r' == `r1' it is impossible for the old
vtop to still occupy the `r' register . And `load' will do nothing
too at this condition.
So the `gv_dup' can not promise the semantics that old vtop in one
register and the new duplicated vtop in another register at the same
time.
I changed (6) to
if (r != r1)
{
vtop->r = r1;
}
Then the new generated machine code will be :
.text:08000000 push ebp
.text:08000001 mov ebp, esp
.text:08000003 sub esp, 10h
.text:08000009 nop
.text:0800000A fld L_0
.text:08000010 fst [ebp+var_8]
.text:08000013 fstp st(0)
.text:08000015 fld [ebp+var_8]
.text:08000018 fst [ebp+var_10]
.text:0800001B fstp st(0)
.text:0800001D fld L_1
.text:08000023 fadd [ebp+var_10]
.text:08000026 fst [ebp+var_8]
.text:08000029 fstp st(0)
.text:0800002B leave
.text:0800002C retn
It works well, and will clean the floating registers stack when return.
Finally, I want to know there is any potential problem of this fixing ?
soloist