I expect that Linux-x86 is probably fine. All other architectures
except ARM are definitely broken since I haven't yet implemented
gfunc_sret for these, although replicating the current behaviour
should be straightforward.
Should fix some warnings wrt. access out of array bounds.
tccelf.c: fix "static function unused" warning
x86_64-gen.c: fix "ctype.ref uninitialzed" warning and cleanup
tcc-win32.txt: remove obsolete limitation notes.
This reverts commit 63193d1794.
Had some problems (_STATIC_ASSERT) and was too ugly anyway.
For retry, I'd suggest to implement a general function
static inline void memswap (void *p1, void* p2, size_t n);
and then use that. If you do so, please keep the original code
as comment.
This replaces commit 3d409b0889
- revert old fix in libtcc.c
- #include_next: look up the file in the include stack to see
if it is already included.
Also:
- streamline include code
- remove 'type' from struct CachedInclude (obsolete because we check
full filename anyway)
- remove inc_type & inc_filename from struct Bufferedfile (obsolete)
- fix bug with TOK_FLAG_ENDIF not being reset
- unrelated: get rid of an 'variable potentially uninitialized' warning
For vstack Fabrice used the trick to initialize vtop to &vstack[-1], so
that on first push, vtop becomes &vstack[0] and a value is also stored
there - everything works.
Except that when tcc is compiled with bounds-checking enabled, vstack - 1
returns INVALID_POINTER and oops...
Let's workaround it with artificial 1 vstack slot which will not be
used, but only serve as an indicator that pointing to &vstack[-1] is ok.
Now, tcc, after being self-compiled with -b works:
$ ./tcc -B. -o tccb -DONE_SOURCE -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" tcc.c -ldl
$ cd tests
$ ../tcc -B.. -run tcctest.c >1
$ ../tccb -B.. -run tcctest.c >2
$ diff -u 1 2
and note, tcc's compilation speed is not affected:
$ ./tcc -B. -bench -DONE_SOURCE -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -c tcc.c
before: 8270 idents, 47221 lines, 1527730 bytes, 0.152 s, 309800 lines/s, 10.0 MB/s
after: 8271 idents, 47221 lines, 1527733 bytes, 0.152 s, 310107 lines/s, 10.0 MB/s
But note, that `tcc -b -run tcc` is still broken - for example it crashes
on
$ cat x.c
double get100 () { return 100.0; }
$ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -run \
-DONE_SOURCE ./tcc.c -B. -c x.c
Runtime error: dereferencing invalid pointer
./tccpp.c:1953: at 0xa7beebdf parse_number() (included from ./libtcc.c, ./tcc.c)
./tccpp.c:3003: by 0xa7bf0708 next() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:4465: by 0xa7bfe348 block() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:4440: by 0xa7bfe212 block() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:5529: by 0xa7c01929 gen_function() (included from ./libtcc.c, ./tcc.c)
./tccgen.c:5767: by 0xa7c02602 decl0() (included from ./libtcc.c, ./tcc.c)
that's because lib/bcheck.c runtime needs more fixes -- see next
patches.
Continuing d6072d37 (Add __builtin_frame_address(0)) implement
__builtin_frame_address for levels greater than zero, in order for
tinycc to be able to compile its own lib/bcheck.c after
cffb7af9 (lib/bcheck: Prevent __bound_local_new / __bound_local_delete
from being miscompiled).
I'm new to the internals, and used the most simple way to do it.
Generated code is not very good for levels >= 2, compare
gcc tcc
level=0 mov %ebp,%eax lea 0x0(%ebp),%eax
level=1 mov 0x0(%ebp),%eax mov 0x0(%ebp),%eax
level=2 mov 0x0(%ebp),%eax mov 0x0(%ebp),%eax
mov (%eax),%eax mov %eax,-0x10(%ebp)
mov -0x10(%ebp),%eax
mov (%eax),%eax
level=3 mov 0x0(%ebp),%eax mov 0x0(%ebp),%eax
mov (%eax),%eax mov (%eax),%ecx
mov (%eax),%eax mov (%ecx),%eax
But this is still an improvement and for bcheck we need level=1 for
which the code is good.
For the tests I had to force gcc use -O0 to not inline the functions.
And -fno-omit-frame-pointer just in case.
If someone knows how to improve the generated code - help is
appreciated.
Thanks,
Kirill
Cc: Michael Matz <matz@suse.de>
Cc: Shinichiro Hamaji <shinichiro.hamaji@gmail.com>
Currently, VLA are not forbidden for static variable. This leads to
problems even if for fixed-size array when the size expression uses the
ternary operator (cond ? then-value : else-value) because it is parsed
as a general expression which leads to code generated in this case.
This commit solve the problem by forbidding VLA for static variables.
Although not required for the fix, avoiding code generation when the
expression is constant would be a nice addition though.
The code for shifts is now similar to code for binary arithmetic operations,
except that only the first argument is considered, as required by the ISO C
standard.
On 2012-06-26 15:07:57 +0200, Vincent Lefevre wrote:
> ISO C99 TC3 says: [6.5.7#3] "The integer promotions are performed on
> each of the operands. The type of the result is that of the promoted
> left operand."
I've written a patch (attached). Now the shift problems no longer
occur with the testcase and with GNU MPFR's "make check".
--
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Sometimes the result of a comparison is not directly used in a jump,
but in arithmetic or further comparisons. If those further things
do a vswap() with the VT_CMP as current top, and then generate
instructions for the new top, this most probably destroys the flags
(e.g. if it's a bitfield load like in the example).
vswap() must do the same like vsetc() and not allow VT_CMP vtops
to be moved down.
This matters when sizeof is directly used in arithmetic,
ala "uintptr_t t; t &= -sizeof(long)" (for alignment). When sizeof
isn't size_t (as it's specified to be) this masking will truncate
the high bits of the uintptr_t object (if uintptr_t is larger than
uint).
This needs to be accepted:
typedef int foo;
void f (void) { foo: return; }
namespaces for labels and types are different. The problem is that
the block parser always tries to find a decl first and that routine
doesn't peek enough to detect this case. Needs some adjustments
to unget_tok() so that we can call it even when we already called
it once, but next() didn't come around restoring the buffer yet.
(It lazily does so not when the buffer becomes empty, but rather
when the next call detects that the buffer is empty, i.e. it requires
two next() calls until the unget buffer gets switched back).
Removes a premature optimization of char/short loads
rewriting the source type. It did so also for bitfield
loads, thereby removing all the shifts/maskings.
(cond ? 0 : ptr)->member wasn't handled correctly. If one arm
is a null pointer constant (which also can be a pointer) the result
type is that of the other arm.
This fixes a bug introduced in commit
8d107d9ffd
that produced wrong code because of interference between
0x10 bits VT_CONST and x86_64-gen.c:TREG_MEM
Also fully zero-pad long doubles on x86-64 to avoid random
bytes in output files which disturb file comparison.
This allows passing colon separated paths to
tcc_add_library_path
tcc_add_sysinclude_path
tcc_add_include_path
Also there are new configure variables
CONFIG_TCC_LIBPATH
CONFIG_TCC_SYSINCLUDE_PATHS
which define the lib/sysinclude paths all in one and can
be overridden from configure/make
For TCC_TARGET_PE semicolons (;) are used as separators
Also, \b in the path string is replaced by s->tcc_lib_path
(CONFIG_TCCDIR rsp. -B option)
Since no code should be generated outside a function, force expr_cond to
only consider constant expression when outside a function since the
generic code can generate some code.
Basically, with:
typedef __attribute__((aligned(16))) struct _xyz {
...
} xyz, *pxyz;
we want the struct aligned but not the pointer.
FIXME: This patch is a hack, waiting for someone in the knowledge
of correct __attribute__ semantics.
VLA inserts a call to alloca via enum TOK_alloca, but TOK_alloca
only exists on I386 and X86_64 targets. This patch just emits an
error at compile-time if someone tries to compile some VLA code
for a TOK_alloca-less target. The best solution might be to just
push the problem to link-time, since the existence-or-not of a
alloca implementation can only be determined by linking. It seems
like just declaring TOK_alloca unconditionally would achieve that,
but for now, this at least gets the cross compilers to build.
I don't know if it makes a difference to gen_op(TOK_PDIV) or not,
but logically the ptr1_is_vla test in TP's VLA patch seems out of
order, where the patch to fix it would be:
------------------------------------------------------------------
@@ -1581,15 +1581,15 @@ ST_FUNC void gen_op(int op)
u = pointed_size(&vtop[-1].type);
}
gen_opic(op);
+ if (ptr1_is_vla)
+ vswap();
/* set to integer type */
#ifdef TCC_TARGET_X86_64
vtop->type.t = VT_LLONG;
#else
vtop->type.t = VT_INT;
#endif
- if (ptr1_is_vla)
- vswap();
- else
+ if (!ptr1_is_vla)
vpushi(u);
gen_op(TOK_PDIV);
} else {
------------------------------------------------------------------
Instead of that patch, which increases the complexity of the code,
this one fixes the problem by just rolling back and retrying with
a simpler approach.
A VLA is not really an array, it's a pointer-to-an-array.
Making this explicit allows us to back out a few parts
of the original VLA patch and paves the way for the next
part of the fix, where a VLA will be stored on the runtime
stack as a pointer-to-an-array, rather than on the compile-
time stack as a Sym*.