Commit Graph

168 Commits

Author SHA1 Message Date
Philip
70a6c4601e VLA code: minor fix
Don't try to call get_flags() on the mob branch, where it's not defined.
2015-05-04 03:42:02 +00:00
seyko
c75d0deecf VLA code minor fix 2015-05-04 04:19:24 +03:00
seyko
999274ca90 a lot simpler VLA code
Author: Philip <pipcet@gmail.com>
    Our VLA code can be made a lot simpler (simple enough for
    even me to understand it) by giving up on the optimization idea, which
    is very tempting. There's a patch to do that attached, feel free to
    test and commit it if you like. (It passes all the tests, at least
2015-05-04 04:09:05 +03:00
Philip
2d3458363e fix another x86_64 ABI bug
The old code assumed that if an argument doesn't fit into the available
registers, none of the subsequent arguments do, either. But that's
wrong: passing 7 doubles, then a two-double struct, then another double
should generate code that passes the 9th argument in the 8th register
and the two-double struct on the stack. We now do so.

However, this patch does not yet fix the function calling code to do the
right thing in the same case.
2015-04-26 17:31:39 +00:00
Philip
8d44851d65 Fix zero-length struct/union test. Remove nonsensical test.
The comment suggests this was meant to detect unions, but in fact it
compared f->c, the union/struct size, against f->next->c, the first
element's offset.

This affected only zero-length structs/unions with a first (zero-length)
element, as in this code:

    struct u2 {
    };

    struct u {
      struct u2 u2;
    } u;

    struct u f(struct u x)
    {
      return x;
    }

However, such structures turned out to be broken anyway, as code like this
was generated for the above f:

0000000000000000 <f>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 81 ec 10 00 00 00    sub    $0x10,%rsp
   b:   66 0f d6 45 f8          movq   %xmm0,-0x8(%rbp)
  10:   66 0f 6e 45 f8          movd   -0x8(%rbp),%xmm0
  15:   e9 00 00 00 00          jmpq   1a <f+0x1a>
  1a:   c9                      leaveq
  1b:   c3                      retq
2015-04-25 19:25:23 +00:00
Philip
059aea5d35 fix a subtle x86-64 calling bug
I ran into an issue playing with tinycc, and tracked it down to a rather
weird assumption in the function calling code. This breaks only when
varargs and float/double arguments are combined, I think, and only when
calling GCC-generated (or non-TinyCC, at least) code. The problem is we
sometimes generate code like this:

804a468: 4c 89 d9 mov %r11,%rcx
804a46b: b8 01 00 00 00 mov $0x1,%eax
804a470: 48 8b 45 c0 mov -0x40(%rbp),%rax
804a474: 4c 8b 18 mov (%rax),%r11
804a477: 41 ff d3 callq *%r11

for a function call. Note how $eax is first set to the correct value,
then clobbered when we try to load the function pointer into R11. With
the patch, the code generated is:

804a468: 4c 89 d9 mov %r11,%rcx
804a46b: b8 01 00 00 00 mov $0x1,%eax
804a470: 4c 8b 5d c0 mov -0x40(%rbp),%r11
804a474: 4d 8b 1b mov (%r11),%r11
804a477: 41 ff d3 callq *%r11

which is correct.

This becomes an issue when get_reg(RC_INT) is modified not always to
return %rax after a save_regs(0), because then another register (%ecx,
say) is clobbered, and the function passed an invalid argument.

A rather convoluted test case that generates the above code is
included. Please note that the test will not cause a failure because
TinyCC code ignores the %rax argument, but it will cause incorrect
behavior when combined with GCC code, which might wrongly fail to save
XMM registers and cause data corruption.
2015-04-23 18:08:28 +00:00
Philip
aacf65bbfa Bugfix: 32-bit vs 64-bit bug in x86_64-gen.c:gcall_or_jmp
Verify an immediate value fits into 32 bits before jumping to it/calling
it with a 32-bit immediate operand. Without this fix, code along the
lines of

  ((int (*)(const char *, ...))140244834372944LL)("hi\n");

will fail mysteriously, even if that decimal constant is the correct
address for printf.

See https://github.com/pipcet/tinycc/tree/bugfix-1
2015-04-23 17:30:16 +00:00
seyko
559675b90a a bounds checking code for the ARCH=x86_64 2015-04-10 15:17:22 +03:00
Michael Matz
50899e30ab Fix stack overwrite on structure return
The common code to move a returned structure packed into
registers into memory on the caller side didn't take the
register size into account when allocating local storage,
so sometimes that lead to stack overwrites (e.g. in 73_arm64.c),
on x86_64.  This fixes it by generally making gfunc_sret also return
the register size.
2015-03-09 00:19:59 +01:00
seyko
2437ccdc76 A partial reverse for commit eda2c756ed
Author: Thomas Preud'homme <robotux@celest.fr>
	Date:   Tue Dec 31 23:51:20 2013 +0800

	Move logic for if (int value) to tccgen.c
	Move the logic to do a test of an integer value (ex if (0)) out of
	arch-specific code to tccgen.c to avoid code duplication. This also
        fixes test of long long value which was only testing the bottom half of
	such values on 32 bits architectures.

I don't understand why if () in gtst(i) was removed.
This patch allows to compile a linux kernel v.2.4.26
W/o this patch a tcc simply crashes.
2015-03-03 15:51:09 +03:00
grischka
6e0a658e96 win64: try to fix linkage
- revert to R_X86_64_PC32 for near calls on PE
- revert to s1->section_align set to zero by default

Untested. Compared to release_0_9_26 the pe-image looks back to
normal.  There are some differences in dissassembly (r10/r11 usage)
but maybe that's ok.
2014-06-24 22:09:12 -04:00
Michael Matz
a913ee6082 x86-64: Use correct ELF values
The x86-64 uses different segment alignment (2MB) and a different
start address.
2014-04-03 17:59:41 +02:00
Michael Matz
080ad7e62a x86-64: Add basic shared lib support
Initial support for shared libraries on x86-64.
2014-03-31 03:45:35 +02:00
Thomas Preud'homme
fdb3b10d06 Fix various errors uncovered by static analysis
Reported-by: Carlos Montiers <cmontiers@gmail.com>
2014-03-08 18:38:49 +08:00
Thomas Preud'homme
d0dae7f241 Ignore VT_DEFSIGN in load on x86-64 arch
This fixes commit b0b5165d16 for x86-64
targets.
2014-02-07 22:31:44 +08:00
grischka
3fe2a95d7f be stricter with aliasing
Refactoring (no logical changes):
- use memcpy in tccgen.c:ieee_finite(double d)
- use union to store attribute flags in Sym
Makefile: "CFLAGS+=-fno-strict-aliasing" basically not necessary
anymore but I left it for now because gcc sometimes behaves
unexpectedly without.

Also:
- configure: back to mode 100755
- tcc.h: remove unused variables tdata/tbss_section
- x86_64-gen.c: adjust gfunc_sret for prototype
2014-01-07 14:57:07 +01:00
Thomas Preud'homme
8efaa71190 Fix struct ret in variadic fct with ARM hardfloat
The procedure calling standard for ARM architecture mandate the use of
the base standard for variadic function. Therefore, hgen float aggregate
must be returned via stack when greater than 4 bytes and via core
registers else in case of variadic function.

This patch improve gfunc_sret() to take into account whether the
function is variadic or not and make use of gfunc_sret() return value to
determine whether to pass a structure via stack in gfunc_prolog(). It
also take advantage of knowing if a function is variadic or not move
float result value from VFP register to core register in gfunc_epilog().
2014-01-06 22:57:05 +08:00
Thomas Preud'homme
eda2c756ed Move logic for if (int value) to tccgen.c
Move the logic to do a test of an integer value (ex if (0)) out of
arch-specific code to tccgen.c to avoid code duplication. This also
fixes test of long long value which was only testing the bottom half of
such values on 32 bits architectures.
2014-01-04 21:10:05 +08:00
Thomas Preud'homme
e0e9a2a295 Report error on NaN comparison
Use comisd / fcompp for float comparison (except TOK_EQ and TOK_NE)
instead of ucomisd / fucompp to detect NaN comparison.

Thanks Vincent Lefèvre for the bug report and for also giving the
solution.
2014-01-03 10:19:38 +08:00
Thomas Preud'homme
59b8007f98 Always set *palign in classify_x86_64_arg
Set *palign for VT_BITFIELD and VT_ARRAY types in classify_x86_64_arg as
else you happen to have in *palign what was already there. This can
cause gfunc_call on !PE systems to consider an array as 16 bytes align
and trigger the assert if the previous argument was 16 bytes aligned.
2014-01-03 10:19:38 +08:00
Thomas Preud'homme
dcec8673f2 Add support for struct > 4B returned via registers
On ARM with hardfloat calling convention, structure containing 4 fields
or less of the same float type are returned via float registers. This
means that a structure can be returned in up to 4 double registers in a
structure is composed of 4 doubles. This commit adds support for return
of structures in several registers.
2013-11-22 09:27:15 +08:00
Thomas Preud'homme
385a86b000 Fix commit 0f5942c6b3 2013-10-01 17:11:44 +02:00
Thomas Preud'homme
0f5942c6b3 Avoid warnings with gcc 4.8 + default CFLAGS 2013-09-24 15:37:12 +02:00
Thomas Preud'homme
f6b50558fc Add support for load/store of _Bool value
Add support for loading _Bool value in i386, x86_64 and arm as well as
support for storing _Bool value on arm.
2013-06-14 16:19:51 +02:00
grischka
be1b6ba7b7 avoid "decl after statement" please
for compiling tcc with msc
2013-04-30 00:33:34 +02:00
James Lyon
41b3c7a507 Improved variable length array support.
VLA storage is now freed when it goes out of scope. This makes it
possible to use a VLA inside a loop without consuming an unlimited
amount of memory.

Combining VLAs with alloca() should work as in GCC - when a VLA is
freed, memory allocated by alloca() after the VLA was created is also
freed. There are some exceptions to this rule when using goto: if a VLA
is in scope at the goto, jumping to a label will reset the stack pointer
to where it was immediately after the last VLA was created prior to the
label, or to what it was before the first VLA was created if the label
is outside the scope of any VLA. This means that in some cases combining
alloca() and VLAs will free alloca() memory where GCC would not.
2013-04-27 22:58:52 +01:00
James Lyon
6ee366e765 Fixed x86-64 long double passing.
long double arguments require 16-byte alignment on the stack, which
requires adjustment when the the stack offset is not an evven number of
8-byte words.
2013-04-26 16:42:12 +01:00
James Lyon
1caee8ab3b Sorted out CMake on x86-64 and fixed silly XMM# bug introduced when working on Win64 stdargs.
I removed the XMM6/7 registers from the register list because they are not used
on Win64 however they are necessary for parameter passing on x86-64. I have now
restored them but not marked them with RC_FLOAT so they will not be used except
for parameter passing.
2013-04-25 22:30:53 +01:00
James Lyon
5c35ba66c5 64-bit tests now pass (well, nearly).
tcctest1-3 fail, but this appears to be due to bugs in GCC rather than TCC
(from manual inspection of the output).
2013-04-24 02:19:15 +01:00
James Lyon
cbce6d2bac Improved x86-64 XMM register argument passing.
Also made XMM0-7 available for use as temporary registers, since they
are not used by the ABI. I'd like to do the same with RSI and RDI but
that's trickier since they can be used by gv() as temporary registers
and there isn't a way to disable that.
2013-04-19 22:05:49 +01:00
James Lyon
946afd2343 Fixed problems with XMM1 use on Linux/x86-64.
All tests pass. I think I've caught all the cases assuming only XMM0 is
used. I expect that Win64 is horribly broken by this point though,
because I haven't altered it to cope with XMM1.
2013-04-19 18:33:30 +01:00
James Lyon
0e17671f72 Most x86-64 tests now work; only on error in test1-3.
I've had to introduce the XMM1 register to get the calling convention
to work properly, unfortunately this has broken a fair bit of code
which assumes that only XMM0 is used.
2013-04-19 15:33:16 +01:00
James Lyon
b961ba5396 Got test1-3 working on x86-64.
There are probably still issues on x86-64 I've missed.
I've added a few new tests to abitest, which fail (2x long long and 2x double
in a struct should be passed in registers).
2013-04-19 11:10:13 +01:00
James Lyon
55ea6d3fc1 x86-64 ABI fixes.
abitest now passes; however test1-3 fail in init_test. All other tests
pass. I need to re-test Win32 and Linux-x86.

I've added a dummy implementation of gfunc_sret to c67-gen.c so it
should now compile, and I think it should behave as before I created
gfunc_sret.
2013-04-19 00:46:49 +01:00
grischka
d6d7686b60 tcc.h: declare CValue.tab[LDOUBLE_SIZE/4]
Should fix some warnings wrt. access out of array bounds.

tccelf.c: fix "static function unused" warning
x86_64-gen.c: fix "ctype.ref uninitialzed" warning and cleanup
tcc-win32.txt: remove obsolete limitation notes.
2013-02-08 19:07:11 +01:00
Michael Matz
a42b029101 x86-64: Fix call saved register restore
Loads of VT_LLOCAL values (which effectively represent saved
addresses of lvalues) were done in VT_INT type, loosing the upper
32 bits.  Needs to be done in VT_PTR type.
2012-06-10 09:01:26 +02:00
Michael Matz
2daae0dc99 x86_64: Fix compares with NaNs.
Comparisons with unordered doubles was broken, NaNs always
compare unequal (and unordered) to everything, including
to itself.
2012-05-13 02:21:51 +02:00
Michael Matz
1d0a5c2515 x86_64: Fix segfault for global data
When offsetted addresses of global non-static data are computed
multiple times in the same statement the x86_64 backend uses
gen_gotpcrel with offset, which implements an add insn on the
register given.  load() uses the R member of the to-be-loaded
value, which doesn't yet have a reg assigned in all cases.

So use the register we're supposed to load the value into as
that register.
2012-04-18 20:57:13 +02:00
Michael Matz
86ac6b9bee x86_64: Fix indirection in struct paramaters
The first loop setting up struct arguments must not remove
elements from the vstack (via vtop--), as gen_reg needs them to
potentially evict some argument still held in registers to stack.

Swapping the arg in question to top (and back to its place) also
simplifies the vstore call itself, as not funny save/restore
or some "non-existing" stack elements need to be done.

Generally for a stack a vop-- operation conceptually clobbers
that element, so further references to it aren't allowed anymore.
2012-04-18 20:57:13 +02:00
grischka
ae191c3a61 x86_64: fix loading of LLOCAL floats
See also commit 9527c4949f

On x86_64 we need to extend the reg_classes array because load()
is called for (at least) R11 too, which was not part of reg_classes
previously.
2012-03-05 20:19:28 +01:00
grischka
8d107d9ffd win64: va_arg with structures 2011-07-14 19:24:53 +02:00
grischka
df4c0892f3 tccrun: win64: add unwind function table for dynamic code
This works only when tcc.exe is compiled using MSC.  MinGW does
something in the startup code that defeats it.
2011-07-14 19:09:49 +02:00
Shinichiro Hamaji
07fd82b411 Make alignments for struct arguments 8 bytes
The ABI (http://www.x86-64.org/documentation/abi.pdf) says
"The size of each argument gets rounded up to eightbytes"
2010-12-28 19:09:59 +09:00
Shinichiro Hamaji
9d347f8742 Probably wrong stack alignment for struct on Win64 2010-08-27 02:49:09 +09:00
Shinichiro Hamaji
1f6781f0ee Fix alignment around struct for SSE.
- Fix a wrong calculation for size of struct
- Handle cases where struct size isn't multple of 8
- Recover vstack after memcpy for pushing struct
- Add a float parameter for struct_assign_test1 to check SSE alignment
2010-08-27 02:32:19 +09:00
grischka
2341ee5142 tccpe: improve dllimport/export and use for tcc_add_symbol 2010-01-14 20:59:42 +01:00
grischka
0de95730ad build from multiple objects: fix other targets 2009-12-20 20:33:41 +01:00
grischka
b54862406e x86-64: fix gtst, back to only 5 regs for now 2009-12-20 20:33:21 +01:00
grischka
070b86a870 x86-64: use r8/r9 as generic integer registers 2009-12-20 02:19:51 +01:00
grischka
0e5c0ee045 x86-64: use r8,r9 as load/store registers 2009-12-20 01:54:39 +01:00
grischka
4a01eb09d8 use vpushv in some places 2009-12-20 01:54:38 +01:00
grischka
50b040ef83 win64: add tiny unwind data for setjmp/longjmp
This enables native unwind semantics with longjmp on
win64 by putting an entry into the .pdata section for
each compiled fuction.

Also, the function now use a fixed stack and store arguments
into X(%rsp) rather than using push.
2009-12-20 01:54:37 +01:00
grischka
88a3ccab9f allow tcc be build from separate objects
If you want that, run: make NOTALLINONE=1
2009-12-20 01:53:49 +01:00
grischka
94bf4d2c22 tccpe: improve dllimport 2009-12-19 22:16:21 +01:00
grischka
1308e8ebcf integrate x86_64-asm.c into i386-asm.c
Also, disable 16bit support for now as it causes bugs
in 32bit mode.  #define I386_ASM_16 if you want it.
2009-12-19 22:16:20 +01:00
grischka
dd3d4f7295 x86-64: fix udiv, add cqto instruction 2009-12-19 22:16:19 +01:00
Shinichiro Hamaji
5dadff3de5 x86-64: Fix stab debug information.
We need 32bit relocations for code and 64bit for debug info.
Introduce a new macro R_DATA_PTR to distinguish the two usages.
2009-08-24 13:30:03 +02:00
grischka
c998985c74 cleanup: constify some global data 2009-07-18 22:07:42 +02:00
grischka
bb5e0df79a x86-64: fix load() for const pointers: (void*)-2 2009-07-18 22:07:03 +02:00
grischka
fc977d56c9 x86-64: chkstk, alloca 2009-07-18 22:06:54 +02:00
grischka
459875796b pe32+ target: adjust x86_64-gen.c
- calling conventions are different:
  * only 4 registers
  * stack "scratch area" is always reserved
  * doubles are mirrored in normal registers
- no GOT or PIC there
2009-07-18 22:05:49 +02:00
Shinichiro Hamaji
0e239e2ba5 Improve the test coverage: !val for float/double/long long f. 2009-04-18 15:08:01 +02:00
Shinichiro Hamaji
fcf2e5981f x86-64: Combine buffers of sections before we call tcc_run().
- Now we can run tcc -run tcc.c successfully, though there are some bugs.
- Remove jmp_table and got_table and use text_section for got and plt entries.
- Combine buffers in tcc_relocate().
- Use R_X86_64_64 instead of R_X86_64_32 for R_DATA_32 (now the name R_DATA_32 is inappropriate...).
2009-04-18 15:08:01 +02:00
Shinichiro Hamaji
830b7533c9 Generate PIC code so that we can create shared objects properly.
- Add got_table in TCCState. This approach is naive and the distance between executable code and GOT can be longer than 32bit.
- Handle R_X86_64_GOTPCREL properly. We use got_table for TCC_OUTPUT_MEMORY case for now.
- Fix load() and store() so that they access global variables via GOT.
2009-04-18 15:08:01 +02:00
Shinichiro Hamaji
06fa15fb99 x86-64: Save RDX and RCX before we use them as function parameters.
When the function call is indirect, these registers may be broken to load a function pointer.
2009-04-18 15:07:09 +02:00
Shinichiro Hamaji
b8a32d8d40 Generate PIC for addresses of symbols. 2009-04-18 15:07:09 +02:00
Shinichiro Hamaji
62e73da612 A uint64 bug fix on x86-64
64bit unsigned literal was handled as 32bit integer.
Added a unittest to catch this.
2009-04-18 15:07:08 +02:00
Shinichiro Hamaji
0a9873aa22 Add support of x86-64.
Most change was done in #ifdef TCC_TARGET_X86_64. So, nothing should be broken by this change.

Summary of current status of x86-64 support:

- produces x86-64 object files and executables.
- the x86-64 code generator is based on x86's.
-- for long long integers, we use 64bit registers instead of tcc's generic implementation.
-- for float or double, we use SSE. SSE registers are not utilized well (we only use xmm0 and xmm1).
-- for long double, we use x87 FPU.
- passes make test.
- passes ./libtcc_test.
- can compile tcc.c. The compiled tcc can compile tcc.c, too. (there should be some bugs since the binary size of tcc2 and tcc3 is differ where tcc tcc.c -o tcc2 and tcc2 tcc.c -o tcc3)
- can compile links browser. It seems working.
- not tested well. I tested this work only on my linux box with few programs.
- calling convention of long-double-integer or struct is not exactly the same as GCC's x86-64 ABI.
- implementation of tcc -run is naive (tcc -run tcctest.c works, but tcc -run tcc.c doesn't work). Relocating 64bit addresses seems to be not as simple as 32bit environments.
- shared object support isn't unimplemented
- no bounds checker support
- some builtin functions such as __divdi3 aren't supported
2008-12-02 02:30:47 +01:00