On Linux 32: sizeof(long)=32 == sizeof(void *)=32
on Linux 64: sizeof(long)=64 == sizeof(void *)=64
on Windows 64: sizeof(long)=32 != sizeof(void *)=64
The common code to move a returned structure packed into
registers into memory on the caller side didn't take the
register size into account when allocating local storage,
so sometimes that lead to stack overwrites (e.g. in 73_arm64.c),
on x86_64. This fixes it by generally making gfunc_sret also return
the register size.
A test program:
/* result of the new version inroduced in 4ad186c5ef: t2a = 44100312 */
#include<stdio.h>
int main() {
int t1 = 176401255;
float f = 0.25f;
int t2a = (int)(t1 * f); // must be 44100313
int t2b = (int)(t1 * (float)0.25f);
printf("t2a=%d t2b=%d \n",t2a,t2b);
return 0;
}
Author: Thomas Preud'homme <robotux@celest.fr>
Date: Tue Dec 31 23:51:20 2013 +0800
Move logic for if (int value) to tccgen.c
Move the logic to do a test of an integer value (ex if (0)) out of
arch-specific code to tccgen.c to avoid code duplication. This also
fixes test of long long value which was only testing the bottom half of
such values on 32 bits architectures.
I don't understand why if () in gtst(i) was removed.
This patch allows to compile a linux kernel v.2.4.26
W/o this patch a tcc simply crashes.
Refactoring (no logical changes):
- use memcpy in tccgen.c:ieee_finite(double d)
- use union to store attribute flags in Sym
Makefile: "CFLAGS+=-fno-strict-aliasing" basically not necessary
anymore but I left it for now because gcc sometimes behaves
unexpectedly without.
Also:
- configure: back to mode 100755
- tcc.h: remove unused variables tdata/tbss_section
- x86_64-gen.c: adjust gfunc_sret for prototype
Variants __fixsfdi/__fixxfdi are not needed for now because
the value is converted to double always.
Also:
- remove __tcc_fpinit for unix as it seems redundant by the
__setfpucw call in the startup code
- avoid reference to s->runtime_main in cross compilers
- configure: fix --with-libgcc help
- tcctok.h: cleanup
The procedure calling standard for ARM architecture mandate the use of
the base standard for variadic function. Therefore, hgen float aggregate
must be returned via stack when greater than 4 bytes and via core
registers else in case of variadic function.
This patch improve gfunc_sret() to take into account whether the
function is variadic or not and make use of gfunc_sret() return value to
determine whether to pass a structure via stack in gfunc_prolog(). It
also take advantage of knowing if a function is variadic or not move
float result value from VFP register to core register in gfunc_epilog().
Move the logic to do a test of an integer value (ex if (0)) out of
arch-specific code to tccgen.c to avoid code duplication. This also
fixes test of long long value which was only testing the bottom half of
such values on 32 bits architectures.
Use comisd / fcompp for float comparison (except TOK_EQ and TOK_NE)
instead of ucomisd / fucompp to detect NaN comparison.
Thanks Vincent Lefèvre for the bug report and for also giving the
solution.
- avoid assumption "ret_align == register_size" which is
false for non-arm targets
- rename symbol "sret" to more descriptive "ret_nregs"
This fixes commit dcec8673f2
Also:
- remove multiple definitions in win32/include/math.h
On ARM with hardfloat calling convention, structure containing 4 fields
or less of the same float type are returned via float registers. This
means that a structure can be returned in up to 4 double registers in a
structure is composed of 4 doubles. This commit adds support for return
of structures in several registers.
- Use runtime function for conversion
- Also initialize fp with tcc -run on windows
This fixes a bug where
double x = 1.0;
double y = 1.0000000000000001;
double z = x < y ? 0 : sqrt (x*x - y*y);
caused a bad sqrt because rounding precision for the x < y comparison
was different to the one used within the sqrt function.
This also fixes a bug where
printf("%d, %d", (int)pow(10, 2), (int)pow(10, 2));
would print
100, 99
Unrelated:
win32: document relative include & lib lookup
win32: normalize_slashes: do not mirror silly gcc behavior
This reverts part of commit 8a81f9e103
winapi: add missing WINAPI decl. for some functions
VLA storage is now freed when it goes out of scope. This makes it
possible to use a VLA inside a loop without consuming an unlimited
amount of memory.
Combining VLAs with alloca() should work as in GCC - when a VLA is
freed, memory allocated by alloca() after the VLA was created is also
freed. There are some exceptions to this rule when using goto: if a VLA
is in scope at the goto, jumping to a label will reset the stack pointer
to where it was immediately after the last VLA was created prior to the
label, or to what it was before the first VLA was created if the label
is outside the scope of any VLA. This means that in some cases combining
alloca() and VLAs will free alloca() memory where GCC would not.
I expect that Linux-x86 is probably fine. All other architectures
except ARM are definitely broken since I haven't yet implemented
gfunc_sret for these, although replicating the current behaviour
should be straightforward.