Commit Graph

123 Commits

Author SHA1 Message Date
grischka
8227db3a23 jump optimizations
This unifies VT_CMP with VT_JMP(i) by using mostly VT_CMP
with both a positive and a negative jump target list.

Such we can delay putting the non-inverted or inverted jump
until we can see which one is nore suitable (in most cases).

example:
    if (a && b || c && d)
        e = 0;

before this patch:
   a:	8b 45 fc             	mov    0xfffffffc(%ebp),%eax
   d:	83 f8 00             	cmp    $0x0,%eax
  10:	0f 84 11 00 00 00    	je     27 <main+0x27>
  16:	8b 45 f8             	mov    0xfffffff8(%ebp),%eax
  19:	83 f8 00             	cmp    $0x0,%eax
  1c:	0f 84 05 00 00 00    	je     27 <main+0x27>
  22:	e9 22 00 00 00       	jmp    49 <main+0x49>
  27:	8b 45 f4             	mov    0xfffffff4(%ebp),%eax
  2a:	83 f8 00             	cmp    $0x0,%eax
  2d:	0f 84 11 00 00 00    	je     44 <main+0x44>
  33:	8b 45 f0             	mov    0xfffffff0(%ebp),%eax
  36:	83 f8 00             	cmp    $0x0,%eax
  39:	0f 84 05 00 00 00    	je     44 <main+0x44>
  3f:	e9 05 00 00 00       	jmp    49 <main+0x49>
  44:	e9 08 00 00 00       	jmp    51 <main+0x51>
  49:	b8 00 00 00 00       	mov    $0x0,%eax
  4e:	89 45 ec             	mov    %eax,0xffffffec(%ebp)
  51:   ...

with this patch:
   a:	8b 45 fc             	mov    0xfffffffc(%ebp),%eax
   d:	83 f8 00             	cmp    $0x0,%eax
  10:	0f 84 0c 00 00 00    	je     22 <main+0x22>
  16:	8b 45 f8             	mov    0xfffffff8(%ebp),%eax
  19:	83 f8 00             	cmp    $0x0,%eax
  1c:	0f 85 18 00 00 00    	jne    3a <main+0x3a>
  22:	8b 45 f4             	mov    0xfffffff4(%ebp),%eax
  25:	83 f8 00             	cmp    $0x0,%eax
  28:	0f 84 14 00 00 00    	je     42 <main+0x42>
  2e:	8b 45 f0             	mov    0xfffffff0(%ebp),%eax
  31:	83 f8 00             	cmp    $0x0,%eax
  34:	0f 84 08 00 00 00    	je     42 <main+0x42>
  3a:	b8 00 00 00 00       	mov    $0x0,%eax
  3f:	89 45 ec             	mov    %eax,0xffffffec(%ebp)
  42:   ...
2019-06-24 11:40:01 +02:00
grischka
1b57560502 nocode, noreturn
A more automatic approach to code suppression (aka. nocode_wanted)

The simple rules are:
- Clear 'nocode_wanted' at (im/explicit) label IF it was used
- Set 'nocode_wanted' after unconditional jumps

Also in order to test this then I did add the "function might
return no value" warning, and then to make that work again I
did add the __attribute__((noreturn)).

Also moved the look ahead label check into the type parser
to gain a little speed.
2019-06-24 11:40:01 +02:00
Michael Matz
c07e81b087 Tidy some code
the real difference is in decl0 where we can use external_sym
just fine also for function definitions, we don't have to use
external_global_sym.  Setting VT_EXTERN in external_sym isn't
necessary either (the type will have it set if necessary).
The rest is tidying: removing unused arguments and moving
some code around.
2019-04-18 03:42:23 +02:00
Michael Matz
749d19d70b Fix sub-int returns on x86-64 and i386
the ABIs (and other compilers) extend sub-int return values in the
caller.  TCC extends them in the callee.  For compatibility with
those other compilers we have extend them in the caller as well.
That introduces a useless double extension in pure TCC-compiled code,
but fixing that generally requires that the code generator of TCC would
understand sub-int types.  For the time being bite the bullet.
2019-02-10 18:27:41 +01:00
Pursuer
ecb90de4cc FIX:Revert commit 3f05d88d5b
The function should to be saved to stack in some cases (fastcall on i386, struct argument on arm etc.). But I neglected. So I revert this commit.
2019-01-12 01:54:24 +08:00
Pursuer
3f05d88d5b optimize the generated code when save_reg is required (2)
In gfunc_call, regisger will be saved before gcall_or_jmp. The register
stored the function will be saved too, though in some generator the SValue
of this function will be immediately poped after gcall_or_jmp, and no need to be saved. So I modify some generator to avoid save redundant SValue before gcall_or_jmp.
2019-01-11 02:32:12 +08:00
Michael Matz
671dcace82 Implement function alignment via attributes
which requires being able to emit an arbitrary number of NOP
instructions, which is also implemented here.  For x86 we
could emit other sequences but these are the easiest.
2018-04-06 23:02:42 +02:00
Larry Doolittle
1b6806e5bb Spelling fixes
Comments only, no change to functionality
2017-09-24 18:03:26 -07:00
Zhang Boyang
b39810ff78 Fix calling function pointers casted from intergers in DLL
The code generated for "((void (*)(void))0x12345678)()" will be a single "CALL 0x12345678" in previous code.
However, this will not work for DLLs, because "CALL imm" is PC related, DLL relocation will break the code.
This commit fixed the problem by forcing TCC generates indirect CALLs in this situation.
2017-09-09 21:11:56 +08:00
Zhang Boyang
02370acdc9 Fix AL/AX is not extended to EAX when calling indirectly
AL/AX should be extended to EAX when calling functions. However, the previous code did this only for direct calls, indirect calls were ignored.
New code also avoid redundant code when generating JMP instruction. (i.e. expanding code should be generated with CALL instruction only)
2017-09-09 21:01:42 +08:00
Zhang Boyang
b8fe8fc210 called function should pop the arguments when using fastcall 2017-08-21 19:38:11 +08:00
grischka
9ba76ac834 refactor sym & attributes
tcc.h:
* cleanup struct 'Sym'
* include some 'Attributes' into 'Sym'
* in turn get rid of VT_IM/EXPORT, VT_WEAK
* re-number VT_XXX flags
* replace some 'long' function args by 'int'

tccgen.c:
* refactor parse_btype()
2017-07-09 12:34:11 +02:00
grischka
9f79b62ec4 unsorted adjustments
- configure
  * use aarch64 instead of arm64

- Makefile
  * rename the custom include file to "config-extra.mak"
  * Also avoid "rm -r /*" if $(tccdir) is empty

- pp/Makefile
  * fix .expect generation with gcc

- tcc.h
  * cleanup #defines for _MSC_VER

- tccgen.c:
  * fix const-propagation for &,|
  * fix anonymous named struct (ms-extension) and enable
    -fms-extension by default

- i386-gen.c
  * clear VT_DEFSIGN

- x86_64-gen.c/win64:
  * fix passing structs in registers
  * fix alloca (need to keep "func_scratch" below each alloca area on stack)
    (This allows to compile a working gnu-make on win64)

- tccpp.c
  * alternative approach to 37999a4fbf
    This is to avoid some slowdown with ## token pasting.
  * get_tok_str() : return <eof> for TOK_EOF
  * -funsigned-char: apply to "string" literals as well

- tccpe/tools.c: -impdef: support both 32 and 64 bit dlls anyway
2017-07-09 12:07:40 +02:00
grischka
28435ec58c configure: --config-musl/-uClibc switch & misc cleanups
- configure:
  - add --config-uClibc,-musl switch and suggest to use
    it if uClibc/musl is detected
  - make warning options magic clang compatible
  - simplify (use $confvars instead of individual options)
- Revert "Remove some unused-parameter lint"
  7443db0d5f
  rather use -Wno-unused-parameter (or just not -Wextra)
- #ifdef functions that are unused on some targets
- tccgen.c: use PTR_SIZE==8 instead of (X86_64 || ARM64)
- tccpe.c: fix some warnings
- integrate dummy arm-asm better
2017-05-13 08:59:06 +02:00
Larry Doolittle
19d8b8a173 Spelling fixes in C comments only 2017-05-07 21:38:09 -07:00
grischka
a4a20360e9 fixes & cleanups
- tccgen.c/tcc.h: allow function declaration after use:
      int f() { return g(); }
      int g() { return 1; }
  may be a warning but not an error
  see also 76cb1144ef

- tccgen.c: redundant code related to inline functions removed
  (functions used anywhere have sym->c set automatically)

- tccgen.c: make 32bit llop non-equal test portable
  (probably not on C67)

- dynarray_add: change prototype to possibly avoid aliasing
  problems or at least warnings

- lib/alloca*.S: ".section .note.GNU-stack,"",%progbits" removed
  (has no effect)

- tccpe: set SizeOfCode field (for correct upx decompression)

- libtcc.c: fixed alternative -run invocation
      tcc "-run -lxxx ..." file.c
  (meant to load the library after file).
  Also supported now:
      tcc files ... options ... -run @ arguments ...
2017-02-13 18:23:43 +01:00
grischka
68666eee2a tccgen: factor out gfunc_return
Also:
- on windows i386 and x86-64, structures of size <= 8 are
  NOT returned in registers if size is not one of 1,2,4,8.
- cleanup: put all tv-push/pop/swap/rot into one place
2017-02-08 19:45:31 +01:00
grischka
3b84e61ead Revert "partial revert of the commit 4ad186c5ef61"
There seems nothing wrong.  With

    int t1 = 176401255;
    float f = 0.25;
    int t2 = t1 * f; // 176401255 * 0.25 = 44100313.75

according to the arithmetic conversion rules, the number
176401255 needs to be converted to float, and the compiler
can choose either the nearest higher or nearest lower
representable number "in an implementation-defined manner".

Which may be 176401248 or 176401264.  So as result both
44100312 and 44100313 are correct.

This reverts commit 664c19ad5e.
2017-02-05 14:30:19 +01:00
grischka
559ee1e940 i386-gen: fix USE_EBX
Restore ebx from *ebp because alloca might change esp.

Also disable USE_EBX for upcoming release.

Actually the benefit is less than one would expect, it
appears that tcc can't do much with more than 3 registers
except with extensive use of long longs where the disassembly
looks much prettier (and shorter also).

Also: tccgen/expr_cond() : fix wrong gv/save_regs order
2016-12-19 00:33:01 +01:00
grischka
f843cadb6b tccgen: nocode_wanted alternatively
tccgen.c: remove any 'nocode_wanted' checks, except in
- greloca(), disables output elf symbols and relocs
- get_reg(), will return just the first suitable reg)
- save_regs(), will do nothing

Some minor adjustments were made where nocode_wanted is set.

xxx-gen.c: disable code output directly where it happens
in functions:
- g(), output disabled
- gjmp(), will do nothing
- gtst(), dto.
2016-12-18 18:53:21 +01:00
Michael Matz
cd9514abc4 i386: Fix various testsuite issues
on 32bit long long support was sometimes broken.  This fixes
code-gen for long long values in switches, disables a x86-64 specific
testcase and avoid an undefined shift amount.  It comments out
a bitfield test involving long long bitfields > 32 bit; with GCC layout
they can straddle multiple words and code generation isn't prepared
for this.
2016-12-15 17:53:09 +01:00
Michael Matz
b5669a952b x86-64: relocation addend is 64bit
Some routines were using the wrong type (int) in passing addends,
truncating it.  This matters when bit 31 isn't set and the high
32 bits are set: the truncation would make it unsigned where in
reality it's signed (happen e.g. on the x86-64 with it's load
address at top-2GB).
2016-12-15 17:47:12 +01:00
Thomas Preud'homme
59391d5520 Fix relocs_info declaration in tcc.h
C standard specifies that array should be declared with a non null size
or with * for standard array. Declaration of relocs_info in tcc.h was
not respecting this rule. This commit add a R_NUM macro that maps to the
R_<ARCH>_NUM macros and declare relocs_info using it. This commit also
moves all linker-related macros from <arch>-gen.c files to <arch>-link.c
ones.
2016-12-05 20:51:10 +00:00
Thomas Preud'homme
1c811a4d1d Make build_got_entries more target independent
Factor most of common logic between targets in build_got_entries by
defining target specific info into structures in the backends.
2016-12-03 17:26:51 +00:00
Pavlas, Zdenek
cdf715a0b5 i386 + bcheck: fix __bound_local_new
With -b, this produces garbage. Code to call __bound_local_new
is put at wrong place, overwriting the regparam setup code.
Fix copied from x86_64-gen.c.

void __attribute__((regparm(3)))
fun(int unused)
{
  char local[1];
}
2016-11-09 01:04:45 -08:00
grischka
3054a76249 i386-gen: use EBX as 4th register
May be enabled/disabled by changing this line:
    #define USE_EBX 1
2016-10-19 19:22:15 +02:00
grischka
02642bc94c lib/libtcc1.c: cleanup
- remove #include dependencies from libtcc1.c
  for easier cross compilation
- clear_cache only on ARM
- error-message for mprotect failure
2016-10-19 19:21:36 +02:00
grischka
d9b7f018ce i386: do not 'lexpand' into registers necessarily
Previously, long longs were 'lexpand'ed into two registers
always.

Now, it expands
- constants into two constants (lo-part, hi-part)
- variables into two lvalues with offset+4 for the hi-part.

This makes long long operations look a bit nicer.

Also: don't apply i386 'inc/dec' optimization if carry
generation is wanted.
2016-10-16 19:04:40 +02:00
grischka
b691585785 tccgen: arm/i386: save_reg_upstack
tccgen.c:gv() when loading long long from lvalue, before
was saving all registers which caused problems in the arm
function call register parameter preparation, as with

    void foo(long long y, int x);
    int main(void)
    {
      unsigned int *xx[1], x;
      unsigned long long *yy[1], y;
      foo(**yy, **xx);
      return 0;
    }

Now only the modified register is saved if necessary,
as in this case where it is used to store the result
of the post-inc:

        long long *p, v, **pp;
        v = 1;
        p = &v;
        p[0]++;
        printf("another long long spill test : %lld\n", *p);

i386-gen.c :
- found a similar problem with TOK_UMULL caused by the
  vstack juggle in tccgen:gen_opl()
  (bug seen only when using EBX as 4th register)
2016-10-04 17:36:51 +02:00
Pavlas, Zdenek
e238e6521b gtst_addr(): short conditional jumps (i386, x86_64) 2016-09-30 07:33:20 -07:00
seyko
a37f8cfc80 short_call_convention patch from tcc bugzilla
BUGZILLA:
    interfacing with other compilers

    extend the return value to the whole register if necessary.
    visual studio and gcc do not always set the whole eax register
    when assigning the return value of a function.

    We've encountered wrong execution results on i386 platforms with an
    application that uses both code compiled with TCC and code compiled
    with other compilers (namely: Visual Studio on Windows, and GCC on
    Linux).

    When calling a function that returns an integer value shorter than 32
    bits, TCC reads the return value from the whole EAX register,
    although the code generated by the other compilers can only sets AL
    for 8 bit values or AX for 16 bits values, and the rest of EAX can be
    anything.

    We worked around this with the attached patch on i386 for the version
    0.9.26, but we did not look at other platforms to find if there are
    similar issues.
2016-05-15 21:10:06 +03:00
seyko
5ee097fce9 allow to compile tcc by pcc
* pcc have only __linux__ macro (and no __linux)
    * pcc don't have __clear_cache proc
2016-04-15 17:41:49 +03:00
Michael Matz
80343ab7d8 Fix assignment to/from volatile types
Code like this was broken:

   char volatile vi = i;

See testcase, happens in ideosyncratic legacy code sprinkling
volatile all over.
2016-03-26 17:57:22 +01:00
Edmund Grimley Evans
4ae626451e Bug fix for commit 553242c18a.
In gtst, vtop->c.i is not usually zero, but it is when compiling:

int f(void) { return 1 && 1 ? 1 : 1; }
2015-11-20 23:17:24 +00:00
Edmund Grimley Evans
553242c18a Replace pointer casts with calls to (read|write)(16|32|64)le.
This stops UBSan from giving runtime misaligned address errors
and might eventually allow building on a non-little-endian host.
2015-11-19 18:21:14 +00:00
Edmund Grimley Evans
569fba6db9 Merge the integer members of union CValue into "uint64_t i". 2015-11-17 19:09:35 +00:00
gus knight
ef3d38c5c9 Revert "fix-mixed-struct (patch by Pip Cet)"
This reverts commit 4e04f67c94. Requested by grischka.
2015-07-29 16:57:41 -04:00
gus knight
89ad24e7d6 Revert all of my changes to directories & codingstyle. 2015-07-29 16:57:12 -04:00
gus knight
47e06c6d4e Reorganize the source tree.
* Documentation is now in "docs".
 * Source code is now in "src".
 * Misc. fixes here and there so that everything still works.

I think I got everything in this commit, but I only tested this
on Linux (Make) and Windows (CMake), so I might've messed
something up on other platforms...
2015-07-27 16:03:25 -04:00
gus knight
41031221c8 Trim trailing spaces everywhere. 2015-07-27 12:43:40 -04:00
seyko
4e04f67c94 fix-mixed-struct (patch by Pip Cet)
Jsut for testing. It works for me (don't break anything)
    Small fixes for x86_64-gen.c in "tccpp: fix issues, add tests"
    are dropped in flavor of this patch.

    Pip Cet:

    Okay, here's a first patch that fixes the problem (but I've found
    another bug, yet unfixed, in the process), though it's not
    particularly pretty code (I tried hard to keep the changes to the
    minimum necessary). If we decide to actually get rid of VT_QLONG and
    VT_QFLOAT (please, can we?), there are some further simplifications in
    tccgen.c that might offset some of the cost of this patch.

    The idea is that an integer is no longer enough to describe how an
    argument is stored in registers. There are a number of possibilities
    (none, integer register, two integer registers, float register, two
    float registers, integer register plus float register, float register
    plus integer register), and instead of enumerating them I've
    introduced a RegArgs type that stores the offsets for each of our
    registers (for the other architectures, it's simply an int specifying
    the number of registers). If someone strongly prefers an enum, we
    could do that instead, but I believe this is a place where keeping
    things general is worth it, because this way it should be doable to
    add SSE or AVX support.

    There is one line in the patch that looks suspicious:

             } else {
                 addr = (addr + align - 1) & -align;
                 param_addr = addr;
                 addr += size;
    -            sse_param_index += reg_count;
             }
             break;

    However, this actually fixes one half of a bug we have when calling a
    function with eight double arguments "interrupted" by a two-double
    structure after the seventh double argument:

    f(double,double,double,double,double,double,double,struct { double
    x,y; },double);

    In this case, the last argument should be passed in %xmm7. This patch
    fixes the problem in gfunc_prolog, but not the corresponding problem
    in gfunc_call, which I'll try tackling next.
2015-05-14 07:32:24 +03:00
grischka
30df3189b1 tccpp: fix issues, add tests
* fix some macro expansion issues
* add some pp tests in tests/pp
* improved tcc -E output for better diff'ability
* remove -dD feature (quirky code, exotic feature,
  didn't work well)

Based partially on ideas / researches from PipCet

Some issues remain with VA_ARGS macros (if used in a
rather tricky way).

Also, to keep it simple, the pp doesn't automtically
add any extra spaces to separate tokens which otherwise
would form wrong tokens if re-read from tcc -E output
(such as '+' '=')  GCC does that, other compilers don't.

 * cleanups
  - #line 01 "file" / # 01 "file" processing
  - #pragma comment(lib,"foo")
  - tcc -E: forward some pragmas to output (pack, comment(lib))
  - fix macro parameter list parsing mess from
    a3fc543459
    a715d7143d
    (some coffee might help, next time ;)
  - introduce TOK_PPSTR - to have character constants as
    written in the file (similar to TOK_PPNUM)
  - allow '\' appear in macros
  - new functions begin/end_macro to:
      - fix switching macro levels during expansion
      - allow unget_tok to unget more than one tok
  - slight speedup by using bitflags in isidnum_table

Also:
  - x86_64.c : fix decl after statements
  - i386-gen,c : fix a vstack leak with VLA on windows
  - configure/Makefile : build on windows (MSYS) was broken
  - tcc_warning: fflush stderr to keep output order (win32)
2015-05-09 14:29:39 +02:00
seyko
999274ca90 a lot simpler VLA code
Author: Philip <pipcet@gmail.com>
    Our VLA code can be made a lot simpler (simple enough for
    even me to understand it) by giving up on the optimization idea, which
    is very tempting. There's a patch to do that attached, feel free to
    test and commit it if you like. (It passes all the tests, at least
2015-05-04 04:09:05 +03:00
seyko
acef4ff244 make a bound checking more compatible with Windows 64
On Linux 32:   sizeof(long)=32 == sizeof(void *)=32
    on Linux 64:   sizeof(long)=64 == sizeof(void *)=64
    on Windows 64: sizeof(long)=32 != sizeof(void *)=64
2015-03-26 07:47:45 +03:00
Michael Matz
50899e30ab Fix stack overwrite on structure return
The common code to move a returned structure packed into
registers into memory on the caller side didn't take the
register size into account when allocating local storage,
so sometimes that lead to stack overwrites (e.g. in 73_arm64.c),
on x86_64.  This fixes it by generally making gfunc_sret also return
the register size.
2015-03-09 00:19:59 +01:00
seyko
664c19ad5e partial revert of the commit 4ad186c5ef
A test program:

    /* result of the new version inroduced in 4ad186c5ef: t2a = 44100312 */
    #include<stdio.h>
    int main() {
	int t1 = 176401255;
	float f = 0.25f;
	int t2a = (int)(t1 * f); // must be 44100313
	int t2b = (int)(t1 * (float)0.25f);
	printf("t2a=%d t2b=%d \n",t2a,t2b);
	return 0;
    }
2015-03-04 16:25:51 +03:00
seyko
2437ccdc76 A partial reverse for commit eda2c756ed
Author: Thomas Preud'homme <robotux@celest.fr>
	Date:   Tue Dec 31 23:51:20 2013 +0800

	Move logic for if (int value) to tccgen.c
	Move the logic to do a test of an integer value (ex if (0)) out of
	arch-specific code to tccgen.c to avoid code duplication. This also
        fixes test of long long value which was only testing the bottom half of
	such values on 32 bits architectures.

I don't understand why if () in gtst(i) was removed.
This patch allows to compile a linux kernel v.2.4.26
W/o this patch a tcc simply crashes.
2015-03-03 15:51:09 +03:00
grischka
3fe2a95d7f be stricter with aliasing
Refactoring (no logical changes):
- use memcpy in tccgen.c:ieee_finite(double d)
- use union to store attribute flags in Sym
Makefile: "CFLAGS+=-fno-strict-aliasing" basically not necessary
anymore but I left it for now because gcc sometimes behaves
unexpectedly without.

Also:
- configure: back to mode 100755
- tcc.h: remove unused variables tdata/tbss_section
- x86_64-gen.c: adjust gfunc_sret for prototype
2014-01-07 14:57:07 +01:00
grischka
4ad186c5ef i386: use __fixdfdi instead of __tcc_cvt_ftol
Variants __fixsfdi/__fixxfdi are not needed for now because
the value is converted to double always.

Also:
- remove __tcc_fpinit for unix as it seems redundant by the
  __setfpucw call in the startup code
- avoid reference to s->runtime_main in cross compilers
- configure: fix --with-libgcc help
- tcctok.h: cleanup
2014-01-06 19:07:08 +01:00
Thomas Preud'homme
8efaa71190 Fix struct ret in variadic fct with ARM hardfloat
The procedure calling standard for ARM architecture mandate the use of
the base standard for variadic function. Therefore, hgen float aggregate
must be returned via stack when greater than 4 bytes and via core
registers else in case of variadic function.

This patch improve gfunc_sret() to take into account whether the
function is variadic or not and make use of gfunc_sret() return value to
determine whether to pass a structure via stack in gfunc_prolog(). It
also take advantage of knowing if a function is variadic or not move
float result value from VFP register to core register in gfunc_epilog().
2014-01-06 22:57:05 +08:00