Commit Graph

164 Commits

Author SHA1 Message Date
Thomas Preud'homme
6e56bb387d Fix preprocessor concat with empty arg 2014-04-12 16:11:42 +08:00
minux
b8eb7dd8e8 tcc.h: add ELF interpreter for DragonFly BSD. 2014-04-12 01:10:12 -04:00
Michael Matz
6a947d9d26 ELF: Remove traces of old RUNTIME_PLTGOT code
The last users of it went away, no use in keeping
this code.
2014-04-06 01:59:35 +02:00
Michael Matz
01c0419234 arm: Use proper PLT/GOT for -run.
Same as with x86_64, disable the runtime_plt_and_got hack
for -run on arm as well.  For that we need to handle several
relocations as (potentially) generating PLT slots as well.
Tested with mpfr-3.1.2 and gawk (both using --disable-shared),
there are two resp. five pre-existing problems, so no regressions.

This also works toward enabling real shared libs for arm,
but it's not there yet.
2014-04-06 01:50:35 +02:00
Michael Matz
9750d0b725 x86_64: Create proper PLT and GOT also for -run
This makes us use the normal PLT/GOT codepaths also for -run,
which formerly used an on-the-side blob for the jump tables.
For x86_64 only for now, arm coming up.
2014-04-06 00:30:22 +02:00
grischka
5879c854fb tccgen: x86_64: fix garbage in the SValue upper bits
This was going wrong (case TOK_LAND in unary: computed labels)
-        vset(&s->type, VT_CONST | VT_SYM, 0);
-        vtop->sym = s;

This does the right thing and is shorter:

+        vpushsym(&s->type, s);


Test case was:

    int main(int argc, char **argv)
    {
        int x;
        static void *label_return = &&lbl_return;
        printf("label_return = %p\n", label_return);
        goto *label_return; //<<<<< here segfault on linux X86_64 without the memset on vset
        printf("unreachable\n");
    lbl_return:
        return 0;
    }


Also::
- Rename "void* CValue.ptr" to more usable "addr_t ptr_offset"
  and start to use it in obvious cases.

- use __attribute__ ((noreturn)) only with gnu compiler

- Revert CValue memsets ("After several days searching ...")
  commit 4bc83ac393

Doesn't mean that the vsetX/vpush thingy isn't brittle and
there still might be bugs as to differences in how the CValue
union  was set and is then interpreted later on.

However the big memset hammer was just too slow (-3% overall).
2014-04-04 20:20:44 +02:00
Michael Matz
0bd1282059 x86-64: shared libs improvement
This correctly resolves local references to global functions from
shared libs to their PLT slot (instead of directly to the target
symbol), so that interposition works.

This is still not 100% conforming (executables don't export symbols
that are also defined in linked shared libs, as they must), but
normal shared lib situations work.
2014-03-31 05:36:12 +02:00
mingodad
5a5fee867a Add __attribute__ ((noreturn)) to tcc_error and expect functions.
This make use of static analysis tools like scan-build report less false positives.
2014-03-30 10:18:18 +01:00
grischka
0ac8aaab1b tccpp: reorder some tokens
... and make future reordering possibly easier

related to 9a6ee577f6
2014-03-29 19:37:26 +01:00
Thomas Preud'homme
aa561d7011 Simplify and fix GOT32 + PLT32 reloc commit
Introduce a new attribute to check the existence of a PLT entry for a
given symbol has the presence of an entry for that symbol in the dynsym
section is not proof that a PLT entry exists.

This fixes commit dc8ea93b13.
2014-03-26 23:13:28 +08:00
Thomas Preud'homme
b0b5165d16 Def signedness != signed != unsigned for char
When checking for exact compatibility between types (such as in
__builtin_types_compatible_p) consider the case of default signedness to
be incompatible with both of the explicit signedness for char. That is,
char is incompatible with signed char *and* unsigned char, no matter
what the default signedness for char is.
2014-02-06 21:40:22 +08:00
Thomas Preud'homme
b6247d1f3c Add support for runtime selection of float ABI 2014-01-08 15:00:52 +08:00
grischka
3fe2a95d7f be stricter with aliasing
Refactoring (no logical changes):
- use memcpy in tccgen.c:ieee_finite(double d)
- use union to store attribute flags in Sym
Makefile: "CFLAGS+=-fno-strict-aliasing" basically not necessary
anymore but I left it for now because gcc sometimes behaves
unexpectedly without.

Also:
- configure: back to mode 100755
- tcc.h: remove unused variables tdata/tbss_section
- x86_64-gen.c: adjust gfunc_sret for prototype
2014-01-07 14:57:07 +01:00
grischka
2bd0daabbe misc. fixes
- tccgen: error out for cast to void, as in
      void foo(void) { return 1; }
  This avoids an assertion failure in x86_64-gen.c, also.
  also fix tests2/03_struct.c accordingly

- Error: "memory full" - be more specific

- Makefiles: remove circular dependencies, lookup tcctest.c from VPATH

- tcc.h: cleanup lib, include, crt and libgcc search paths"
  avoid duplication or trailing slashes with no CONFIG_MULTIARCHDIR
  (as from 9382d6f1a0)

- tcc.h: remove ";{B}" from PE search path
  in ce5e12c2f9 James Lyon wrote:
  "... I'm not sure this is the right way to fix this problem."
  And the answer is: No, please. (copying libtcc1.a for tests instead)

- win32/build_tcc.bat: do not move away a versioned file
2014-01-06 19:56:26 +01:00
Thomas Preud'homme
8efaa71190 Fix struct ret in variadic fct with ARM hardfloat
The procedure calling standard for ARM architecture mandate the use of
the base standard for variadic function. Therefore, hgen float aggregate
must be returned via stack when greater than 4 bytes and via core
registers else in case of variadic function.

This patch improve gfunc_sret() to take into account whether the
function is variadic or not and make use of gfunc_sret() return value to
determine whether to pass a structure via stack in gfunc_prolog(). It
also take advantage of knowing if a function is variadic or not move
float result value from VFP register to core register in gfunc_epilog().
2014-01-06 22:57:05 +08:00
Thomas Preud'homme
a01d83d783 Don't enable bound check if libgcc is used
Bound check rely on some functions provided by libtcc. It should
therefore not be enabled when libgcc is used.
2014-01-06 11:26:09 +08:00
Ramsay Jones
d0c2f00df2 Fix CONFIG_TCC_SYSINCLUDEPATHS on !win32 systems
Commit 9382d6f1 ("Fix lib, include, crt and libgcc search paths",
07-09-2013) inadvertently included an initial empty entry to the
CONFIG_TCC_SYSINCLUDEPATHS variable (for non win32 targets). In
addition to an empty line in the 'tcc -vv' display, this leads
to the preprocessor attempting to read an include file from the
root of the filesystem (i.e. '/header.h').

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
2013-10-02 21:49:55 +02:00
Thomas Preud'homme
76cb1144ef Generate an error when a function is redefined
Use one more bit in AttributeDef to differenciate between declared
function (only its prototype is known) and defined function (its body is
also known). This allows to generate an error in cases like:

int f(){return 0;}
int f(){return 1;}
2013-09-16 14:48:33 +02:00
Thomas Preud'homme
9382d6f1a0 Fix lib, include, crt and libgcc search paths 2013-09-07 19:28:06 +02:00
grischka
73faaea227 i386-gen: preserve fp control word in gen_cvt_ftoi
- Use runtime function for conversion
- Also initialize fp with tcc -run on windows

This fixes a bug where
  double x = 1.0;
  double y = 1.0000000000000001;
  double z = x < y ? 0 : sqrt (x*x - y*y);
caused a bad sqrt because rounding precision for the x < y comparison
was different to the one used within the sqrt function.

This also fixes a bug where
  printf("%d, %d", (int)pow(10, 2), (int)pow(10, 2));
would print
  100, 99

Unrelated:
  win32: document relative include & lib lookup
  win32: normalize_slashes: do not mirror silly gcc behavior
  This reverts part of commit 8a81f9e103
  winapi: add missing WINAPI decl. for some functions
2013-08-28 22:55:05 +02:00
James Lyon
41b3c7a507 Improved variable length array support.
VLA storage is now freed when it goes out of scope. This makes it
possible to use a VLA inside a loop without consuming an unlimited
amount of memory.

Combining VLAs with alloca() should work as in GCC - when a VLA is
freed, memory allocated by alloca() after the VLA was created is also
freed. There are some exceptions to this rule when using goto: if a VLA
is in scope at the goto, jumping to a label will reset the stack pointer
to where it was immediately after the last VLA was created prior to the
label, or to what it was before the first VLA was created if the label
is outside the scope of any VLA. This means that in some cases combining
alloca() and VLAs will free alloca() memory where GCC would not.
2013-04-27 22:58:52 +01:00
James Lyon
946afd2343 Fixed problems with XMM1 use on Linux/x86-64.
All tests pass. I think I've caught all the cases assuming only XMM0 is
used. I expect that Win64 is horribly broken by this point though,
because I haven't altered it to cope with XMM1.
2013-04-19 18:33:30 +01:00
James Lyon
b961ba5396 Got test1-3 working on x86-64.
There are probably still issues on x86-64 I've missed.
I've added a few new tests to abitest, which fail (2x long long and 2x double
in a struct should be passed in registers).
2013-04-19 11:10:13 +01:00
James Lyon
55ea6d3fc1 x86-64 ABI fixes.
abitest now passes; however test1-3 fail in init_test. All other tests
pass. I need to re-test Win32 and Linux-x86.

I've added a dummy implementation of gfunc_sret to c67-gen.c so it
should now compile, and I think it should behave as before I created
gfunc_sret.
2013-04-19 00:46:49 +01:00
James Lyon
2bbfaf436f Tests in abitest.c now work on Win32.
I expect that Linux-x86 is probably fine. All other architectures
except ARM are definitely broken since I haven't yet implemented
gfunc_sret for these, although replicating the current behaviour
should be straightforward.
2013-04-18 17:27:34 +01:00
James Lyon
ce5e12c2f9 Added ABI compatibility tests with native compiler using libtcc.
Only one test so far, which fails on Windows (with MinGW as the native
compiler - I've tested the MinGW output against MSVC and it appears the
two are compatible).

I've also had to modify tcc.h so that tcc_set_lib_path can point to the
directory containing libtcc1.a on Windows to make the libtcc dependent
tests work. I'm not sure this is the right way to fix this problem.
2013-04-17 21:52:44 +01:00
Andrew Aladjev
0ad857c80e added CPATH, C_INCLUDE_PATH and LD_LIBRARY_PATH 2013-02-19 14:47:36 +03:00
Thomas Preud'homme
5d6cfe855a Fix GNU Hurd interpreter path 2013-02-18 11:53:00 +01:00
Thomas Preud'homme
e946c3583f Add support for KfreeBSD 64bits 2013-02-18 11:42:49 +01:00
Urs Janssen
0bdbd49eac add version number to manpage
avoid c++/c99 style comments in preprocessor directives
avoid leadings whitespaces in preprocessor directives
mention implemented variable length arrays in documentation
fixed ambiguous option in texi2html call (Austin English)
2013-02-17 00:48:51 +01:00
Urs Janssen
cec76c8b8a - document -dumpversion
- fixed a broken prototype
2013-02-15 12:48:33 +01:00
Thomas Preud'homme
0928761257 Revert "Don't search libgcc_s.so.1 on /lib64"
This reverts commit b9f089fc4a.
2013-02-14 23:52:11 +01:00
Thomas Preud'homme
b9f089fc4a Don't search libgcc_s.so.1 on /lib64
It seems libgcc_s.so.1 is systematically on /lib/ (whether
/lib/$triplet for multiarch systems or just /lib for other systems).
2013-02-14 18:05:55 +01:00
grischka
762a43877b configure: pass CONFIG_xxxDIR/PATH options via commandline
- except for CONFIG_SYSROOT and CONFIG_TCCDIR

Strictly neccessary it is only for CONFIG_MULTIARCHDIR
because otherwise if it's in config.h it is impossible to
leave it undefined.

But it is also nicer not to use these definitions for
cross-compilers.

- Also:
lib/Makefile : include ../Makefile for CFLAGS
lib/libtcc1.c : fix an issue compiling tcc with tcc on x64
2013-02-14 17:43:24 +01:00
grischka
944627c479 configure: cleanup
- add quotes: eval opt=\"$opt\"
- use $source_path/conftest.c for OOT build
- add fn_makelink() for OOT build
- do not check lddir etc. on Windows/MSYS
- formatting

config-print.c
- rename to conftest.c (for consistency)
- change option e to b
- change output from that from "yes" to "no"
- remove inttypes.h dependency
- simpify version output

Makefile:
- improve GCC warning flag checks

tcc.h:
- add back default CONFIG_LDDIR
- add default CONFIG_TCCDIR also (just for fun)

tccpp.c:
- fix Christian's last warning
  tccpp.c: In function ‘macro_subst’:
  tccpp.c:2803:12: warning: ‘*((void *)&cval+4)’ is used uninitialized
     in this function [-Wuninitialized]
  That the change fixes the warning doesn't make sense but anyway.

libtcc.c:
- tcc_error/warning: print correct source filename/line for
  token :paste: (also inline :asm:)

lddir and multiarch logic still needs fixing.
2013-02-14 06:53:07 +01:00
Thomas Preud'homme
f9ac201377 Detect multiarch triplet and lddir from ldd output 2013-02-13 20:14:13 +01:00
Thomas Preud'homme
f6cfaa6d25 Improve multiarch detection
* Detect multiarch at configure time
* Detect based on the place where crti.o is
* Define multiarch triplet in tcc.h
2013-02-13 17:03:30 +01:00
grischka
05108a3b0a libtcc: new LIBTCCAPI tcc_set_options(TCCState*, const char*str)
This replaces       -> use instead:
-----------------------------------
- tcc_set_linker    -> tcc_set_options(s, "-Wl,...");
- tcc_set_warning   -> tcc_set_options(s, "-W...");
- tcc_enable_debug  -> tcc_set_options(s, "-g");

parse_args is moved to libtcc.c (now tcc_parse_args).

Also some cleanups:
- reorder TCCState members
- add some comments here and there
- do not use argv's directly, make string copies
- use const char* in tcc_set_linker
- tccpe: use fd instead of fp

tested with -D MEM_DEBUG: 0 bytes left
2013-02-12 19:13:28 +01:00
grischka
8042121d74 tcc -vv/--print-search-dirs: print more info
tests/Makefile:
- print-search-dirs when 'hello' fails
- split off hello-run

win32/include/_mingw.h:
- fix for compatibility with mingw headers
  (While our headers in win32 are from mingw-64 and don't have
  the problem)

tiny_libmaker:
- don't use "dangerous" mktemp
2013-02-10 00:38:40 +01:00
grischka
d6d7686b60 tcc.h: declare CValue.tab[LDOUBLE_SIZE/4]
Should fix some warnings wrt. access out of array bounds.

tccelf.c: fix "static function unused" warning
x86_64-gen.c: fix "ctype.ref uninitialzed" warning and cleanup
tcc-win32.txt: remove obsolete limitation notes.
2013-02-08 19:07:11 +01:00
grischka
7a477d70ca lib/Makefile: use CC, add bcheck to libtcc1.a
Also:
- fix "make tcc_p" (profiling version)
- remove old gcc flags:
  -mpreferred-stack-boundary=2 -march=i386 -falign-functions=0
- remove test "hello" for Darwin (cannot compile to file)
2013-02-06 19:01:07 +01:00
grischka
82bcbd027f portability: fix void* <-> target address conversion confusion
- #define addr_t as ElfW(Addr)
- replace uplong by addr_t
- #define TCC_HAS_RUNTIME_PLTGOT and use it
2013-02-04 16:24:59 +01:00
grischka
3186455599 Makefile: allow CONFIG_LDDIR=lib64 configuration 2013-02-04 16:24:58 +01:00
grischka
263dc93cfa c67: remove global #define's for TRUE/FALSE/BOOL
Also use uppercase TRUE/FALSE instead of true/false
2013-02-04 16:24:56 +01:00
grischka
c5892fe4f5 Revert "Optimize vswap()"
This reverts commit 63193d1794.

Had some problems (_STATIC_ASSERT) and was too ugly anyway.
For retry, I'd suggest to implement a general function
    static inline void memswap (void *p1, void* p2, size_t n);
and then use that.  If you do so, please keep the original code
as comment.
2013-01-14 18:41:37 +01:00
Thomas Preud'homme
8c56b0cf90 Revert "Added what I call virtual io to tinycc this way we can make a monolitic executable or library that contains all needed to compile programs, truly tinycc portable."
This reverts commit 59e18aee0e.
tcc is being stabilized now in order to do a new release soon.
Therefore, such a change is not appropriate now.
2013-01-14 17:34:07 +01:00
mingodad
59e18aee0e Added what I call virtual io to tinycc this way we can make a monolitic executable or library that contains all needed to compile programs, truly tinycc portable.
Tested under linux exec the "mk-it" shell script and you'll end up with a portable tinycc executable that doesn't depend on anything else.
2013-01-11 00:04:38 +00:00
grischka
2358b378b3 tccpp: alternative fix for #include_next infinite loop bug
This replaces commit 3d409b0889

- revert old fix in libtcc.c
- #include_next: look up the file in the include stack to see
  if it is already included.
Also:
- streamline include code
- remove 'type' from struct CachedInclude (obsolete because we check
  full filename anyway)
- remove inc_type & inc_filename from struct Bufferedfile (obsolete)
- fix bug with TOK_FLAG_ENDIF not being reset
- unrelated: get rid of an 'variable potentially uninitialized' warning
2013-01-06 17:20:44 +01:00
Kirill Smelkov
63193d1794 Optimize vswap()
vswap() is called often enough and shows in profile and it was easy to
hand optimize swapping vtop[-1] and vtop[0] - instead of large (28 bytes
on i386) tmp variable and two memory to memory copies, let's swap areas
by longs through registers with streamlined assembly.

For

    $ ./tcc -B. -bench -DONE_SOURCE -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -c tcc.c

before:

 # Overhead      Command        Shared Object                                          Symbol
 # ........  ...........  ...................  ..............................................
 #
     15.19%          tcc  tcc                  [.] next_nomacro1
      5.19%          tcc  libc-2.13.so         [.] _int_malloc
      4.57%          tcc  tcc                  [.] next
      3.36%          tcc  tcc                  [.] tok_str_add2
      3.03%          tcc  tcc                  [.] macro_subst_tok
      2.93%          tcc  tcc                  [.] macro_subst
      2.53%          tcc  tcc                  [.] next_nomacro_spc
      2.49%          tcc  tcc                  [.] vswap
      2.36%          tcc  libc-2.13.so         [.] _int_free

       │    ST_FUNC void vswap(void)
       │    {
  1,96 │      push   %edi
  2,65 │      push   %esi
  1,08 │      sub    $0x20,%esp
       │        SValue tmp;
       │
       │        /* cannot let cpu flags if other instruction are generated. Also
       │           avoid leaving VT_JMP anywhere except on the top of the stack
       │           because it would complicate the code generator. */
       │        if (vtop >= vstack) {
  0,98 │      mov    0x8078cac,%eax
       │      cmp    $0x8078d3c,%eax
  1,18 │   ┌──jb     24
       │   │        int v = vtop->r & VT_VALMASK;
  1,08 │   │  mov    0x8(%eax),%edx
  0,78 │   │  and    $0x3f,%edx
       │   │        if (v == VT_CMP || (v & ~1) == VT_JMP)
  0,78 │   │  cmp    $0x33,%edx
  0,69 │   │↓ je     54
  0,59 │   │  and    $0xfffffffe,%edx
  0,49 │   │  cmp    $0x34,%edx
  0,29 │   │↓ je     54
       │   │            gv(RC_INT);
       │   │    }
       │   │    tmp = vtop[0];
  1,08 │24:└─→lea    0x4(%esp),%edi
  0,39 │      mov    $0x7,%ecx
       │      mov    %eax,%esi
 14,41 │      rep    movsl %ds:(%esi),%es:(%edi)
       │        vtop[0] = vtop[-1];
  9,51 │      lea    -0x1c(%eax),%esi
  1,96 │      mov    $0x7,%cl
       │      mov    %eax,%edi
 17,06 │      rep    movsl %ds:(%esi),%es:(%edi)
       │        vtop[-1] = tmp;
 10,20 │      mov    0x8078cac,%edi
  2,35 │      sub    $0x1c,%edi
  0,78 │      lea    0x4(%esp),%esi
       │      mov    $0x7,%cl
 15,20 │      rep    movsl %ds:(%esi),%es:(%edi)
       │    }
  9,90 │      add    $0x20,%esp
  2,25 │      pop    %esi
  1,67 │      pop    %edi
  0,69 │      ret

after:

 # Overhead      Command        Shared Object                                          Symbol
 # ........  ...........  ...................  ..............................................
 #
     15.27%          tcc  tcc                  [.] next_nomacro1
      5.08%          tcc  libc-2.13.so         [.] _int_malloc
      4.57%          tcc  tcc                  [.] next
      3.17%          tcc  tcc                  [.] tok_str_add2
      3.12%          tcc  tcc                  [.] macro_subst
      2.99%          tcc  tcc                  [.] macro_subst_tok
      2.43%          tcc  tcc                  [.] next_nomacro_spc
      2.32%          tcc  libc-2.13.so         [.] _int_free

      . . .

      0.71%          tcc  tcc                  [.] vswap

       │    ST_FUNC void vswap(void)
       │    {
  7,22 │      push   %eax
       │        /* cannot let cpu flags if other instruction are generated. Also
       │           avoid leaving VT_JMP anywhere except on the top of the stack
       │           because it would complicate the code generator. */
       │        if (vtop >= vstack) {
 11,34 │      mov    0x8078cac,%eax
  2,75 │      cmp    $0x8078d3c,%eax
  0,34 │   ┌──jb     20
       │   │        int v = vtop->r & VT_VALMASK;
  0,34 │   │  mov    0x8(%eax),%edx
  8,93 │   │  and    $0x3f,%edx
       │   │        if (v == VT_CMP || (v & ~1) == VT_JMP)
  2,06 │   │  cmp    $0x33,%edx
  2,41 │   │↓ je     74
  2,41 │   │  and    $0xfffffffe,%edx
  0,34 │   │  cmp    $0x34,%edx
  2,41 │   │↓ je     74
       │   │        vtopl[-1*VSIZEL + i] = tmpl;    \
       │   │      } do {} while (0)
       │   │
       │   │    VSWAPL(15); VSWAPL(14); VSWAPL(13); VSWAPL(12);
       │   │    VSWAPL(11); VSWAPL(10); VSWAPL( 9); VSWAPL( 8);
       │   │    VSWAPL( 7); VSWAPL( 6); VSWAPL( 5); VSWAPL( 4);
  2,06 │20:└─→mov    0x18(%eax),%edx
  1,37 │      mov    -0x4(%eax),%ecx
  2,06 │      mov    %ecx,0x18(%eax)
  1,37 │      mov    %edx,-0x4(%eax)
  2,06 │      mov    0x14(%eax),%edx
  2,06 │      mov    -0x8(%eax),%ecx
  2,41 │      mov    %ecx,0x14(%eax)
  3,09 │      mov    %edx,-0x8(%eax)
  3,09 │      mov    0x10(%eax),%edx
  1,72 │      mov    -0xc(%eax),%ecx
  2,75 │      mov    %ecx,0x10(%eax)
  1,72 │      mov    %edx,-0xc(%eax)
       │        VSWAPL( 3); VSWAPL( 2); VSWAPL( 1); VSWAPL( 0);
  2,41 │      mov    0xc(%eax),%edx
  2,41 │      mov    -0x10(%eax),%ecx
  2,41 │      mov    %ecx,0xc(%eax)
  0,69 │      mov    %edx,-0x10(%eax)
  1,72 │      mov    0x8(%eax),%edx
  0,69 │      mov    -0x14(%eax),%ecx
  1,03 │      mov    %ecx,0x8(%eax)
  1,37 │      mov    %edx,-0x14(%eax)
  1,37 │      mov    0x4(%eax),%edx
  0,69 │      mov    -0x18(%eax),%ecx
  3,09 │      mov    %ecx,0x4(%eax)
  2,06 │      mov    %edx,-0x18(%eax)
  1,37 │      mov    (%eax),%edx
  2,41 │      mov    -0x1c(%eax),%ecx
  1,37 │      mov    %ecx,(%eax)
  4,12 │      mov    %edx,-0x1c(%eax)
       │        }
       │
       │    #   undef VSWAPL
       │    #   undef VSIZEL
       │    }
  1,03 │      pop    %eax
  3,44 │      ret

Overal speedup:

    # best of 5 runs
    before: 8268 idents, 47203 lines, 1526763 bytes, 0.148 s, 319217 lines/s, 10.3 MB/s
    after:  8273 idents, 47231 lines, 1527685 bytes, 0.146 s, 324092 lines/s, 10.5 MB/s

Static ASSERT macro taken from CCAN's[1] build_assert[2] which is in
public domain.

[1] http://ccodearchive.net/
[2] http://git.ozlabs.org/?p=ccan;a=blob;f=ccan/build_assert/build_assert.h;h=24e59c44cd930173178ac9b6e101b0af64a879e9;hb=HEAD
2012-12-21 20:46:26 +04:00
Kirill Smelkov
8eb92e6052 Optimize cstr_reset() to only reset string to empty, not call free() and later malloc()
A CString could be reset to empty just setting its .size to 0.

If memory was already allocated, that would be remembered in
.data_allocated and .size_allocated and on consequent string
manipulations that memory will be used without immediate need to call
malloc().

For

    $ ./tcc -B. -bench -DONE_SOURCE -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\" -c tcc.c

after the patch malloc/free are called less often:

(tcc is run in loop; perf record -a sleep 10 && perf report)
before:

 # Overhead      Command       Shared Object                                      Symbol
 # ........  ...........  ..................  ..........................................
 #
     13.89%          tcc  tcc                 [.] next_nomacro1
      4.73%          tcc  libc-2.13.so        [.] _int_malloc
      4.39%          tcc  tcc                 [.] next
      2.94%          tcc  tcc                 [.] tok_str_add2
      2.78%          tcc  tcc                 [.] macro_subst_tok
      2.75%          tcc  libc-2.13.so        [.] free
      2.74%          tcc  tcc                 [.] macro_subst
      2.63%          tcc  libc-2.13.so        [.] _int_free
      2.28%          tcc  tcc                 [.] vswap
      2.24%          tcc  tcc                 [.] next_nomacro_spc
      2.06%          tcc  libc-2.13.so        [.] realloc
      2.00%          tcc  libc-2.13.so        [.] malloc
      1.99%          tcc  tcc                 [.] unary
      1.85%          tcc  libc-2.13.so        [.] __i686.get_pc_thunk.bx
      1.76%  kworker/0:1  [kernel.kallsyms]   [k] delay_tsc
      1.70%          tcc  tcc                 [.] next_nomacro
      1.62%          tcc  tcc                 [.] preprocess
      1.41%          tcc  libc-2.13.so        [.] __memcmp_ssse3
      1.38%          tcc  [kernel.kallsyms]   [k] memset
      1.10%          tcc  tcc                 [.] g
      1.06%          tcc  tcc                 [.] parse_btype
      1.05%          tcc  tcc                 [.] sym_push2
      1.04%          tcc  libc-2.13.so        [.] _int_realloc
      1.00%          tcc  libc-2.13.so        [.] malloc_consolidate

after:

 # Overhead      Command       Shared Object                                          Symbol
 # ........  ...........  ..................  ..............................................
 #
     15.26%          tcc  tcc                 [.] next_nomacro1
      5.07%          tcc  libc-2.13.so        [.] _int_malloc
      4.62%          tcc  tcc                 [.] next
      3.22%          tcc  tcc                 [.] tok_str_add2
      3.03%          tcc  tcc                 [.] macro_subst_tok
      3.02%          tcc  tcc                 [.] macro_subst
      2.59%          tcc  tcc                 [.] next_nomacro_spc
      2.44%          tcc  tcc                 [.] vswap
      2.39%          tcc  libc-2.13.so        [.] _int_free
      2.28%          tcc  libc-2.13.so        [.] free
      2.22%          tcc  tcc                 [.] unary
      2.07%          tcc  libc-2.13.so        [.] realloc
      1.97%          tcc  libc-2.13.so        [.] malloc
      1.70%          tcc  tcc                 [.] preprocess
      1.69%          tcc  libc-2.13.so        [.] __i686.get_pc_thunk.bx
      1.68%          tcc  tcc                 [.] next_nomacro
      1.59%          tcc  [kernel.kallsyms]   [k] memset
      1.55%          tcc  libc-2.13.so        [.] __memcmp_ssse3
      1.22%          tcc  tcc                 [.] parse_comment
      1.11%          tcc  tcc                 [.] g
      1.11%          tcc  tcc                 [.] sym_push2
      1.10%          tcc  tcc                 [.] parse_btype
      1.10%          tcc  libc-2.13.so        [.] _int_realloc
      1.06%          tcc  tcc                 [.] vsetc
      0.98%          tcc  libc-2.13.so        [.] malloc_consolidate

and this gains small speedup for tcc:

    # best of 5 runs
    before: 8268 idents, 47191 lines, 1526670 bytes, 0.153 s, 307997 lines/s, 10.0 MB/s
    after:  8268 idents, 47203 lines, 1526763 bytes, 0.148 s, 319217 lines/s, 10.3 MB/s
2012-12-21 20:46:26 +04:00