Commit Graph

170 Commits

Author SHA1 Message Date
grischka
5e4d0718ff tcc -E -P10 : output all numbers as decimals
This may be used to preprocess Fabrice Bellards initial revision
in this repository to demonstrate its capability to compile and
run itself (on i386 32-bit linux or windows).

Initial revision: 27f6e16bae

Also needed:
* an empty stdio.h
* a wrapper named tc.c with

  void expr(void);
  void decl(int);
  void next(void);
  #include "tcc.c"

* an hello.c such as

  int main()
  {
      printf("Hello World\n");
      return 0;
  }

All files with unix LF only lines.  Then ...

* preprocess the source
  $ tcc -E -P10 -I. tcc.c -o tc1.c
* compile the compiler
  $ tcc -w -I. tc.c -o tc -ldl
* run it to compile and
   run itself to compile and
    run itself to compile and
     run itself to compile and
      run hello.c
$ ./tc tc1.c tc1.c tc1.c hello.c

--> Hello World!

------------------------------------------------------
* On i386 windows this may be added to the tc.c wrapper

  #ifdef _WIN32
  #include <windows.h>
  void *dlsym(int x, const char *func)
  {
      if (0 == strcmp(func, "dlsym"))
          return &dlsym;
      return GetProcAddress(LoadLibrary("msvcrt"), func);
  }
  #endif
2016-05-12 10:25:50 +02:00
Edmund Grimley Evans
f5f82abc99 Insert spaces between certain tokens when tcc is invoked with -E.
Insert a space when it is required to prevent mistokenisation of
the output, and also in a few cases where it is not strictly
required, imitating GCC's behaviour.
2016-05-09 19:27:31 +01:00
seyko
75243f744c TOK_PPNUM in asm (Edmund Grimley Evans version) 2016-05-08 05:14:03 +03:00
grischka
fe845cf53d tccpp: cleanup options -dD -dM, remove -C
The lexer is for reading files, not for writing.

Also :
- macro_is_equal(): avoid crash if redefining __FILE__
2016-05-05 14:12:53 +02:00
Edmund Grimley Evans
a348513569 Revert 78e4ee5. 2016-05-04 20:27:39 +01:00
Edmund Grimley Evans
6015840583 Revert 3283c26 and a1c1390 in tccpp.c. 2016-05-04 20:14:39 +01:00
seyko
78e4ee55b7 PP_NUM in ASM mode
oxe+1 is parsed as 0xe +1 if (parse_flags & PARSE_FLAG_ASM_FILE)
        Helps to compile a code:
        __asm__("mov $0xe" "+1", "%eax\n")
2016-05-04 16:54:40 +03:00
seyko
3283c26827 clearing "output space after TOK_PPNUM ..." 2016-05-01 16:36:19 +03:00
seyko
a1c139063b output space after TOK_PPNUM which followed by '+' or '-'
* correct -E output for the case ++ + ++ concatenation
        do this only for expanded from macro string
        and only when tcc_state->output_type == TCC_OUTPUT_PREPROCESS
2016-05-01 05:43:57 +03:00
grischka
256078933c tccpp: macro subst fix
#define Y(x) Z(x)
#define X Y
return X(X(1));

was : return Z(Y(1));
now : return Z(Z(1));
2016-04-29 19:00:33 +02:00
seyko
b4125ba0c1 fix for the "Reduce allocations overhead"
Now no trap when compiling tccboot
2016-04-22 20:32:15 +03:00
seyko
1f49441a27 .rept asm directive
and '.' alone is a token now in *.S (not an identifier)
    representing a current position in the code (PC).
2016-04-22 18:29:56 +03:00
seyko
8db7a0f7af Source and destination overlap in memcpy, cstr_cat (tccpp.c:322)
This code is from "Improve hash performance"
2016-04-22 18:21:09 +03:00
Vlad Vissoultchev
cdc16d428f Reduce allocations overhead
- uses new `TinyAlloc`-ators for small `TokenSym`, `CString` and
  `TokenString` instances
- conditional `TAL_DEBUG` for mem leaks and double frees detection
- on `TAL_DEBUG` collects allocation origin (file + line)
- conditional `TAL_INFO` for allocators stats (in release mode too)
- chain a new allocator twice current capacity on buffer exhaustion
2016-04-17 17:26:10 +03:00
Vlad Vissoultchev
224236f57c Improve hash performance
- better `TOK_HASH_FUNC`
- increases `hash_ident` initial size to 16k (from 8k)
- `cstr_cat` uses single `realloc` + `memcpy`
- `cstr_cat` can append terminating zero
- `tok_str_realloc` initial size to 16 (from 8)
- `parse_define` uses static `tokstr_buf`
- `next` uses static `tokstr_buf`
- fixes two latent bugs (wrong deallocations in libtcc.c:482 and
  tccpp.c:2987)
2016-04-17 17:25:55 +03:00
seyko
587aacedf3 simplify -C printing
parse_print_line_comment() and parse_print_comment() are
    combined and made more simply:
        * don't worry about speed with -E option
        * don't handle straya in comments
            Do we need to handle strays in regular
            parse_line_comment() and
            parse_comment() ?
2016-04-17 10:07:55 +03:00
seyko
c6dc756d4e preprocessor oprtion -C (keep comments)
This is done by impression of the pcc -C option.
    Usual execution path and speed are not changed.
2016-04-15 17:15:11 +03:00
seyko
16cbca281f fix preprocessing *.S with ` ' chars in #comments
with a test program. Problem detected when trying to
    compile linux-2.4.37.9 with tcc.
2016-04-14 21:46:46 +03:00
seyko
5fb57bead4 fix for thev "#pragna once" guard
gcc 3.4.6 don't understand "#if PATHCMP==stricmp"
    where "#define PATHCMP stricmp"
2016-04-14 21:39:34 +03:00
Vlad Vissoultchev
34feee0ed6 Move utility functions trimfront/back to tccpp.c
These are used in `libtcc.c` now and cannot remain in `tccpe.c`
2016-04-13 14:33:21 +03:00
Vlad Vissoultchev
b3782c3cf5 Better pragma once guard
This takes care of case-insensitive filenames (like on win32)
2016-04-13 11:18:40 +03:00
Vlad Vissoultchev
810a677d32 tccpp.c: Guard against ppfp being NULL
Missed these in e946eb2a41
2016-04-13 10:58:42 +03:00
seyko
6a49afb3ed correct version of "Identifiers can start and/or contain"
A problem was in TOK_ASMDIR_text:
    -    sprintf(sname, ".%s", get_tok_str(tok1, NULL));
    +    sprintf(sname, "%s", get_tok_str(tok1, NULL));
    When tok1 is '.text', then sname is '..text'
2016-04-13 10:23:46 +03:00
seyko
b5b3e89f9e Fix pragma once guard
From: Vlad Vissoultchev
    Date: Mon, 11 Apr 2016 01:26:32 +0300
    Subject: Fix pragma once guard when compiling multiple source files

    When compiling multiple source files directly to executable cached
    include files guard was incorrectly checked for TOK_once in ifndef_macro
    member.

    If two source files included the same header guarded by pragma once, then
    the second one erroneously skipped it as `cached_includes` is not cleared
    on second `tcc_compile`
2016-04-13 06:17:02 +03:00
seyko
131d776d66 revert of the 'Identifiers can start and/or contain'
When tccboot kernels compiles with
    'Identifiers can start and/or', this kernel don't start.
    It is hard to find what is wrong.

    PS: there was no test for identifiers in *.S with '.'
2016-04-13 03:52:07 +03:00
Vlad Vissoultchev
e946eb2a41 Implement -dM preprocessor option as in gcc
There was already support for -dD option but in contrast -dM dumps only `#define` directives w/o actual preprocessor output.

The original -dD output differs from gcc output by additional comment in front of `#define`s so this quirk is left for -dM as well.
2016-04-06 18:57:11 +03:00
seyko
effc7d9ed4 cleaning "Identifiers can start and/or contain"
more logical algorithm of the isidnum_table[] changing
2016-04-05 15:06:47 +03:00
seyko
983c40f58b compilation speed of the tccboot correction
we use gnu extension "case 0x80 ... 0xFF" for tcc & gcc
    and perform test
        if(c & 0x80)
    for other compilers
2016-04-05 13:38:53 +03:00
seyko
936819a1b9 utf8 in identifiers
made like in pcc
    (pcc.ludd.ltu.se/ftp/pub/pcc-docs/pcc-utf8-ver3.pdf)
    We treat all chars with high bit set as alphabetic.
    This allow code like

    #include <stdio.h>
    int Lefèvre=2;
    int main() {
        printf("Lefèvre=%d\n",Lefèvre);
        return 0;
    }
2016-04-05 13:05:09 +03:00
seyko
5a704457e2 optimization of the previous patch
compilation speed of the tccboot restored
    (patch remove testing of the parse_flags in loop)
2016-04-05 11:19:09 +03:00
seyko
d3e85e80fd Identifiers can start and/or contain '.' in *.S
modified version of the old one which don't allow '.'
    in #define Identifiers. This allow correctly preprocess
    the following code in *.S

        #define SRC(y...)               \
        9999: y;                        \
        .section __ex_table, "a";       \
        .long 9999b, 6001f      ;       \
        // .previous

        SRC(1: movw (%esi), %bx)
        6001:

    A test included.
2016-04-05 10:43:50 +03:00
seyko
41785a0bf9 -fnormalize-inc-dirs
remove non-existent or duplicate directories from include paths
    if -fnormalize-inc-dirs is specified. This will help
    to compile current coreutils package
2016-04-03 11:42:15 +03:00
seyko
2bf43b5483 reverse of the "Identifiers can start and/or contain '.'"
- Identifiers can start and/or contain '.' in PARSE_FLAG_ASM_FILE
    - Move all GAS directives under TOK_ASMDIR prefix

    This patches breaks compilation of the tccboot (linux 2.4.26
    kernel). A test.S which fails with this patches:

    #define SRC(y...) \
    9999: y; \
    .section __ex_table, "a"; \
    .long 9999b, 6001f<---->; \
    .previous

    SRC(1:<>movw (%esi), %bx<------>)
    // 029-test.S:7: error: macro 'SRC' used with too many args
2016-04-03 11:01:05 +03:00
Michael Matz
8fc5a6a2a4 Fix tokenization of TOK_DOTS
We really need to use PEEKC during tokenization so as to
skip line continuations automatically.
2016-03-24 15:58:32 +01:00
Vlad Vissoultchev
05ec6654a7 Identifiers can start and/or contain '.' in PARSE_FLAG_ASM_FILE
Including labels, directives and section names
2016-03-14 18:37:39 +02:00
Vlad Vissoultchev
17395ea507 tccpp.c: Fix failing PPTest 03 by reverting rogue modification in macro_arg_subst 2016-03-14 18:26:41 +02:00
Vlad Vissoultchev
32755dbea9 Migrate static STRING_MAX_SIZE buffers to CString instances for large macros expansion 2016-03-13 04:26:45 +02:00
Edmund Grimley Evans
1c2dfa1f4b Change the way struct CStrings are handled.
A CString used to be copied into a token string, which is an int array.
On a 64-bit architecture the pointers were misaligned, so ASan gave
lots of warnings. On a 64-bit architecture that required memory
accesses to be correctly aligned it would not work at all.

The CString is now included in CValue instead.
2015-11-26 12:40:50 +00:00
grischka
8dd1859176 tccpp: allow .. in token stream
for gas comments lonely on a line such as

    # .. more stuff

where tcc would try to parse .. as a preprocessor directive

See also: 0b3612631f
2015-11-20 18:25:00 +01:00
grischka
0b3612631f tccpp: cleanup #include_next
tcc_normalize_inc_dirs: normally no problem to be absolutly
gcc compatible as long as it can be done the tiny way.

This reverts to the state before recent related commits and
reimplements a (small) part of it to fix the reported problem.


Also: Revert "parsing "..." sequence"
c3975cf27c

	&& p[1] == '.'

is not a reliable way to lookahead
2015-11-20 12:05:55 +01:00
grischka
54cf57ab1a tccgen: asm_label cleanup
- avoid memory allocation by using its (int) token number
- avoid additional function parameter by using Attribute

Also: fix some strange looking error messages
2015-11-20 11:22:56 +01:00
Edmund Grimley Evans
569fba6db9 Merge the integer members of union CValue into "uint64_t i". 2015-11-17 19:09:35 +00:00
seyko
8fc9c79705 TOK_INCLUDE: fix for the "normalize inc dirs"
A case for the absolute path: prevent an error after openening
2015-11-06 02:50:36 +03:00
seyko
7cb921a44b TOK_INCLUDE: streamline
goto removed
2015-11-06 02:40:14 +03:00
seyko
a6276b7a78 normalize inc dirs, symplify include_next
include dirs are prepared as in gcc
    - for each duplicate path keep just the first one
    - remove each include_path that exists in sysinclude_paths

    include_next streamlined by introducing inc_path_index
    in the BufferedFile
2015-11-05 19:52:49 +03:00
Edmund Grimley Evans
24308fd292 tccpp.c: In TOK_GET, add comment warning about illegal cast.
Also, in tok_str_add2, use memcpy instead of the illegal cast.

Unfortunately, I can't see an easy way of fixing the bug.
2015-11-04 20:27:54 +00:00
Edmund Grimley Evans
20f0c179da tccpp.c: Define and use tok_last for checking if last token is space. 2015-11-04 20:25:26 +00:00
seyko
003c532bf3 fix for the #include_next, v4 (final)
This version looks rigth. Comparing to the original
    algorithm:

    1) Loop breaking. We remember a start point after wich
    we can try next path. Do not search include stack after
    this.

    2) But compare next file patch with the start point.
    Skip if it the same. Remove "./" before comparing.

    PS: a problems with compaling a coreutils-8.24.51-8802e
    remain. There are errors messages like:
    src/chgrp
        src/chown-core.c:42: multiple definition of `make_timespec'
        src/chgrp.c:42: first defined here
    A problem is in the lib/config.h
        #define _GL_INLINE_ extern inline // gcc
        #define _GL_INLINE_ inline        // tcc

    A long description from the lib/config.h
    * suppress extern inline with HP-UX cc, as it appears to be broken
    * suppress extern inline with Sun C in standards-conformance mode
    * suppress extern inline on configurations that mistakenly use
      'static inline' to implement functions or macros in standard
      C headers like <ctype.h>.

    GCC and Clang are excluded from this list. Why not tcc?
2015-10-20 07:32:53 +03:00
seyko
ad1c01f96c fix for the #include_next, v3
don't give an error and simply ingnore directive
  if we detect a loop of the #include_next.

  With this aproach coreutils-8.24.51-8802e
  compiles, but with errors:
  	lib/libcoreutils.a: error: 'xnmalloc' defined twice
	lib/libcoreutils.a: error: 'xnrealloc' defined twice
2015-10-19 17:55:26 +03:00
seyko
6b9490b6ff fix for the #include_next, v2
A more correct fix. This one don't break old logic.
    But if include file is not found, we try to search
    again with the new compare rule.

    A description of the problem:
    http://permalink.gmane.org/gmane.comp.compilers.tinycc.devel/2769
2015-10-17 15:48:10 +03:00