Commit Graph

2749 Commits

Author SHA1 Message Date
Michael Matz
c883d43ae5 fixes for deadlocks and dollars in asm
* tccelf code doesn't use USING_GLOBALS, hence must be called from
  outside the semaphore
* dollars aren't part of identifiers in asm mode
2020-06-21 01:45:35 +02:00
Michael Matz
18db8da7df Workaround old GCC viz indexed string literals
GCCs before gcc8 didn't deal with these where constants are
required.
2020-06-20 23:17:02 +02:00
Michael Matz
1711eb2ef3 macos: add overview comment 2020-06-20 22:25:08 +02:00
Michael Matz
d174af08f9 macos: Fix memtest
* free allocated memory
* fix weak symbols related to -run
* remove memtest from ignored tests for OSX
2020-06-20 22:17:08 +02:00
Michael Matz
ac86fb50cd macos: Tidy code 2020-06-20 22:17:08 +02:00
Michael Matz
c86760c1e9 macos: Read exported symbols from dylibs
up to now we simply assumed that any undefined symbols will be
provided by some shared libs on the cmdline.  Now we check this.
2020-06-20 22:17:08 +02:00
Michael Matz
fbfe6209be macos: Fix asm-c-connect-test
via some heavy-handed hackery in the ASM symbol handling in case
C symbols get a leading underscore (but ASM symbols do not).
But this is now like clang and GCC on Darwin work: asm symbols are
undecorated, C symbols get a _ prepended, so to connect both some
trickery is involved for the ASM symbols that don't have a _ prepended.
They must be included in the C symbol table (because that's what we use
to lookup also ASM labels), but they also must not disturb the normal
C symbol (which don't have the _ prepended), so they need some mangling.

A bit unsatisfying, but well.  So, add asm-c-connect-test to the working
ones for Darwin as well.
2020-06-20 22:17:08 +02:00
Michael Matz
3cf7bec12f macos: Enable all working tests
all except the below work now on MacOS, also as executable test,
not just with -run:
* dlltest - we don't support dylib generation (yet)
* memtest - tccmacho.c contains some leaks
* asm-c-connect-test - some confusion with underscores still
2020-06-20 22:17:08 +02:00
Michael Matz
f486b19f56 macos: bound_alloca symbol adjust
so that tests/boundtest.c links as well.
2020-06-20 22:17:08 +02:00
Michael Matz
f2e154e1e5 macos: Disable verbose linking output 2020-06-20 22:17:08 +02:00
Michael Matz
9ce5b4a691 macos: Adjust tcctest.c for clang
we need to disable or adjust some tests where clang behaves
slightly different from GCC:
* slight difference in __FILE__ behaviour
* difference (to less useful vs GCC) in computed #include
* difference in __builtin_constant_p
* attribute(weak, alias) isn't supported by clang on MacOS (though
  it could be, as Mach-O has the capabilities for this)
* the built-in assembler of clang is mediocre

this was all checked with
  Apple LLVM version 10.0.1 (clang-1001.0.46.4)
2020-06-20 22:17:08 +02:00
Michael Matz
4eff2b5f6a macos: more adjustments for OSX systems
* instead of /usr/include use the current SDK path as system
  include directory (/usr/include is empty with current tools)
  (this also removes the need to add these paths in individual
  Makefiles)
* define _DARWIN_C_SOURCE in tcc.h to get the full set of decls
  from system headers (e.g. vsnprintf), similar to _GNU_SOURCE
  (and don't define _ANSI_SOURCE in the main Makefile anymore)
* tests/tests2/Makefile: remove the -w flag, it's added when necessary
  in the rules generating the .expect files
2020-06-20 22:17:06 +02:00
Michael Matz
f18f865159 Handle always_inline as GNU inline
this is needed for multi-file testcases using stdio.h, as
the __sputc function is implemented as a extern inline
function (with gnu_inline attribute, but we don't support that for now).

Without this change that leads to multiply defined symbols when using
multiple units including stdio.h.

It also has an always_inline attribute, which we can use to guide our
behaviour, as in ISO-C an always_inline can't be defined with ISO
'extern inline' semantics.  This is the minimal change and not a full
implementation of GNU inline semantics, which would require thorough
testcases.

If __clang__ would be defined the header would make use of C99 semantics,
which would work for us.  It would also do that if _GNUC_ wouldn't be
defined.  But we can't do the latter (as the whole MacOSX SDK refuses
to be compiled with anything not defining that).  I haven't tested
defining __clang__, but suspect that's going to be problematic.
2020-06-20 22:14:56 +02:00
Michael Matz
5d2f4cb403 macos: various fixes
* alloca needs to be _alloca
* int64_t and uint64_t are also defined in <sys/_types/_int64_t.h>
  but as long long, not as long, so adjust out <stddef.h> to agree
  on Darwin
* define __APPLE_CC__ so that <TargetConditionals.h> correctly defines
  various macros like TARGET_OS_MAC and TARGET_CPU_X86_64
2020-06-20 22:14:56 +02:00
Michael Matz
b0b0e48409 macos: Use <dispatch.h> for locking
the global compiler lock can't be a POSIX non-shared semaphore on
Darwin (because not implemented there).  Use the dispatch framework.
2020-06-20 22:14:56 +02:00
Michael Matz
57ba50e611 macos: support bounds checking
* non-process-shared POSIX semaphores aren't supported on
  Darwin, we use the dispatch framework
* dlsym segfaults with RTLD_NEXT from JIT code, so we must not
  even try this for -run.  So we need to know in __bound_init
  if called from -run code, or from normal code, which means passing
  this down also from __bt_init and hence from the stub added in
  tcc_add_btstub
* Darwin uses different structures for <ctype.h> facilities, this
  merely adds a warning about this
* __libc_freeres doesn't exist
* for non -run modus the context (.prog_base member) is constructed
  incorrectly (uses symbol zero for trying to get at the load bias,
  which doesn't really work that way), on Mach-O this errors out
  (and could also error out on ELF).  For now deactivate this, which
  makes backtraces not be symbolic on MacOS for not -run.
2020-06-20 22:14:56 +02:00
Michael Matz
0b3c8360a0 macos: section syms, strtab, runtime
uncovered by the backtrace/boundcheck tests:

* handle STT_SECTION symbols
* call tcc_add_runtime (to get the bcheck.o/bt-exe.o files added)
* add .stab strtab into segments (we should probably add all stab
  syms to the output LC_SYMTAB eventually, but right now TCC uses
  32 bit stabs, while mach-o uses 32/64bit stabs
2020-06-20 22:14:56 +02:00
Michael Matz
91cb41330d macos: Adjust tests2.112 and tests2 Makefile
* <malloc.h> isn't as portable as <stdlib.h>
* skip 113_btdll.c on Darwin
* replace [...]\+ with [...]\{1,\} in the sed regex (basic REs
  have no + even some sed(1) accept it as \+, but bounds _are_ part
  of POSIX BREs)
2020-06-20 22:14:38 +02:00
Michael Matz
71b0634168 Add find_c_sym and friends
for handling leading underscores when looking up symbols.
Necessary on MacOS, as there C symbols have a '_' prepended.
get_sym_addr (replacing get_elf_sym_addr) gets an argument to
specify if bare/raw/ELF symbols should be looked up or if decorated
C symbols should be looked up.  That reflects into tcc_get_symbol.
tcc_add_symbol is _not_ yet changed, but probably should be.
2020-06-20 22:12:02 +02:00
Michael Matz
cc11750bda macos: Support .init_array and .fini_array
called different on Mach-O, but same functionality (list of pointers
to init and fini functions).
2020-06-20 22:12:02 +02:00
Michael Matz
ef730c4d2f macos: don't like in crt startup code
it's not needed with LC_MAIN executables.
2020-06-20 22:12:02 +02:00
Michael Matz
51f15d9971 macos: Deal with leading underscore on Mach-O
all C/C++/ObjC symbols in symbols tables have a leading underscore
in Mach-O.  Within TCC there's some confusion with tcc_add_symbol
(not adding it) and tcc_get_elf_symbol (not expecting it), and
resolve_syms (using dlsym, which doesn't expect it) and -run support.
But this sort of works.
2020-06-20 22:12:02 +02:00
Michael Matz
1ca209dad0 macos: set LC_MAIN entrypoint correctly
to the file offset of 'main' not simply to the start of TEXT.
2020-06-20 22:12:02 +02:00
Michael Matz
84c3fecf5e macos: Support external functions
these are resolved non-lazy for now.  We only need to generate
the jump stub (using the GOT slot that will be initialized due
to the non-lazy pointer marking, like with data symbols).  On
x86-64 we don't even need special marking of these stubs (with
S_SYMBOL_STUBS and associated additional indirect symbol entries),
as that's only used on i386 (where the stubs are self-modifying).

So, this now works:

extern int _printf(const char*, ...);
int _start(void)
{
  _printf("hello\n");
  return 0;
}
2020-06-20 22:12:02 +02:00
Michael Matz
d82e64c163 macos: handle undefined symbols a bit
at least data symbols coming from dylibs can be used now, as in the
below.  Note in the example that optind is defined in libc (really in
libsystem_c.dylib, reexported from libSystem.B.dylib):

static int loc;
extern int _optind;
int _start(void)
{
  _optind = 0;
  loc = 42 + _optind;
  return loc - 42;
}
2020-06-20 22:12:02 +02:00
Michael Matz
fab8787b23 macos: Fix GOT access to local defined symbols
if a GOT slot is required (due to codegen, indicated by
presence of some relocation types), then it needs to contain
the address of the wanted symbol, also when it's local and defined,
i.e. not overridable.  For simplicity we use a GOT slot for that as
well (other storage would require rewriting relocs and symbols,
as resolving of GOT relocs is hardwired to be based on s1->got).
But that means we need to fill in its indirect symbol mapping slot as
well, for which Mach-O provides a mean to say "not symbol based,
resolved locally".  So this fixes this testcase:

static int loc;
int _start(void)
{
  loc = 42;
  return loc - 42;
}

(where our codegen currently uses a GOT-based access for the write
by accident)
2020-06-20 22:12:02 +02:00
Michael Matz
6ebd463021 macos: dynamic symbol table and __got
this now sorts the symbols properly (local, global defined, undefined;
the latter two by name), marks the three ranges within LC_DYSYMTAB,
generates a __got section (non-lazy pointers) and slots for
relocations which need them, and the indirect symbol mapping for
them.

This doesn't yet deal with undefined symbols.  But it means compared to
last example now this also works, i.e. read access to _global_ data:

% cat simple3.c
int loc = 42;
int _start(void)
{
  return loc - 42;
}
2020-06-20 22:12:02 +02:00
Michael Matz
bbccb13566 for_each_element for all; add tcc_exit_state
this moves the for_each_element macro to tcc.h and adds a
tcc_exit_state (and uses it) pairing with tcc_enter_state.
2020-06-20 22:11:57 +02:00
Michael Matz
1320d85742 macos: Create symtab
this creates a proper LC_SYMTAB, with reasonable entries.  It's
not sorted, so not usable for LC_DYSYMTAB.  But 'nm -x -no-sort'
allows to see us some useful info.

This also relocates sections and symbols, so now this example
works as well (i.e. read access to static local data):

% cat simple2.c
static int loc = 42;
int main(void)
{
  return loc - 42;
}
2020-06-20 22:09:21 +02:00
Michael Matz
94066765ed macos: First cut at generating Mach-O executables
this does generate a working executable for a very simple
example input, e.g. this:

% cat simple.c
int main(void)
{
  return 0;
}
% ./tcc -B. -c simple.c
% ./tcc -nostdlib -B. simple.o -lc
% ./a.out && echo okay
okay

(the -lc is actually not necessary right now, see below).  This
has many limitations:

* no symbol table, hence no calls to external functions from
  e.g. libc, aka libSystemB
* no proper entry point (should be main, but is hardcoded to first
  real .text address)
* libSystemB is hardcoded, no other libs are supported (but again
  no external calls anyway)
* generated Mach-O executable is in old format: neither LC_DYLD_INFO
  no export tries for symbols are created (no symbol table at all!)
* the __LINKEDIT segment is faked and empty, as dyld doesn't like
  it empty even if no symbols point into it
* same with __DATA, dyld wants a non-empty writable segment which
  we enforce with useless data
* no relocations, hence no function call stubs (lazy or not) are
  generated
* hardcodes some other constants as well
2020-06-20 22:09:21 +02:00
Michael Matz
b882cad67e macos: Don't try to load dylibs or linker scripts
we ignore dylibs for now (can't inspect them yet for meta-info).
Also don't try to load GNU linker scripts, it's simply an unknown file
type (e.g. when mentioning Mach-O object files).
2020-06-20 22:09:21 +02:00
Michael Matz
9d2ce50d2f macos: Use SDK path for headers
until we have it properly encoded within TCC itself.
2020-06-20 22:09:21 +02:00
Michael Matz
6178e47345 macos: detect clang also when called as gcc 2020-06-20 22:09:21 +02:00
Michael Matz
032664bf7f macos: make Mach-O somewhat ELF_OBJ_ONLY
we only emit .o files in ELF format, but do use the .got building
from generic ELF code also on MacOS for -run.
2020-06-20 22:09:21 +02:00
Michael Matz
c16f5d2fe6 Fake __has_include handling
cctools for MacOS 10.14 (at least) unconditionally uses the
__has_include preprocessor directive (i.e. without checking
  if defined __has_include
as normally suggested for portable code).  So we need to handle
it a little bit.  For now we simply say "nope" aka evaluate to 0.
2020-06-20 22:09:21 +02:00
Michael Matz
34a5658564 macos: Add system include dir for building libs
we use xcrun to determine the correct path.  In the future those
should be determined at configure time or even runtime of tcc itself.
2020-06-20 22:09:17 +02:00
Michael Matz
4cc27b816f macos: Use newer MACOSX_DEPLOYMENT_TARGET
10.6 seems to still be supported on Mojave, but not 10.4
anymore.
2020-06-20 22:07:24 +02:00
Michael Matz
7b6931ed1f Fix some string literal expressions in initializers
Things like 'char x[] = { "foo"[1], 0 }' (which should initialize
a two-character array 'x' with "o\0").  See added testcase.
2020-06-20 22:00:18 +02:00
herman ten brugge
b2d351e0ec Fix fetch_and_add code
Change type from signed char to int.
Make assembly code work with tcc and gcc.
2020-06-18 07:21:48 +02:00
grischka
b5faa45d90 tccpp: faster next()
- call TOK_GET() as a function only for tokens with values
- get rid of 'next_nomacro_spc()'
- be sligtly more efficient in next()

This made about 4-5%+ speed in my tests.

Also: tcc.h: reorder tokens
2020-06-17 18:44:11 +02:00
grischka
781872517d tccpp.c: compile 'tcc_predefs' from string
It might have advantages in cases if tcc/libtcc does not
depend on extern files.

Also:
- apply "stray \\ ..." check to macros only. For files it
  was already checked. Add positive test.
2020-06-17 18:01:45 +02:00
grischka
cbef54653a tcc -MD: drop system includes and duplicates
Also:
- print filenames in errors for binary input files
  (was lost with the thread support recently)
2020-06-17 18:01:40 +02:00
grischka
e7a4140d28 tcc --help -v: cleanup
from e640ed1aeb

Also:
- cleanup -std, -O, -pthread
- tcc.h:win32: use win32-type include paths even for cross
  compilers (needed for loading tcc_predefs.h in cases)
- Makefile: simplify OSX .dylib clause
2020-06-17 17:41:46 +02:00
herman ten brugge
8fb8d88ea6 Add vla bound support for arm,arm64 and riscv64
tcctok.h:
- Add __bound_new_region

arm-gen.c arm64-gen.c riscv64-gen.c:
- Add bound checking support to gen_vla_alloc

tests/Makefile boundtest.c
- Add test18 vla bound checking test
2020-06-17 11:24:17 +02:00
herman ten brugge
44019e874f Fix debug info for functions
Add code for VT_FUNC.
Use octal number for unsigned int/unsigned long for 32 bits targets.
Add VT_BYTE | VT_UNSIGNED for targets with default unsigned signed char.
Remove extra ',' in default_debug struct.
2020-06-17 07:58:18 +02:00
herman ten brugge
0b8ee7364a Add bound checking to arm, arm64 and riscv64
Checked on:
- i386/x86_64 (linux/windows)
- arm/arm64 (rapberry pi)
- riscv64 (simulator)
Not tested for arm softfloat because raspberry pi does not support it.

Modifications:

Makefile:
  add arm-asm.c to arm64_FILES
  add riscv64-asm.c (new file) to riscv64_FILES

lib/Makefile:
  add fetch_and_add_arm.o(new file) to ARM_O
  add fetch_and_add_arm64.o(new file) to ARM64_O
  add fetch_and_add_riscv64.o(new file) to RISCV64_O
  add $(BCHECK_O) to OBJ-arm/OBJ-arm64/OBJ-riscv64

tcc.h:
  Enable CONFIG_TCC_BCHECK for arm32/arm64/riscv64
  Add arm-asm.c, riscv64-asm.c

tcctok.h:
  for arm use memmove4 instead of memcpy4
  for arm use memmove8 instead of memcpy8

tccgen.c:
  put_extern_sym2: for arm check memcpy/memmove/memset/memmove4/memmove8
                   only use alloca for i386/x86_64
  for arm use memmove4 instead of memcpy4
  for arm use memmove8 instead of memcpy8
  fix builtin_frame_address/builtin_return_address for arm/riscv64

tccrun.c:
  Add riscv64 support
  fix rt_getcontext/rt_get_caller_pc for arm

tccelf.c:
  tcc_load_dll: Print filename for bad architecture

libtcc.c:
  add arm-asm.c/riscv64-asm.c

tcc-doc.texi:
  Add arm, arm64, riscv64 support for bound checking

lib/bcheck.c:
  add __bound___aeabi_memcpy/__bound___aeabi_memmove
      __bound___aeabi_memmove4/__bound___aeabi_memmove8
      __bound___aeabi_memset for arm
  call fetch_and_add_arm/fetch_and_add_arm64/fetch_and_add_riscv64
  __bound_init: Fix type for start/end/ad
  __bound_malloc/__bound_memalign/__bound_realloc/__bound_calloc: Use size + 1

arm-gen.c:
  add bound checking code like i386/x86_64
  assign_regs: only malloc if nb_args != 0
  gen_opi/gen_opf: Fix reload problems

arm-link.c:
  relocate_plt: Fix address calculating

arm64-gen.c:
  add bound checking code like i386/x86_64
  load/store: remove VT_BOUNDED from sv->r
  arm64_hfa_aux/arm64_hfa_aux: Fix array code
  gfunc_prolog: only malloc if n != 0

arm64-link.c:
  code_reloc/gotplt_entry_type/relocate: add R_AARCH64_LDST64_ABS_LO12_NC
  relocate: Use addXXle instead of writeXXle

riscv64-gen.c:
  add bound checking code like i386/x86_64
  add NB_ASM_REGS/CONFIG_TCC_ASM

riscv64-link.c:
  relocate: Use addXXle instead of writeXXle

i386-gen.c/x86_64-gen.c
  gen_bounds_epilog: Fix code (unrelated)

tests/Makefile:
  add $(BTESTS) for arm/arm64/riscv64

tests/tests2/Makefile:
  Use 85 only on i386/x86_64 because of asm code
  Use 113 only on i386/x86_64 because of DLL code
  Add 112/114/115/116 for arm/arm64/riscv64
  Fix FILTER (failed on riscv64)

tests/boundtest.c:
  Only use alloca for i386/x86_64
2020-06-16 07:39:48 +02:00
Michael Matz
9eef33993a Fix type compatiblity of enums and ints
an enum must be compatible with one or more integer type,
so adjust the test accordingly.  That means the following
redeclarations should work:

  enum e6 { E1 = -1, E0 };
  void f3(enum e6);
  void f3(int);        // should work as int and e6 are compatible

while the following should not:

  void f4(enum e6 e);
  void f4(unsigned e); // should error as unsigned and e6 are incompatible
2020-06-05 16:02:08 +02:00
Michael Matz
068d5b3d20 pp: Move errors about stray backslashes
they were emitted too early, in particular also in macro
substitution which might turn out to not be stray in case it's
further stringified.  Check in next() instead.  Add two testcases
that an error is still emitted for obvious top-level baskslashes,
and that stringifying such a thing works.
2020-06-03 18:38:11 +02:00
Christian Jullien
e640ed1aeb Also support --help -v, i.e. arguments in different order. 2020-06-01 18:01:12 +02:00
Christian Jullien
bbb3f79bb8 Missied a commit in order to support -v --help the same way as gcc does 2020-06-01 17:23:20 +02:00