Files

Motiejus Jakštys 47c9dd8636 stages_test: full per-function Air comparison between C and Zig sema

Replace the count-only check with a faithful textual comparison,
analogous to how expectEqualZir compares AstGen output:

- Export Zcu from test_exports so tests can construct a PerThread
- Parse Zig verbose_air output into per-function sections keyed by FQN
- For each C function Air, render it as text via air.write() using
  the Zig PerThread (InternPool indices must match between C and Zig
  for the same source), then compare against the Zig reference text

For the current corpus (codecs.zig, no functions), both sides produce
zero entries so the comparison loop is empty. When zirFunc is ported
and a corpus file with functions is added, this will exercise real
per-function Air matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-19 13:04:30 +00:00

.claude

WIP: wire up Zig Compilation for reference sema in stage0 tests

2026-02-19 09:14:28 +00:00

.clang-format

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

.gitignore

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

air.c

stage0: add Sema data structures (Phase A)

2026-02-17 19:42:24 +00:00

air.h

stage0: add Sema data structures (Phase A)

2026-02-17 19:42:24 +00:00

ast.c

unify error handling: SET_ERROR(ctx, msg) for both parser and astgen

2026-02-17 17:02:00 +00:00

ast.h

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

astgen_test.zig

remove some duplicated tests

2026-02-18 22:49:36 +02:00

astgen.c

cppcheck: remove all suppressions, fix all warnings

2026-02-17 18:13:52 +00:00

astgen.h

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

CLAUDE.md

skills: replace port-astgen + fix-stages with unified enable-tests

2026-02-18 21:10:42 +00:00

common.h

unify error handling: SET_ERROR(ctx, msg) for both parser and astgen

2026-02-17 17:02:00 +00:00

intern_pool.c

stage0: implement InternPool core (Phase B)

2026-02-17 19:56:05 +00:00

intern_pool.h

stage0: implement InternPool core (Phase B)

2026-02-17 19:56:05 +00:00

main.c

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

parser_test.zig

use C parser in AstGen

2026-02-17 10:56:11 +00:00

parser.c

cppcheck: remove all suppressions, fix all warnings

2026-02-17 18:13:52 +00:00

parser.h

unify error handling: SET_ERROR(ctx, msg) for both parser and astgen

2026-02-17 17:02:00 +00:00

README.md

we have zig-out/bin/zig

2026-02-18 16:46:31 +02:00

sema_c.zig

sema: return per-function Air list instead of flat module-wide Air

2026-02-19 12:15:04 +00:00

sema_test.zig

sema: return per-function Air list instead of flat module-wide Air

2026-02-19 12:15:04 +00:00

sema.c

sema: return per-function Air list instead of flat module-wide Air

2026-02-19 12:15:04 +00:00

sema.h

sema: return per-function Air list instead of flat module-wide Air

2026-02-19 12:15:04 +00:00

sema.zig

sema: capture per-function Air text via verbose_air_output

2026-02-19 12:27:50 +00:00

stages_test.zig

stages_test: full per-function Air comparison between C and Zig sema

2026-02-19 13:04:30 +00:00

test_all.zig

stage0: add sema test framework skeleton (Phase C)

2026-02-17 20:04:31 +00:00

tokenizer_test.zig

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

tokenizer.c

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

tokenizer.h

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

type.c

stage0: add Sema data structures (Phase A)

2026-02-17 19:42:24 +00:00

type.h

stage0: add Sema data structures (Phase A)

2026-02-17 19:42:24 +00:00

value.c

stage0: add Sema data structures (Phase A)

2026-02-17 19:42:24 +00:00

value.h

stage0: add Sema data structures (Phase A)

2026-02-17 19:42:24 +00:00

zig0_bridge.zig

stage0-specific changes

2026-02-14 00:03:26 +02:00

zig0.c

sema: return per-function Air list instead of flat module-wide Air

2026-02-19 12:15:04 +00:00

zir.c

Add 'stage0/' from commit 'b3d106ec971300a9c745f4681fab3df7518c4346'

2026-02-13 23:32:08 +02:00

zir.h

Merge commit '6204bb245b4a05e0f4f00bb48d83b76ebcd899e2' into zig0-0.15.2

2026-02-14 10:05:42 +02:00

README.md

About

zig0 aspires to be an interpreter of zig 0.15.2 written in C.

This is written with help from LLM:

Lexer:
- Datastructures 100% human.
- Helper functions 100% human.
- Lexing functions 50/50 human/bot.
Parser:
- Datastructures 100% human.
- Helper functions 50/50.
- Parser functions 5/95 human/bot.
AstGen: TBD.

Testing

Quick test:

./zig-out/bin/zig build fmt-zig0 test-zig0

Full test and static analysis with all supported compilers and valgrind (run before commit, takes a while):

./zig-out/bin/zig build all-zig0 -Dvalgrind

Debugging tips

Test runs infinitely? Build the test program executable:

$ ./zig-out/bin/zig build test-zig0 -Dzig0-no-exec

And then run it, capturing the stack trace:

gdb -batch \
    -ex "python import threading; threading.Timer(1.0, lambda: gdb.post_event(lambda: gdb.execute('interrupt'))).start()" \
    -ex run \
    -ex "bt full" \
    -ex quit \
    zig-out/bin/test

You are welcome to replace -ex "bt full" with anything other of interest.

Float handling

Float literals are parsed with strtold() (C11 standard, portable). On x86-64 Linux, long double is 80-bit extended precision (63 fraction bits).

When a float doesn't round-trip through f64, it's emitted as f128 (ZIR float128 instruction). The 80-bit extended value is converted to IEEE 754 binary128 encoding by bit manipulation — both formats share the same 15-bit exponent with bias 16383. The top 63 of binary128's 112 fraction bits come from the 80-bit value; the bottom 49 are zero-padded.

This means float128 literals lose ~49 bits of precision compared to the upstream Zig implementation (which uses native f128). This is acceptable because stage0 is a bootstrap tool — the real Zig compiler re-parses all source with full f128 precision in later stages. The test comparison mask in astgen_test.zig skips float128 payloads to account for this.

Previous approach used __float128/strtof128 (GCC/glibc extensions) for full precision, but these are not portable to TCC and other C11 compilers.