Replace the old approach of linking verbose_air.zig (and the full Zig
compiler internals) into the test binary with a build-time generator
(verbose_air_gen.zig) that pre-computes AIR data for corpus files.
The generator runs as a build step, compiling each corpus file through
the Zig compiler and serializing the resulting AIR to binary files.
It produces air_data.zig and tag_names.zig bridge files that the test
binary imports as anonymous modules. This removes the heavyweight
zig_compile_air extern dependency from the test binary.
Key changes:
- build.zig: add air_gen executable build+run step, anonymous imports
- verbose_air_gen.zig (new): build-time AIR generator with symlink
workaround to avoid lib/std/ module path conflicts
- corpus.zig (new): centralized corpus file list with num_passing
- sema_test.zig: replace zig_compile_air extern with parsePrecomputedAir
- stages_test.zig: use corpus.zig and @import("air_data")
- sema.c: zero dead block data in comptime switch handler so the
dead-block skip rule fires correctly with precomputed data
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In safety-checked builds, Zir.Inst.Data is a tagged union where
@sizeOf(Data) > 8 due to the safety tag. saveZirCache strips these
tags by reinterpreting each Data as a HackDataLayout and copying the
first 8 bytes into a safety_buffer.
Union variants that use fewer than 8 bytes of payload leave the
remaining bytes uninitialised. The bulk copy propagates these
uninitialised V-bits into safety_buffer, causing valgrind to report:
Syscall param pwritev(vector[...]) points to uninitialised byte(s)
when the buffer is written to the cache file. This is harmless:
loadZirCache reconstructs the safety tag from the tag array, and each
variant only reads its own fields — the padding is never interpreted.
@memset before the copy does not help: the assignment
`safety_buffer[i] = as_struct.data` copies all 8 bytes from the
source union, and valgrind propagates the per-byte defined/undefined
status (V-bits) from source to destination, re-tainting the padding.
Use makeMemDefined after the copy loop to inform valgrind that the
padding contents are intentional. This compiles to a no-op when not
running under valgrind.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Export air_tag_name from verbose_air.zig to convert AIR tag u8 values
to their string names (e.g. "arg", "ret", "block"). Use it in
sema_test.zig error messages so mismatches show readable names instead
of raw numbers. Also add refKindStr to distinguish ip/inst refs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add verbose_air.c/h implementing a human-readable AIR printer for
debugging the C sema, ported from src/Air/print.zig. Types print as
human-readable names (u32, *const u8, fn (...) noreturn) instead of
raw IP indices. Add --verbose-air flag to zig0 CLI and a `zig build
zig0` target for building the standalone executable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Defense-in-depth for the cache_mode fix: track how many times the
verbose_air_callback fires and fail the test if C sema produced
functions but the Zig callback was never invoked.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
With .cache_mode = .whole, comp.update() returns immediately on a
cache hit, skipping sema entirely. The verbose_air_callback never
fires, so the collector returns 0 functions. This causes spurious
test passes when the C sema also returns 0 functions (e.g. for
unported @export), because 0 == 0 looks like a match.
Switch to .incremental so that sema always runs and the callback
always fires, making test results deterministic across runs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move padding awareness from collection (verbose_air.zig) to the test
comparison (sema_test.zig). Air.Inst.Data is an 8-byte union where
some variants (un_op, no_op, ty, repeat) use fewer bytes; the rest is
uninitialised padding. Instead of zeroing padding at collection time,
compare only the meaningful bytes per tag in the test harness.
This reverts the verbose_air.zig zeroing from 67b821e925 and
replaces the bulk std.mem.eql in airCompareOne with a per-instruction
loop that also gives better diagnostics on mismatch (instruction
index, tag, byte count).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Air.Inst.Data is a union; variants smaller than 8 bytes (un_op,
no_op, ty, repeat) leave padding bytes uninitialised. Zero the
destination buffer and copy only the meaningful bytes per instruction
so that byte-level comparisons in tests are deterministic and
valgrind-clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the heap-allocated error_msg with a fixed-size caller-provided
buffer, making error reporting infallible. All error paths now write
into the buffer via bufPrint with truncation on overflow.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port zirFunc to C sema so that `export fn f() void {}` produces
matching AIR on both the C and Zig sides. Configure verbose_air.zig
to use the self-hosted wasm backend (use_llvm=false) with ReleaseFast,
which eliminates error tracing and safety-check instructions.
Enable the "sema air: empty void function" test case.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
With emit_bin=false, the compiler never queued functions for analysis,
making the verbose_air_callback a no-op — all corpus files produced 0
AIR functions on both C and Zig sides, so Stage 3 comparison was
vacuous.
Set emit_bin=true so that `export fn` and `@export` trigger function
analysis. Disable 5 compiler_rt corpus files that now expose the
Zig-vs-C mismatch (they use @export but C sema doesn't port zirFunc
yet). Add 3 skipped export-fn test tiers in sema_test.zig as targets
for incremental zirFunc porting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use allocatedSlice() instead of .items to get the full [0..capacity]
slice for realloc. .items only covers [0..len] which causes an
out-of-bounds access when capacity > len.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
src/verbose_air.zig now only collects Air data (zig_compile_air) instead
of comparing it (zig_compare_air). The comparison logic lives in
stage0/sema_test.zig, keeping testing infrastructure in stage0/.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Hardcoding the module root to dirname(src_path) caused "import of file
outside module path" for any @import("../...") in corpus files. Add an
optional module_root parameter so stages_test can symlink to the repo
root, allowing all relative imports to resolve within the module path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of writing file content to /tmp (which broke relative imports
like codecs/asn1.zig), symlink the original file's directory into
.zig-cache/tmp/zig0_test. This keeps the module root outside lib/std/
(avoiding module path conflicts) while preserving subdirectory import
resolution through the symlink.
In verbose_air.zig, use Compilation.Path.fromUnresolved to construct the
module root so it gets the same canonical root enum (.local_cache, etc.)
as file paths computed during import resolution, avoiding isNested
root mismatches.
Fixes the codecs.zig test failure (434/434 tests now pass).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both zig_dump_intern_pool and c_dump_intern_pool were unimplemented stubs
with no callers. InternPool correctness is validated by unit tests and Air
comparison.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass C-side SemaFuncAir arrays into zig_compare_air so the callback
can compare Air tags/datas/extra directly against the Zig compiler's
in-memory arrays, eliminating 4 heap allocations + 3 memcpys per
function.
Fix the early-return guard in PerThread.zig to also check
verbose_air_callback, so the callback fires even when
enable_debug_extensions is false (ReleaseFast).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Export raw Air arrays (tags, datas, extra) from the Zig compiler via a
RawAirCallback on Compilation, and memcmp them against C-produced arrays
instead of comparing formatted text output. This is more robust (catches
any byte-level divergence) and eliminates the need for the C-side text
formatter.
- Add RawAirCallback type and field to Compilation
- Rewrite src/verbose_air.zig: raw array export instead of text capture
- Update stage0 tests to use compareAir with expectEqualSlices
- Delete stage0/verbose_air.{c,h} (no longer needed)
- Remove verbose_air.c/h from build.zig file lists
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove all @import("zig_internals") from stage0/ so that test_obj
compilation is independent of the Zig compiler (~6min). The sema
comparison now uses text-based dumpers:
- Zig side (src/verbose_air.zig): compiles source through the full Zig
pipeline, captures verbose_air output, exports zig_dump_air() as a C
function. Compiled as a separate dumper_obj that is cached
independently.
- C side (stage0/verbose_air.c): formats C Air structs to text in the
same format as Zig's Air/print.zig.
Changing stage0 code no longer triggers Zig compiler recompilation:
C compile + cached test_obj + cached dumper + link = seconds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the count-only check with a faithful textual comparison,
analogous to how expectEqualZir compares AstGen output:
- Export Zcu from test_exports so tests can construct a PerThread
- Parse Zig verbose_air output into per-function sections keyed by FQN
- For each C function Air, render it as text via air.write() using
the Zig PerThread (InternPool indices must match between C and Zig
for the same source), then compare against the Zig reference text
For the current corpus (codecs.zig, no functions), both sides produce
zero entries so the comparison loop is empty. When zirFunc is ported
and a corpus file with functions is added, this will exercise real
per-function Air matching.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a verbose_air_output field to Compilation that redirects verbose Air
dumps to a caller-provided writer instead of stderr. When set, liveness
is omitted from the output to support textual comparison in stage0 tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add zigSema helper (stage0/sema.zig) that creates a Compilation,
points it at a source file, and runs the full Zig sema pipeline.
Export Compilation and Package from test_exports.zig. Wire up in
stagesCheck to run Zig sema alongside C sema.
Not yet working: files under lib/ conflict with the auto-created
std module ("file exists in modules 'root' and 'std'"). The fix
(using .root = .none with absolute path) needs testing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add stage0/sema_c.zig that converts C Sema output (Air struct) to Zig's
Air type via MultiArrayList, with per-tag data dispatch. Update
stagesCheck to use the conversion, and extend the const x = 42 test to
verify both Air structure and InternPool contents.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire src/test_exports.zig through build.zig so zig0 tests can import
the real Zig InternPool. Add a test that initializes both the C and Zig
InternPools and compares all 124 pre-interned entries index by index.
Also add rule to skill file: never run `zig build test` or bare
`zig build` (they test upstream Zig and take ages).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The RoundMode packed struct had Direction as enum(u4) occupying bits 3:0,
which pushed the precision exception suppress field to bit 4. Per Intel
SDM, the ROUNDSS/VROUNDSS/VCVTPS2PH immediate layout is:
bits 1:0 = rounding mode
bit 2 = rounding source (MXCSR.RC vs immediate)
bit 3 = precision exception suppress
bits 7:4 = reserved (must be 0)
The old encoding emitted e.g. vroundss $0x12 for ceil-suppress (bit 4
set, reserved), which CPUs silently ignore but valgrind 3.26.0 correctly
rejects with SIGILL. Fix by changing Direction to enum(u3) so precision
lands at bit 3, producing the correct $0x0a encoding.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Before, this had a subtle ordering bug where duplicate
deps that are specified as both lazy and eager in different
parts of the dependency tree end up not getting fetched
depending on the ordering. I modified it to resubmit lazy
deps that were promoted to eager for fetching so that it will
be around for the builds that expect it to be eager downstream
of this.
--debug-rt previously would make rt libs match the root module. Now they
are always debug when --debug-rt is passed. This includes compiler-rt,
fuzzer lib, and others.
Before https://github.com/ziglang/zig/pull/18160, error tracing defaulted to true in ReleaseSafe, but that is no longer the case. These option descriptions were never updating accordingly.