From 40c55b93aa3f8a06947355f3d0cd162e062cc3f0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Motiejus=20Jak=C5=A1tys?= Date: Wed, 25 Feb 2026 17:46:23 +0000 Subject: [PATCH] CLAUDE.md: add InternPool gap porting workflow Document the workflow for closing the IP entry gap between the Zig compiler and C sema. Starting with neghf2.zig, corpus tests require the C sema to create ~878 module-level IP entries (struct types, ptr_nav, enum types, etc.) matching the Zig compiler's output. The workflow describes how to dump the Zig compiler's IP state, compare with the C sema, port module-system functions (createFileRootStruct, scanNamespace, etc.), and iterate. Co-Authored-By: Claude Opus 4.6 (1M context) --- stage0/CLAUDE.md | 64 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/stage0/CLAUDE.md b/stage0/CLAUDE.md index d0455ebdba..2d463b629e 100644 --- a/stage0/CLAUDE.md +++ b/stage0/CLAUDE.md @@ -43,6 +43,70 @@ take shortcuts. Instead, decompose: 5. Repeat until enough pieces are in place for the corpus test to pass. 6. Return to [The loop](#the-loop). +## Closing the InternPool gap (neghf2 and beyond) + +Starting with `neghf2.zig`, corpus tests require the C sema to produce +InternPool entries that match the Zig compiler's module-level analysis. +The Zig compiler creates ~878 IP entries before analyzing `__neghf2`'s +function body (struct types for modules, pointer-to-Nav entries, enum +types/values, function types, etc.). The C sema must create the same +entries so that IP indices in the function body AIR match. + +### Background + +- Nav entries (declarations) are stored in a **separate** list from IP + Items — they do NOT consume IP indices. But `ptr_nav` entries (pointers + TO Navs) DO consume IP indices. +- The Zig compiler creates IP entries through `createFileRootStruct` + (in `src/Zcu/PerThread.zig`) and `scanNamespace`. These must be + ported to C. +- Source of truth for the IP entry sequence: add a temporary debug print + in `src/Zcu/PerThread.zig` at the `verbose_air_callback` call site + (around line 4478) to dump `ip.locals[0].shared.items` for the + function being debugged. Build with `zig build test-zig0` to compile + a new `air_gen`, then run it directly on the target file. **Always + revert** the debug print before committing. + +### The module-system porting loop + +1. **Dump the Zig IP.** Temporarily add debug output in + `src/Zcu/PerThread.zig` at the `verbose_air_callback` site. + Rebuild, run the `air_gen` binary directly on the target corpus + file (e.g. `lib/compiler_rt/neghf2.zig`), capture the IP entries. + Revert the debug print. +2. **Compare.** Run `zig build test-zig0` with `num_passing` bumped. + Note the mismatch: `a=0x???[ip] b=0x???[ip]`. The gap `a − b` is + the number of missing IP entries. +3. **Port the next batch.** Identify what the Zig compiler creates for + the next ~10 IP entries (struct types, ptr_nav, enum types, etc.). + Port the corresponding logic from `src/Zcu/PerThread.zig` and + `src/Sema.zig` into `stage0/sema.c`. Key functions to port: + - `createFileRootStruct` → creates `type_struct` IP entry for a + module's root. + - `scanNamespace` → iterates declarations, creates `ptr_nav` entries + for each Nav. + - `getStructType` / `getEnumType` → creates type entries in IP. + - `ensureFileAnalyzed` → recursively processes imported modules. + - `zirExport` → forces resolution of exported declarations. +4. **Test.** `zig build test-zig0` — the gap should shrink. +5. **Clean up & commit** (see [Cleaning Up](#cleaning-up)). Keep + `num_passing` at whatever value passes; don't bump it until the gap + reaches zero. +6. **Repeat** until the gap is zero and `num_passing` can be incremented. + +### Important constraints + +- Do NOT hardcode IP entries from a dump. The entries must be computed + from the ZIR, matching the Zig compiler's processing. +- Do NOT include generated `.c` files. All logic belongs in `sema.c`. +- Entry ORDER matters. The C sema must create entries in the same + order as the Zig compiler. Follow the Zig compiler's processing + sequence (struct type → scan declarations → process imports → + resolve comptime blocks). +- Deduplication matters. If the function body interns a value that was + already created during module-level analysis, `ipIntern` must return + the existing index (not create a duplicate). + ## AIR comparison exceptions C and Zig AIR must match byte-by-byte except: