commit 40c55b93aa3f8a06947355f3d0cd162e062cc3f0 (tree)
parent b85c3b0f761199945e4900c1506befcc9ae1c79b
Author: Motiejus Jakštys <motiejus@jakstys.lt>
Date: Wed, 25 Feb 2026 17:46:23 +0000
CLAUDE.md: add InternPool gap porting workflow
Document the workflow for closing the IP entry gap between the Zig
compiler and C sema. Starting with neghf2.zig, corpus tests require
the C sema to create ~878 module-level IP entries (struct types,
ptr_nav, enum types, etc.) matching the Zig compiler's output.
The workflow describes how to dump the Zig compiler's IP state, compare
with the C sema, port module-system functions (createFileRootStruct,
scanNamespace, etc.), and iterate.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat:
| M | stage0/CLAUDE.md | | | 64 | ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
1 file changed, 64 insertions(+), 0 deletions(-)
diff --git a/stage0/CLAUDE.md b/stage0/CLAUDE.md
@@ -43,6 +43,70 @@ take shortcuts. Instead, decompose:
5. Repeat until enough pieces are in place for the corpus test to pass.
6. Return to [The loop](#the-loop).
+## Closing the InternPool gap (neghf2 and beyond)
+
+Starting with `neghf2.zig`, corpus tests require the C sema to produce
+InternPool entries that match the Zig compiler's module-level analysis.
+The Zig compiler creates ~878 IP entries before analyzing `__neghf2`'s
+function body (struct types for modules, pointer-to-Nav entries, enum
+types/values, function types, etc.). The C sema must create the same
+entries so that IP indices in the function body AIR match.
+
+### Background
+
+- Nav entries (declarations) are stored in a **separate** list from IP
+ Items — they do NOT consume IP indices. But `ptr_nav` entries (pointers
+ TO Navs) DO consume IP indices.
+- The Zig compiler creates IP entries through `createFileRootStruct`
+ (in `src/Zcu/PerThread.zig`) and `scanNamespace`. These must be
+ ported to C.
+- Source of truth for the IP entry sequence: add a temporary debug print
+ in `src/Zcu/PerThread.zig` at the `verbose_air_callback` call site
+ (around line 4478) to dump `ip.locals[0].shared.items` for the
+ function being debugged. Build with `zig build test-zig0` to compile
+ a new `air_gen`, then run it directly on the target file. **Always
+ revert** the debug print before committing.
+
+### The module-system porting loop
+
+1. **Dump the Zig IP.** Temporarily add debug output in
+ `src/Zcu/PerThread.zig` at the `verbose_air_callback` site.
+ Rebuild, run the `air_gen` binary directly on the target corpus
+ file (e.g. `lib/compiler_rt/neghf2.zig`), capture the IP entries.
+ Revert the debug print.
+2. **Compare.** Run `zig build test-zig0` with `num_passing` bumped.
+ Note the mismatch: `a=0x???[ip] b=0x???[ip]`. The gap `a − b` is
+ the number of missing IP entries.
+3. **Port the next batch.** Identify what the Zig compiler creates for
+ the next ~10 IP entries (struct types, ptr_nav, enum types, etc.).
+ Port the corresponding logic from `src/Zcu/PerThread.zig` and
+ `src/Sema.zig` into `stage0/sema.c`. Key functions to port:
+ - `createFileRootStruct` → creates `type_struct` IP entry for a
+ module's root.
+ - `scanNamespace` → iterates declarations, creates `ptr_nav` entries
+ for each Nav.
+ - `getStructType` / `getEnumType` → creates type entries in IP.
+ - `ensureFileAnalyzed` → recursively processes imported modules.
+ - `zirExport` → forces resolution of exported declarations.
+4. **Test.** `zig build test-zig0` — the gap should shrink.
+5. **Clean up & commit** (see [Cleaning Up](#cleaning-up)). Keep
+ `num_passing` at whatever value passes; don't bump it until the gap
+ reaches zero.
+6. **Repeat** until the gap is zero and `num_passing` can be incremented.
+
+### Important constraints
+
+- Do NOT hardcode IP entries from a dump. The entries must be computed
+ from the ZIR, matching the Zig compiler's processing.
+- Do NOT include generated `.c` files. All logic belongs in `sema.c`.
+- Entry ORDER matters. The C sema must create entries in the same
+ order as the Zig compiler. Follow the Zig compiler's processing
+ sequence (struct type → scan declarations → process imports →
+ resolve comptime blocks).
+- Deduplication matters. If the function body interns a value that was
+ already created during module-level analysis, `ipIntern` must return
+ the existing index (not create a duplicate).
+
## AIR comparison exceptions
C and Zig AIR must match byte-by-byte except: