CLAUDE.md: add InternPool gap porting workflow

Document the workflow for closing the IP entry gap between the Zig
compiler and C sema.  Starting with neghf2.zig, corpus tests require
the C sema to create ~878 module-level IP entries (struct types,
ptr_nav, enum types, etc.) matching the Zig compiler's output.

The workflow describes how to dump the Zig compiler's IP state, compare
with the C sema, port module-system functions (createFileRootStruct,
scanNamespace, etc.), and iterate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-02-25 17:46:23 +00:00
parent b85c3b0f76
commit 40c55b93aa

View File

@@ -43,6 +43,70 @@ take shortcuts. Instead, decompose:
5. Repeat until enough pieces are in place for the corpus test to pass.
6. Return to [The loop](#the-loop).
## Closing the InternPool gap (neghf2 and beyond)
Starting with `neghf2.zig`, corpus tests require the C sema to produce
InternPool entries that match the Zig compiler's module-level analysis.
The Zig compiler creates ~878 IP entries before analyzing `__neghf2`'s
function body (struct types for modules, pointer-to-Nav entries, enum
types/values, function types, etc.). The C sema must create the same
entries so that IP indices in the function body AIR match.
### Background
- Nav entries (declarations) are stored in a **separate** list from IP
Items — they do NOT consume IP indices. But `ptr_nav` entries (pointers
TO Navs) DO consume IP indices.
- The Zig compiler creates IP entries through `createFileRootStruct`
(in `src/Zcu/PerThread.zig`) and `scanNamespace`. These must be
ported to C.
- Source of truth for the IP entry sequence: add a temporary debug print
in `src/Zcu/PerThread.zig` at the `verbose_air_callback` call site
(around line 4478) to dump `ip.locals[0].shared.items` for the
function being debugged. Build with `zig build test-zig0` to compile
a new `air_gen`, then run it directly on the target file. **Always
revert** the debug print before committing.
### The module-system porting loop
1. **Dump the Zig IP.** Temporarily add debug output in
`src/Zcu/PerThread.zig` at the `verbose_air_callback` site.
Rebuild, run the `air_gen` binary directly on the target corpus
file (e.g. `lib/compiler_rt/neghf2.zig`), capture the IP entries.
Revert the debug print.
2. **Compare.** Run `zig build test-zig0` with `num_passing` bumped.
Note the mismatch: `a=0x???[ip] b=0x???[ip]`. The gap `a b` is
the number of missing IP entries.
3. **Port the next batch.** Identify what the Zig compiler creates for
the next ~10 IP entries (struct types, ptr_nav, enum types, etc.).
Port the corresponding logic from `src/Zcu/PerThread.zig` and
`src/Sema.zig` into `stage0/sema.c`. Key functions to port:
- `createFileRootStruct` → creates `type_struct` IP entry for a
module's root.
- `scanNamespace` → iterates declarations, creates `ptr_nav` entries
for each Nav.
- `getStructType` / `getEnumType` → creates type entries in IP.
- `ensureFileAnalyzed` → recursively processes imported modules.
- `zirExport` → forces resolution of exported declarations.
4. **Test.** `zig build test-zig0` — the gap should shrink.
5. **Clean up & commit** (see [Cleaning Up](#cleaning-up)). Keep
`num_passing` at whatever value passes; don't bump it until the gap
reaches zero.
6. **Repeat** until the gap is zero and `num_passing` can be incremented.
### Important constraints
- Do NOT hardcode IP entries from a dump. The entries must be computed
from the ZIR, matching the Zig compiler's processing.
- Do NOT include generated `.c` files. All logic belongs in `sema.c`.
- Entry ORDER matters. The C sema must create entries in the same
order as the Zig compiler. Follow the Zig compiler's processing
sequence (struct type → scan declarations → process imports →
resolve comptime blocks).
- Deduplication matters. If the function body interns a value that was
already created during module-level analysis, `ipIntern` must return
the existing index (not create a duplicate).
## AIR comparison exceptions
C and Zig AIR must match byte-by-byte except: