docs: update CLAUDE.md with current status and conventions

- Unified corpus (single array, num_passing=4, sema tests first)
- Bidirectional AIR comparison in stages_test.zig
- callconv=.c and wasm32-wasi hardcoding marked as OK
- Next blocker: reify_int.zig IP index mismatch
- Add note to update CLAUDE.md upon progress

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-01 15:59:44 +00:00
parent 7d6be1e23d
commit ee8456fb0f

View File

@@ -1,15 +1,20 @@
# Sema porting instructions
Goal: make `corpus_test.zig` skip over fewer tests. Continue until all corpus
tests pass.
Goal: make `stages_test.zig` skip over fewer corpus tests. Continue until all
corpus tests pass.
## Key files
- `src/Sema.zig` — upstream Zig Sema (source of truth).
- `stage0/sema.c`, `stage0/sema.h` — C port (what you are writing).
- `stage0/corpus.zig` — corpus test list, `num_passing` counter, sema unit
test list, `num_sema_passing` counter.
- `stage0/corpus.zig` unified corpus test list with single `num_passing`
counter. Sema unit tests (stage0/sema_tests/) come first, then lib/ files.
- `stage0/sema_tests/` — focused unit tests for decomposing complex problems.
These are part of the main corpus and go through the full pipeline
(parser → ZIR → sema).
- `stage0/stages_test.zig` — runs all stages for corpus files with
**bidirectional** AIR comparison (every C function must match Zig, AND
every Zig function must exist in C output).
## The loop
@@ -43,20 +48,28 @@ take shortcuts. Instead, decompose:
1. Create a focused test file in `stage0/sema_tests/` that isolates one piece
of the problem.
2. Add it to the `sema_unit_tests` array in `stage0/corpus.zig`.
3. Increment `num_sema_passing` in `stage0/corpus.zig`.
4. Get that small test passing.
5. Repeat until enough pieces are in place for the corpus test to pass.
6. Return to [The loop](#the-loop).
2. Add it to the `files` array in `stage0/corpus.zig` (before the lib/ files).
3. Get that small test passing.
4. Repeat until enough pieces are in place for the corpus test to pass.
5. Return to [The loop](#the-loop).
## Closing the InternPool gap (neghf2 and beyond)
## Closing the InternPool gap
Starting with `neghf2.zig`, corpus tests require the C sema to produce
InternPool entries that match the Zig compiler's module-level analysis.
The Zig compiler creates ~878 IP entries before analyzing `__neghf2`'s
function body (struct types for modules, pointer-to-Nav entries, enum
types/values, function types, etc.). The C sema must create the same
entries so that IP indices in the function body AIR match.
### Current status (2026-03-01)
**num_passing = 4.** The first 4 sema unit tests pass the full bidirectional
pipeline (empty.zig, const_decl.zig, empty_void_function.zig,
type_identity_fn.zig). These are standalone files without imports and without
complex type references.
**Next blocker: reify_int.zig** (index 5). Uses `const U32 = @Type(...)`.
The C sema produces IP index 35 for u32 instead of the pre-interned index 8.
Root cause: the module-level IP entry creation via `createFileRootStructC`
produces different entries than the Zig compiler, shifting all subsequent
IP indices.
ALL sema tests with functions that reference types (index 5+) fail with
similar IP reference mismatches (`zig_ip_base` differs between C and Zig).
### Background
@@ -103,10 +116,13 @@ entries so that IP indices in the function body AIR match.
- Do NOT hardcode IP entries from a dump. The entries must be computed
from the ZIR, matching the Zig compiler's processing.
- Do NOT hardcode target-specific values (enum tags, field indices,
type indices, calling conventions, etc.) into sema.c. All such values
must be computed by porting the upstream Zig functions that produce
them. Port the function, not the output.
- **callconv = .c and wasm32-wasi target hardcoding are OK.** This is a
bootstrap interpreter targeting only that platform. Do not spend time
making these configurable.
- Do NOT hardcode other target-specific values (enum tags, field indices,
type indices, etc.) into sema.c. All such values must be computed by
porting the upstream Zig functions that produce them. Port the function,
not the output.
- Do NOT include generated `.c` files. All logic belongs in `sema.c`.
- Entry ORDER matters. The C sema must create entries in the same
order as the Zig compiler. Follow the Zig compiler's processing
@@ -156,11 +172,14 @@ C and Zig AIR must match byte-by-byte except:
## Cleaning up
Update `stage0/CLAUDE.md` with brief current status (num_passing, next
blocker) when meaningful progress is made. Keep it concise.
Before committing, ensure the branch stays green:
1. Ensure `num_passing` (and `num_sema_passing`, if changed) in
`stage0/corpus.zig` only cover tests that actually pass. If the test you
just enabled still fails, lower the counter to exclude it.
1. Ensure `num_passing` in `stage0/corpus.zig` only covers tests that actually
pass. If the test you just enabled still fails, lower the counter to exclude
it.
2. Remove or comment out all debug printf statements.
3. Run: `zig build fmt-zig0 test-zig0` — must pass with no extraneous output.
4. Run: `zig build all-zig0 -Doptimize=ReleaseSafe` — must pass with no