commit ee8456fb0fe3c0a9049d7f6544cd0c675cd1726c (tree)
parent 7d6be1e23d8a65e8d6e8b8745024a4641e71523e
Author: Motiejus <motiejus@jakstys.lt>
Date: Sun, 1 Mar 2026 15:59:44 +0000
docs: update CLAUDE.md with current status and conventions
- Unified corpus (single array, num_passing=4, sema tests first)
- Bidirectional AIR comparison in stages_test.zig
- callconv=.c and wasm32-wasi hardcoding marked as OK
- Next blocker: reify_int.zig IP index mismatch
- Add note to update CLAUDE.md upon progress
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat:
| M | stage0/CLAUDE.md | | | 65 | ++++++++++++++++++++++++++++++++++++++++++----------------------- |
1 file changed, 42 insertions(+), 23 deletions(-)
diff --git a/stage0/CLAUDE.md b/stage0/CLAUDE.md
@@ -1,15 +1,20 @@
# Sema porting instructions
-Goal: make `corpus_test.zig` skip over fewer tests. Continue until all corpus
-tests pass.
+Goal: make `stages_test.zig` skip over fewer corpus tests. Continue until all
+corpus tests pass.
## Key files
- `src/Sema.zig` — upstream Zig Sema (source of truth).
- `stage0/sema.c`, `stage0/sema.h` — C port (what you are writing).
-- `stage0/corpus.zig` — corpus test list, `num_passing` counter, sema unit
- test list, `num_sema_passing` counter.
+- `stage0/corpus.zig` — unified corpus test list with single `num_passing`
+ counter. Sema unit tests (stage0/sema_tests/) come first, then lib/ files.
- `stage0/sema_tests/` — focused unit tests for decomposing complex problems.
+ These are part of the main corpus and go through the full pipeline
+ (parser → ZIR → sema).
+- `stage0/stages_test.zig` — runs all stages for corpus files with
+ **bidirectional** AIR comparison (every C function must match Zig, AND
+ every Zig function must exist in C output).
## The loop
@@ -43,20 +48,28 @@ take shortcuts. Instead, decompose:
1. Create a focused test file in `stage0/sema_tests/` that isolates one piece
of the problem.
-2. Add it to the `sema_unit_tests` array in `stage0/corpus.zig`.
-3. Increment `num_sema_passing` in `stage0/corpus.zig`.
-4. Get that small test passing.
-5. Repeat until enough pieces are in place for the corpus test to pass.
-6. Return to [The loop](#the-loop).
+2. Add it to the `files` array in `stage0/corpus.zig` (before the lib/ files).
+3. Get that small test passing.
+4. Repeat until enough pieces are in place for the corpus test to pass.
+5. Return to [The loop](#the-loop).
-## Closing the InternPool gap (neghf2 and beyond)
+## Closing the InternPool gap
-Starting with `neghf2.zig`, corpus tests require the C sema to produce
-InternPool entries that match the Zig compiler's module-level analysis.
-The Zig compiler creates ~878 IP entries before analyzing `__neghf2`'s
-function body (struct types for modules, pointer-to-Nav entries, enum
-types/values, function types, etc.). The C sema must create the same
-entries so that IP indices in the function body AIR match.
+### Current status (2026-03-01)
+
+**num_passing = 4.** The first 4 sema unit tests pass the full bidirectional
+pipeline (empty.zig, const_decl.zig, empty_void_function.zig,
+type_identity_fn.zig). These are standalone files without imports and without
+complex type references.
+
+**Next blocker: reify_int.zig** (index 5). Uses `const U32 = @Type(...)`.
+The C sema produces IP index 35 for u32 instead of the pre-interned index 8.
+Root cause: the module-level IP entry creation via `createFileRootStructC`
+produces different entries than the Zig compiler, shifting all subsequent
+IP indices.
+
+ALL sema tests with functions that reference types (index 5+) fail with
+similar IP reference mismatches (`zig_ip_base` differs between C and Zig).
### Background
@@ -103,10 +116,13 @@ entries so that IP indices in the function body AIR match.
- Do NOT hardcode IP entries from a dump. The entries must be computed
from the ZIR, matching the Zig compiler's processing.
-- Do NOT hardcode target-specific values (enum tags, field indices,
- type indices, calling conventions, etc.) into sema.c. All such values
- must be computed by porting the upstream Zig functions that produce
- them. Port the function, not the output.
+- **callconv = .c and wasm32-wasi target hardcoding are OK.** This is a
+ bootstrap interpreter targeting only that platform. Do not spend time
+ making these configurable.
+- Do NOT hardcode other target-specific values (enum tags, field indices,
+ type indices, etc.) into sema.c. All such values must be computed by
+ porting the upstream Zig functions that produce them. Port the function,
+ not the output.
- Do NOT include generated `.c` files. All logic belongs in `sema.c`.
- Entry ORDER matters. The C sema must create entries in the same
order as the Zig compiler. Follow the Zig compiler's processing
@@ -156,11 +172,14 @@ C and Zig AIR must match byte-by-byte except:
## Cleaning up
+Update `stage0/CLAUDE.md` with brief current status (num_passing, next
+blocker) when meaningful progress is made. Keep it concise.
+
Before committing, ensure the branch stays green:
-1. Ensure `num_passing` (and `num_sema_passing`, if changed) in
- `stage0/corpus.zig` only cover tests that actually pass. If the test you
- just enabled still fails, lower the counter to exclude it.
+1. Ensure `num_passing` in `stage0/corpus.zig` only covers tests that actually
+ pass. If the test you just enabled still fails, lower the counter to exclude
+ it.
2. Remove or comment out all debug printf statements.
3. Run: `zig build fmt-zig0 test-zig0` — must pass with no extraneous output.
4. Run: `zig build all-zig0 -Doptimize=ReleaseSafe` — must pass with no