split README/CLAUDE: human docs vs agent instructions

README.md keeps project overview, testing commands, debugging tips, and float handling. CLAUDE.md gets the full Sema porting loop, decomposition strategy, AIR exceptions, cleanup policy, and general rules. Also fixes: stages.zig -> corpus.zig, sema_test.zig -> sema_tests/ + num_sema_passing, nether -> neither. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-25 08:35:27 +00:00
parent f37ea1113f
commit 1e9d79518d
2 changed files with 83 additions and 70 deletions
--- a/stage0/CLAUDE.md
+++ b/stage0/CLAUDE.md
@@ -1,22 +1,93 @@
- when porting features from upstream Zig, it should be a mechanical copy.
-  Don't invent. Most of what you are doing is invented, but needs to be re-done
-  in C. Keep the structure in place, name functions and types the same way (or
-  within reason equivalently if there are namespacing constraints). It should
-  be easy to reference one from the other; and, if there are semantic
+# Sema porting instructions
+
+Goal: make `corpus_test.zig` skip over fewer tests. Continue until all corpus
+tests pass.
+
+## Key files
+
+- `src/Sema.zig` — upstream Zig Sema (source of truth).
+- `stage0/sema.c`, `stage0/sema.h` — C port (what you are writing).
+- `stage0/corpus.zig` — corpus test list, `num_passing` counter, sema unit
+  test list, `num_sema_passing` counter.
+- `stage0/sema_tests/` — focused unit tests for decomposing complex problems.
+
+## The loop
+
+Repeat until all corpus tests in `stage0/corpus.zig` pass:
+
+1. **Bump.** Increment `num_passing` by 1 in `stage0/corpus.zig`.
+2. **Run.** `zig build test-zig0` — observe the failure.
+3. **Port.** Mechanically copy the needed logic from `src/Sema.zig` into
+   `stage0/sema.c` / `stage0/sema.h`. Ground rules:
+    - Function names should match (except for `sema` prefix when appropriate).
+    - Function control flow should match.
+    - Data structures should be the same, C <-> Zig interop permitting. I.e.
+      struct definitions in `stage0/sema.h` should be, language permitting, the
+      same as in `src/Sema.zig`.
+    - Add functions in the same order as in the original Zig file.
+4. **Test.** `zig build test-zig0` — iterate until the new test passes.
+5. **Clean up & commit.** See [Cleaning Up](#cleaning-up).
+6. **Go to 1.**
+
+## When stuck
+
+If a single corpus test requires too many new functions, causes unclear
+failures, or needs multiple unrelated features at once — do NOT give up or
+take shortcuts. Instead, decompose:
+
+1. Create a focused test file in `stage0/sema_tests/` that isolates one piece
+   of the problem.
+2. Add it to the `sema_unit_tests` array in `stage0/corpus.zig`.
+3. Increment `num_sema_passing` in `stage0/corpus.zig`.
+4. Get that small test passing.
+5. Repeat until enough pieces are in place for the corpus test to pass.
+6. Return to [The loop](#the-loop).
+
+## AIR comparison exceptions
+
+C and Zig AIR must match byte-by-byte except:
+
+1. If floats don't round-trip through f64, we allow some imprecision. See
+   `astgen.c` and [Float Handling](README.md#float-handling) in `README.md`.
+2. Padding: Zig compiler leaves `undefined` bytes in some places where they are
+   never read (e.g. in case of shorter tags). In C we zero them out. Since
+   those bytes are `undefined` and never read, they can differ.
+
+## Cleaning up
+
+Before committing, ensure the branch stays green:
+
+1. Ensure `num_passing` (and `num_sema_passing`, if changed) in
+   `stage0/corpus.zig` only cover tests that actually pass. If the test you
+   just enabled still fails, lower the counter to exclude it.
+2. Remove or comment out all debug printf statements.
+3. Run: `zig build fmt-zig0 test-zig0` — must pass with no extraneous output.
+4. Run: `zig build all-zig0 -Doptimize=ReleaseSafe` — must pass with no
+   extraneous output.
+
+If a test that previously passed now fails, that is a regression. Do not commit.
+Go back and fix it — never commit with fewer passing tests than before. If
+it's not a test failure but a formatting/linting issue, fix it before committing.
+
+# General rules
+
+- When porting features from upstream Zig, it should be a **mechanical copy**.
+  Don't invent. Keep the structure in place, name functions and types the same
+  way (or within reason equivalently if there are namespacing constraints). It
+  should be easy to reference one from the other; and, if there are semantic
  differences, they *must* be because Zig or C does not support certain
  features (like errdefer).
- See README.md for useful information about this project, incl. how to test
-  this.
- **Never ever** remove zig-cache, nether local nor global.
- Zig code is in ~/code/zig, don't look at /nix/...
- when translating functions from Zig to C (mechanically, remember?), add them
+- When translating functions from Zig to C (mechanically, remember?), add them
  in the same order as in the original Zig file.
- debug printfs: add printfs only when debugging a specific issue; when done
+- **Never ever** remove zig-cache, neither local nor global.
+- Zig code is in ~/code/zig, don't look at /nix/...
+- Debug printfs: add printfs only when debugging a specific issue; when done
  debugging, remove them (or comment them if you may find them useful later). I
  prefer committing code only when `zig build` returns no output.
 - Always complete all tasks before stopping. Do not stop to ask for
  confirmation mid-task. If you have remaining work, continue without waiting
  for input.
- no `cppcheck` suppressions. They are here for a reason. If it is complaining
+- No `cppcheck` suppressions. They are here for a reason. If it is complaining
  about automatic variables, make it non-automatic. I.e. find a way to satisfy
  the linter, do not suppress it.
+- See `stage0/README.md` for testing commands and debugging tips.
--- a/stage0/README.md
+++ b/stage0/README.md
@@ -13,64 +13,6 @@ The goal of stage0 is to be able to implement enough zig to be able to build
 3. AstGen: DONE, written fully by an LLM.
 4. Sema: in progress.

-# Sema porting approach
-
-Goal: make `corpus_test.zig` skip over less tests. Rules:
-
-1. We have extensive AIR comparator: we generate AIR from the upstream Zig
-   compiler and compare it byte-by-byte (see [Exceptions](#exceptions)) to the
-   C implementation.
-2. Run Red/Green TDD. The first step of Red/Green TDD is expanding the test
-   suite by bumping the `num_passing` in `stage0/stages.zig`.
-3. Once test fails, we need to port enough code mechanically from Zig to C to
-   make it pass. Ground rules:
-    - Function names should match (except for `sema` prefix when appropriate).
-    - Function control flow should match.
-    - Data structures should be the same, C <-> Zig interop permitting. I.e.
-      struct definitions in `stage0/sema.h` should be, language permitting, the
-      same as in `src/Sema.zig`.
-4. If the changes required to enable a single test case feel too complex to
-   tackle in one go (subjectively: too many new functions, unclear failures,
-   multiple unrelated features needed at once), split it into smaller tractable
-   problems by adding focused tests to `sema_test.zig`. Get those passing
-   first, then return to the corpus test.
-5. Once progress is made (e.g. more AIR matches between C and Zig), clean up
-   (see [Cleaning Up](#cleaning-up)) and commit.
-6. If you get stuck (e.g. a test won't pass after several attempts), decompose
-   the problem into smaller pieces as described in step 4. Do not give up or
-   take shortcuts — keep splitting until the pieces are small enough to solve.
-
-Once a new test case has been enabled and passes, enable the _next_ test case
-by bumping `num_passing` and repeat the process. Continue until all corpus
-tests pass.
-
-## Exceptions
-
-C and Zig AIR must match byte-by-byte except:
-
-1. If floats don't round-trip through f64, we allow some imprecision. See
-   `astgen.c` and `Float Handling` (later in the README) to understand what to
-   do & why.
-2. Padding: Zig compiler leaves `undefined` bytes in some places where they are
-   never read (e.g. in case of shorter tags). In C we zero them out. Since
-   those bytes are `undefined` and never read, they can differ.
-
-## Cleaning Up
-
-Before committing, ensure master stays green:
-
-1. Revert `num_passing` back so that only tests that actually pass are enabled.
-   If the test you just enabled still fails, lower `num_passing` to exclude it.
-2. Remove or comment out all printf statements.
-3. Run the quick test (`zig build fmt-zig0 test-zig0`, see [Testing](#testing)),
-   ensure it passes and there is no extraneous output.
-4. Run the more elaborate test (`zig build all-zig0 -Doptimize=ReleaseSafe`,
-   see [Testing](#testing)), ensure it passes and there is no extraneous output.
-
-If a test that previously passed now fails, that is a regression. Do not commit.
-Go back and fix it — we never commit with fewer passing tests than before. If
-it's not a test failure but a formatting/linting issue, fix it before committing.
-
 # Testing

 Quick test: