diff --git a/stage0/.claude/skills/enable-tests/SKILL.md b/stage0/.claude/skills/enable-tests/SKILL.md index d6b0ef1f16..d24502b6fe 100644 --- a/stage0/.claude/skills/enable-tests/SKILL.md +++ b/stage0/.claude/skills/enable-tests/SKILL.md @@ -1,30 +1,33 @@ --- name: enable-tests -description: Iteratively enable disabled tests in a given zig test file, fix divergences by mechanical porting from upstream, and commit. +description: Sequentially enable disabled tests, fixing divergences by mechanical porting from upstream, and commit. allowed-tools: Read, Write, Edit, Bash, Grep, Glob, Task disable-model-invocation: true --- # Enable Tests — Orchestrator -You manage the iterative loop of enabling disabled tests in a Zig test file. -For each iteration you enable a test, run it, and either batch-commit -passing tests or dispatch a worker to fix failures. +You manage the sequential loop of enabling disabled tests in a Zig test file. +For each test you enable it, dispatch a worker sub-agent to test and fix it, +then verify and commit. You go through the corpus **in order** and do not +skip tests. **Input:** The test file is provided as the skill argument (e.g., `stage0/astgen_test.zig`). If no argument is given, ask the user. -**You do NOT**: analyze test output in detail, compare Zig/C code, or -edit C source files. The worker handles all of that. +**You do NOT**: run test commands (except Phase 0 baseline and one +verification before committing), analyze test output in detail, compare +Zig/C code, or edit C source files. The worker handles all of that. **CRITICAL RULES:** 1. **NEVER stop early.** Do not pause to ask the user if you should continue. Do not summarize remaining work and wait. Keep looping until every disabled - test is enabled and passing. -2. **ALWAYS run the test command yourself before committing.** Never trust - the worker's claim that tests pass — verify it. -3. **Batch passing tests.** When multiple consecutive tests pass without code - changes, commit them together in one commit. + test is enabled and passing, or a test is stuck (see guard rail). +2. **Go in order.** Enable tests sequentially. Do not skip ahead to easier + tests. If a test requires porting a major feature (e.g. `zirFunc`), that + is the worker's job. +3. **ALWAYS verify before committing.** Run the test command once after the + worker returns. Never trust the worker's claim alone. 4. **NEVER run `zig build test` or `zig build` without arguments.** These build and test upstream Zig, which takes ages and is irrelevant to zig0. Always use `zig build test-zig0 -Dzig0-cc=tcc` for testing. @@ -63,8 +66,8 @@ cd ~/code/zig ./zig-out/bin/zig build test-zig0 -Dzig0-cc=tcc 2>&1 | tail -5 ``` -If non-zero exit or failures: dispatch a worker (Step 4) with the failure -context. After the worker returns, follow Steps 5-7. Re-run Phase 0 until +If non-zero exit or failures: dispatch a worker (Step 3) with the failure +context. After the worker returns, follow Steps 4-5. Re-run Phase 0 until clean. If clean: proceed to the main loop. @@ -73,7 +76,7 @@ If clean: proceed to the main loop. Repeat until no disabled tests remain in the test file. -### Step 1: Find the next disabled test +### Step 1: Enable the next disabled test Search the test file in priority order: @@ -82,84 +85,90 @@ Search for lines matching: ``` if (true) return error.SkipZigTest ``` -Pick the first one. Note the test name (the `test "..."` header above -the skip line) and the line number. +Pick the first one. Comment out the skip line to enable the test. **Priority B -- Commented corpus entries:** Search for lines matching `//"..` inside the `corpus_files` tuple -(between `const corpus_files = .{` and `};`). Pick the first one. -Note the file name and line number. +(between `const corpus_files = .{` and `};`). Pick the **first** one — +do not skip ahead. Uncomment the line (remove `//` prefix). If neither is found, all tests are enabled -- go to Final Check. -### Step 2: Enable the test +### Step 2: Accumulate trivially-passing tests -**For SkipZigTest:** Comment out the `if (true) return error.SkipZigTest;` -line to enable the test. +Before dispatching a worker, check if the test passes without code changes. +Dispatch a worker (Step 3). If the worker reports `pass` with no C code +changes, **do not commit yet** — go back to Step 1 and enable the next +test. Accumulate these trivially-passing tests. -**For commented corpus entry:** Uncomment the line (remove `//` prefix). +When a worker reports anything other than `pass`-without-code-changes, or +when you have accumulated ~10 passing tests, flush the batch: -### Step 3: Smoke test +1. Verify (Step 5). +2. If green, commit all accumulated enables together. +3. Then handle the current worker's non-pass result (Step 4). -```sh -./zig-out/bin/zig build test-zig0 -Dzig0-cc=tcc 2>&1 | tail -10 -``` +### Step 3: Dispatch worker -### Step 4: Evaluate and dispatch - -**If it passes** -- go back to Step 1 and enable the next test. Accumulate -passing tests without committing yet. - -**If it fails** -- first, flush any accumulated passing tests. If there -are previously-enabled tests that passed without code changes, commit them -now before dispatching the worker. This isolates the worker's code changes -from the batch of trivially-passing tests: - -```sh -git add -git commit -m ": enable - -Co-Authored-By: " -``` - -Then dispatch a worker: - -1. Read `.claude/skills/enable-tests/worker-prompt.md`. +1. Read `stage0/.claude/skills/enable-tests/worker-prompt.md`. 2. Replace `{{TEST_CONTEXT}}` with: - Test file name - What was enabled: test name or corpus file path + line number - - First ~20 lines of test output from Step 3 (use `tail -20`) - Modifiable C files (from the configuration table above) - Key reference files (from the configuration table above) + - If this is a retry after `progress`, include the previous worker's + COMMIT_MSG so the new worker knows what was already done. 3. Launch via the Task tool: ``` subagent_type=general-purpose prompt= ``` -### Step 5: Parse worker response +### Step 4: Handle worker result -Extract: +Extract from the worker response: - `STATUS`: one of `pass`, `progress`, or `no-progress` - `COMMIT_MSG`: a descriptive commit message If the worker response does not end with the expected STATUS format, -treat it as `no-progress` and verify the test file was reverted. +treat it as `no-progress`. -### Step 6: Handle by status +**If STATUS is `pass`:** -**If STATUS is `pass` or `progress`:** +The worker made the test pass (possibly with C code changes). Go to +Step 5 to verify and commit. -Verify before committing (use the fast test command, not valgrind): +**If STATUS is `progress`:** + +The worker made partial progress but the test still fails. The worker +has re-disabled the test. Go to Step 5 to verify and commit the partial +C code fixes. After committing, re-enable the same test and go to +Step 3, including the previous COMMIT_MSG in the context. + +**If STATUS is `no-progress`:** + +Worker couldn't make progress. The worker has re-disabled the test. +Check `git diff stage0/` — if there are no meaningful C code changes, +revert with `git checkout -- `. Otherwise go to Step 5 +to verify and commit any useful changes. + +Re-enable the same test and dispatch another worker (Step 3). + +**Guard rail:** If the same test gets `no-progress` three times in a +row, stop the loop and report to the user — the test likely needs a +design discussion. + +### Step 5: Verify and commit + +Run the test command: ```sh ./zig-out/bin/zig build test-zig0 -Dzig0-cc=tcc 2>&1 | tail -10 ; echo "EXIT: $?" ``` -Check: (1) EXIT is 0, (2) no test failures. If either fails, dispatch -another worker to fix. +**If verification passes (EXIT 0, no failures):** -Only if clean, commit all accumulated changes: +Commit all staged changes: ```sh git add @@ -168,50 +177,38 @@ git commit -m " Co-Authored-By: " ``` -If `progress` (not `pass`), the worker re-disabled the test. Commit -anyway -- the passing tests + partial fix are still valuable. +Go back to Step 1. -**If STATUS is `no-progress`:** +**If verification fails:** -Worker should have re-disabled the test. Verify: - -```sh -git diff stage0/ -``` - -If there are accumulated passing tests, commit those: - -```sh -git add -git commit -m ": enable - -Co-Authored-By: " -``` - -If nothing to commit, revert and continue: +The worker's changes broke something. Revert all uncommitted changes: ```sh git checkout -- ``` -### Step 7: Repeat +Re-enable the test (it was reverted too) and dispatch a fresh worker +(Step 3), noting in the context that the previous attempt caused a +regression. + +**Guard rail:** If verification fails twice in a row for the same test, +stop the loop and report to the user. + +### Step 6: Repeat Go back to Step 1. **Never stop early** -- continue until all disabled tests are enabled. -**Guard rail:** If the same test receives `no-progress` twice in a row, -skip it permanently (leave it disabled) and move to the next test. Log -which test was skipped. - ## Commit message conventions -- Batch of passing tests: `": enable , , ..."` -- Worker fix + passing tests: use the worker's COMMIT_MSG +- Batch of trivially-passing tests: `": enable , , ..."` +- Single test enabled with C fixes: use the worker's COMMIT_MSG +- Partial progress (test re-disabled): `": "` - If the list is too long: `": enable N files from /"` ## Final Check -When no disabled tests remain (or all remaining are permanently skipped), +When no disabled tests remain (or the loop was stopped for a stuck test), run the full suite with valgrind (this is the only place where valgrind is used -- per-iteration verification uses the fast `test-zig0` command): diff --git a/stage0/.claude/skills/enable-tests/worker-prompt.md b/stage0/.claude/skills/enable-tests/worker-prompt.md index a3c7118525..3f2d37f781 100644 --- a/stage0/.claude/skills/enable-tests/worker-prompt.md +++ b/stage0/.claude/skills/enable-tests/worker-prompt.md @@ -1,13 +1,16 @@ -# Enable Tests -- Worker (single iteration) +# Enable Tests -- Worker -You are a worker agent fixing a test failure. The orchestrator has enabled a -test for you (either by uncommenting a corpus entry or by commenting out a +You are a worker agent fixing a test failure. The orchestrator has enabled +a test for you (either by uncommenting a corpus entry or by commenting out a `SkipZigTest` line). Your job: diagnose the failure, port the fix from -upstream Zig to C, clean up, and return a result. +upstream Zig to C, and make the test pass. This is a **mechanical translation** -- no creativity, no invention. When the C code differs from Zig, copy the Zig structure into C. +**Your goal is to make the test pass.** Do not re-disable the test unless you +have exhausted your iteration budget and cannot make further progress. + ## Context from orchestrator {{TEST_CONTEXT}} @@ -23,9 +26,16 @@ cd ~/code/zig Capture the last ~50 lines. If tests pass, skip to Step 5. -### Step 2: Determine failure stage +### Step 2: Determine failure type -From the output, identify the failure type: +From the output, identify the failure: + +**"Zig function '...' not found in C output":** +- C sema doesn't produce a function that Zig does. This means a sema + feature is missing (e.g. `zirFunc`, inline function analysis, etc.). +- Fix target: `stage0/sema.c`, reference: `src/Sema.zig` +- This is the most common failure type. Port the missing sema + functionality from upstream. **Parser failure** (stack trace through `parser_test` or `expectAstConsistent`): - AST node/token mismatch between C and Zig parsers @@ -41,11 +51,18 @@ From the output, identify the failure type: **Sema failure** (stack trace through `sema` or Air checks): - `has_compile_errors` -- C Sema emitted compile errors -- Air array null checks failed +- Air instruction/extra mismatch - Fix target: `stage0/sema.c`, reference: `src/Sema.zig` ### Step 3: Analyze the failure +**For "function not found in C output":** Identify which sema feature +is missing. Check what ZIR instructions the source file produces and +which sema handler needs to be ported. Common missing features: +- `zirFunc` / `zirFuncFancy` -- function declarations +- `zirCall` -- function calls +- Various `zir*` handlers for specific instructions + **For `has_compile_errors`:** Temporarily add `#include ` and `fprintf(stderr, ...)` to `setCompileError()` (or the sema equivalent) to find which error fires. Run the test again and note the function @@ -73,6 +90,7 @@ Enumerate differences that affect output: - String table entries - Break payload values - AST node tags and token associations +- Missing sema handlers (functions that exist in Zig but not in C) Apply the minimal mechanical change to match the upstream. Make ONE change at a time. After each change, run: @@ -88,13 +106,15 @@ Check that no previously-passing tests broke. - First tag mismatch position moved later - Compile errors resolved (even if other mismatch remains) - AST mismatch moved to a later node +- A different function name now appears in "not found" error +- The error changes from "not found" to a different type ### Step 5: Clean up 1. Remove ALL `fprintf`/`printf` debug statements from C files. 2. Remove `#include ` if it was added for debugging. 3. Verify with: `grep -n 'fprintf\|printf' stage0/astgen.c stage0/parser.c stage0/sema.c` -4. If the test still fails: +4. If the test still fails after exhausting your iteration budget: - **For SkipZigTest:** re-add the `if (true) return error.SkipZigTest;` line with a TODO comment describing the remaining diff. - **For corpus entry:** re-comment the line (add `//` prefix back) and @@ -115,7 +135,8 @@ COMMIT_MSG: ``` - `pass` -- the enabled test now passes (entry remains enabled) -- `progress` -- partial progress was made (entry was re-disabled with TODO) +- `progress` -- partial progress was made toward making the test pass + (entry was re-disabled with TODO); C code changes are worth committing - `no-progress` -- no measurable improvement (entry was re-disabled) ## Rules @@ -140,7 +161,7 @@ COMMIT_MSG: whole will blow your context. - **Budget your iterations.** If after ~15 test-fix cycles you have made progress but cannot fully fix the test, stop iterating. Clean up (Step 5) - and report `progress`. The orchestrator will commit partial work and can + and report `progress`. The orchestrator will commit partial work and dispatch you again later. Do not exhaust your context chasing diminishing returns. - **Avoid modifying files under `src/`.** Changing any `src/` file invalidates