update skill
This commit is contained in:
@@ -1,30 +1,33 @@
|
||||
---
|
||||
name: enable-tests
|
||||
description: Iteratively enable disabled tests in a given zig test file, fix divergences by mechanical porting from upstream, and commit.
|
||||
description: Sequentially enable disabled tests, fixing divergences by mechanical porting from upstream, and commit.
|
||||
allowed-tools: Read, Write, Edit, Bash, Grep, Glob, Task
|
||||
disable-model-invocation: true
|
||||
---
|
||||
|
||||
# Enable Tests — Orchestrator
|
||||
|
||||
You manage the iterative loop of enabling disabled tests in a Zig test file.
|
||||
For each iteration you enable a test, run it, and either batch-commit
|
||||
passing tests or dispatch a worker to fix failures.
|
||||
You manage the sequential loop of enabling disabled tests in a Zig test file.
|
||||
For each test you enable it, dispatch a worker sub-agent to test and fix it,
|
||||
then verify and commit. You go through the corpus **in order** and do not
|
||||
skip tests.
|
||||
|
||||
**Input:** The test file is provided as the skill argument (e.g.,
|
||||
`stage0/astgen_test.zig`). If no argument is given, ask the user.
|
||||
|
||||
**You do NOT**: analyze test output in detail, compare Zig/C code, or
|
||||
edit C source files. The worker handles all of that.
|
||||
**You do NOT**: run test commands (except Phase 0 baseline and one
|
||||
verification before committing), analyze test output in detail, compare
|
||||
Zig/C code, or edit C source files. The worker handles all of that.
|
||||
|
||||
**CRITICAL RULES:**
|
||||
1. **NEVER stop early.** Do not pause to ask the user if you should continue.
|
||||
Do not summarize remaining work and wait. Keep looping until every disabled
|
||||
test is enabled and passing.
|
||||
2. **ALWAYS run the test command yourself before committing.** Never trust
|
||||
the worker's claim that tests pass — verify it.
|
||||
3. **Batch passing tests.** When multiple consecutive tests pass without code
|
||||
changes, commit them together in one commit.
|
||||
test is enabled and passing, or a test is stuck (see guard rail).
|
||||
2. **Go in order.** Enable tests sequentially. Do not skip ahead to easier
|
||||
tests. If a test requires porting a major feature (e.g. `zirFunc`), that
|
||||
is the worker's job.
|
||||
3. **ALWAYS verify before committing.** Run the test command once after the
|
||||
worker returns. Never trust the worker's claim alone.
|
||||
4. **NEVER run `zig build test` or `zig build` without arguments.** These
|
||||
build and test upstream Zig, which takes ages and is irrelevant to zig0.
|
||||
Always use `zig build test-zig0 -Dzig0-cc=tcc` for testing.
|
||||
@@ -63,8 +66,8 @@ cd ~/code/zig
|
||||
./zig-out/bin/zig build test-zig0 -Dzig0-cc=tcc 2>&1 | tail -5
|
||||
```
|
||||
|
||||
If non-zero exit or failures: dispatch a worker (Step 4) with the failure
|
||||
context. After the worker returns, follow Steps 5-7. Re-run Phase 0 until
|
||||
If non-zero exit or failures: dispatch a worker (Step 3) with the failure
|
||||
context. After the worker returns, follow Steps 4-5. Re-run Phase 0 until
|
||||
clean.
|
||||
|
||||
If clean: proceed to the main loop.
|
||||
@@ -73,7 +76,7 @@ If clean: proceed to the main loop.
|
||||
|
||||
Repeat until no disabled tests remain in the test file.
|
||||
|
||||
### Step 1: Find the next disabled test
|
||||
### Step 1: Enable the next disabled test
|
||||
|
||||
Search the test file in priority order:
|
||||
|
||||
@@ -82,84 +85,90 @@ Search for lines matching:
|
||||
```
|
||||
if (true) return error.SkipZigTest
|
||||
```
|
||||
Pick the first one. Note the test name (the `test "..."` header above
|
||||
the skip line) and the line number.
|
||||
Pick the first one. Comment out the skip line to enable the test.
|
||||
|
||||
**Priority B -- Commented corpus entries:**
|
||||
Search for lines matching `//"..` inside the `corpus_files` tuple
|
||||
(between `const corpus_files = .{` and `};`). Pick the first one.
|
||||
Note the file name and line number.
|
||||
(between `const corpus_files = .{` and `};`). Pick the **first** one —
|
||||
do not skip ahead. Uncomment the line (remove `//` prefix).
|
||||
|
||||
If neither is found, all tests are enabled -- go to Final Check.
|
||||
|
||||
### Step 2: Enable the test
|
||||
### Step 2: Accumulate trivially-passing tests
|
||||
|
||||
**For SkipZigTest:** Comment out the `if (true) return error.SkipZigTest;`
|
||||
line to enable the test.
|
||||
Before dispatching a worker, check if the test passes without code changes.
|
||||
Dispatch a worker (Step 3). If the worker reports `pass` with no C code
|
||||
changes, **do not commit yet** — go back to Step 1 and enable the next
|
||||
test. Accumulate these trivially-passing tests.
|
||||
|
||||
**For commented corpus entry:** Uncomment the line (remove `//` prefix).
|
||||
When a worker reports anything other than `pass`-without-code-changes, or
|
||||
when you have accumulated ~10 passing tests, flush the batch:
|
||||
|
||||
### Step 3: Smoke test
|
||||
1. Verify (Step 5).
|
||||
2. If green, commit all accumulated enables together.
|
||||
3. Then handle the current worker's non-pass result (Step 4).
|
||||
|
||||
```sh
|
||||
./zig-out/bin/zig build test-zig0 -Dzig0-cc=tcc 2>&1 | tail -10
|
||||
```
|
||||
### Step 3: Dispatch worker
|
||||
|
||||
### Step 4: Evaluate and dispatch
|
||||
|
||||
**If it passes** -- go back to Step 1 and enable the next test. Accumulate
|
||||
passing tests without committing yet.
|
||||
|
||||
**If it fails** -- first, flush any accumulated passing tests. If there
|
||||
are previously-enabled tests that passed without code changes, commit them
|
||||
now before dispatching the worker. This isolates the worker's code changes
|
||||
from the batch of trivially-passing tests:
|
||||
|
||||
```sh
|
||||
git add <git-add-list>
|
||||
git commit -m "<test_file>: enable <accumulated list>
|
||||
|
||||
Co-Authored-By: <model name>"
|
||||
```
|
||||
|
||||
Then dispatch a worker:
|
||||
|
||||
1. Read `.claude/skills/enable-tests/worker-prompt.md`.
|
||||
1. Read `stage0/.claude/skills/enable-tests/worker-prompt.md`.
|
||||
2. Replace `{{TEST_CONTEXT}}` with:
|
||||
- Test file name
|
||||
- What was enabled: test name or corpus file path + line number
|
||||
- First ~20 lines of test output from Step 3 (use `tail -20`)
|
||||
- Modifiable C files (from the configuration table above)
|
||||
- Key reference files (from the configuration table above)
|
||||
- If this is a retry after `progress`, include the previous worker's
|
||||
COMMIT_MSG so the new worker knows what was already done.
|
||||
3. Launch via the Task tool:
|
||||
```
|
||||
subagent_type=general-purpose
|
||||
prompt=<the worker prompt with context filled in>
|
||||
```
|
||||
|
||||
### Step 5: Parse worker response
|
||||
### Step 4: Handle worker result
|
||||
|
||||
Extract:
|
||||
Extract from the worker response:
|
||||
- `STATUS`: one of `pass`, `progress`, or `no-progress`
|
||||
- `COMMIT_MSG`: a descriptive commit message
|
||||
|
||||
If the worker response does not end with the expected STATUS format,
|
||||
treat it as `no-progress` and verify the test file was reverted.
|
||||
treat it as `no-progress`.
|
||||
|
||||
### Step 6: Handle by status
|
||||
**If STATUS is `pass`:**
|
||||
|
||||
**If STATUS is `pass` or `progress`:**
|
||||
The worker made the test pass (possibly with C code changes). Go to
|
||||
Step 5 to verify and commit.
|
||||
|
||||
Verify before committing (use the fast test command, not valgrind):
|
||||
**If STATUS is `progress`:**
|
||||
|
||||
The worker made partial progress but the test still fails. The worker
|
||||
has re-disabled the test. Go to Step 5 to verify and commit the partial
|
||||
C code fixes. After committing, re-enable the same test and go to
|
||||
Step 3, including the previous COMMIT_MSG in the context.
|
||||
|
||||
**If STATUS is `no-progress`:**
|
||||
|
||||
Worker couldn't make progress. The worker has re-disabled the test.
|
||||
Check `git diff stage0/` — if there are no meaningful C code changes,
|
||||
revert with `git checkout -- <git-add-list>`. Otherwise go to Step 5
|
||||
to verify and commit any useful changes.
|
||||
|
||||
Re-enable the same test and dispatch another worker (Step 3).
|
||||
|
||||
**Guard rail:** If the same test gets `no-progress` three times in a
|
||||
row, stop the loop and report to the user — the test likely needs a
|
||||
design discussion.
|
||||
|
||||
### Step 5: Verify and commit
|
||||
|
||||
Run the test command:
|
||||
|
||||
```sh
|
||||
./zig-out/bin/zig build test-zig0 -Dzig0-cc=tcc 2>&1 | tail -10 ; echo "EXIT: $?"
|
||||
```
|
||||
|
||||
Check: (1) EXIT is 0, (2) no test failures. If either fails, dispatch
|
||||
another worker to fix.
|
||||
**If verification passes (EXIT 0, no failures):**
|
||||
|
||||
Only if clean, commit all accumulated changes:
|
||||
Commit all staged changes:
|
||||
|
||||
```sh
|
||||
git add <git-add-list from configuration table>
|
||||
@@ -168,50 +177,38 @@ git commit -m "<COMMIT_MSG>
|
||||
Co-Authored-By: <model name>"
|
||||
```
|
||||
|
||||
If `progress` (not `pass`), the worker re-disabled the test. Commit
|
||||
anyway -- the passing tests + partial fix are still valuable.
|
||||
Go back to Step 1.
|
||||
|
||||
**If STATUS is `no-progress`:**
|
||||
**If verification fails:**
|
||||
|
||||
Worker should have re-disabled the test. Verify:
|
||||
|
||||
```sh
|
||||
git diff stage0/
|
||||
```
|
||||
|
||||
If there are accumulated passing tests, commit those:
|
||||
|
||||
```sh
|
||||
git add <git-add-list>
|
||||
git commit -m "<test_file_name>: enable <list of tests>
|
||||
|
||||
Co-Authored-By: <model name>"
|
||||
```
|
||||
|
||||
If nothing to commit, revert and continue:
|
||||
The worker's changes broke something. Revert all uncommitted changes:
|
||||
|
||||
```sh
|
||||
git checkout -- <git-add-list>
|
||||
```
|
||||
|
||||
### Step 7: Repeat
|
||||
Re-enable the test (it was reverted too) and dispatch a fresh worker
|
||||
(Step 3), noting in the context that the previous attempt caused a
|
||||
regression.
|
||||
|
||||
**Guard rail:** If verification fails twice in a row for the same test,
|
||||
stop the loop and report to the user.
|
||||
|
||||
### Step 6: Repeat
|
||||
|
||||
Go back to Step 1. **Never stop early** -- continue until all disabled
|
||||
tests are enabled.
|
||||
|
||||
**Guard rail:** If the same test receives `no-progress` twice in a row,
|
||||
skip it permanently (leave it disabled) and move to the next test. Log
|
||||
which test was skipped.
|
||||
|
||||
## Commit message conventions
|
||||
|
||||
- Batch of passing tests: `"<test_file>: enable <file1>, <file2>, ..."`
|
||||
- Worker fix + passing tests: use the worker's COMMIT_MSG
|
||||
- Batch of trivially-passing tests: `"<test_file>: enable <file1>, <file2>, ..."`
|
||||
- Single test enabled with C fixes: use the worker's COMMIT_MSG
|
||||
- Partial progress (test re-disabled): `"<test_file>: <worker COMMIT_MSG>"`
|
||||
- If the list is too long: `"<test_file>: enable N files from <dir>/"`
|
||||
|
||||
## Final Check
|
||||
|
||||
When no disabled tests remain (or all remaining are permanently skipped),
|
||||
When no disabled tests remain (or the loop was stopped for a stuck test),
|
||||
run the full suite with valgrind (this is the only place where valgrind
|
||||
is used -- per-iteration verification uses the fast `test-zig0` command):
|
||||
|
||||
|
||||
@@ -1,13 +1,16 @@
|
||||
# Enable Tests -- Worker (single iteration)
|
||||
# Enable Tests -- Worker
|
||||
|
||||
You are a worker agent fixing a test failure. The orchestrator has enabled a
|
||||
test for you (either by uncommenting a corpus entry or by commenting out a
|
||||
You are a worker agent fixing a test failure. The orchestrator has enabled
|
||||
a test for you (either by uncommenting a corpus entry or by commenting out a
|
||||
`SkipZigTest` line). Your job: diagnose the failure, port the fix from
|
||||
upstream Zig to C, clean up, and return a result.
|
||||
upstream Zig to C, and make the test pass.
|
||||
|
||||
This is a **mechanical translation** -- no creativity, no invention. When the
|
||||
C code differs from Zig, copy the Zig structure into C.
|
||||
|
||||
**Your goal is to make the test pass.** Do not re-disable the test unless you
|
||||
have exhausted your iteration budget and cannot make further progress.
|
||||
|
||||
## Context from orchestrator
|
||||
|
||||
{{TEST_CONTEXT}}
|
||||
@@ -23,9 +26,16 @@ cd ~/code/zig
|
||||
|
||||
Capture the last ~50 lines. If tests pass, skip to Step 5.
|
||||
|
||||
### Step 2: Determine failure stage
|
||||
### Step 2: Determine failure type
|
||||
|
||||
From the output, identify the failure type:
|
||||
From the output, identify the failure:
|
||||
|
||||
**"Zig function '...' not found in C output":**
|
||||
- C sema doesn't produce a function that Zig does. This means a sema
|
||||
feature is missing (e.g. `zirFunc`, inline function analysis, etc.).
|
||||
- Fix target: `stage0/sema.c`, reference: `src/Sema.zig`
|
||||
- This is the most common failure type. Port the missing sema
|
||||
functionality from upstream.
|
||||
|
||||
**Parser failure** (stack trace through `parser_test` or `expectAstConsistent`):
|
||||
- AST node/token mismatch between C and Zig parsers
|
||||
@@ -41,11 +51,18 @@ From the output, identify the failure type:
|
||||
|
||||
**Sema failure** (stack trace through `sema` or Air checks):
|
||||
- `has_compile_errors` -- C Sema emitted compile errors
|
||||
- Air array null checks failed
|
||||
- Air instruction/extra mismatch
|
||||
- Fix target: `stage0/sema.c`, reference: `src/Sema.zig`
|
||||
|
||||
### Step 3: Analyze the failure
|
||||
|
||||
**For "function not found in C output":** Identify which sema feature
|
||||
is missing. Check what ZIR instructions the source file produces and
|
||||
which sema handler needs to be ported. Common missing features:
|
||||
- `zirFunc` / `zirFuncFancy` -- function declarations
|
||||
- `zirCall` -- function calls
|
||||
- Various `zir*` handlers for specific instructions
|
||||
|
||||
**For `has_compile_errors`:** Temporarily add `#include <stdio.h>` and
|
||||
`fprintf(stderr, ...)` to `setCompileError()` (or the sema equivalent)
|
||||
to find which error fires. Run the test again and note the function
|
||||
@@ -73,6 +90,7 @@ Enumerate differences that affect output:
|
||||
- String table entries
|
||||
- Break payload values
|
||||
- AST node tags and token associations
|
||||
- Missing sema handlers (functions that exist in Zig but not in C)
|
||||
|
||||
Apply the minimal mechanical change to match the upstream. Make ONE change
|
||||
at a time. After each change, run:
|
||||
@@ -88,13 +106,15 @@ Check that no previously-passing tests broke.
|
||||
- First tag mismatch position moved later
|
||||
- Compile errors resolved (even if other mismatch remains)
|
||||
- AST mismatch moved to a later node
|
||||
- A different function name now appears in "not found" error
|
||||
- The error changes from "not found" to a different type
|
||||
|
||||
### Step 5: Clean up
|
||||
|
||||
1. Remove ALL `fprintf`/`printf` debug statements from C files.
|
||||
2. Remove `#include <stdio.h>` if it was added for debugging.
|
||||
3. Verify with: `grep -n 'fprintf\|printf' stage0/astgen.c stage0/parser.c stage0/sema.c`
|
||||
4. If the test still fails:
|
||||
4. If the test still fails after exhausting your iteration budget:
|
||||
- **For SkipZigTest:** re-add the `if (true) return error.SkipZigTest;`
|
||||
line with a TODO comment describing the remaining diff.
|
||||
- **For corpus entry:** re-comment the line (add `//` prefix back) and
|
||||
@@ -115,7 +135,8 @@ COMMIT_MSG: <one-line descriptive message about what was ported/fixed>
|
||||
```
|
||||
|
||||
- `pass` -- the enabled test now passes (entry remains enabled)
|
||||
- `progress` -- partial progress was made (entry was re-disabled with TODO)
|
||||
- `progress` -- partial progress was made toward making the test pass
|
||||
(entry was re-disabled with TODO); C code changes are worth committing
|
||||
- `no-progress` -- no measurable improvement (entry was re-disabled)
|
||||
|
||||
## Rules
|
||||
@@ -140,7 +161,7 @@ COMMIT_MSG: <one-line descriptive message about what was ported/fixed>
|
||||
whole will blow your context.
|
||||
- **Budget your iterations.** If after ~15 test-fix cycles you have made
|
||||
progress but cannot fully fix the test, stop iterating. Clean up (Step 5)
|
||||
and report `progress`. The orchestrator will commit partial work and can
|
||||
and report `progress`. The orchestrator will commit partial work and
|
||||
dispatch you again later. Do not exhaust your context chasing diminishing
|
||||
returns.
|
||||
- **Avoid modifying files under `src/`.** Changing any `src/` file invalidates
|
||||
|
||||
Reference in New Issue
Block a user