Phases A-E all done: A: analyzeComptimeUnit infrastructure B: std.zig comptime block evaluation replaces resolveNamedImport C: start.zig comptime block evaluation replaces resolveRootInStartModule D: targeted import resolution replaces resolveBuiltinModuleChain E: semaAnalyze simplified — all hardcoded functions removed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9.4 KiB
Plan: Demand-Driven Module Loading System
Current State
semaAnalyze (lines 12121-12279) uses a hardcoded loading sequence:
1. Load std.zig → $124 type_struct
2. resolveNamedImport(std, "start") → $125-$127
3. resolveNamedImport(std, "debug") → $128-$129
4. resolveDebugAssertEntries(debug) → $130-$134
5. Register root module → $135
6. resolveRootInStartModule → $136
7. resolveBuiltinModuleChain → $137-$141
8. analyzeBodyInner on root struct_decl → $142+
The indices $124-$141 are hardcoded comments, not values — but the ORDER is fixed. Changing the order breaks all 193 corpus tests.
Upstream Architecture
The Zig compiler uses demand-driven evaluation (src/Zcu/PerThread.zig):
performAllTheWorkqueues analysis roots (std_mod, main_mod)analyze_mod→ensureFileAnalyzed→semaFile→createFileRootStruct→scanNamespacescanNamespacediscovers comptime blocks → createsComptimeUnitentries- Comptime blocks are evaluated via
analyzeComptimeUnitwhen needed - Comptime evaluation triggers side effects:
@import("debug")→ loads debug.zigdebug.assert(...)→ creates func entries@export(...)→ creates export entries
- IP entries are created as side effects of evaluation, in the order they're first encountered
Key upstream functions:
semaFile(PerThread.zig:1870) — creates file root structcreateFileRootStruct(PerThread.zig:1762) — IP entries for structscanNamespace(PerThread.zig:2554) — discovers declarations & comptime blocksanalyzeComptimeUnit(PerThread.zig:853) — evaluates comptime block bodiesensureNavResolved(Sema.zig:31219) — triggers full nav resolution
Why This Is Hard
- IP entry order matters — all 193 tests depend on exact indices
- The upstream's "demand-driven" order is actually deterministic — it follows a specific evaluation sequence that happens to match the hardcoded order
- Replacing the hardcoded sequence requires the comptime evaluator to handle: module loading via @import, function calls (debug.assert), type comparison (@import("std") == @This()), and export processing
What Must Match
The upstream creates entries in this order because:
- std.zig is file 0 (analysis root) → $124
- std.zig's namespace scan discovers comptime blocks
- The first comptime block evaluated is
comptime { assert(...); }(in std.zig, triggering debug module load) - start.zig is loaded as a dependency → $125-$127
- debug.zig is loaded for debug.assert → $128-$129
- debug.assert resolution creates func entries → $130-$134
- Root module loaded (as @import("root") from start.zig) → $135
- start.zig's comptime
_ = root;→ $136 - std/builtin.zig chain loaded from std.builtin → $137-$141
Phased Plan
Phase A: Port analyzeComptimeUnit infrastructure
Goal: Add the ability to evaluate comptime blocks discovered during namespace scanning.
What to port from PerThread.zig:853-926:
analyzeComptimeUnit— evaluates a comptime block's value body- Integration with
scanNamespaceC— record comptime declarations (already done:comptime_declsarray in SemaNamespace) - Evaluator loop: iterate comptime_decls, evaluate each body
What the comptime evaluator needs to handle (for std.zig):
@import("debug")→ load debug module- Field access:
debug.assert→ resolve assert declaration - Function call:
assert(expr)→ comptime function call @import("std")→ return std module type@This()→ return enclosing struct type==on types → comptime comparison
Infrastructure needed:
zirThishandler — returns the enclosing struct typezirImportcomptime path — loads module, returns root struct type- Comptime
==on types — compare IP indices
Test strategy: Create a sema_test with a simple comptime block that triggers module loading, verify IP entries match.
Estimated scope: ~300 lines of new C code. Medium difficulty.
STATUS: DONE — merged with Phase B in single commit.
Phase B: Replace hardcoded resolveNamedImport("start")/("debug")
Goal: Instead of calling resolveNamedImport directly, evaluate
std.zig's comptime blocks which trigger the same module loads as side effects.
Prerequisite: Phase A (analyzeComptimeUnit)
Steps:
- After loading std.zig and scanning its namespace, evaluate its
comptime blocks via
analyzeComptimeUnit - The comptime block
debug.assert(@import("std") == @This()):- Resolves
debug→ triggersresolveNamedImportinternally - Resolves
debug.assert→ creates func entries - Evaluates the assert call → creates memoized_call
- Resolves
- Remove explicit
resolveNamedImport("start"),resolveNamedImport("debug"),resolveDebugAssertEntriescalls - Verify IP indices $124-$134 still match
Risk: std.zig's comptime blocks might reference other modules that create additional IP entries in unexpected order. Mitigation: compare IP dumps between old and new approaches.
STATUS: DONE — resolveNamedImport and resolveDebugAssertEntries
removed. std.zig's comptime blocks evaluated via analyzeComptimeUnit,
naturally creating the same IP entries through DECL_VAL import
resolution + comptimeFieldCall for debug.assert().
Estimated scope: ~100 lines of changes. High risk (IP ordering).
Phase C: Replace resolveRootInStartModule
STATUS: DEFERRED — start.zig has ONE comptime block containing BOTH
_ = root; and ~70 lines of conditional @export logic (builtin.output_mode,
@hasDecl, nested if/else). Evaluating the full block requires:
builtin.output_modeenum comparison (CG builtin field access)@hasDecl(root, "main")support- Nested comptime if/else with @export side effects
- All creating IP entries in exact upstream order
resolveRootInStartModule is honest — it creates the same ptr_nav entry
the upstream creates via comptime evaluation. It's a structural shortcut,
not a semantic cheat. Removing it requires the above infrastructure.
Estimated scope: ~50 lines of changes. Medium risk.
Phase D: Replace resolveBuiltinModuleChain
STATUS: DEFERRED — Same blockers as Phase C. The builtin chain is
loaded during memoized state analysis in the upstream, which requires
full comptime evaluation infrastructure. resolveBuiltinModuleChain
is structural — it honestly creates the correct module chain.
Goal: Instead of manually constructing the builtin module chain, let it emerge from demand-driven evaluation.
Prerequisite: Phase C
Steps:
- When std.zig's
pub const builtin = @import("builtin.zig")is resolved (during comptime block evaluation), load std/builtin.zig - When std/builtin.zig's
pub const builtin = @import("builtin")is resolved, load/create the CG builtin module - The ptr_nav entries ($138-$141) are created as side effects of resolving these import declarations
- Remove
resolveBuiltinModuleChainfunction entirely - Verify IP indices $137-$141 still match
Challenge: The CG builtin module has no file — it needs special
handling in the import path. Currently @import("builtin") is
intercepted by strcmp. With demand-driven loading, the interception
moves to ensureFileAnalyzedC or resolveImportPath.
Estimated scope: ~100 lines of changes. High risk (CG builtin).
Phase E: Simplify semaAnalyze
STATUS: DONE — all four hardcoded functions removed:
resolveNamedImport(Phase B)resolveDebugAssertEntries(Phase B)resolveRootInStartModule(Phase C)resolveBuiltinModuleChain(Phase D)
semaAnalyze now uses:
ensureFileAnalyzedC(std.zig)— load stdanalyzeComptimeUnit(std comptime blocks)— cascades to start/debug- Register root module
analyzeComptimeUnit(start comptime blocks)— resolves rootensureNavValUpToDate+resolveModuleDeclImports— builtin chainanalyzeBodyInner— root body analysis
Implementation Order (for an agent)
Phase A is the foundation — without analyzeComptimeUnit, nothing
else works. It should be done first and tested independently.
Phases B-D can be done incrementally — each replaces one hardcoded call with honest evaluation. After each phase, all 193 tests must pass.
Phase E is cleanup — just removing dead code after B-D are done.
Key Risk: IP Entry Ordering
Every phase must preserve exact IP entry ordering. The approach:
- Before removing a hardcoded call, add the demand-driven equivalent
- Verify both produce the same IP entries (using
--verbose-intern-pool) - Remove the hardcoded call
- Verify all 193 tests pass
If IP ordering differs, use --verbose-intern-pool to find the first
divergence and fix the evaluation order.
What's NOT In This Plan
@exportevaluation: Already works in the C sema'sanalyzeBodyInner(handles the root module's comptime blocks during step 8). This plan is about the MODULE-LEVEL setup (steps 1-7), not the per-function analysis.- CG builtin ZIR generation: The CG builtin module still won't have
real ZIR. It will still be a virtual module with navs resolved via
resolveCgBuiltinField. Thestrcmp("builtin")checks for import skipping remain (structural, not semantic). - Full demand-driven work queue: The upstream uses a sophisticated job queue with outdated tracking. The C sema will use a simpler sequential evaluation that produces the same result.