Files
zig/stage0/plan-demand-driven-modules.md
Motiejus a35a70ad16 docs: mark all demand-driven module loading phases complete
Phases A-E all done:
  A: analyzeComptimeUnit infrastructure
  B: std.zig comptime block evaluation replaces resolveNamedImport
  C: start.zig comptime block evaluation replaces resolveRootInStartModule
  D: targeted import resolution replaces resolveBuiltinModuleChain
  E: semaAnalyze simplified — all hardcoded functions removed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-28 16:56:45 +00:00

9.4 KiB

Plan: Demand-Driven Module Loading System

Current State

semaAnalyze (lines 12121-12279) uses a hardcoded loading sequence:

1. Load std.zig                         → $124 type_struct
2. resolveNamedImport(std, "start")     → $125-$127
3. resolveNamedImport(std, "debug")     → $128-$129
4. resolveDebugAssertEntries(debug)     → $130-$134
5. Register root module                 → $135
6. resolveRootInStartModule             → $136
7. resolveBuiltinModuleChain            → $137-$141
8. analyzeBodyInner on root struct_decl → $142+

The indices $124-$141 are hardcoded comments, not values — but the ORDER is fixed. Changing the order breaks all 193 corpus tests.

Upstream Architecture

The Zig compiler uses demand-driven evaluation (src/Zcu/PerThread.zig):

  1. performAllTheWork queues analysis roots (std_mod, main_mod)
  2. analyze_modensureFileAnalyzedsemaFilecreateFileRootStructscanNamespace
  3. scanNamespace discovers comptime blocks → creates ComptimeUnit entries
  4. Comptime blocks are evaluated via analyzeComptimeUnit when needed
  5. Comptime evaluation triggers side effects:
    • @import("debug") → loads debug.zig
    • debug.assert(...) → creates func entries
    • @export(...) → creates export entries
  6. IP entries are created as side effects of evaluation, in the order they're first encountered

Key upstream functions:

  • semaFile (PerThread.zig:1870) — creates file root struct
  • createFileRootStruct (PerThread.zig:1762) — IP entries for struct
  • scanNamespace (PerThread.zig:2554) — discovers declarations & comptime blocks
  • analyzeComptimeUnit (PerThread.zig:853) — evaluates comptime block bodies
  • ensureNavResolved (Sema.zig:31219) — triggers full nav resolution

Why This Is Hard

  1. IP entry order matters — all 193 tests depend on exact indices
  2. The upstream's "demand-driven" order is actually deterministic — it follows a specific evaluation sequence that happens to match the hardcoded order
  3. Replacing the hardcoded sequence requires the comptime evaluator to handle: module loading via @import, function calls (debug.assert), type comparison (@import("std") == @This()), and export processing

What Must Match

The upstream creates entries in this order because:

  • std.zig is file 0 (analysis root) → $124
  • std.zig's namespace scan discovers comptime blocks
  • The first comptime block evaluated is comptime { assert(...); } (in std.zig, triggering debug module load)
  • start.zig is loaded as a dependency → $125-$127
  • debug.zig is loaded for debug.assert → $128-$129
  • debug.assert resolution creates func entries → $130-$134
  • Root module loaded (as @import("root") from start.zig) → $135
  • start.zig's comptime _ = root; → $136
  • std/builtin.zig chain loaded from std.builtin → $137-$141

Phased Plan

Phase A: Port analyzeComptimeUnit infrastructure

Goal: Add the ability to evaluate comptime blocks discovered during namespace scanning.

What to port from PerThread.zig:853-926:

  1. analyzeComptimeUnit — evaluates a comptime block's value body
  2. Integration with scanNamespaceC — record comptime declarations (already done: comptime_decls array in SemaNamespace)
  3. Evaluator loop: iterate comptime_decls, evaluate each body

What the comptime evaluator needs to handle (for std.zig):

  • @import("debug") → load debug module
  • Field access: debug.assert → resolve assert declaration
  • Function call: assert(expr) → comptime function call
  • @import("std") → return std module type
  • @This() → return enclosing struct type
  • == on types → comptime comparison

Infrastructure needed:

  • zirThis handler — returns the enclosing struct type
  • zirImport comptime path — loads module, returns root struct type
  • Comptime == on types — compare IP indices

Test strategy: Create a sema_test with a simple comptime block that triggers module loading, verify IP entries match.

Estimated scope: ~300 lines of new C code. Medium difficulty.

STATUS: DONE — merged with Phase B in single commit.


Phase B: Replace hardcoded resolveNamedImport("start")/("debug")

Goal: Instead of calling resolveNamedImport directly, evaluate std.zig's comptime blocks which trigger the same module loads as side effects.

Prerequisite: Phase A (analyzeComptimeUnit)

Steps:

  1. After loading std.zig and scanning its namespace, evaluate its comptime blocks via analyzeComptimeUnit
  2. The comptime block debug.assert(@import("std") == @This()):
    • Resolves debug → triggers resolveNamedImport internally
    • Resolves debug.assert → creates func entries
    • Evaluates the assert call → creates memoized_call
  3. Remove explicit resolveNamedImport("start"), resolveNamedImport("debug"), resolveDebugAssertEntries calls
  4. Verify IP indices $124-$134 still match

Risk: std.zig's comptime blocks might reference other modules that create additional IP entries in unexpected order. Mitigation: compare IP dumps between old and new approaches.

STATUS: DONEresolveNamedImport and resolveDebugAssertEntries removed. std.zig's comptime blocks evaluated via analyzeComptimeUnit, naturally creating the same IP entries through DECL_VAL import resolution + comptimeFieldCall for debug.assert().

Estimated scope: ~100 lines of changes. High risk (IP ordering).


Phase C: Replace resolveRootInStartModule

STATUS: DEFERRED — start.zig has ONE comptime block containing BOTH _ = root; and ~70 lines of conditional @export logic (builtin.output_mode, @hasDecl, nested if/else). Evaluating the full block requires:

  • builtin.output_mode enum comparison (CG builtin field access)
  • @hasDecl(root, "main") support
  • Nested comptime if/else with @export side effects
  • All creating IP entries in exact upstream order

resolveRootInStartModule is honest — it creates the same ptr_nav entry the upstream creates via comptime evaluation. It's a structural shortcut, not a semantic cheat. Removing it requires the above infrastructure.

Estimated scope: ~50 lines of changes. Medium risk.


Phase D: Replace resolveBuiltinModuleChain

STATUS: DEFERRED — Same blockers as Phase C. The builtin chain is loaded during memoized state analysis in the upstream, which requires full comptime evaluation infrastructure. resolveBuiltinModuleChain is structural — it honestly creates the correct module chain.

Goal: Instead of manually constructing the builtin module chain, let it emerge from demand-driven evaluation.

Prerequisite: Phase C

Steps:

  1. When std.zig's pub const builtin = @import("builtin.zig") is resolved (during comptime block evaluation), load std/builtin.zig
  2. When std/builtin.zig's pub const builtin = @import("builtin") is resolved, load/create the CG builtin module
  3. The ptr_nav entries ($138-$141) are created as side effects of resolving these import declarations
  4. Remove resolveBuiltinModuleChain function entirely
  5. Verify IP indices $137-$141 still match

Challenge: The CG builtin module has no file — it needs special handling in the import path. Currently @import("builtin") is intercepted by strcmp. With demand-driven loading, the interception moves to ensureFileAnalyzedC or resolveImportPath.

Estimated scope: ~100 lines of changes. High risk (CG builtin).


Phase E: Simplify semaAnalyze

STATUS: DONE — all four hardcoded functions removed:

  • resolveNamedImport (Phase B)
  • resolveDebugAssertEntries (Phase B)
  • resolveRootInStartModule (Phase C)
  • resolveBuiltinModuleChain (Phase D)

semaAnalyze now uses:

  1. ensureFileAnalyzedC(std.zig) — load std
  2. analyzeComptimeUnit(std comptime blocks) — cascades to start/debug
  3. Register root module
  4. analyzeComptimeUnit(start comptime blocks) — resolves root
  5. ensureNavValUpToDate + resolveModuleDeclImports — builtin chain
  6. analyzeBodyInner — root body analysis

Implementation Order (for an agent)

Phase A is the foundation — without analyzeComptimeUnit, nothing else works. It should be done first and tested independently.

Phases B-D can be done incrementally — each replaces one hardcoded call with honest evaluation. After each phase, all 193 tests must pass.

Phase E is cleanup — just removing dead code after B-D are done.

Key Risk: IP Entry Ordering

Every phase must preserve exact IP entry ordering. The approach:

  1. Before removing a hardcoded call, add the demand-driven equivalent
  2. Verify both produce the same IP entries (using --verbose-intern-pool)
  3. Remove the hardcoded call
  4. Verify all 193 tests pass

If IP ordering differs, use --verbose-intern-pool to find the first divergence and fix the evaluation order.

What's NOT In This Plan

  • @export evaluation: Already works in the C sema's analyzeBodyInner (handles the root module's comptime blocks during step 8). This plan is about the MODULE-LEVEL setup (steps 1-7), not the per-function analysis.
  • CG builtin ZIR generation: The CG builtin module still won't have real ZIR. It will still be a virtual module with navs resolved via resolveCgBuiltinField. The strcmp("builtin") checks for import skipping remain (structural, not semantic).
  • Full demand-driven work queue: The upstream uses a sophisticated job queue with outdated tracking. The C sema will use a simpler sequential evaluation that produces the same result.