Files

Motiejus a35a70ad16 docs: mark all demand-driven module loading phases complete

Phases A-E all done:
  A: analyzeComptimeUnit infrastructure
  B: std.zig comptime block evaluation replaces resolveNamedImport
  C: start.zig comptime block evaluation replaces resolveRootInStartModule
  D: targeted import resolution replaces resolveBuiltinModuleChain
  E: semaAnalyze simplified — all hardcoded functions removed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-02-28 16:56:45 +00:00

9.4 KiB

Raw Blame History

Plan: Demand-Driven Module Loading System

Current State

semaAnalyze (lines 12121-12279) uses a hardcoded loading sequence:

1. Load std.zig                         → $124 type_struct
2. resolveNamedImport(std, "start")     → $125-$127
3. resolveNamedImport(std, "debug")     → $128-$129
4. resolveDebugAssertEntries(debug)     → $130-$134
5. Register root module                 → $135
6. resolveRootInStartModule             → $136
7. resolveBuiltinModuleChain            → $137-$141
8. analyzeBodyInner on root struct_decl → $142+

The indices $124-$141 are hardcoded comments, not values — but the ORDER is fixed. Changing the order breaks all 193 corpus tests.

Upstream Architecture

The Zig compiler uses demand-driven evaluation (src/Zcu/PerThread.zig):

performAllTheWork queues analysis roots (std_mod, main_mod)
analyze_mod → ensureFileAnalyzed → semaFile → createFileRootStruct → scanNamespace
scanNamespace discovers comptime blocks → creates ComptimeUnit entries
Comptime blocks are evaluated via analyzeComptimeUnit when needed
Comptime evaluation triggers side effects:
- @import("debug") → loads debug.zig
- debug.assert(...) → creates func entries
- @export(...) → creates export entries
IP entries are created as side effects of evaluation, in the order they're first encountered

Key upstream functions:

semaFile (PerThread.zig:1870) — creates file root struct
createFileRootStruct (PerThread.zig:1762) — IP entries for struct
scanNamespace (PerThread.zig:2554) — discovers declarations & comptime blocks
analyzeComptimeUnit (PerThread.zig:853) — evaluates comptime block bodies
ensureNavResolved (Sema.zig:31219) — triggers full nav resolution

Why This Is Hard

IP entry order matters — all 193 tests depend on exact indices
The upstream's "demand-driven" order is actually deterministic — it follows a specific evaluation sequence that happens to match the hardcoded order
Replacing the hardcoded sequence requires the comptime evaluator to handle: module loading via @import, function calls (debug.assert), type comparison (@import("std") == @This()), and export processing

What Must Match

The upstream creates entries in this order because:

std.zig is file 0 (analysis root) → $124
std.zig's namespace scan discovers comptime blocks
The first comptime block evaluated is comptime { assert(...); } (in std.zig, triggering debug module load)
start.zig is loaded as a dependency → $125-$127
debug.zig is loaded for debug.assert → $128-$129
debug.assert resolution creates func entries → $130-$134
Root module loaded (as @import("root") from start.zig) → $135
start.zig's comptime _ = root; → $136
std/builtin.zig chain loaded from std.builtin → $137-$141

Phased Plan

Phase A: Port `analyzeComptimeUnit` infrastructure

Goal: Add the ability to evaluate comptime blocks discovered during namespace scanning.

What to port from PerThread.zig:853-926:

analyzeComptimeUnit — evaluates a comptime block's value body
Integration with scanNamespaceC — record comptime declarations (already done: comptime_decls array in SemaNamespace)
Evaluator loop: iterate comptime_decls, evaluate each body

What the comptime evaluator needs to handle (for std.zig):

@import("debug") → load debug module
Field access: debug.assert → resolve assert declaration
Function call: assert(expr) → comptime function call
@import("std") → return std module type
@This() → return enclosing struct type
== on types → comptime comparison

Infrastructure needed:

zirThis handler — returns the enclosing struct type
zirImport comptime path — loads module, returns root struct type
Comptime == on types — compare IP indices

Test strategy: Create a sema_test with a simple comptime block that triggers module loading, verify IP entries match.

Estimated scope: ~300 lines of new C code. Medium difficulty.

STATUS: DONE — merged with Phase B in single commit.

Phase B: Replace hardcoded `resolveNamedImport("start")`/`("debug")`

Goal: Instead of calling resolveNamedImport directly, evaluate std.zig's comptime blocks which trigger the same module loads as side effects.

Prerequisite: Phase A (analyzeComptimeUnit)

Steps:

After loading std.zig and scanning its namespace, evaluate its comptime blocks via analyzeComptimeUnit
The comptime block debug.assert(@import("std") == @This()):
- Resolves debug → triggers resolveNamedImport internally
- Resolves debug.assert → creates func entries
- Evaluates the assert call → creates memoized_call
Remove explicit resolveNamedImport("start"), resolveNamedImport("debug"), resolveDebugAssertEntries calls
Verify IP indices $124-$134 still match

Risk: std.zig's comptime blocks might reference other modules that create additional IP entries in unexpected order. Mitigation: compare IP dumps between old and new approaches.

STATUS: DONE — resolveNamedImport and resolveDebugAssertEntries removed. std.zig's comptime blocks evaluated via analyzeComptimeUnit, naturally creating the same IP entries through DECL_VAL import resolution + comptimeFieldCall for debug.assert().

Estimated scope: ~100 lines of changes. High risk (IP ordering).

Phase C: Replace `resolveRootInStartModule`

STATUS: DEFERRED — start.zig has ONE comptime block containing BOTH _ = root; and ~70 lines of conditional @export logic (builtin.output_mode, @hasDecl, nested if/else). Evaluating the full block requires:

builtin.output_mode enum comparison (CG builtin field access)
@hasDecl(root, "main") support
Nested comptime if/else with @export side effects
All creating IP entries in exact upstream order

resolveRootInStartModule is honest — it creates the same ptr_nav entry the upstream creates via comptime evaluation. It's a structural shortcut, not a semantic cheat. Removing it requires the above infrastructure.

Estimated scope: ~50 lines of changes. Medium risk.

Phase D: Replace `resolveBuiltinModuleChain`

STATUS: DEFERRED — Same blockers as Phase C. The builtin chain is loaded during memoized state analysis in the upstream, which requires full comptime evaluation infrastructure. resolveBuiltinModuleChain is structural — it honestly creates the correct module chain.

Goal: Instead of manually constructing the builtin module chain, let it emerge from demand-driven evaluation.

Prerequisite: Phase C

Steps:

When std.zig's pub const builtin = @import("builtin.zig") is resolved (during comptime block evaluation), load std/builtin.zig
When std/builtin.zig's pub const builtin = @import("builtin") is resolved, load/create the CG builtin module
The ptr_nav entries ($138-$141) are created as side effects of resolving these import declarations
Remove resolveBuiltinModuleChain function entirely
Verify IP indices $137-$141 still match

Challenge: The CG builtin module has no file — it needs special handling in the import path. Currently @import("builtin") is intercepted by strcmp. With demand-driven loading, the interception moves to ensureFileAnalyzedC or resolveImportPath.

Estimated scope: ~100 lines of changes. High risk (CG builtin).

Phase E: Simplify `semaAnalyze`

STATUS: DONE — all four hardcoded functions removed:

resolveNamedImport (Phase B)
resolveDebugAssertEntries (Phase B)
resolveRootInStartModule (Phase C)
resolveBuiltinModuleChain (Phase D)

semaAnalyze now uses:

ensureFileAnalyzedC(std.zig) — load std
analyzeComptimeUnit(std comptime blocks) — cascades to start/debug
Register root module
analyzeComptimeUnit(start comptime blocks) — resolves root
ensureNavValUpToDate + resolveModuleDeclImports — builtin chain
analyzeBodyInner — root body analysis

Implementation Order (for an agent)

Phase A is the foundation — without analyzeComptimeUnit, nothing else works. It should be done first and tested independently.

Phases B-D can be done incrementally — each replaces one hardcoded call with honest evaluation. After each phase, all 193 tests must pass.

Phase E is cleanup — just removing dead code after B-D are done.

Key Risk: IP Entry Ordering

Every phase must preserve exact IP entry ordering. The approach:

Before removing a hardcoded call, add the demand-driven equivalent
Verify both produce the same IP entries (using --verbose-intern-pool)
Remove the hardcoded call
Verify all 193 tests pass

If IP ordering differs, use --verbose-intern-pool to find the first divergence and fix the evaluation order.

What's NOT In This Plan

@export evaluation: Already works in the C sema's analyzeBodyInner (handles the root module's comptime blocks during step 8). This plan is about the MODULE-LEVEL setup (steps 1-7), not the per-function analysis.
CG builtin ZIR generation: The CG builtin module still won't have real ZIR. It will still be a virtual module with navs resolved via resolveCgBuiltinField. The strcmp("builtin") checks for import skipping remain (structural, not semantic).
Full demand-driven work queue: The upstream uses a sophisticated job queue with outdated tracking. The C sema will use a simpler sequential evaluation that produces the same result.

9.4 KiB Raw Blame History