ZIR uses the Zig Signedness enum convention (signed=0, unsigned=1) while
the InternPool uses the inverted convention (unsigned=0, signed=1). Fix
both places that read signedness from ZIR: resolveEnumDeclFromZir (for
explicit enum tag types) and zirIntType (for int_type instructions).
Add resolveBuiltinDeclTypes() to resolve BuiltinDecl types from
std.builtin in order, matching Sema.zig's analyzeMemoizedState. Currently
resolves Signedness and AddressSpace, creating IP entries $213-$251.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add new IP key types: IP_KEY_BYTES, IP_KEY_PTR_UAV,
IP_KEY_PTR_UAV_ALIGNED, IP_KEY_PTR_SLICE, IP_KEY_OPT_PAYLOAD
with full hash/equality/typeOf/verbose support.
Add internStringLiteral() helper that creates the complete sequence
of IP entries for a comptime string literal:
- [len:0]u8 array type + bytes value
- *const [len:0]u8 pointer + ptr_uav + ptr_uav_aligned
- int_usize(len) + ptr_slice for [:0]const u8
- [len]u8 array type + pointer + individual u8 values + bytes
Create entries $198-$212 for the "main" string from
@hasDecl(root, "main") in start.zig's comptime block.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The enum_tag IP key was incorrectly storing a single InternPoolIndex
when it should store a struct with both the enum type and the integer
tag value. Add hash, equality, typeOf, and verbose printing support.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port enum_decl ZIR parsing to create the correct IP entries for
CompilerBackend: enum_type + 24 int entries (comptime_int and u64
coercions for each field value) + ptr_nav.
Key additions:
- resolveZirRefValue: resolves ZIR refs to IP indices, handles
AS_NODE by following the operand chain
- coerceIntToTagType: coerces comptime_int to typed int (e.g. u64)
- resolveEnumDeclFromZir: parses enum_decl extra data, creates
type_enum + field value int entries in correct order
- ensureNavValUpToDate: finds Nav's ZIR and dispatches to type resolver
- resolveStartComptimePreamble: orchestrates start.zig comptime processing
Also removes unused preCreateExportedFuncEntries (dead code).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create debug.assert entries ($130-$134) matching Zig compiler's evaluation
of std.zig's `comptime { debug.assert(@import("std") == @This()); }`.
Add resolveRootInStartModule for start.zig's "root" Nav ($136).
Reorder resolveBuiltinModuleChain to create std ptr_nav before
compiler-gen builtin struct ($139-$141).
Entries $124-$141 now match the Zig compiler exactly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restructure semaAnalyze to match the Zig compiler's module loading order:
- Load std.zig first (file_idx=0), creating its struct type at IP $124
- Selectively load start.zig and debug.zig via resolveNamedImport
- Load the root module at a later file_idx
This matches IP entries $124-$129 between the C sema and Zig compiler.
Preparation for neghf2.zig (test #4) which requires exact IP index match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port the module loading part of analyzeMemoizedState from
PerThread.zig. After the root module's exported function entries
($130-$134), the Zig compiler loads the builtin module chain:
std → std/builtin.zig → compiler-generated builtin
This creates type_struct and ptr_nav entries $135-$139:
$135 = type_struct (std/builtin.zig root)
$136 = ptr_nav (builtin in std)
$137 = type_struct (compiler-generated builtin)
$138 = ptr_nav (builtin in std/builtin.zig)
$139 = ptr_nav (std in std/builtin.zig)
Also adds findNavInNamespace helper for looking up Navs by name.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add IP_KEY_FUNC and IP_KEY_MEMOIZED_CALL support to InternPool
(hash, equality, typeOf cases)
- Change InternPoolKey.func from simple index to struct{owner_nav, ty}
and memoized_call from simple index to struct{func, result}
- Restructure zirStructDecl with multi-pass approach:
1. Record ALL declaration names first (so comptime blocks can find
forward-referenced declarations via DECL_REF)
2. Pre-create func_type + func_decl + ptr_type + ptr_nav +
memoized_call IP entries for @export targets before comptime
block body analysis
3. Process bodies (comptime blocks and functions)
- Extract parseDeclValueBody helper for declaration parsing
- This fixes the IP entry ordering where func_type/func_decl now
appear before enum_literal entries, matching the Zig compiler's
demand-driven resolution order
Closes 3 entries of the IP gap for neghf2.zig (853 remaining).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Increase IP_MAX_NAVS from 4096 to 16384 to support loading larger
module trees (e.g. std). Fix zig0.c module_root derivation to handle
relative paths like "lib/compiler_rt" which lack a leading "/lib/".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix resolveImportPath to handle .zig-extension imports (e.g.
"BitStack.zig") as relative paths instead of module names
- Increase SEMA_NS_MAX_NAVS 256→1024, MAX_LOADED_MODULES 256→512,
MAX_NAMESPACES 128→512 to handle std library modules
- Reduce resolveModuleDeclImports recursion depth 3→2 to avoid
exceeding Nav limits while still covering neghf2→common→std chain
- Add source_dir/module_root support to zig0.c for standalone IP dumps
- Document anti-pattern: analysis paralysis when facing large IP gaps
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port module-level analysis infrastructure from upstream Zig to create
InternPool entries that match the Zig compiler's output. This is the
first step toward closing the IP gap for neghf2.zig (corpus test 4).
- Add Nav struct to intern_pool.h with ipCreateDeclNav/ipGetNav/
ipResetNavs/ipNavCount management functions
- Add IP_KEY_PTR_NAV to InternPool (hash, equality, typeOf)
- Add SemaNamespace struct to sema.h for declaration grouping
- Port createFileRootStruct, scanNamespace from PerThread.zig
- Port ensureFileAnalyzed for recursive import resolution
- Add resolveModuleDeclImports: creates type_struct + type_pointer +
ptr_nav entries for import declarations, matching upstream order
- Add internPtrConst and internNavPtr helpers (from analyzeNavRefInner)
The C sema now creates entries [124-129] matching the Zig compiler:
type_struct(neghf2), type_struct(common), type_pointer(*const type),
ptr_nav(common), type_struct(std), ptr_nav(std). The remaining ~870
entries will be added in subsequent commits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zig0 doesn't use any C11-specific features. Lowering to C99
enables bootstrapping on platforms with only C99 compilers,
such as OpenBSD on exotic architectures (GCC 4.2.1).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace GNU statement expressions ({...}) in common.h with a static
inline function and do...while(0) macros. Expand case range expressions
(case 'a' ... 'z') in tokenizer.c to individual case labels. Replace
empty initializer braces {} with {0} in parser.c. Add a dummy member
to the empty struct in ast.h. Add -pedantic to zig0_cflags in build.zig
to prevent future regressions.
zig0 now compiles with any C11-conforming compiler, not just those
supporting GNU extensions. This enables bootstrapping with MSVC,
cproc, and other strict C11 compilers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add two points to the IP gap constraints section:
- Port functions mechanically, don't analyze individual entries first
- Time-box investigation to ~10 minutes before coding
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add verbose_intern_pool.c/h for dumping the C sema's InternPool
entries. Integrate as --verbose-intern-pool flag in zig0, mirroring
the Zig compiler's flag.
Fix InternPool.zig dump crash on locals with zero-capacity items
(skip empty locals in dumpStatsFallible and dumpAllFallible).
Update CLAUDE.md with IP debugging tools documentation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After creating the function type IP entry, also create a
pointer-to-function type (*const fn(...) ...) matching what the Zig
compiler creates when taking the address of a function for @export.
For neghf2.zig (num_passing=4), gap shrinks from 861 to 860.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Before analyzing an exported function's body, create a type_function
IP entry matching what the Zig compiler's ensureNavValUpToDate
creates when resolving the function declaration during @export
processing.
For neghf2.zig (num_passing=4), gap shrinks from 862 to 861.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove scanZirImportsRecursive which eagerly scanned all imports in
each loaded module, creating 255 struct types vs the Zig compiler's
108. The eager approach creates entries for modules never accessed
during the Zig compiler's lazy analysis, which would cause IP index
overshoot when other entry types are added.
Keep the demand-driven approach: struct types are created via
DECL_VAL (when imports are first referenced during analysis) and
loadImportZirFromPath (when modules are loaded for cross-module
calls). Currently creates 2 struct types (root + common.zig) for
neghf2.zig; gap = 862.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When DECL_VAL encounters a declaration whose value is an @import,
create a struct type IP entry for the imported module. Additionally,
scan the imported module's ZIR for its own imports and recursively
create struct types for those too. This matches the Zig compiler's
ensureFileAnalyzed → semaFile → createFileRootStruct → scanNamespace
sequence, where importing a file triggers analysis of that file which
discovers further imports.
The struct type creation is triggered lazily from the DECL_VAL handler
during analysis (not eagerly upfront), matching the Zig compiler's
demand-driven processing order.
For neghf2.zig (num_passing=4), the IP index gap shrinks from 862 to
607 as ~255 struct type entries are created for transitively imported
modules.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The answer to "how do I proceed?" is always the same: follow what the
upstream Zig compiler does. There is no reason to stop and ask.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three changes to prevent the agent from stopping between iterations:
1. Add bold "Do NOT stop between iterations" notice at the top of
the main loop — each commit is a checkpoint, not a stopping point.
2. Main loop step 6: reinforce "keep looping until blocked or all
corpus tests pass."
3. Module-system sub-loop: step 5 now says "immediately continue to
step 3. Do NOT stop here." Step 6 is renamed "Exit condition"
with explicit criteria (gap zero, num_passing incremented, tests
pass).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Embed the Zig compiler's IP items count (func_ip) in the .air binary
format so that test mismatch errors show [zig_ip_base=N], eliminating
the need for temporary debug prints in src/Zcu/PerThread.zig.
Add lazy module-level struct type creation in the C sema: each imported
module gets a type_struct IP entry when first loaded via
loadImportZirFromPath, matching the Zig compiler's demand-driven
ensureFileAnalyzed → createFileRootStruct sequence. The root module's
struct type is created at the start of semaAnalyze.
For neghf2.zig (num_passing=4), the IP index gap shrinks from 864 to
862 (root struct + common.zig struct created lazily during cross-module
call resolution).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the workflow for closing the IP entry gap between the Zig
compiler and C sema. Starting with neghf2.zig, corpus tests require
the C sema to create ~878 module-level IP entries (struct types,
ptr_nav, enum types, etc.) matching the Zig compiler's output.
The workflow describes how to dump the Zig compiler's IP state, compare
with the C sema, port module-system functions (createFileRootStruct,
scanNamespace, etc.), and iterate.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The upstream Zig sema is lazy — it only evaluates const declarations
when first accessed. The C sema was eagerly evaluating ALL non-function
declarations, including ones never accessed during analysis (e.g.
`pub const panic = common.panic` in neghf2.zig).
Only evaluate comptime declarations (id == 3) and function declarations
(detected by ZIR_INST_FUNC / FUNC_FANCY). Skip all other const/var
declarations, matching upstream behavior.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
README.md keeps project overview, testing commands, debugging tips, and
float handling. CLAUDE.md gets the full Sema porting loop, decomposition
strategy, AIR exceptions, cleanup policy, and general rules.
Also fixes: stages.zig -> corpus.zig, sema_test.zig -> sema_tests/ +
num_sema_passing, nether -> neither.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
air_gen: replace per-file symlink workaround with two-pass compilation.
Pass 1 compiles lib/std/std.zig as root with use_root_as_std=true
(one compilation, all lib/std/ functions). Pass 2 compiles non-lib/std/
files standalone. Symlink workaround eliminated entirely.
build.zig: pass all corpus.files (not 0..num_passing) to air_gen,
skipping lib/std/ files. Bumping num_passing no longer invalidates
the air_gen cache.
air_data.zig: route lib/std/ paths to the combined std.zig.air file.
sema_test.zig: switch to unidirectional comparison (C→Zig only) and
exact FQN matching. Remove stripModulePrefix, bare-name fallback, and
unused cNameSpan. Add pathToModulePrefix and pathStem helpers.
sema.h/sema.c: add root_fqn, module_prefix, and is_test fields to
Sema struct. Function names use "{root_fqn}[.{prefix}].{name}" format
to match Zig's FQN convention.
stages_test.zig: set root_fqn and module_prefix on C sema so FQNs
match Zig's naming. Remove symlink workaround — C sema uses real
paths directly. Set is_test=false to match air_gen.
corpus.zig: remove lib/init/src/main.zig (template file with
@import(".NAME") that cannot compile standalone).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove all normalization layers from the AIR comparison:
- canonicalizeRef: was renumbering IP refs sequentially by first
appearance to hide raw index differences
- stripAnonSuffix: was stripping __anon_NNN suffix from generic
function names
- canonicalizeExtraRefs: was canonicalizing refs in extra payloads
The C and Zig InternPools now produce identical indices for 431 of
433 tests. Two tests still fail due to IP index gaps:
- return_integer.zig: value 42 at IP 0xd8 (Zig) vs 0x7d (C)
- neghf2.zig: value at IP 0x3e1 (Zig) vs 0x81 (C)
These gaps come from upstream interning intermediate values during
module-level analysis (struct declarations, function types, export
validation) that the C sema doesn't yet replicate.
Also uses IP index (not ZIR inst) for __anon_ suffix in generic
function names, matching upstream's finishFuncInstance.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change the generic monomorphization naming from func_inst (ZIR
instruction index) to func_val_ip (InternPool index), matching
upstream's finishFuncInstance which uses @intFromEnum(func_index).
Pass the func_val_ip through analyzeFuncBodyAndRecord and store it
in SemaFuncAir.func_ip. The anon suffix now uses the same numbering
scheme as upstream, though the actual numbers still differ because
the C and Zig InternPools intern values in different order.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes to match upstream Sema.zig behavior for addhf3:
1. ComptimeReturn: don't rollback air_inst_len at all (upstream keeps
all body instructions as dead instructions in the AIR array).
This preserves nested dead blocks from comptime inline calls.
2. dbg_arg_inline: skip emission when the declared param type is
comptime-only (comptime_int, comptime_float, enum_literal).
Ported from addDbgVar's val_ty.comptimeOnlySema() check.
The C sema doesn't coerce comptime IP values to the param type,
so we check the ZIR param type body directly.
3. Param type body scanning: always register calls in the global
seen_calls set (even when the dead block is skipped due to
type_fn_created). This ensures that after type_fn_created is
reset by analyzeFuncBodyAndRecord, subsequent calls still dedup.
Enables num_passing = 9 (addhf3) and adds comptime_arg_dbg.zig unit test.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In upstream Sema.zig:7872, when an inline call returns at comptime
(ComptimeReturn), the pre-allocated block instruction is NOT rolled
back — it remains as a dead block in the AIR. The C sema was
incorrectly discarding it by rolling back air_inst_len to before the
block.
Fix: roll back to block_inst_idx+1 (keep dead block, discard body
instructions). This produces dead blocks for comptime inline calls
in comptime context (e.g., floatExponentMax, mantissaOne called
from within nan(T)'s comptime-evaluated body).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In upstream Sema.zig:7872, when an inline call returns at comptime
(ComptimeReturn), the pre-allocated block instruction is NOT rolled
back — it remains as a dead block in the AIR. The C sema was
incorrectly discarding it by rolling back air_inst_len to before the
block.
Fix: roll back to block_inst_idx+1 (keep dead block, discard body
instructions). This produces dead blocks for comptime inline calls
in comptime context (e.g., floatExponentMax, mantissaOne called
from within nan(T)'s comptime-evaluated body).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a global seen_call_names/seen_call_nargs set to Sema that persists
across analyzeFuncBodyAndRecord calls (not reset per-function). This
matches upstream Zig's InternPool memoization which is global: when a
type-returning function (Int, Log2Int, etc.) is called in one function's
body and later in another function's body, upstream memoizes the result
and skips the dead block on the second call.
The set is checked at three points:
- Unresolved type function path (callee not found, known type name)
- Param type body scanning (generic param type resolution)
- Resolved type function path (returns_type handler)
After creating a dead block, the call is registered in the set so
subsequent calls with the same callee name and arg count skip it.
Also add two new sema unit tests:
- cross_fn_memoized_call.zig: two exports calling same inline helper
- nested_inline_dead_blocks.zig: nested comptime inline calls
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two fixes toward enabling addhf3.zig corpus test:
1. dbg_arg_inline: upstream emits dbg_arg_inline for all inline
params whose resolved type is not comptime-only. The C sema was
skipping all comptime-declared params (ZIR_INST_PARAM_COMPTIME).
Now it checks whether the argument value is a type (param's type
is `type`) and only skips those, matching upstream behavior.
E.g. `comptime bits: u16` now gets dbg_arg_inline.
2. Log2Int dead blocks: when Log2Int is called from a comptime
sub-block whose parent is runtime (e.g. @as(Log2Int(T), ...)),
create 2 dead blocks (1 for Log2Int + 1 for nested Int call).
This fixes normalize__anon_1028 which was missing 2 instructions.
Also lifts skip_block out of inner scope in the resolved-callee path
for visibility by the Log2Int handler, and resolves the TODO about
Log2Int comptime context dead blocks.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The skip_first_int fix in 67cb6933 was insufficient: normalize's AIR
still mismatches by 4 instructions. The root cause is that the C sema
needs broader handling of comptime-only return types (comptime_int, not
just type) and proper memoization of inline comptime calls across
function boundaries. Revert to 8 passing corpus files until the dead
block generation for comptime function calls matches upstream.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In upstream Zig, finishFuncInstance evaluates param type bodies and
memoizes type function calls (e.g. Int) in InternPool. When the
function body contains an identical call, it hits the memo and skips
dead block creation. The C port's shortcut (call_arg_types) skips
type body evaluation, so the memo is never set.
Add skip_first_int flag: set by analyzeFuncBodyAndRecord when a generic
param type body contains both ptr_type and a call instruction (the
*Int(...) pattern). Consumed once by site2's dead block creation.
Also fix cppcheck lint: const-qualify call_arg_types parameter.
normalize__anon_1028 still off by -2 (missing Log2Int dead blocks
from comptime sub-expressions) — to be addressed separately.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>