Port two missing features from upstream AstGen.zig ret() function:
1. Add any_defer_node field to GenZir (AstGen.zig:11812) to track
whether we're inside a defer expression. Set it in defer body
generation and propagate via makeSubBlock. retExpr now checks
this field and errors with "cannot return from defer expression"
(AstGen.zig:8127-8135). Also reorder retExpr checks to match
upstream: fn_block null check first, then any_defer_node check,
then emitDbgNode.
2. Add reachableExpr wrapper (AstGen.zig:408-416) that calls exprRl
and checks refIsNoReturn to detect unreachable code. Use it in
retExpr instead of plain exprRl for the return operand
(AstGen.zig:8185-8186). nameStratExpr is left as TODO since
containerDecl does not yet accept a name_strategy parameter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add two fixes from audit of ptrTypeExpr against upstream AstGen.zig ptrType:
1. Reject `[*c]allowzero T` with a compile error matching upstream
(AstGen.zig:3840-3842). C pointers always allow address zero, so
the allowzero modifier is invalid on them.
2. Save source_offset/source_line/source_column before typeExpr and
restore them before evaluating each trailing expression (sentinel,
addrspace, align). This ensures correct debug info source locations
matching upstream (AstGen.zig:3844-3846, 3859-3861, 3876-3878,
3885-3887).
Issue 3 (addrspace RL using addBuiltinValue) is skipped as
addBuiltinValue is not yet implemented.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port assignShift (AstGen.zig:3786) and assignShiftSat (AstGen.zig:3812)
from upstream, handling <<=, >>=, and <<|= operators as both statements
in blockExprStmts and expressions in exprRl. Previously these fell
through to SET_ERROR.
Add grouped_expression unwrapping loop in blockExprStmts (matching
AstGen.zig:2569-2630) so that parenthesized statements like `(x += 1)`
are correctly dispatched to assignment handlers instead of going through
the default unusedResultExpr path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix three issues in the RL annotation pre-pass (rlExpr):
1. Label detection for `inline while`/`inline for` now accounts for
the `keyword_inline` token before checking for `identifier colon`,
matching upstream fullWhileComponents/fullForComponents logic.
2. `assign_destructure` now recurses into variable nodes and the value
expression with RL_RI_NONE, matching upstream behavior instead of
returning false without visiting sub-expressions.
3. `rlTokenIdentEqual` now handles @"..."-quoted identifiers by comparing
the quoted content rather than stopping at the `@` character, which
previously caused all @-quoted identifiers to compare as equal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The exprRl function was wrapping blockExprExpr's return value in an
extra rvalue() call, but blockExprExpr already applies rvalue internally
for labeled blocks when need_result_rvalue=true. The upstream expr()
function at AstGen.zig:991 returns blockExpr's result directly without
extra rvalue wrapping. This could produce duplicate coercion/store
instructions for non-trivial result locations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port two missing checks from upstream AstGen.zig rvalueInner to the C
rvalue function:
1. isAlwaysVoid (Zir.zig:1343-1608): When the result refers to an
instruction that always produces void (e.g., dbg_stmt, store_node,
export, memcpy, etc.), replace the result with void_value before
proceeding. This prevents emitting unnecessary type coercions or
stores on always-void instructions.
2. endsWithNoReturn (AstGen.zig:11068): When the current GenZir block
ends with a noreturn instruction, return the result immediately
without emitting any rvalue instructions. This avoids emitting dead
ZIR instructions after noreturn.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port the \u{NNNNNN} unicode escape parsing from upstream Zig's
string_literal.zig:parseEscapeSequence into both strLitAsString
(string literal decoding with UTF-8 encoding) and char_literal
(codepoint value extraction). Without this, \u escapes fell through
to the default branch which wrote a literal 'u' character, producing
incorrect ZIR string bytes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BuiltinCall.Flags has ensure_result_used at bit 1, not bit 3 like
Call/FieldCall. Separate the case to use the correct bit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three bugs found by auditing against upstream AstGen.zig/AstRlAnnotate.zig:
1. rlExpr: defer was recursing into nd.rhs (always 0) instead of nd.lhs
(the actual deferred expression), so the RL annotation pass never
visited defer bodies.
2. addEnsureResult: compile_error was missing from the noreturn
instruction list, causing spurious ensure_result_used instructions
to be emitted after @compileError calls.
3. blockExprExpr: force_comptime was derived from gz->is_comptime,
but upstream blockExpr always passes force_comptime=false to
labeledBlockExpr. This caused labeled blocks in comptime contexts
to incorrectly emit BLOCK_COMPTIME + BREAK_INLINE instead of
BLOCK + BREAK.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Match Zig's Signedness enum values (unsigned=1, signed=0) and
reorder int_type struct fields to match Zig's layout:
[src_node, bit_count, signedness, pad].
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Inline index_inst at usage site to narrow scope, initialize
var_init_rl.ctx to RI_CTX_NONE (matching upstream default).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- whileExpr: emit emitDbgNode before condition evaluation to match
upstream AstGen.zig:6579. Fixes astgen_test.zig corpus (1 missing
DBG_STMT).
- Block expressions in exprRl: wrap blockExprExpr result with rvalue()
to handle result location storage (RL_PTR → STORE_NODE, etc.).
Fixes parser_test.zig inst_len to exact match.
- parser_test.zig corpus now has matching inst_len and all tags, but
has 1 int_type data signedness mismatch (pre-existing issue).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Typed struct init empty (SomeType{}) was returning the result directly
without going through rvalue(), missing STORE_NODE/STORE_TO_INFERRED_PTR/
COERCE_PTR_ELEM_TY+REF emissions when result location requires storage.
Reduces parser_test.zig corpus diff from 5 to 1 instruction.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- retExpr: check nodesNeedRl to use RL_PTR with ret_ptr/ret_load instead of
always RL_COERCED_TY with ret_node. Handle .always/.maybe error paths with
load from ptr when needed.
- Use typeExpr() instead of expr()/exprRl() for type sub-expressions in
optional_type, error_union, merge_error_sets, and array elem types in
structInitExpr/arrayInitExpr. This generates BLOCK_COMPTIME wrappers for
non-primitive type identifiers.
- arrayInitExpr: only use ARRAY_INIT_REF for RL_REF (not RL_REF_COERCED_TY),
and pass non-ref results through rvalue().
- slice_sentinel: emit SLICE_SENTINEL_TY and coerce sentinel to that type.
All slice variants: coerce start/end to usize.
- COERCE_PTR_ELEM_TY in rvalue for RL_REF_COERCED_TY.
- rvalueNoCoercePreRef for local variable references.
- structInitExprPtr/arrayInitExprPtr for RL_PTR with OPT_EU_BASE_PTR_INIT.
- Typed struct init: use RL_COERCED_TY with field type for init expressions.
Reduces parser_test.zig corpus diff from 225 to 5 instructions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- @as builtin: propagate RL_TY with dest_type through exprRl instead of
evaluating operand with RL_NONE and manually emitting as_node. Matches
upstream AstGen.zig lines 8909-8920.
- rlResultType: add missing RL_REF_COERCED_TY case (elem_type extraction).
- continue handler: use AST_NODE_OFFSET_NONE for addBreak operand_src_node
instead of computing node offset. Upstream uses addBreak (not
addBreakWithSrcNode), which writes .none.
- varDecl: set init_rl.src_node = 0 for RL_PTR (upstream leaves
PtrResultLoc.src_node at default .none).
Enables astgen_test.zig corpus test — all corpus tests now pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
addInstruction() already returns idx + ZIR_REF_START_INDEX (a ref),
so the extra + ZIR_REF_START_INDEX on the inplace_arith_result_ty path
resulted in a double-offset (+248 instead of +124) being stored in
extra data for += and -= compound assignments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- BREAK/CONTINUE: lhs is opt_token (null=UINT32_MAX), not opt_node
(null=0). Check nd.lhs != UINT32_MAX instead of != 0.
- ERROR_VALUE: last token is main_token + 2 (error.name has 3 tokens),
not main_token.
- advanceSourceCursor: replace silent return on backward movement with
assert, matching upstream behavior.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Save source cursor before evaluating sub-expressions in array_access
and @tagName (cursor was being mutated by inner expr calls)
- Add is_comptime guard to advanceSourceCursorToMainToken matching
upstream maybeAdvanceSourceCursorToMainToken (skip in comptime)
- Re-skip astgen_test.zig corpus (dbg_stmt mismatch remains at inst 1557)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mechanically match upstream comptimeExpr signature which accepts ResultInfo.
This fixes coercion in comptime contexts (e.g. sentinel 0 becoming zero_u8
instead of generic zero when elem_type is u8).
- comptimeExpr: add ResultLoc rl parameter, thread to exprRl
- typeExpr: pass coerced_ty=type_type (matching upstream coerced_type_ri)
- ptrType: pass ty=elem_type for sentinel, coerced_ty=u29 for align,
coerced_ty=u16 for bit_range
- retExpr: set RI_CTX_RETURN
- tryExpr: set RI_CTX_ERROR_HANDLING_EXPR for operand
- orelseCatchExpr: set RI_CTX_ERROR_HANDLING_EXPR when do_err_trace
- ifExpr: set RI_CTX_ERROR_HANDLING_EXPR for error union condition
- shiftOp: set RI_CTX_SHIFT_OP, use as_shift_operand in rvalue
- breakResultInfo: don't forward ctx for discard case
- fnDecl ret_body break: use AST_NODE_OFFSET_NONE
Passes corpus tests for test_all.zig, build.zig, tokenizer_test.zig.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace linear scan of all string_bytes with a string_table that
only contains explicitly registered strings (via identAsString and
strLitAsString). This prevents false deduplication against multiline
string content that upstream's hash table would never match.
Also handle embedded null bytes in strLitAsString: when decoded string
contains \x00, skip dedup and don't add trailing null, matching upstream
AstGen.zig:11560. Fix c_include extended instruction small field to
0xAAAA (undefined) matching upstream addExtendedPayload.
Passes corpus tests for test_all.zig, build.zig, tokenizer_test.zig.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Major fixes to match upstream AstGen.zig:
- Call/FieldCall: flags at offset 0, scratch_extra for arg bodies,
pop_error_return_trace from ResultCtx instead of hardcoded true
- CondBr: write {condition, then_body_len, else_body_len} then bodies
(was interleaving lengths with bodies)
- For loop: use instructionsSliceUpto, resurrect loop_scope for
increment/repeat after then/else unstacked
- validate_struct_init_result_ty: un_node encoding (no extra payload)
- addEnsureResult: flags always at pi+0 for all call types
- addFunc: param_insts extra refs for correct body attribution
- array_init_elem_type: addBin instead of addPlNodeBin
- Pre-register struct field names for correct string ordering
- comptime break_inline: AST_NODE_OFFSET_NONE
- varDecl: pass RI_CTX_CONST_INIT context
- Rewrite test infrastructure with field-by-field ZIR comparison
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Comprehensive firstToken: handle all AST node types matching upstream
Ast.zig (call, struct_init, slice, binary ops, fn_decl, blocks, etc.)
instead of falling through to main_token for unknown types.
- Slice LHS uses .ref rl: pass RL_REF_VAL for slice_open/slice/
slice_sentinel LHS evaluation, matching upstream AstGen.zig:882-939.
- fnDecl param name before type: resolve parameter name via
identAsString before evaluating the type expression, matching upstream
AstGen.zig:4283-4335 ordering.
- Break label comparison: use tokenIdentEql (source text comparison)
instead of identAsString to avoid adding label names to string_bytes,
matching upstream AstGen.zig:2176 tokenIdentEql.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix call instruction not being appended to gz's instruction list due
to a debug range check left in callExpr. This caused emitDbgStmt's
dedup logic to not see call instructions, resulting in 10 missing
dbg_stmt instructions in the build.zig corpus test.
Also port shiftOp from upstream (AstGen.zig:9978) for shl/shr operators,
which need typeof_log2_int_type for RHS coercion and their own emitDbgStmt.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port the missing rvalue() call in orelseCatchExpr's then-branch
(AstGen.zig:6088-6091). The upstream applies rvalue with
block_scope.break_result_info to the unwrapped payload before
breaking, which emits as_node coercion when needed. The C code
was passing the unwrapped value directly to addBreak without
coercion.
Also update the corpus build.zig TODO with current diff state.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use nodeIndexToRelative(decl_node) = node - proto_node for the
break_inline returning func to declaration, matching upstream
AstGen.zig:4495. Previously used AST_NODE_OFFSET_NONE which
produced incorrect extra data values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Handle anonymous struct init (.{.a = b}) when the result location has
a type (RL_TY/RL_COERCED_TY). Emit validate_struct_init_result_ty and
struct_init_field_type instructions, matching upstream AstGen.zig:
1706-1731 and structInitExprTyped.
Also add validate_struct_init_result_ty to test comparison functions
and fix char literal escape sequences.
build.zig corpus: improved from 25 to 3 inst diff (remaining:
as_node coercion in rvalue).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add RL_REF_COERCED_TY to the result location enum, matching the upstream
ref_coerced_ty variant. This carries a pointer type through the result
location so that array init and struct init expressions can generate
validate_array_init_ref_ty and struct_init_empty_ref_result instructions.
- Use RL_REF_COERCED_TY in address_of when result type is available
- Handle in arrayInitDotExpr to emit validate_array_init_ref_ty
- Handle in structInitExpr for empty .{} to emit struct_init_empty_ref_result
- Add RL_IS_REF() macro for checking both RL_REF and RL_REF_COERCED_TY
- Update rvalue to treat RL_REF_COERCED_TY like RL_REF
tokenizer_test.zig corpus: instructions now match (7026). Extra and
string_bytes still have diffs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add escape sequence handling to strLitAsString (\n, \r, \t, \\, \',
\", \xNN). Previously copied string content byte-for-byte.
- Fix strLitAsString quote scanning to skip escaped quotes (\\").
- Handle @"..." quoted identifiers in identAsString.
- Add test name and field name strings to scanContainer to match
upstream string table insertion order.
- Skip dedup against reserved index 0 in strLitAsString to match
upstream hash table behavior.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port the emitDbgNode(parent_gz, cond_expr) call from upstream
AstGen.zig:6335 into ifExpr. This emits a DBG_STMT instruction
before evaluating the if condition, matching the reference output.
Enable astgen_test.zig corpus test (still has extra_len and
string_bytes mismatches to fix).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix arrayInitExpr for [_]T{...} patterns to use elem_type as the
coercion target for each element expression (RL_COERCED_TY), matching
upstream AstGen.zig:1598-1642. Previously used RL_NONE_VAL which
produced different instruction sequences.
Add struct init typed and enum decl isolated tests.
Note: build.zig corpus still needs ref_coerced_ty result location
support and fn body ordering fixes — left as TODO.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add enumDeclInner and setEnum, ported from upstream AstGen.zig:5508-5729.
Dispatch in containerDecl based on main_token keyword (struct vs enum).
Fix fnDecl to pass proto_node (not fn_decl node) to makeDeclaration,
matching upstream AstGen.zig:4090.
Improve is_pub detection in fnDecl to use token tags instead of string
comparison.
Add func/func_inferred proto_hash to the test hash skip mask, and
enum_decl fields_hash skipping.
Tests added: enum decl.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rewrite globalVarDecl to properly handle extern/export/pub/threadlocal
variables with type/align/linksection/addrspace bodies. Port the full
Declaration extra data layout from upstream AstGen.zig:13883, including
lib_name, type_body, and special bodies fields.
Add extractVarDecl to decode all VarDecl node types (global, local,
simple, aligned) and computeVarDeclId to select the correct
Declaration.Flags.Id.
Fix firstToken to scan backwards for modifier tokens (extern, export,
pub, threadlocal, comptime) on var decl nodes, matching upstream
Ast.zig:634-643.
Test added: extern var.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port errorSetDecl from upstream AstGen.zig:5905-5955. Replaces the
SET_ERROR placeholder at the ERROR_SET_DECL case. Loops tokens between
lbrace and rbrace, collecting identifier strings into the ErrorSetDecl
payload.
Also add error_set_decl to the test comparison functions.
Tests added: empty error set, error set with members.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port WipMembers, field processing loop, nodeImpliesMoreThanOnePossibleValue,
and nodeImpliesComptimeOnly from upstream AstGen.zig. Struct fields are now
properly emitted with type expressions, default values, alignment, and
comptime annotations.
Also fix structDeclInner to add the reserved instruction to the GenZir
body (matching upstream gz.reserveInstructionIndex behavior) and use
AST_NODE_OFFSET_NONE for break_inline src_node in field bodies.
Tests added: single field, multiple fields, field with default, field
with alignment, comptime field.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- continue: emit check_comptime_control_flow and
restore_err_ret_index_unconditional (matching AstGen.zig:2328-2334)
- forExpr: set loop_scope.continue_block = cond_block
(matching AstGen.zig:6974), allowing continue inside for loops
to target the correct scope
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add emitDbgStmt and result type from RL to typeCast builtins
(@intCast, @truncate, @ptrCast, @enumFromInt, @bitCast)
- Pass ResultLoc to builtinCall for result type access
- Fix @memset: upstream derives elem_ty via typeof+indexable_ptr_elem_type
and evaluates value with coerced_ty RL
- Fix @memcpy/@memset to return void_value (not instruction ref)
- Add builtinEvalToError: per-builtin eval_to_error lookup instead of
always returning MAYBE for all builtins
- Fix nodeMayAppendToErrorTrace: pass loop var 'n' to nodeMayEvalToError
instead of original 'node' parameter
Corpus: ref=4177 got=4160, mismatch at inst[557], gap=17
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add genDefers() with DEFER_NORMAL_ONLY/DEFER_BOTH_SANS_ERR modes
- Add countDefers() for checking defer types in scope chain
- Add genDefers calls to breakExpr, continueExpr, retExpr, tryExpr
- Add fn_block tracking to AstGenCtx (set in fnDecl/testDecl)
- Add return error.Foo fast path using ret_err_value instruction
- Fix fullBodyExpr scope: pass &body_gz.base instead of params_scope
- Fix blockExprStmts: guard genDefers with noreturn_stmt check
- Fix retExpr MAYBE path: correct dbg_stmt/restore ordering
- Save/restore fn_block in containerDecl (set NULL for nested structs)
- addEnsureResult now returns bool indicating noreturn
First ZIR tag mismatch moved from inst[211] to inst[428].
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port several AstGen.zig patterns to C:
- Thread ResultLoc through fullBodyExpr, ifExpr, switchExpr, callExpr,
calleeExpr (for proper type coercion and decl_literal handling)
- Add rlBr() and breakResultInfo() helpers mirroring upstream ri.br()
and setBreakResultInfo
- Implement labeled blocks with label on GenZir (matching upstream),
restoreErrRetIndex before break, and break_result_info
- Fix breakExpr to emit restoreErrRetIndex and use break_result_info
for value/void breaks (AstGen.zig:2150-2237)
- Add setBlockComptimeBody with comptime_reason field (was using
setBlockBody which omitted the reason, causing wrong extra layout)
- Add comptime_reason parameter to comptimeExpr with correct reasons
for type/array_sentinel/switch_item/comptime_keyword contexts
- Handle enum_literal in calleeExpr (decl_literal_no_coerce)
- Fix decl_literal rvalue wrapping for ty/coerced_ty result locs
All 5 corpus files now pass byte-by-byte ZIR comparison.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port scope chain infrastructure, function parameters, local var_decl,
control flow (if/for/while/switch/orelse/catch/defer), labeled blocks,
break/continue, comparison/boolean/unary operators, array access,
field access rvalue, rvalue type coercion optimization, and many
builtins from upstream AstGen.zig. test_all.zig corpus passes;
4 remaining corpus files still have mismatches (WIP).
Also fix cppcheck/lint issues: safe realloc pattern, null checks,
const correctness, enable inline suppressions, comment out test
debug output for clean `zig build`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce zir.h/zir.c with ZIR instruction types (269 tags, 56 extended
opcodes, 8-byte Data union) ported from lib/std/zig/Zir.zig, and
astgen.h/astgen.c implementing the empty-container fast path that produces
correct ZIR for empty source files.
The test infrastructure in astgen_test.zig compares C astGen() output
field-by-field against Zig's std.zig.AstGen.generate() using tag-based
dispatch, avoiding raw byte comparison since Zig's Data union has no
guaranteed in-memory layout.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>