* AstGen: use Ast.zig helper methods to avoid copy pasting token counting logic
- take advantage of the `first_doc_comment` field we already have for
param AST nodes
* Add missing ZIR docs
The previous commit that implemented doc comment zir support for
decls did not properly account for all the possible attribute
keyword combinations (threadlocal, extern, and such).
* AIR instruction vector_init gains the ability to init arrays and
tuples in addition to vectors. This will probably also gain the
ability to initialize structs and be renamed to `aggregate_init`.
* AstGen prefers to use an `anon_array_init` ZIR instruction for
local variables when the init expr is an array literal and there is
no type.
The ZIR instructions `switch_capture_else` and `switch_capture_ref` are
removed because they are not needed. Instead, the prong index is set to
max int for the special prong.
Else prong with error sets is not handled yet.
Adds a new behavior test because there was not a prior on to cover only
the capture value of else on a switch.
There is a mechanism to avoid redundant `as` ZIR instructions which is
to pass `ResultLoc.coerced_ty` instead of `ResultLoc.ty` when it is
known by AstGen that Sema will do the coercion.
This commit downgrades `coerced_ty` to `ty` when a result location
passes through an expression that branches, such as `if`, `switch`,
`while`, and `for`, causing the `as` ZIR instruction to be emitted.
This ensures that the type of a result location will be applied to, e.g.
a `comptime_int` on either side of a branch on a runtime condition.
Introduce `validate_array_init_comptime`, similar to
`validate_struct_init_comptime` introduced in
713d2a9b38.
`zirValidateArrayInit` is improved to detect comptime array literals and
emit AIR accordingly. This code is very similar to the changes
introduced in that same commit for `zirValidateStructInit`.
The C backend needed some improvements to continue passing the same set
of tests:
* `resolveInst` for arrays now will add a local `static const` with the
array value and so then `elem_val` instructions reference that local.
It memoizes accesses using `value_map`, which is changed to use
`Air.Inst.Ref` as the key rather than `Air.Inst.Index`.
* This required a mechanism for writing to a "header" which is lines
that appear at the beginning of a function body, before everything
else.
* dbg_stmt output comments rather than `#line` directives.
TODO comment reproduced here:
We need to re-evaluate whether to emit these or not. If we naively emit
these directives, the output file will report bogus line numbers because
every newline after the #line directive adds one to the line.
We also don't print the filename yet, so the output is strictly unhelpful.
If we wanted to go this route, we would need to go all the way and not output
newlines until the next dbg_stmt occurs.
Perhaps an additional compilation option is in order?
`Value.elemValue` is improved to support `elem_ptr` values.
AIR:
* `array_elem_val` is now allowed to be used with a vector as the array
type.
* New instructions: splat, vector_init
AstGen:
* The splat ZIR instruction uses coerced_ty for the ResultLoc, avoiding
an unnecessary `as` instruction, since the coercion will be performed
in Sema.
* Builtins that accept vectors now ignore the type parameter. Comment
from this commit reproduced here:
The accepted proposal #6835 tells us to remove the type parameter from
these builtins. To stay source-compatible with stage1, we still observe
the parameter here, but we do not encode it into the ZIR. To implement
this proposal in stage2, only AstGen code will need to be changed.
Sema:
* `clz` and `ctz` ZIR instructions are now handled by the same function
which accept AIR tag and comptime eval function pointer to
differentiate.
* `@typeInfo` for vectors is implemented.
* `@splat` is implemented. It takes advantage of `Value.Tag.repeated` 😎
* `elemValue` is implemented for vectors, when the index is a scalar.
Handling a vector index is still TODO.
* Element-wise coercion is implemented for vectors. It could probably
be optimized a bit, but it is at least complete & correct.
* `Type.intInfo` supports vectors, returning int info for the element.
* `Value.ctz` initial implementation. Needs work.
* `Value.eql` is implemented for arrays and vectors.
LLVM backend:
* Implement vector support when lowering `array_elem_val`.
* Implement vector support when lowering `ctz` and `clz`.
* Implement `splat` and `vector_init`.
Add a variant of the `validate_struct_init` ZIR instruction:
`validate_struct_init_comptime` which is the same thing except it
indicates a comptime scope.
Sema code for this instruction now handles default struct field
values and detects when the struct initialization resulted in a
comptime value, replacing the already-emitted AIR instructions
to store each individual field with a single `store` instruction
with a comptime struct value as the operand.
In the case of a comptime scope, there is a simpler path that only
evals the implicit store instructions for default field values, avoiding
the mechanism for detecting comptime values.
This regressed one test case for the wasm backend, but it's just hitting
a different prong of `emitConstant` which currently has "TODO" in there,
so I think it's fine.
The main problem was that the loop body was treated as an expression
that was one of the peer result values of a loop, when in reality the
loop body is noreturn and only the `break` operands are the result
values of loops.
This was solved by introducing an override that prevents rvalue() from
emitting a store to result location instruction for loop bodies.
An orthogonal change also included in this commit is switching
`elem_val` index expressions to using `coerced_ty` and doing the
coercion to `usize` inside `Sema`, resulting in smaller ZIR (since the
cast becomes implied).
I also changed the break operand expression to use `reachableExpr`,
introducing a new compile error for double break.
This makes a few more behavior tests pass for `while` and `for` loops.
Previously, function parameter instructions for function prototypes would be
generated in the parent block. This caused issues in blocks where multiple
prototypes would be generated in, such as the block for struct fields for
example. This change introduces an inline block around every prototype such
that all parameters for a prototype are confined to a unique block.
dd62a6d2e8 short-circuited the logic of
`asmExpr` by emitting ZIR for `@compileError("...")`. This caused false
positive "unreachable code" errors for stage1 when there was an
expression in the asm template.
This commit makes such cases instead go through logic of `asmExpr` like
normal, however the asm template is set to 0. This is then picked up in
Sema (part of stage2, not stage1) and reported as "assembly code must
use string literal syntax".
The end-game for inline assembly is that the syntax is more integrated
with zig, and it will not allow string concatenation for the assembler
code, for the same reasons that Zig does not have a preprocessor.
However, inline assembly in zig right now is lacking for a variety of
use cases (take a look at the open issues having to do with inline
assembly for example), and being able to use comptime expressions to
concatenate text is a workaround that real-world users are exploiting to
get by in the short term.
This commit keeps "assembly code must use string literal syntax" as a
compile error when using stage2, but allows it through when using
stage1.
I expect to revert this commit after making enough improvements to
inline assembly that our real world users' needs are satisfied.
Switch prong values are fetched by index in semantic analysis by prong
offset, but these were computed as capture offset. This means that a switch
where the first prong does not capture and the second does, the switch_capture
zir instruction would be assigned switch_prong 0 instead of 1.
* AstGen: always use `typeof` and never `typeof_elem` on the
`switch_cond`/`switch_cond_ref` instruction because both variants
return a value and not a pointer.
- Delete the `typeof_elem` ZIR instruction since it is no longer
needed.
* Sema: validateUnionInit now recognizes a comptime mutable value and
no longer emits a compile error saying "cannot evaluate constant
expression"
- Still to-do is detecting comptime union values in a function that
is not being executed at compile-time.
- This is still to-do for structs too.
* Sema: when emitting a call AIR instruction, call resolveTypeLayout on
all the parameter types as well as the return type.
* `Type.structFieldOffset` now works for unions in addition to structs.
After a discussion about language specs, this seems like the best way to
go, because it's simpler to reason about both for humans and compilers.
The `bitcast_result_ptr` ZIR instruction is no longer needed.
This commit also implements writing enums, arrays, and vectors to
virtual memory at compile-time.
This unlocked some more of compiler-rt being able to build, which
in turn unlocks saturating arithmetic behavior tests.
There was also a memory leak in the comptime closure system which is now
fixed.
* New AIR instruction: slice, which constructs a slice out of a pointer
and a length.
* AstGen: use `coerced_ty` for start and end expressions, use `none`
for the sentinel, and don't try to load the result of the slice
operation because it returns a by-value result.
* Sema: pointer arithmetic is extracted into analyzePointerArithmetic
and it is used by the implementation of slice.
- Also I implemented comptime pointer addition.
* Sema: extract logic into analyzeSlicePtr, analyzeSliceLen and use them
inside the slice semantic analysis.
- The approach in stage2 is much cleaner than stage1 because it uses
more granular analysis calls for obtaining the slice pointer, doing
arithmetic on it, and checking if the length is comptime-known.
* Sema: use the slice Value Tag for slices when doing coercion from
pointer-to-array.
* LLVM backend: detect when emitting a GEP instruction into a
pointer-to-array and add the extra index that is required.
* Type: ptrAlignment for c_void returns 0.
* Implement Value.hash and Value.eql for slices.
* Remove accidentally duplicated behavior test.
* AstGen: Move `refToIndex` and `indexToRef` to Zir
* ZIR: the switch_block_*_* instruction tags are collapsed into one
switch_block tag which uses 4 bits for flags, and reduces the
scalar_cases_len field from 32 to 28 bits.
This freed up more ZIR tags, 2 of which are now used for
`switch_cond` and `switch_cond_ref` for producing the switch
condition value. For example, for union values it returns the
corresponding enum value.
* switching with multiple cases and ranges is not yet supported because
I want to change the ZIR encoding to store index pointers into the
extra array rather than storing prong indexes. This will avoid O(N^2)
iteration over prongs.
* AstGen now adds a `switch_cond` on the operand and then passes the
result of that to the `switch_block` instruction.
* Sema: partially implement `switch_capture_*` instructions.
* Sema: `unionToTag` notices if the enum type has only one possible value.