Commit Graph

1820 Commits

Author SHA1 Message Date
mlugg
edfada4317 Sema: mark transitive failure when @import refers to a failed file 2023-09-21 16:27:52 -07:00
mlugg
2e5d13e9cf Sema: queue type resolution when analyzing ret_ptr during inline call 2023-09-21 14:48:41 -07:00
Andrew Kelley
d3f3de73ba Sema: less aggressive type resolution for field_val
This is an alternate version of c675daac7d0d6328ad89297e8938a6b17b59c489
2023-09-21 14:48:41 -07:00
mlugg
1b672e41c5 InternPool,Sema,type,llvm: alignment fixes
This changeset fixes the handling of alignment in several places. The
new rules are:
* `@alignOf(T)` where `T` is a runtime zero-bit type is at least 1,
  maybe greater.
* Zero-bit fields in `extern` structs *do* force alignment, potentially
  offsetting following fields.
* Zero-bit fields *do* have addresses within structs which can be
  observed and are consistent with `@offsetOf`.

These are not necessarily all implemented correctly yet (see disabled
test), but this commit fixes all regressions compared to master, and
makes one new test pass.
2023-09-21 14:48:41 -07:00
Andrew Kelley
cd242b7440 Sema: queue type resolution when adding a struct_field_val instruction 2023-09-21 14:48:40 -07:00
Andrew Kelley
dca17d3603 Sema: fix struct alignment regressions 2023-09-21 14:48:40 -07:00
Andrew Kelley
d06bf707ed compiler: make pointer type canonicalization always work
Previously it would canonicalize or not depending on some volatile
internal state of the compiler, now it forces resolution of the element
type to determine the alignment if it needs to.
2023-09-21 14:48:40 -07:00
Andrew Kelley
5ea3de55c4 Sema: fix dependency loop regression on struct field alignment 2023-09-21 14:48:40 -07:00
Andrew Kelley
c79da0d731 Sema: allow users to provide alignment 0 to mean default 2023-09-21 14:48:40 -07:00
mlugg
ada83fa557 compiler: get codegen of behavior tests working on at least one backend
We're hitting false compile errors, but this is progress!
2023-09-21 14:48:40 -07:00
Andrew Kelley
2da62a7106 Sema: restore master branch field reordering behavior
Let's try to reduce the explosive scope of this branch.
2023-09-21 14:48:40 -07:00
Andrew Kelley
baea62a8ad fix regressions from this branch 2023-09-21 14:48:40 -07:00
Andrew Kelley
fa1beba74f InternPool: implement getStructType
This also modifies AstGen so that struct types use 1 bit each from the
flags to communicate if there are nonzero inits, alignments, or comptime
fields. This allows adding a struct type to the InternPool without
looking ahead in memory to find out the answers to these questions,
which is easier for CPUs as well as for me, coding this logic right now.
2023-09-21 14:48:40 -07:00
Andrew Kelley
accd5701c2 compiler: move struct types into InternPool proper
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.

After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.

Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.

I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
2023-09-21 14:48:40 -07:00
r00ster91
ee4ced9683 write function types consistently with a space before fn keyword
Currently, the compiler (like @typeName) writes it `fn(...) Type` but
zig fmt writes it `fn (...) Type` (notice the space after `fn`).
This inconsistency is now resolved and function types are consistently
written the zig fmt way. Before this there were more `fn (...) Type`
occurrences than `fn(...) Type` already.
2023-09-19 15:15:05 +03:00
mlugg
6df78c3bc1 Sema: mark pointers to inline functions as comptime-only
This is supposed to be the case, similar to how pointers to generic
functions are comptime-only (several pieces of logic already assumed
this). These types being considered runtime was causing `dbg_var_val`
AIR instructions to be wrongly emitted for such values, causing codegen
backends to create a runtime reference to the inline function, which (at
least on the LLVM backend) triggers an error.

Resolves: #38
2023-09-15 21:46:38 -07:00
mlugg
f366d9f879 compiler: start using destructure syntax 2023-09-15 11:42:08 -07:00
mlugg
88f5315ddf compiler: implement destructuring syntax
This change implements the following syntax into the compiler:

```zig
const x: u32, var y, foo.bar = .{ 1, 2, 3 };
```

A destructure expression may only appear within a block (i.e. not at
comtainer scope). The LHS consists of a sequence of comma-separated var
decls and/or lvalue expressions. The RHS is a normal expression.

A new result location type, `destructure`, is used, which contains
result pointers for each component of the destructure. This means that
when the RHS is a more complicated expression, peer type resolution is
not used: each result value is individually destructured and written to
the result pointers. RLS is always used for destructure expressions,
meaning every `const` on the LHS of such an expression creates a true
stack allocation.

Aside from anonymous array literals, Sema is capable of destructuring
the following types:
* Tuples
* Arrays
* Vectors

A destructure may be prefixed with the `comptime` keyword, in which case
the entire destructure is evaluated at comptime: this means all `var`s
in the LHS are `comptime var`s, every lvalue expression is evaluated at
comptime, and the RHS is evaluated at comptime. If every LHS is a
`const`, this is not allowed: as with single declarations, the user
should instead mark the RHS as `comptime`.

There are a few subtleties in the grammar changes here. For one thing,
if every LHS is an lvalue expression (rather than a var decl), a
destructure is considered an expression. This makes, for instance,
`if (cond) x, y = .{ 1, 2 };` valid Zig code. A destructure is allowed
in almost every context where a standard assignment expression is
permitted. The exception is `switch` prongs, which cannot be
destructures as the comma is ambiguous with the end of the prong.

A follow-up commit will begin utilizing this syntax in the Zig compiler.

Resolves: #498
2023-09-15 11:33:53 -07:00
mlugg
50ef10eb49 Sema: add missing compile error for runtime-known const with comptime-only type
When RLS is used to initialize a value with a comptime-only type, the
usual "value with comptime-only type depends on runtime control flow"
error message isn't hit, because we don't use results from a block. When
we reach `make_ptr_const`, we must validate that the value is
comptime-known.
2023-09-15 14:29:57 +01:00
Andrew Kelley
8592c5cdac compiler: rework capture scopes in-memory layout
* Use 32-bit integers instead of pointers for compactness and
  serialization friendliness.
* Use a separate hash map for runtime and comptime capture scopes,
  avoiding the 1-bit union tag.
* Use a compact array representation instead of a tree of hash maps.
* Eliminate the only instance of ref-counting in the compiler, instead
  relying on garbage collection (not implemented yet but is the plan for
  almost all long-lived objects related to incremental compilation).

Because a code modification may need to access capture scope data, this
makes capture scope data long-lived state. My goal is to get incremental
compilation state serialization down to a single pwritev syscall, by
unifying the on-disk representation with the in-memory representation.
This commit eliminates the last remaining pointer field of
`Module.Decl`.
2023-09-15 00:55:07 -07:00
Andrew Kelley
cb6201715a InternPool: prevent anon struct UAF bugs with type safety
Instead of using actual slices for InternPool.Key.AnonStructType, this
commit changes to use Slice types instead, which store a
long-lived index rather than a pointer.

This is a follow-up to 7ef1eb1c27.
2023-09-12 20:08:56 -04:00
Jacob Young
e2ff486de5 Sema: cleanup coerceExtra
* remove unreachable code
 * remove already handled cases
 * avoid `InternPool.getCoerced`
 * add some undef checks
 * error when converting undef int to float

Closes #16987
2023-08-30 16:50:30 -04:00
Jacob Young
7a251c4cb8 Sema: revert reference trace changes that are no longer needed 2023-08-28 17:43:37 -04:00
Jacob Young
232077dcf9 Sema: create "called from here" notes before reference traces
This allows reference traces to begin at the outermost inline call.
2023-08-28 17:43:19 -04:00
Jacob Young
fe737d7a8f Sema: refactor to add a block param to failWithOwnedErrorMsg
We will need this in a bit...
2023-08-28 17:25:39 -04:00
Jacob Young
acd35a1aa7 Sema: factor out NeededComptimeReason from comptime value resolution
This makes the call sites easier to read, reduces the number of `catch`
expressions required, and prepares for comptime reasons to appear
earlier in the list of notes.
2023-08-28 17:25:39 -04:00
Jacob Young
bdbe16c47a Sema: cleanup to use more enum literals 2023-08-28 17:25:39 -04:00
Jacob Young
70e0d8170f Sema: cleanup @as invasion 2023-08-28 17:25:39 -04:00
mlugg
b8e6c42688 compiler: provide result type for @memset value
Resolves: #16986
2023-08-28 12:33:36 -07:00
mlugg
8d036d1d78 Sema: allow cast builtins on vectors
The following cast builtins did not previously work on vectors, and have
been made to:

* `@floatCast`
* `@ptrFromInt`
* `@intFromPtr`
* `@floatFromInt`
* `@intFromFloat`
* `@intFromBool`

Resolves: #16267
2023-08-28 12:32:02 -07:00
Jacob Young
c6024691cf Sema: implement reference trace with called from here notes
Closes #15124
2023-08-28 12:28:25 -07:00
Andrew Kelley
ada0010471 compiler: move unions into InternPool
There are a couple concepts here worth understanding:

Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.

InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.

Additionally:

* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
  semantic analysis know whether a union has explicit alignment on any
  fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
  signature and near-duplicate logic to `typeRequiresComptime`.
  - Make opaque types not report comptime-only (this was inconsistent
    between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
2023-08-22 13:54:14 -07:00
mlugg
6a5463951f Sema: disallow C pointer to slice coercion
Resolves: #16719
2023-08-21 11:31:22 -07:00
mlugg
82c8e45a7e Sema: check @memset operand provides length
Resolves: #16698
2023-08-21 11:30:20 -07:00
mlugg
321961d860 AstGen: add result location analysis pass
The main motivation for this change is eliminating the `block_ptr`
result location and corresponding `store_to_block_ptr` ZIR instruction.
This is achieved through a simple pass over the AST before AstGen which
determines, for AST nodes which have a choice on whether to provide a
result location, which choice to make, based on whether the result
pointer is consumed non-trivially.

This eliminates so much logic from AstGen that we almost break even on
line count! AstGen no longer has to worry about instruction rewriting
based on whether or not a result location was consumed: it always knows
what to do ahead of time, which simplifies a *lot* of logic. This also
incidentally fixes a few random AstGen bugs related to result location
handling, leading to the changes in `test/` and `lib/std/`.

This opens the door to future RLS improvements by making them much
easier to implement correctly, and fixes many bugs. Most ZIR is made
more compact after this commit, mainly due to not having redundant
`store_to_block_ptr` instructions lying around, but also due to a few
bugs in the old system which are implicitly fixed here.
2023-08-20 11:58:14 -07:00
Lewis Gaul
387b0ac4f1 Make NaNs quiet by default and other NaN tidy-up (#16826)
* Generalise NaN handling and make std.math.nan() give quiet NaNs

* Address uses of std.math.qnan_* and std.math.nan_* consts

* Comment out failing test due to issues with signalling NaN

* Fix issue in c_builtins.zig where we need qnan_u32
2023-08-18 02:07:49 -04:00
Andrew Kelley
7ef1eb1c27 InternPool: safer enum API
The key changes in this commit are:

```diff
-        names: []const NullTerminatedString,
+        names: NullTerminatedString.Slice,
-        values: []const Index,
+        values: Index.Slice,
```

Which eliminates the slices from `InternPool.Key.EnumType` and replaces
them with structs that contain `start` and `len` indexes. This makes the
lifetime of `EnumType` change from expiring with updates to InternPool,
to expiring when the InternPool is garbage-collected, which is currently
never.

This is gearing up for a larger change I started working on locally
which moves union types into InternPool.

As a bonus, I fixed some unnecessary instances of `@as`.
2023-08-17 18:16:03 -07:00
mlugg
083ee8e0e2 InternPool: preserve indices of builtin types when resolved
Some builtin types have a special InternPool index (e.g.
`.type_info_type`) so that AstGen can refer to them before semantic
analysis. Unfortunately, this previously led to a second index existing
to refer to the type once it was resolved, complicating Sema by having
the concept of an "unresolved" type index.

This change makes Sema modify these InternPool indices in-place to
contain the expanded representation when resolved. The analysis of the
corresponding decls is caught in `Module.semaDecl`, and a field is set
on Sema telling it which index to place struct/union/enum types at. This
system could break if `std.builtin` contained complex decls which
evaluate multiple struct types, but this will be caught by the
assertions in `InternPool.resolveBuiltinType`.

The AstGen result types which were disabled in 6917a8c have been
re-enabled.

Resolves: #16603
2023-08-15 11:45:23 +01:00
mlugg
6e2eb208aa Sema: queue type resolution whem emitting array_elem_val
This type not being resolved was a bug which was being triggered by the
next commit.
2023-08-15 11:42:25 +01:00
mlugg
8f3ccbbe36 Sema: provide source location when analyzing panic handler
The panic handler decl_val was previously given a `unneeded` source
location, which was then added to the reference trace, resulting in a
crash if the source location was used in the reference trace. This
commit makes two trivial changes:

* Don't add unneeded source locations to the ref table (panic in debug, silently ignore in release)
* Pass a real source location when analyzing the panic handler
2023-08-14 11:43:21 -07:00
mlugg
5e0107fbce Sema: remove redundant addConstant functions
After ff37ccd, interned values are trivial to convert to Air refs, using
`Air.internedToRef`. This made functions like `Sema.addConstant` effectively
redundant. This commit removes `Sema.addConstant` and `Sema.addType`, replacing
them with direct usages of `Air.internedToRef`.

Additionally, a new helper `Module.undefValue` is added, and the following
functions are moved into Module:
* `Sema.addConstUndef` -> `Module.undefRef`
* `Sema.addUnsignedInt` -> `Module.intRef` (now also works for signed types)

The general pattern here is that any `Module.xyzValue` helper may also have a
corresponding `Module.xyzRef` helper, which just wraps the call in
`Air.internedToRef`.
2023-08-11 11:02:24 -07:00
Jacob Young
8b9161179d Sema: avoid deleting runtime side-effects in comptime initializers
Closes #16744
2023-08-11 11:01:47 -07:00
Xavier Bouchoux
77dd64b5f4 Sema: fix coerceArrayLike() for vectors with padding
as explainded at https://llvm.org/docs/LangRef.html#vector-type :

"In general vector elements are laid out in memory in the same way as array types.
Such an analogy works fine as long as the vector elements are byte sized.
However, when the elements of the vector aren’t byte sized it gets a bit more complicated.
One way to describe the layout is by describing what happens when a vector such
as <N x iM> is bitcasted to an integer type with N*M bits, and then following the
rules for storing such an integer to memory."

"When <N*M> isn’t evenly divisible by the byte size the exact memory layout
is unspecified (just like it is for an integral type of the same size)."
2023-08-10 16:10:59 -07:00
Andrew Kelley
b820d5df79 Merge pull request #16747 from jacobly0/llvm-wo-libllvm
llvm: enable the backend even when not linked to llvm
2023-08-10 12:02:57 -07:00
mlugg
f32b9bc776 Sema: add references to generic instantiations
This makes the reference trace appear for generic calls where it
previously did not.

Resolves: #16725
2023-08-10 10:00:37 +01:00
mlugg
7a7d0225d9 Sema: detect invalid field stores in tuple initialization
This bug was exposed by the previous commit, since array_init is now
used for tuple parameters.
2023-08-10 10:00:37 +01:00
mlugg
f2c8fa769a Sema: refactor generic calls to interleave argument analysis and parameter type resolution
AstGen provides all function call arguments with a result location,
referenced through the call instruction index. The idea is that this
should be the parameter type, but for `anytype` parameters, we use
generic poison, which is required to be handled correctly.

Previously, generic instantiations and inline calls worked by evaluating
all args in advance, before resolving generic parameter types. This
means any generic parameter (not just `anytype` ones) had generic poison
result types. This caused missing result locations in some cases.

Additionally, the generic instantiation logic caused `zirParam` to
analyze the argument types a second time before coercion. This meant
that for nominal types (struct/enum/etc), a *new* type was created,
distinct to the result type which was previously forwarded to the
argument expression.

This commit fixes both of these issues. Generic parameter type
resolution is now interleaved with argument analysis, so that we don't
have unnecessary generic poison types, and generic instantiation logic
now handles parameters itself rather than falling through to the
standard zirParam logic, so avoids duplicating the types.

Resolves: #16566
Resolves: #16258
Resolves: #16753
2023-08-10 10:00:26 +01:00
mlugg
93e53d1e00 compiler: fix crash on invalid result type for @splat
This introduces a new ZIR instruction, `vec_elem_type`.

Co-Authored-By: Ali Chraghi <alichraghi@proton.me>
Resolves: #16567
2023-08-09 19:46:58 +01:00
mlugg
6917a8c258 AstGen: handle ty result location for struct and array init correctly
Well, this was a journey!

The original issue I was trying to fix is covered by the new behavior
test in array.zig: in essence, `ty` and `coerced_ty` result locations
were not correctly propagated.

While fixing this, I noticed a similar bug in struct inits: the type was
propagated to *fields* fine, but the actual struct init was
unnecessarily anonymous, which could lead to unnecessary copies. Note
that the behavior test added in struct.zig was already passing - the bug
here didn't change any easy-to-test behavior - but I figured I'd add it
anyway.

This is a little harder than it seems, because the result type may not
itself be an array/struct type: it could be an optional / error union
wrapper. A new ZIR instruction is introduced to unwrap these.

This is also made a little tricky by the fact that it's possible for
result types to be unknown at the time of semantic analysis (due to
`anytype` parameters), leading to generic poison. In these cases, we
must essentially downgrade to an anonymous initialization.

Fixing these issues exposed *another* bug, related to type resolution in
Sema. That issue is now tracked by #16603. As a temporary workaround for
this bug, a few result locations for builtin function operands have been
disabled in AstGen. This is technically a breaking change, but it's very
minor: I doubt it'll cause any breakage in the wild.
2023-08-09 19:46:55 +01:00
Jacob Young
736df27663 Sema: use the correct decl for generic argument source locations
Closes #16746
2023-08-09 10:09:01 -04:00