Made most `Value` functions require a `Type`. If the provided type is a
vector, then automatically vectorize the operation and return with
another vector. The Sema side can then automatically become vectorized
with minimal changes. There are already a few manually vectorized
instructions, but we can simplify those later.
Notably, Value.eql and Value.hash are improved to treat NaN as equal to
itself, so that Type/Value can be hash map keys. Likewise float hashing
normalizes the float value before computing the hash.
* don't store `has_well_defined_layout` in memory.
* remove struct `hasWellDefinedLayout` logic. it's just
`layout != .Auto`. This means we only need one implementation, in
Type.
* fix some of the cases being wrong in `hasWellDefinedLayout`, such as
optional pointers.
* move `tag_ty_inferred` field into a position that makes it more
obvious how the struct layout will be done. Also we don't have a
compiler that intelligently moves fields around so this layout is
better.
* Sema: don't `resolveTypeLayout` in `zirCoerceResultPtr` unless
necessary.
* Rename `ComptimePtrLoadKit` `target` field to `pointee` to avoid
confusion with `target`.
The core change here is that we no longer blindly trust that parent
pointers (.elem_ptr, .field_ptr, .eu_payload_ptr, .union_payload_ptr)
were derived from the "true" type of the underlying decl. When types
diverge, direct dereference fails and we are forced to bitcast, as
usual.
In order to maximize our chances to have a successful bitcast, this
includes several changes to the dereference procedure:
- `root` is now `parent` and is the largest Value containing the
dereference target, with the condition that its layout and the
byte offset of the target within are both well-defined.
- If the target cannot be dereferenced directly, because the
pointers were not derived from the true type of the underlying
decl, then it is returned as null.
- `beginComptimePtrDeref` now accepts an optional array_ty param,
which is used to directly dereference an array from an elem_ptr,
if necessary. This allows us to dereference array types without
well-defined layouts (e.g. `[N]?u8`) at an offset
The load_ty also allows us to correctly "over-read" an .elem_ptr to an
array of [N]T, if necessary. This makes direct dereference work for
array types even in the presence of an offset, which is necessary if
the array has no well-defined layout (e.g. loading from `[6]?u8`)
Detect if we are storing an array operand to a bitcasted vector pointer.
If so, we instead reach through the bitcasted pointer to the vector pointer,
bitcast the array operand to a vector, and then lower this as a store of
a vector value to a vector pointer. This generally results in better code,
as well as working around an LLVM bug.
See #11154
This implements type equality for error sets. This is done
through element-wise error set comparison.
Inferred error sets are always distinct types and other error sets are
always sorted. See #11022.
* mul_add AIR instruction: use `pl_op` instead of `ty_pl`. The type is
always the same as the operand; no need to waste bytes redundantly
storing the type.
* AstGen: use coerced_ty for all the operands except for one which we
use to communicate the type.
* Sema: use the correct source location for requireRuntimeBlock in
handling of `@mulAdd`.
* native backends: handle liveness even for the functions that are
TODO.
* C backend: implement `@mulAdd`. It lowers to libc calls.
* LLVM backend: make `@mulAdd` handle all float types.
- improved fptrunc and fpext to handle f80 with compiler-rt calls.
* Value.mulAdd: handle all float types and use the `@mulAdd` builtin.
* behavior tests: revert the changes to testing `@mulAdd`. These
changes broke the test coverage, making it only tested at
compile-time.
Improved f80 support:
* std.math.fma handles f80
* move fma functions from freestanding libc to compiler-rt
- add __fmax and fmal
- make __fmax and fmaq only exported when they don't alias fmal.
- make their linkage weak just like the rest of compiler-rt symbols.
* removed `longDoubleIsF128` and replaced it with `longDoubleIs` which
takes a type as a parameter. The implementation is now more accurate
and handles more targets. Similarly, in stage2 the function
CTypes.sizeInBits is more accurate for long double for more targets.
Similar to how Type.eql was reworked in the previous commit, this commit
reworks Type.hash to check all the different kinds of tags that a Type
can be represented with. It also completes the implementation for all
types except error sets, which need to have Type.eql enhanced as well.
This implements #10113 for the self-hosted compiler only. It removes the
ability to override alignment of packed struct fields, and removes the
ability to put pointers and arrays inside packed structs.
After this commit, nearly all the behavior tests pass for the stage2 llvm
backend that involve packed structs.
I didn't implement the compile errors or compile error tests yet. I'm
waiting until we have stage2 building itself and then I want to rework
the compile error test harness with inspiration from Vexu's arocc test
harness. At that point it should be a much nicer dev experience to work
on compile errors.
Big-int functions were updated to respect the provided abi_size, rather
than inferring a potentially incorrect abi_size implicitly.
In combination with the convention that any required padding bits are
added on the MSB end, this means that exotic integers can potentially
have a well-defined memory layout.
Get rid of `std.math.F80Repr`. Instead of trying to match the memory
layout of f80, we treat it as a value, same as the other floating point
types. The functions `make_f80` and `break_f80` are introduced to
compose an f80 value out of its parts, and the inverse operation.
stage2 LLVM backend: fix pointer to zero length array tripping LLVM
assertion. It now checks for when the element type is a zero-bit type
and lowers such thing the same way that pointers to other zero-bit types
are lowered.
Both stage1 and stage2 LLVM backends are adjusted so that f80 is lowered
as x86_fp80 on x86_64 and i386 architectures, and identical to a u80 on
others. LLVM constants are lowered in a less hacky way now that #10860
is fixed, by using the expression `(exp << 64) | fraction` using llvm
constants.
Sema is improved to handle c_longdouble by recursively handling it
correctly for whatever the float bit width is. In both stage1 and
stage2.
* F80Repr extern struct needs no explicit padding; let's match the
target padding.
* stage2: fix lowering of f80 constants.
* stage1: decide ABI size and alignment of f80 based on alignment of
u64. x86 has alignof u64 equal to 4 but arm has it as 8.
* stage2: fix Value.floatReadFromMemory to use F80Repr
* pass more x64 behavior tests
* return with a TODO error when lowering a decl with no runtime bits
* insert some debug logs for tracing recursive descent down the
type-value tree when lowering types
* print `Decl`'s name when print debugging `decl_ref` value
* pass air_tag instead of zir_tag
* also pass eval function so that the branch only happens once and the
body of zirUnaryMath is simplified
* Value.sqrt: update to handle f80 and f128 in the normalized way that
includes handling c_longdouble.
Semi-related change: fix incorrect sqrt builtin name for f80 in stage1.
Support for f128, comptime_float, and c_longdouble require improvements
to compiler_rt and will implemented in a later PR. Some of the code in
this commit could be made more generic, for instance `llvm.airSqrt`
could probably be `llvm.airUnaryMath`, but let's cross that
bridge when we get to it.
Currently Zig lowers `@intToFloat` for f80 incorrectly on non-x86
targets:
```
broken LLVM module found:
UIToFP result must be FP or FP vector
%62 = uitofp i64 %61 to i128
SIToFP result must be FP or FP vector
%66 = sitofp i64 %65 to i128
```
This happens because on such targets, we use i128 instead of x86_fp80 in
order to avoid "LLVM ERROR: Cannot select". `@intToFloat` must be
lowered differently to account for this difference as well.