This commit makes some big changes to how we track state for Zig source
files. In particular, it changes:
* How `File` tracks its path on-disk
* How AstGen discovers files
* How file-level errors are tracked
* How `builtin.zig` files and modules are created
The original motivation here was to address incremental compilation bugs
with the handling of files, such as #22696. To fix this, a few changes
are necessary.
Just like declarations may become unreferenced on an incremental update,
meaning we suppress analysis errors associated with them, it is also
possible for all imports of a file to be removed on an incremental
update, in which case file-level errors for that file should be
suppressed. As such, after AstGen, the compiler must traverse files
(starting from analysis roots) and discover the set of "live files" for
this update.
Additionally, the compiler's previous handling of retryable file errors
was not very good; the source location the error was reported as was
based only on the first discovered import of that file. This source
location also disappeared on future incremental updates. So, as a part
of the file traversal above, we also need to figure out the source
locations of imports which errors should be reported against.
Another observation I made is that the "file exists in multiple modules"
error was not implemented in a particularly good way (I get to say that
because I wrote it!). It was subject to races, where the order in which
different imports of a file were discovered affects both how errors are
printed, and which module the file is arbitrarily assigned, with the
latter in turn affecting which other files are considered for import.
The thing I realised here is that while the AstGen worker pool is
running, we cannot know for sure which module(s) a file is in; we could
always discover an import later which changes the answer.
So, here's how the AstGen workers have changed. We initially ensure that
`zcu.import_table` contains the root files for all modules in this Zcu,
even if we don't know any imports for them yet. Then, the AstGen
workers do not need to be aware of modules. Instead, they simply ignore
module imports, and only spin off more workers when they see a by-path
import.
During AstGen, we can't use module-root-relative paths, since we don't
know which modules files are in; but we don't want to unnecessarily use
absolute files either, because those are non-portable and can make
`error.NameTooLong` more likely. As such, I have introduced a new
abstraction, `Compilation.Path`. This type is a way of representing a
filesystem path which has a *canonical form*. The path is represented
relative to one of a few special directories: the lib directory, the
global cache directory, or the local cache directory. As a fallback, we
use absolute (or cwd-relative on WASI) paths. This is kind of similar to
`std.Build.Cache.Path` with a pre-defined list of possible
`std.Build.Cache.Directory`, but has stricter canonicalization rules
based on path resolution to make sure deduplicating files works
properly. A `Compilation.Path` can be trivially converted to a
`std.Build.Cache.Path` from a `Compilation`, but is smaller, has a
canonical form, and has a digest which will be consistent across
different compiler processes with the same lib and cache directories
(important when we serialize incremental compilation state in the
future). `Zcu.File` and `Zcu.EmbedFile` both contain a
`Compilation.Path`, which is used to access the file on-disk;
module-relative sub paths are used quite rarely (`EmbedFile` doesn't
even have one now for simplicity).
After the AstGen workers all complete, we know that any file which might
be imported is definitely in `import_table` and up-to-date. So, we
perform a single-threaded graph traversal; similar to what
`resolveReferences` plays for `AnalUnit`s, but for files instead. We
figure out which files are alive, and which module each file is in. If a
file turns out to be in multiple modules, we set a field on `Zcu` to
indicate this error. If a file is in a different module to a prior
update, we set a flag instructing `updateZirRefs` to invalidate all
dependencies on the file. This traversal also discovers "import errors";
these are errors associated with a specific `@import`. With Zig's
current design, there is only one possible error here: "import outside
of module root". This must be identified during this traversal instead
of during AstGen, because it depends on which module the file is in. I
tried also representing "module not found" errors in this same way, but
it turns out to be much more useful to report those in Sema, because of
use cases like optional dependencies where a module import is behind a
comptime-known build option.
For simplicity, `failed_files` now just maps to `?[]u8`, since the
source location is always the whole file. In fact, this allows removing
`LazySrcLoc.Offset.entire_file` completely, slightly simplifying some
error reporting logic. File-level errors are now directly built in the
`std.zig.ErrorBundle.Wip`. If the payload is not `null`, it is the
message for a retryable error (i.e. an error loading the source file),
and will be reported with a "file imported here" note pointing to the
import site discovered during the single-threaded file traversal.
The last piece of fallout here is how `Builtin` works. Rather than
constructing "builtin" modules when creating `Package.Module`s, they are
now constructed on-the-fly by `Zcu`. The map `Zcu.builtin_modules` maps
from digests to `*Package.Module`s. These digests are abstract hashes of
the `Builtin` value; i.e. all of the options which are placed into
"builtin.zig". During the file traversal, we populate `builtin_modules`
as needed, so that when we see this imports in Sema, we just grab the
relevant entry from this map. This eliminates a bunch of awkward state
tracking during construction of the module graph. It's also now clearer
exactly what options the builtin module has, since previously it
inherited some options arbitrarily from the first-created module with
that "builtin" module!
The user-visible effects of this commit are:
* retryable file errors are now consistently reported against the whole
file, with a note pointing to a live import of that file
* some theoretical bugs where imports are wrongly considered distinct
(when the import path moves out of the cwd and then back in) are fixed
* some consistency issues with how file-level errors are reported are
fixed; these errors will now always be printed in the same order
regardless of how the AstGen pass assigns file indices
* incremental updates do not print retryable file errors differently
between updates or depending on file structure/contents
* incremental updates support files changing modules
* incremental updates support files becoming unreferenced
Resolves: #22696
Textual PTX is just assembly language like any other. And if we do ever add
support for emitting PTX object files after reverse engineering the bytecode
format, we'd be emitting ELF files like the CUDA toolchain. So there's really no
need for a special ObjectFormat tag here, nor linker code that treats it as a
distinct format.
This allows emitting object files for s390x-zos (GOFF) and powerpc(64)-aix
(XCOFF).
Note that GOFF emission in LLVM is still being worked on upstream for LLVM 21;
the resulting object files are useless right now. Also, -fstrip is required, or
LLVM will SIGSEGV during DWARF emission.
The old logic only decremented `remaining_prelink_tasks` if `bin_file`
was not `null`. This meant that on `-fno-emit-bin` builds with
registered prelink tasks (e.g. C source files), we exited from
`Compilation.performAllTheWorkInner` early, assuming a prelink error.
Instead, when `bin_file` is `null`, we still decrement
`remaining_prelink_tasks`; we just don't do any actual work.
Resolves: #22682
Windows is a ridiculous operating system designed by toddlers, and so
requires us to close all file handles in the `tmp/xxxxxxx` cache dir
before renaming it into `o/xxxxxxx`. We have a hack in place to handle
this for the main output file, but the MachO linker also outputs a file
with debug symbols, and we weren't closing it! This led to a fuckton of
CI failures when we enabled `.whole` cache mode by default for
self-hosted backends.
thanks jacob for figuring this out while i sat there
This can also be extended to ELF later as it means roughly the same thing there.
This addresses the main issue in #21721 but as I don't have a macOS machine to
do further testing on, I can't confirm whether zig cc is able to pass the entire
cgo test suite after this commit. It can, however, cross-compile a basic program
that uses cgo to x86_64-macos-none which previously failed due to lack of -x
support. Unlike previously, the resulting symbol table does not contain local
symbols (such as C static functions).
I believe this satisfies the related donor bounty: https://ziglang.org/news/second-donor-bounty
Functions like isMinGW() and isGnuLibC() have a good reason to exist: They look
at multiple components of the target. But functions like isWasm(), isDarwin(),
isGnu(), etc only exist to save 4-8 characters. I don't think this is a good
enough reason to keep them, especially given that:
* It's not immediately obvious to a reader whether target.isDarwin() means the
same thing as target.os.tag.isDarwin() precisely because isMinGW() and similar
functions *do* look at multiple components.
* It's not clear where we would draw the line. The logical conclusion before
this commit would be to also wrap Arch.isX86(), Os.Tag.isSolarish(),
Abi.isOpenHarmony(), etc... this obviously quickly gets out of hand.
* It's nice to just have a single correct way of doing something.
Currently zig fails to build while linking the system LLVM/C++ libraries
on my Chimera Linux system due to the fact that libc++.so is a linker
script with the following contents:
INPUT(libc++.so.1 -lc++abi -lunwind)
Prior to this commit, zig would try to convert "ambiguous names" in
linker scripts such as libc++.so.1 in this example into -lfoo style
flags. This fails in this case due to the so version number as zig
checks for exactly the .so suffix.
Furthermore, I do not think that this conversion is semantically correct
since converting libfoo.so to -lfoo could theoretically end up resulting
in libfoo.a getting linked which seems wrong when a different file is
specified in the linker script.
With this patch, this attempted conversion is removed. Instead, zig
always first checks if the exact file/path in the linker script exists
relative to the current working directory.
If the file is classified as a library (including versioned shared
objects such as libfoo.so.1), zig then falls back to checking if
the exact file/path in the linker script exists relative to each
directory in the library search path, selecting the first match or
erroring out if none is found.
This behavior fixes the regression that prevents building zig while
linking the system LLVM/C++ libraries on Chimera Linux.
Instead, `source`, `tree`, and `zir` should all be optional. This is
precisely what we're actually trying to model here; and `File` isn't
optimized for memory consumption or serializability anyway, so it's fine
to use a couple of extra bytes on actual optionals here.
Makes linker functions have small error sets, required to report
diagnostics properly rather than having a massive error set that has a
lot of codes.
Other linker implementations are not ported yet.
Also the branch is not passing semantic analysis yet.
The goals of this branch are to:
* compile faster when using the wasm linker and backend
* enable saving compiler state by directly copying in-memory linker
state to disk.
* more efficient compiler memory utilization
* introduce integer type safety to wasm linker code
* generate better WebAssembly code
* fully participate in incremental compilation
* do as much work as possible outside of flush(), while continuing to do
linker garbage collection.
* avoid unnecessary heap allocations
* avoid unnecessary indirect function calls
In order to accomplish this goals, this removes the ZigObject
abstraction, as well as Symbol and Atom. These abstractions resulted
in overly generic code, doing unnecessary work, and needless
complications that simply go away by creating a better in-memory data
model and emitting more things lazily.
For example, this makes wasm codegen emit MIR which is then lowered to
wasm code during linking, with optimal function indexes etc, or
relocations are emitted if outputting an object. Previously, this would
always emit relocations, which are fully unnecessary when emitting an
executable, and required all function calls to use the maximum size LEB
encoding.
This branch introduces the concept of the "prelink" phase which occurs
after all object files have been parsed, but before any Zcu updates are
sent to the linker. This allows the linker to fully parse all objects
into a compact memory model, which is guaranteed to be complete when Zcu
code is generated.
This commit is not a complete implementation of all these goals; it is
not even passing semantic analysis.
This commit separates semantic analysis of the annotated type vs value
of a global declaration, therefore allowing recursive and mutually
recursive values to be declared.
Every `Nav` which undergoes analysis now has *two* corresponding
`AnalUnit`s: `.{ .nav_val = n }` and `.{ .nav_ty = n }`. The `nav_val`
unit is responsible for *fully resolving* the `Nav`: determining its
value, linksection, addrspace, etc. The `nav_ty` unit, on the other
hand, resolves only the information necessary to construct a *pointer*
to the `Nav`: its type, addrspace, etc. (It does also analyze its
linksection, but that could be moved to `nav_val` I think; it doesn't
make any difference).
Analyzing a `nav_ty` for a declaration with no type annotation will just
mark a dependency on the `nav_val`, analyze it, and finish. Conversely,
analyzing a `nav_val` for a declaration *with* a type annotation will
first mark a dependency on the `nav_ty` and analyze it, using this as
the result type when evaluating the value body.
The `nav_val` and `nav_ty` units always have references to one another:
so, if a `Nav`'s type is referenced, its value implicitly is too, and
vice versa. However, these dependencies are trivial, so, to save memory,
are only known implicitly by logic in `resolveReferences`.
In general, analyzing ZIR `decl_val` will only analyze `nav_ty` of the
corresponding `Nav`. There are two exceptions to this. If the
declaration is an `extern` declaration, then we immediately ensure the
`Nav` value is resolved (which doesn't actually require any more
analysis, since such a declaration has no value body anyway).
Additionally, if the resolved type has type tag `.@"fn"`, we again
immediately resolve the `Nav` value. The latter restriction is in place
for two reasons:
* Functions are special, in that their externs are allowed to trivially
alias; i.e. with a declaration `extern fn foo(...)`, you can write
`const bar = foo;`. This is not allowed for non-function externs, and
it means that function types are the only place where it is possible
for a declaration `Nav` to have a `.@"extern"` value without actually
being declared `extern`. We need to identify this situation
immediately so that the `decl_ref` can create a pointer to the *real*
extern `Nav`, not this alias.
* In certain situations, such as taking a pointer to a `Nav`, Sema needs
to queue analysis of a runtime function if the value is a function. To
do this, the function value needs to be known, so we need to resolve
the value immediately upon `&foo` where `foo` is a function.
This restriction is simple to codify into the eventual language
specification, and doesn't limit the utility of this feature in
practice.
A consequence of this commit is that codegen and linking logic needs to
be more careful when looking at `Nav`s. In general:
* When `updateNav` or `updateFunc` is called, it is safe to assume that
the `Nav` being updated (the owner `Nav` for `updateFunc`) is fully
resolved.
* Any `Nav` whose value is/will be an `@"extern"` or a function is fully
resolved; see `Nav.getExtern` for a helper for a common case here.
* Any other `Nav` may only have its type resolved.
This didn't seem to be too tricky to satisfy in any of the existing
codegen/linker backends.
Resolves: #131
Previous commits
2b0929929d4ea2f441df
had this text:
> There are no dir components, so you would think that this was
> unreachable, however we have observed on macOS two processes racing to
> do openat() with O_CREAT manifest in ENOENT.
This appears to have been a misunderstanding based on the issue
report #12138 and corresponding PR #12139 in which the steps to
reproduce removed the cache directory in a loop which also executed
detached Zig compiler processes.
There is no evidence for the macOS kernel bug however the ENOENT is
easily explained by the removal of the cache directory.
This commit reverts those commits, ultimately reporting the ENOENT as an
error rather than repeating the create file operation. However this
commit also adds an explicit error set to `std.Build.Cache.hit` as well
as changing the `failed_file_index` to a proper diagnostic field that
fully communicates what failed, leading to more informative error
messages on failure to check the cache.
The equivalent failure when occuring for AstGen performs a fatal process
kill, reasoning being that the compiler has an invariant of the cache
directory not being yanked out from underneath it while executing. This
could be made a more granular error in the future but I suspect such
thing is not valuable to pursue.
Related to #18340 but does not solve it.
Primarily, this moves linker input parsing from flush() into the linker
task queue, which is executed simultaneously with the frontend.
I also made it avoid redundantly opening the same archive file N times
for each object file inside. Furthermore, hard code fixed buffer stream
rather than using a generic stream type.
Finally, I fixed the error handling of the Wasm.Archive.parse function.
Please pay attention to this pattern of returning a struct rather than
accepting a mutable struct as an argument. This ensures function-level
atomicity and makes resource management straightforward.
Deletes the file and path fields from Archive and Object.
Removed a well-meaning but ultimately misguided suggestion about how to
think about ZigObject since thinking about it that way has led to
problematic anti-DOD patterns.
* AIX has its own bespoke format.
* Handle all Apple platforms.
* FreeBSD and OpenBSD both use the GNU format in LLVM.
* Windows has since been switched to the COFF format by default in LLVM.
LLVM recently introduced new Triple::ArchType members in 19.1.3 which broke our
static assertions in zig_llvm.cpp. When implementing a fix for that, I realized
that we don't even need a lot of the stuff we have in zig_llvm.(cpp,h) anymore.
This commit trims the interface down considerably.
these tasks have some shared data dependencies so they cannot be done
simultaneously. Future work should untangle these data dependencies so
that more can be done in parallel.
for now this commit ensures correctness by making linker input parsing
and codegen tasks part of the same queue.