diff --git a/README.md b/README.md index f3464db..2534f59 100644 --- a/README.md +++ b/README.md @@ -3,12 +3,21 @@ # Bazel zig cc toolchain This is a C/C++ toolchain that can (cross-)compile C/C++ programs. It contains -clang-13, musl, glibc (versions 2-2.34, selectable), all in a ~40MB package. -Read +clang-13, musl, glibc 2-2.35, all in a ~40MB package. Read [here](https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html) about zig-cc; the rest of the README will present how to use this toolchain from Bazel. +Configuring toolchains in Bazel is complex, under-documented, and fraught with +peril. I, the co-author of bazel-zig-cc, am still confused on how this all +works, and often wonder why it works at all. That aside, we made the our best +effort to make bazel-zig-cc usable for your C/C++/CGo projects, with as many +guardrails as we could install. + +While copy-pasting the code in your project, attempt to read and understand the +text surrounding the code snippets. This will save you hours of head +scratching, I promise. + # Usage Add this to your `WORKSPACE`: @@ -25,23 +34,18 @@ http_archive( load("@bazel-zig-cc//toolchain:defs.bzl", zig_toolchains = "toolchains") -zig_toolchains() +# version, url_formats and host_platform_sha256 are optional, but highly +# recommended. Zig SDK is by default downloaded from dl.jakstys.lt, which is a +# tiny server in the closet of Yours Truly. +zig_toolchains( + version = "<...>", + url_formats = [ + "https://example.org/zig/zig-{host_platform}-{version}.tar.xz", + ], + host_platform_sha256 = { ... }, +) ``` -> ### zig sdk download control -> -> If you are using this in production, you probably want more control over -> where the zig sdk is downloaded from: -> ``` -> zig_register_toolchains( -> version = "<...>", -> url_formats = [ -> "https://example.internal/zig/zig-{host_platform}-{version}.tar.xz", -> ], -> host_platform_sha256 = { ... }, -> ) -> ``` - And this to `.bazelrc`: ``` @@ -49,15 +53,20 @@ build --incompatible_enable_cc_toolchain_resolution ``` The snippets above will download the zig toolchain and make the bazel -toolchains available for registration and usage. +toolchains available for registration and usage. If you do nothing else, this +may work. The `.bazelrc` snippet instructs Bazel to use the registered "new +kinds of toolchains". All above are required regardless of how wants to use it. +The next steps depend on how one wants to use bazel-zig-cc. The descriptions +below is a gentle introduction to C++ toolchains from "user's perspective" too. -The next steps depend on your use case. +## Use case: manually build a single target with a specific zig cc toolchain -## I want to manually build a single target with a specific zig cc toolchain +This option is least disruptive to the workflow compared to no hermetic C++ +toolchain, and works best when trying out or getting started with bazel-zig-cc +for a subset of targets. -You may explicitly request Bazel to use a specific toolchain (compatible with -the specified platform). For example, if you wish to compile a specific binary -(or run tests) on linux/amd64/musl, you may specify: +To request Bazel to use a specific toolchain (compatible with the specified +platform) for build/tests/whatever on linux-amd64-musl, do: ``` bazel build \ @@ -66,13 +75,82 @@ bazel build \ //test/go:go ``` -This registers the toolchain `@zig_sdk//toolchain:linux_arm64_musl` for linux -arm64 targets. This toolchains links code statically with musl. We also specify -that we want to build //test/go:go for linux arm64. +There are a few things going on here, let's try to dissect them. -## I want to use zig cc as the default compiler +### Option `--platforms @zig_sdk//platform:linux_arm64` + +Specifies that the our target platform is `linux_arm64`, which resolves into: + +``` +$ bazel query --output=build @zig_sdk//platform:linux_arm64 +platform( + name = "linux_arm64", + generator_name = "linux_arm64", + generator_function = "declare_platforms", + generator_location = "platform/BUILD:7:18", + constraint_values = ["@platforms//os:linux", "@platforms//cpu:aarch64"], +) +``` + +`constraint_values` instructs Bazel to be looking for a **toolchain** that is +compatible with (in Bazelspeak, `target_compatible_with`) **all of the** +`["@platforms//os:linux", "@platforms//cpu:aarch64"]`. + +### Option `--toolchains=@zig_sdk//toolchain:linux_arm64_musl` + +Inspect first: + +``` +$ bazel query --output=build @zig_sdk//toolchain:linux_arm64_musl +toolchain( + name = "linux_arm64_musl", + generator_name = "linux_arm64_musl", + generator_function = "declare_toolchains", + generator_location = "toolchain/BUILD:7:19", + toolchain_type = "@bazel_tools//tools/cpp:toolchain_type", + target_compatible_with = [ + "@platforms//os:linux", + "@platforms//cpu:aarch64", + "@zig_sdk//libc:unconstrained", + ], + toolchain = "@zig_sdk//private:aarch64-linux-musl_cc", +) +``` + +The above means toolchain is compatible with platforms that include +`@platforms//os:linux`, `@platforms//cpu:aarch64` (an alias to +`@platforms//cpu:arm64`) and `@zig_sdk//libc:unconstrained`. For a platform to +pick up the right toolchain, the toolchain's `target_compatible_with` must be +equivalent or a superset to the platforms `constraint_values`. Since the +toolchain is a superset (therefore, `libc:unconstrained` does not matter here), +the platform is compatible with this toolchain. As a result, `--platforms +@zig_sdk//platform:linux_amd64` causes Bazel to select a toolchain +`@zig_sdk//platform:linux_arm64_musl` (because it satisfies all constraints), +which will compile and link the C/C++ code with musl. + +`@zig_sdk//libc:unconstrained` will become important later. + +### Same as above, less typing (with `--config`) + +Specifying the platform and toolchain for every target may become burdensome, +so they can be put used via `--config`. For example, append this to `.bazelrc`: + +``` +build:linux_arm64 --platforms @zig_sdk//platform:linux_arm64 +build:linux_arm64 --extra_toolchains @zig_sdk//toolchain:linux_arm64_musl +``` + +And then building to linux-arm64-musl boils down to: + +``` +bazel build --config=linux_arm64_musl //test/go:go +``` + +## Use case: always compile with zig cc + +Instead of adding the toolchains to `.bazelrc`, they can be added +unconditionally. Append this to `WORKSPACE` after `zig_toolchains(...)`: -Replace the call to `zig_register_toolchains` with ``` register_toolchains( "@zig_sdk//toolchain:linux_amd64_gnu.2.19", @@ -82,100 +160,93 @@ register_toolchains( ) ``` -The snippets above will download the zig toolchain and register it for the -following configurations: +Append this to `.bazelrc`: -- `toolchain:linux_amd64_gnu.2.19` for `["@platforms//os:linux", "@platforms//cpu:x86_64", "@zig_sdk//libc:unconstrained"]`. -- `toolchain:linux_arm64_gnu.2.28` for `["@platforms//os:linux", "@platforms//cpu:aarch64", "@zig_sdk//libc:unconstrained"]`. -- `toolchain:darwin_arm64` for `["@platforms//os:macos", "@platforms//cpu:x86_64"]`. -- `toolchain:darwin_arm64` for `["@platforms//os:macos", "@platforms//cpu:aarch64"]`. - -> ### Naming -> -> Both Go and Bazel naming schemes are accepted. For convenience with -> Go, the following Go-style toolchain aliases are created: -> -> |Bazel (zig) name | Go name | -> |---------------- | -------- | -> |`x86_64` | `amd64` | -> |`aarch64` | `arm64` | -> |`macos` | `darwin` | -> -> For example, the toolchain `linux_amd64_gnu.2.28` is aliased to -> `x86_64-linux-gnu.2.28`. To find out which toolchains can be registered or -> used, run: -> -> ``` -> $ bazel query @zig_sdk//toolchain/... -> ``` - -> ### Disabling the default bazel cc toolchain -> -> It may be useful to disable the default toolchain that bazel configures for -> you, so that configuration issues can be caught early on: -> -> .bazelrc -> ``` -> build:zig_cc --action_env BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 -> ``` -> -> This is not documented in bazel, so use at your own peril. - -## I want to start using the zig cc toolchain gradually - -You can register your zig cc toolchains under a config in your .bazelrc ``` -build:zig_cc --extra_toolchains @zig_sdk//toolchain:linux_amd64_gnu.2.19 -build:zig_cc --extra_toolchains @zig_sdk//toolchain:linux_arm64_gnu.2.28 -build:zig_cc --extra_toolchains @zig_sdk//toolchain:darwin_amd64 -build:zig_cc --extra_toolchains @zig_sdk//toolchain:darwin_arm64 +build --action_env BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1 ``` -Then for your builds/tests you need to specify that the `zig_cc` config -should be used: +From Bazel's perspective, this is almost equivalent to always specifying +`--extra_toolchains` on every `bazel <...>` command-line invocation. It also +means there is no way to disable the toolchain with the command line. This is +useful if you find bazel-zig-cc useful enough to compile for all of your +targets and tools. + +With `BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1` Bazel stops detecting the default +host toolchain. Configuring toolchains is complicated enough, and the +auto-detection (read: fallback to non-hermetic toolchain) is a footgun best +avoided. This option is not documented in bazel, so may break. If you intend to +use the hermetic toolchain exclusively, it won't hurt. + +## Use case: zig-cc for targets for multiple libc variants + +When some targets need to be build with different libcs (either different +versions of glibc or musl), use a linux toolchain from +`@zig_sdk//libc_aware/toolchains:<...>`. The toolchain will only be selected +when building for a specific libc. For example, in `WORKSPACE`: + ``` -bazel build --config zig_cc //test/go:go +register_toolchains( + "@zig_sdk//libc_aware/toolchain:linux_amd64_gnu.2.19", + "@zig_sdk//libc_aware/toolchain:linux_amd64_gnu.2.28", + "@zig_sdk//libc_aware/toolchain:x86_64-linux-musl", +) ``` -You can build a target for a different platform like so: +What does `@zig_sdk//libc_aware/toolchain:linux_amd64_gnu.2.19` mean? + ``` -bazel build --config zig_cc \ - --platforms @zig_sdk//platform:linux_arm64 \ - //test/go:go +$ bazel query --output=build @zig_sdk//libc_aware/toolchain:linux_amd64_gnu.2.19 |& grep target + target_compatible_with = ["@platforms//os:linux", "@platforms//cpu:x86_64", "@zig_sdk//libc:gnu.2.19"], ``` -## I want to use zig to build targets for multiple libc variants +To see how this relates to the platform: -If you have targets that need to be build with different glibc versions or with -musl, you can register a linux toolchain declared under `libc_aware/toolchains`. -It will only be selected when building for a specific libc version. For example - -- `libc_aware/toolchain:linux_amd64_gnu.2.19` for `["@platforms//os:linux", "@platforms//cpu:x86_64", "@zig_sdk//libc:gnu.2.19"]`. -- `libc_aware/toolchain:linux_amd64_gnu.2.28` for `["@platforms//os:linux", "@platforms//cpu:x86_64", "@zig_sdk//libc:gnu.2.28"]`. -- `libc_aware/toolchain:x86_64-linux-musl` for `["@platforms//os:linux", "@platforms//cpu:x86_64", "@zig_sdk//libc:musl"]`. - -With these toolchains registered, you can build a project for a specific libc -aware platform: ``` -$ bazel build --platforms @zig_sdk//libc_aware/platform:linux_amd64_gnu.2.19 //test/go:go -$ bazel build --platforms @zig_sdk//libc_aware/platform:linux_amd64_gnu.2.28 //test/go:go -$ bazel build --platforms @zig_sdk//libc_aware/platform:linux_amd64_musl //test/go:go +$ bazel query --output=build @zig_sdk//libc_aware/platform:linux_amd64_gnu.2.19 |& grep constraint + constraint_values = ["@platforms//os:linux", "@platforms//cpu:x86_64", "@zig_sdk//libc:gnu.2.19"], ``` -You can see the list of libc aware toolchains and platforms by running: +In this case, the platform's `constraint_values` and toolchain's +`target_compatible_with` are identical, causing Bazel to select the right +toolchain for the requested platform. With these toolchains registered, one can +build a project for a specific libc-aware platform; it will select the +appropriate toolchain: + +``` +$ bazel run --platforms @zig_sdk//libc_aware/platform:linux_amd64_gnu.2.19 //test/c:which_libc +glibc_2.19 +$ bazel run --platforms @zig_sdk//libc_aware/platform:linux_amd64_gnu.2.28 //test/c:which_libc +glibc_2.28 +$ bazel run --platforms @zig_sdk//libc_aware/platform:linux_amd64_musl //test/c:which_libc +non_glibc +$ bazel run --run_under=file --platforms @zig_sdk//libc_aware/platform:linux_arm64_gnu.2.28 //test/c:which_libc +which_libc: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 2.0.0, stripped +``` + +To the list of libc aware toolchains and platforms: + ``` $ bazel query @zig_sdk//libc_aware/toolchain/... $ bazel query @zig_sdk//libc_aware/platform/... ``` -This is especially useful if you are relying on [transitions][transitions], as -transitioning `extra_platforms` will cause your host tools to be rebuilt with -the specific libc version, which takes time, and your host may not be able to -run them. +Libc-aware toolchains are especially useful when relying on +[transitions][transitions], as transitioning `extra_platforms` will cause the +host tools to be rebuilt with the specific libc version, which takes time; also +the build host may not be able to run them if, say, target glibc version is +newer than on the host. Some tests in this repository (under `test/`) are using +transitions; you may check out how it's done. + +The `@zig_sdk//libc:variant` constraint is necessary to select a matching +toolchain. Remember: the toolchain's `target_compatible_with` must be +equivalent or a superset of the platform's `constraint_values`. This is why +both libc-aware platforms and libc-aware toolchains reside in their own +namespace; if we try to mix non-libc-aware to libc-aware, confusion ensues. + +To use the libc constraints in the project's platform definitions, add a +`@zig_sdk//libc:variant` constraint to them. See the list of available values: -The `@zig_sdk//libc:variant` constraint is used to select a matching toolchain. -If you are using your own platform definitions, add a `@zig_sdk//libc:variant` -constraint to them. See the list of available values: ``` $ bazel query "attr(constraint_setting, @zig_sdk//libc:variant, @zig_sdk//...)" ``` @@ -183,8 +254,28 @@ $ bazel query "attr(constraint_setting, @zig_sdk//libc:variant, @zig_sdk//...)" `@zig_sdk//libc:unconstrained` is a special value that indicates that no value for the constraint is specified. The non libc aware linux toolchains are only compatible with this value to prevent accidental silent fallthrough to them. +This is a guardrail. Thanks, future me! -# UBSAN and "SIGILL: Illegal Instruction" +# Note: Naming + +Both Go and Bazel naming schemes are accepted. For convenience with +Go, the following Go-style toolchain aliases are created: + +|Bazel (zig) name | Go name | +|---------------- | -------- | +|`x86_64` | `amd64` | +|`aarch64` | `arm64` | +|`macos` | `darwin` | + +For example, the toolchain `linux_amd64_gnu.2.28` is aliased to +`x86_64-linux-gnu.2.28`. To find out which toolchains can be registered or +used, run: + +``` +$ bazel query @zig_sdk//toolchain/... +``` + +# Note: UBSAN and "SIGILL: Illegal Instruction" `zig cc` differs from "mainstream" compilers by [enabling UBSAN by default][ubsan1]. Which means your program may compile successfully and crash @@ -197,6 +288,7 @@ SIGILL: illegal instruction This is by design: it encourages program authors to fix the undefined behavior. There are [many ways][ubsan2] to find the undefined behavior. + # Known Issues In bazel-zig-cc These are the things you may stumble into when using bazel-zig-cc. I am diff --git a/WORKSPACE b/WORKSPACE index 759d198..3ac9b8b 100644 --- a/WORKSPACE +++ b/WORKSPACE @@ -50,8 +50,17 @@ load( zig_toolchains() register_toolchains( + # if no `--platform` is selected, these toolchains will be used. "@zig_sdk//toolchain:linux_amd64_gnu.2.19", "@zig_sdk//toolchain:linux_arm64_gnu.2.28", "@zig_sdk//toolchain:darwin_amd64", "@zig_sdk//toolchain:darwin_arm64", + + # when a libc-aware platform is selected, these will be used. arm64: + "@zig_sdk//libc_aware/toolchain:linux_arm64_gnu.2.28", + "@zig_sdk//libc_aware/toolchain:linux_arm64_musl", + # ditto, amd64: + "@zig_sdk//libc_aware/toolchain:linux_amd64_gnu.2.19", + "@zig_sdk//libc_aware/toolchain:linux_amd64_gnu.2.28", + "@zig_sdk//libc_aware/toolchain:linux_amd64_musl", ) diff --git a/test/c/main.c b/test/c/main.c index 9ab0dc1..9570a6c 100644 --- a/test/c/main.c +++ b/test/c/main.c @@ -3,9 +3,9 @@ int main() { #ifdef __GLIBC__ - printf("glibc_%d.%d", __GLIBC__, __GLIBC_MINOR__); + printf("glibc_%d.%d\n", __GLIBC__, __GLIBC_MINOR__); #else - puts("non-glibc"); + printf("non-glibc\n"); #endif return 0; } diff --git a/toolchain/private/defs.bzl b/toolchain/private/defs.bzl index 671a81a..25c3f5c 100644 --- a/toolchain/private/defs.bzl +++ b/toolchain/private/defs.bzl @@ -28,7 +28,7 @@ _GLIBCS = [ "2.34", ] -LIBCS = ["musl", "gnu"] + ["gnu.{}".format(glibc) for glibc in _GLIBCS] +LIBCS = ["musl"] + ["gnu.{}".format(glibc) for glibc in _GLIBCS] def target_structs(): ret = []