1
Fork 0

Ken's comments

main
Motiejus Jakštys 2022-05-23 16:04:18 +03:00
parent ac96c45b1a
commit 7e82e68d7a
1 changed files with 18 additions and 14 deletions

View File

@ -47,8 +47,8 @@ TLDR:
Uber started in 2010, has clocked over 15 billion trips, and made lots of cool
and innovative tech for it to happen. General-purpose "allowed" server-side
languages are Go and Java, with Python and Node allowed for specific use cases
(like front-end for Node and Python for data analysis/ML). Use of other
languages in back-end code is minimal.
(like front-end for Node and Python for data analysis/ML). C++ is used for a few
low level libraries. Use of other languages in back-end code is minimal.
Our Go Monorepo is larger than Linux kernel[^1], and worked on by a couple of
thousand engineers. In short, it's big.
@ -83,7 +83,8 @@ wave --- I still remember the complexity.
At the time, the Go monorepo already used a hermetic Go toolchain. Therefore,
the Go compiler used to build the monorepo was unaffected by the compiler
installed on the system, if any. Therefore, on whichever environment a Go build
was running, it always used the same version of Go.
was running, it always used the same version of Go. Bazel docs [explain this
better than me][bazel-hermetic].
{{<img src="_/2022/uber-zig-gm-221.png"
alt="A Jira task asking for a hermetic C++ toolchain."
@ -96,7 +97,7 @@ unavoidable for some our Go code to use [CGo][cgo], so it needs a C/C++
compiler. Go then links the Go and C parts to the final executable.
The C++ toolchain was not hermetic since the start of Go monorepo: Bazel would
use whatever it found on the system. That meant clang on macOS, gcc (whatever
use whatever it found on the system. That meant Clang on macOS, GCC (whatever
version) on Linux. Setting up a hermetic C++ toolchain in Bazel is a lot of
work (think person-months for our monorepo), there was no immediate need, and
it also was not painful *enough* to be picked up.
@ -122,11 +123,12 @@ All of these issues were annoying, but not enough to invest into the toolchain.
### 2020 Dec: need musl
I was working on a toy project that is built with Bazel and uses CGo. I wanted
my binary to be static, but Bazel does not make that easy. I spent a couple of
evenings creating a Bazel toolchain on top of [musl.cc](https://musl.cc), but
didn't go far, because at the time I wasn't able to make sense out of the
Bazel's toolchain documentation, and I didn't find a good example to rely on.
I was working on a non-Uber-related toy project that is built with Bazel and
uses CGo. I wanted my binary to be static, but Bazel does not make that easy. I
spent a couple of evenings creating a Bazel toolchain on top of
[musl.cc](https://musl.cc), but didn't go far, because at the time I wasn't
able to make sense out of the Bazel's toolchain documentation, and I didn't
find a good example to rely on.
### 2021 Jan: discovering `zig cc`
@ -137,12 +139,13 @@ understand the remaining article better, because I gave the talk to a Zig
audience). To sum up the Andrew's article, `zig cc` has the following
advantages:
- Fully hermetic C/C++ compiler in ~40MB tarball.
- Fully hermetic C/C++ compiler in ~40MB tarball. This is an order of magnitude
smaller than the standard Clang distributions.
- Can link against a glibc version that was provided as a command-line argument
(e.g. `-target x86_64-linux-gnu.2.28` will compile for x86_64 Linux and link
against glibc 2.28).
- Host and target are decoupled. The setup is the same for both linux-aarch64
and darwin-x86_64 targets, regardless of the host.
- Host and target are decoupled. The setup is the same for both `linux-aarch64`
and `darwin-x86_64` targets, regardless of the host.
- Linking with musl is "just a different libc version": `-target
x86_64-linux-musl`.
@ -201,7 +204,7 @@ dependency on system libraries and undoing of a lot of technical debt.
justification.
- bazel-zig-cc kinda works, but both bazel-zig-cc and zig cc have known bugs.
- I can't realistically implement the necessary changes or bug fixes. I tried
implementing `zig ar`, a trivial front-end for llvm's `ar`, and failed.
implementing `zig ar`, a trivial front-end for LLVM's `ar`, and failed.
- Once an issue had been identified as a Zig issue, getting attention from Zig
developers was unpredictable. Some issues got resolved within days, some took
more than 6 months, and donations din't change `zig cc` priorities.
@ -232,7 +235,7 @@ research on this unproven prototype.
Given that we now needed a cross-compiler, we had two candidates:
- [grailbio/bazel-toolchain][grailbio/bazel-toolchain]. Uses a vanilla clang.
- [grailbio/bazel-toolchain][grailbio/bazel-toolchain]. Uses a vanilla Clang.
No risk. Well understood. Obviously safe and correct solution.
- [~motiejus/bazel-zig-cc][bazel-zig-cc]: uses `zig cc`. Buggy, risky, unsafe,
uncertain, used-by-nobody, but quite a tempting solution.
@ -366,3 +369,4 @@ Many thanks Abhinav Gupta and Loris Cro for reading drafts of this.
[milan-youtube]: https://www.youtube.com/watch?v=SCj2J3HcEfc
[zig-motiejus-issues]: https://github.com/ziglang/zig/issues?q=author%3Amotiejus+sort%3Acreated-asc
[kmicklas]: https://github.com/kmicklas
[bazel-hermetic]: https://bazel.build/concepts/hermeticity