abg's comments

This commit is contained in:
Motiejus Jakštys 2022-05-21 10:44:22 +03:00
parent 304c635b27
commit 80580a9968

View File

@ -26,14 +26,15 @@ transcript, with some commentary and errata.
TLDR: TLDR:
* Uber uses zig to compile it's C/C++ code. Now only in [Go * Uber uses zig to compile its C/C++ code. Now only in the [Go
Monorepo][go-monorepo] via [bazel-zig-cc][bazel-zig-cc], with inconcrete Monorepo][go-monorepo] via [bazel-zig-cc][bazel-zig-cc], with plans to
ideas to expand use of `zig cc` to other monorepos. possibly expand use of `zig cc` to other languages that need a C/C++
* Uber does not have any plans to use zig-the-language. toolchain.
* Uber does not have any plans to use zig-the-language yet.
* Uber signed a support agreement with Zig Software Foundation (ZSF) to * Uber signed a support agreement with Zig Software Foundation (ZSF) to
prioritize bug fixes. The contract value is disclosed in the ZSF financial prioritize bug fixes. The contract value is disclosed in the ZSF financial
reports. reports.
* Thanks my team, the Go Monorepo team, the Go Platform team, my director, * Thanks to my team, the Go Monorepo team, the Go Platform team, my director,
finance, legal, and of course Zig Software Foundation for making this finance, legal, and of course Zig Software Foundation for making this
relationship happen. The relationship has been fruitful so far. relationship happen. The relationship has been fruitful so far.
@ -47,8 +48,8 @@ languages are Go and Java, with Python and Node allowed for specific use cases
(like front-end for Node and Python for data analysis/ML). Use of other (like front-end for Node and Python for data analysis/ML). Use of other
languages in back-end code is minimal. languages in back-end code is minimal.
Our go monorepo is larger than Linux kernel[^1], and worked on by a couple of Our Go Monorepo is larger than Linux kernel[^1], and worked on by a couple of
thousand engineers. To sum up, it is size-able. thousand engineers. In short, it's big.
## How does Uber use Zig? ## How does Uber use Zig?
@ -77,10 +78,10 @@ wave --- I still remember the complexity.
### 2019: asks for a hermetic toolchain ### 2019: asks for a hermetic toolchain
At the time, the Go monorepo already used a hermetic Go toolchain. That means At the time, the Go monorepo already used a hermetic Go toolchain. Therefore,
it would download the Go SDK as part of the build process. Therefore, on the Go compiler used to build the monorepo was unaffected by the compiler
whichever environment a Go build was running, it always used the same version installed on the system, if any. Therefore, on whichever environment a Go build
of Go. was running, it always used the same version of Go.
{{<img src="_/2022/uber-zig-gm-221.png" {{<img src="_/2022/uber-zig-gm-221.png"
alt="A Jira task asking for a hermetic C++ toolchain." alt="A Jira task asking for a hermetic C++ toolchain."
@ -88,21 +89,21 @@ of Go.
hint="graph" hint="graph"
>}} >}}
C++ toolchain is a collection of programs to compile C/C++ code. Our Go code A C++ toolchain is a collection of programs to compile C/C++ code. It is
uses quite a bit of [CGo][cgo], so it needs a C/C++ compiler. Go then links the unavoidable for some our Go code to use [CGo][cgo], so it needs a C/C++
Go and C parts to the final executable. compiler. Go then links the Go and C parts to the final executable.
The C++ toolchain was not hermetic since the start of Go monorepo: Bazel would The C++ toolchain was not hermetic since the start of Go monorepo: Bazel would
use whatever it found on the system. That meant clang on MacOS, gcc (whatever use whatever it found on the system. That meant clang on macOS, gcc (whatever
version) on Linux. Setting up C++ toolchain in Bazel is a lot of work (think version) on Linux. Setting up a hermetic C++ toolchain in Bazel is a lot of
person-months for our monorepo), there was no immediate need, and it also was work (think person-months for our monorepo), there was no immediate need, and
not painful *enough* to be picked up. it also was not painful *enough* to be picked up.
At this point it is important to understand the limitations of a non-hermetic At this point it is important to understand the limitations of a non-hermetic
C++ toolchain: C++ toolchain:
- Cannot cross-compile. So we can't compile Linux executables on a Mac if they - Cannot cross-compile. So we can't compile Linux executables on a Mac if they
have CGo (which is most of our service code). This was worked around by... have CGo (which many of our services do). This was worked around by... not
not cross-compiling. cross-compiling.
- CGo executables would link to a glibc version that was found on the system. - CGo executables would link to a glibc version that was found on the system.
That means: when upgrading the OS (multi-month effort), the build fleet must That means: when upgrading the OS (multi-month effort), the build fleet must
be upgraded last. Otherwise, if build host runs a newer glibc than a be upgraded last. Otherwise, if build host runs a newer glibc than a
@ -111,15 +112,18 @@ C++ toolchain:
- We couldn't use new compilers, which have better optimizations, because we - We couldn't use new compilers, which have better optimizations, because we
were running an older OS on the build fleet (backporting only the compiler, were running an older OS on the build fleet (backporting only the compiler,
but not glibc, carries it's own risks). but not glibc, carries it's own risks).
- Official binaries for newer versions of Go are built against a more recent
version of GCC than some of our build machines. We had to work around this by
compiling Go from source on these machines.
All of these issues were annoying, but not enough to invest into the toolchain. All of these issues were annoying, but not enough to invest into the toolchain.
### 2020 Dec: need musl ### 2020 Dec: need musl
I was working on a toy project that is built with Bazel and uses CGo. I wanted I was working on a toy project that is built with Bazel and uses CGo. I wanted
my binary to be static, but Bazel is not easily offering that. I spent a couple my binary to be static, but Bazel does not make that easy. I spent a couple of
of evenings creating a Bazel toolchain on top of [musl.cc](https://musl.cc), evenings creating a Bazel toolchain on top of [musl.cc](https://musl.cc), but
but didn't go far, because at the time I wasn't able to make sense out of the didn't go far, because at the time I wasn't able to make sense out of the
Bazel's toolchain documentation, and I didn't find a good example to rely on. Bazel's toolchain documentation, and I didn't find a good example to rely on.
### 2021 Jan: discovering `zig cc` ### 2021 Jan: discovering `zig cc`
@ -193,12 +197,12 @@ dependency on system libraries and undoing of a lot of tech debt.
- Various places at Uber would benefit from a hermetic C++ cross-compiler, but - Various places at Uber would benefit from a hermetic C++ cross-compiler, but
it's not funded due to a large investment and not enough justification. it's not funded due to a large investment and not enough justification.
- bazel-zig-cc kinda works, but both bazel-zig-cc and zig cc have known bugs. - bazel-zig-cc kinda works, but both bazel-zig-cc and zig cc have known bugs.
- Donations don't "help" for `zig cc`, and I can't realistically implement - I can't realistically implement the necessary changes or bug fixes. I tried
them. I tried with `zig ar`, a trivial front-end for llvm's ld, and failed. implementing `zig ar`, a trivial front-end for llvm's `ar`, and failed.
- The monorepo-onboarding diff was simmering and waiting for it's time.
- Once an issue had been identified as a Zig issue, getting attention from Zig - Once an issue had been identified as a Zig issue, getting attention from Zig
developers was unpredictable. Some issues got resolved within days, some took developers was unpredictable. Some issues got resolved within days, some took
more than 6 months. more than 6 months. Donations don't change `zig cc` priorities.
- The monorepo-onboarding diff was simmering and waiting for it's time.
### 2021 End: Uber needs a cross-compiler ### 2021 End: Uber needs a cross-compiler
@ -218,8 +222,8 @@ thing to manage is risk. As zig is a novel technology (not even 1.0!), it was
truly unusual to suggest compiling all of our C and C++ code with it. We should truly unusual to suggest compiling all of our C and C++ code with it. We should
be planning to stick with it for at least a decade. Questions were raised and be planning to stick with it for at least a decade. Questions were raised and
evaluated with great care and scrutiny. For that I am truly grateful to the Go evaluated with great care and scrutiny. For that I am truly grateful to the Go
Monorepo team, especially Ken Micklas, for doing the work and research on this Monorepo team, especially [Ken Micklas][kmicklas], for doing the work and
unproven prototype. research on this unproven prototype.
### Evaluation of different compilers ### Evaluation of different compilers
@ -234,22 +238,23 @@ Given that we now needed a cross-compiler, we had two candidates:
- configurable glibc version. In grailbio case you would need a sysroot - configurable glibc version. In grailbio case you would need a sysroot
(basically, a chroot with the system libraries, so the programs can be linked (basically, a chroot with the system libraries, so the programs can be linked
against them), which needs to be maintained. against them), which needs to be maintained.
- a working, albeit still buggy, hermetic (cross-)compiler for OSX. - a working, albeit still buggy, hermetic (cross-)compiler for macOS.
Glibc we can handle in either case. However, `bazel-toolchain` will unlikely Glibc we can handle in either case. However, `bazel-toolchain` will unlikely
ever have a way to compile to OSX, let alone cross-compile. Relying on the ever have a way to compile to macOS, let alone cross-compile. Relying on the
system compiler is undesirable on developer laptops, and Go Platform feels that system compiler is undesirable on developer laptops, and Go Platform feels that
first-hand, especially during OSX upgrades. first-hand, especially during macOS upgrades.
The prospect of a hermetic toolchain for OSX targets tripped the scales towards The prospect of a hermetic toolchain for macOS targets tripped the scales
`zig cc`, with all it's warts, risks and instability. towards `zig cc`, with all its warts, risks and instability.
There was another, attention problem: if we were considering to use zig in a There was another, attention problem: if we were considering the use of Zig in
serious capacity, we knew we will hit problems, but unlikely have the expertise a serious capacity, we knew we will hit problems, but would be unlikely to have
to solve them. How can we, as a BigCorp, de-risk the engagement question, the expertise to solve them. How can we, as a BigCorp, de-risk the engagement
making sure that bugs important to us are handled timely? We were sure of good question, making sure that bugs important to us are handled timely? We were
intentions of ZSF: it was obvious that, if we find and report a legitimate bug, sure of good intentions of ZSF: it was obvious that, if we find and report a
it would get fixed. But how can we put an upper bound on latency? legitimate bug, it would get fixed. But how can we put an upper bound on
latency?
### Money ### Money
@ -271,7 +276,7 @@ bystander. We did not ask for special rights, it's explicit in the contract,
and we don't want that. and we don't want that.
The contract was signed, the wire transfer completed, and in 2022 January we The contract was signed, the wire transfer completed, and in 2022 January we
hpad: had:
- A service contract with ZSF that promised to prioritize issues that we've - A service contract with ZSF that promised to prioritize issues that we've
registered. registered.
@ -287,8 +292,8 @@ hpad:
## 2022 and beyond ## 2022 and beyond
In Feb 2022 the toolchain was gated behind a command-line flag In Feb 2022 the toolchain was gated behind a command-line flag
(`--config=hermetic-cc`). As of Feb 2022, you can invoke `zig cc` in Uber's go (`--config=hermetic-cc`). As of Feb 2022, you can invoke `zig cc` in Uber's Go
monorepo without requiring a custom patch. Monorepo without requiring a custom patch.
{{<img src="_/2022/uber-zig-landed.png" {{<img src="_/2022/uber-zig-landed.png"
alt="WIP DIFF onboarding the monorepo was landed" alt="WIP DIFF onboarding the monorepo was landed"
@ -317,7 +322,7 @@ zig-cc could have failed due to many many reasons.
Looking back, I think the most important reasons for success is a killer Looking back, I think the most important reasons for success is a killer
feature at the right time. In our case, there were two: glibc version selection feature at the right time. In our case, there were two: glibc version selection
without a sysroot and cross-compiling to OSX. without a sysroot and cross-compiling to macOS.
## Appendix ## Appendix
@ -333,6 +338,11 @@ If compilers or adopting software for other CPU architectures (and/or living in
the Eastern Europe) is your thing, my team in Vilnius is hiring. Also, my the Eastern Europe) is your thing, my team in Vilnius is hiring. Also, my
sister teams in Seattle and Bay Area are hiring too. Ping me. sister teams in Seattle and Bay Area are hiring too. Ping me.
Credits
-------
Many thanks Abhinav Gupta for reading drafts of this.
[^1]: Errata: I incorrectly said "by an order of magnitude". The order of [^1]: Errata: I incorrectly said "by an order of magnitude". The order of
magnitude is the same. magnitude is the same.
[^2]: Errata: I said Go was the first monorepo. Go was 4'th. [^2]: Errata: I said Go was the first monorepo. Go was 4'th.
@ -352,3 +362,4 @@ sister teams in Seattle and Bay Area are hiring too. Ping me.
[grailbio/bazel-toolchain]: https://github.com/grailbio/bazel-toolchain [grailbio/bazel-toolchain]: https://github.com/grailbio/bazel-toolchain
[milan-youtube]: https://www.youtube.com/watch?v=SCj2J3HcEfc [milan-youtube]: https://www.youtube.com/watch?v=SCj2J3HcEfc
[zig-motiejus-issues]: https://github.com/ziglang/zig/issues?q=author%3Amotiejus+sort%3Acreated-asc [zig-motiejus-issues]: https://github.com/ziglang/zig/issues?q=author%3Amotiejus+sort%3Acreated-asc
[kmicklas]: https://github.com/kmicklas