abg's comments
This commit is contained in:
parent
304c635b27
commit
80580a9968
@ -26,14 +26,15 @@ transcript, with some commentary and errata.
|
||||
|
||||
TLDR:
|
||||
|
||||
* Uber uses zig to compile it's C/C++ code. Now only in [Go
|
||||
Monorepo][go-monorepo] via [bazel-zig-cc][bazel-zig-cc], with inconcrete
|
||||
ideas to expand use of `zig cc` to other monorepos.
|
||||
* Uber does not have any plans to use zig-the-language.
|
||||
* Uber uses zig to compile its C/C++ code. Now only in the [Go
|
||||
Monorepo][go-monorepo] via [bazel-zig-cc][bazel-zig-cc], with plans to
|
||||
possibly expand use of `zig cc` to other languages that need a C/C++
|
||||
toolchain.
|
||||
* Uber does not have any plans to use zig-the-language yet.
|
||||
* Uber signed a support agreement with Zig Software Foundation (ZSF) to
|
||||
prioritize bug fixes. The contract value is disclosed in the ZSF financial
|
||||
reports.
|
||||
* Thanks my team, the Go Monorepo team, the Go Platform team, my director,
|
||||
* Thanks to my team, the Go Monorepo team, the Go Platform team, my director,
|
||||
finance, legal, and of course Zig Software Foundation for making this
|
||||
relationship happen. The relationship has been fruitful so far.
|
||||
|
||||
@ -47,8 +48,8 @@ languages are Go and Java, with Python and Node allowed for specific use cases
|
||||
(like front-end for Node and Python for data analysis/ML). Use of other
|
||||
languages in back-end code is minimal.
|
||||
|
||||
Our go monorepo is larger than Linux kernel[^1], and worked on by a couple of
|
||||
thousand engineers. To sum up, it is size-able.
|
||||
Our Go Monorepo is larger than Linux kernel[^1], and worked on by a couple of
|
||||
thousand engineers. In short, it's big.
|
||||
|
||||
## How does Uber use Zig?
|
||||
|
||||
@ -77,10 +78,10 @@ wave --- I still remember the complexity.
|
||||
|
||||
### 2019: asks for a hermetic toolchain
|
||||
|
||||
At the time, the Go monorepo already used a hermetic Go toolchain. That means
|
||||
it would download the Go SDK as part of the build process. Therefore, on
|
||||
whichever environment a Go build was running, it always used the same version
|
||||
of Go.
|
||||
At the time, the Go monorepo already used a hermetic Go toolchain. Therefore,
|
||||
the Go compiler used to build the monorepo was unaffected by the compiler
|
||||
installed on the system, if any. Therefore, on whichever environment a Go build
|
||||
was running, it always used the same version of Go.
|
||||
|
||||
{{<img src="_/2022/uber-zig-gm-221.png"
|
||||
alt="A Jira task asking for a hermetic C++ toolchain."
|
||||
@ -88,21 +89,21 @@ of Go.
|
||||
hint="graph"
|
||||
>}}
|
||||
|
||||
C++ toolchain is a collection of programs to compile C/C++ code. Our Go code
|
||||
uses quite a bit of [CGo][cgo], so it needs a C/C++ compiler. Go then links the
|
||||
Go and C parts to the final executable.
|
||||
A C++ toolchain is a collection of programs to compile C/C++ code. It is
|
||||
unavoidable for some our Go code to use [CGo][cgo], so it needs a C/C++
|
||||
compiler. Go then links the Go and C parts to the final executable.
|
||||
|
||||
The C++ toolchain was not hermetic since the start of Go monorepo: Bazel would
|
||||
use whatever it found on the system. That meant clang on MacOS, gcc (whatever
|
||||
version) on Linux. Setting up C++ toolchain in Bazel is a lot of work (think
|
||||
person-months for our monorepo), there was no immediate need, and it also was
|
||||
not painful *enough* to be picked up.
|
||||
use whatever it found on the system. That meant clang on macOS, gcc (whatever
|
||||
version) on Linux. Setting up a hermetic C++ toolchain in Bazel is a lot of
|
||||
work (think person-months for our monorepo), there was no immediate need, and
|
||||
it also was not painful *enough* to be picked up.
|
||||
|
||||
At this point it is important to understand the limitations of a non-hermetic
|
||||
C++ toolchain:
|
||||
- Cannot cross-compile. So we can't compile Linux executables on a Mac if they
|
||||
have CGo (which is most of our service code). This was worked around by...
|
||||
not cross-compiling.
|
||||
have CGo (which many of our services do). This was worked around by... not
|
||||
cross-compiling.
|
||||
- CGo executables would link to a glibc version that was found on the system.
|
||||
That means: when upgrading the OS (multi-month effort), the build fleet must
|
||||
be upgraded last. Otherwise, if build host runs a newer glibc than a
|
||||
@ -111,15 +112,18 @@ C++ toolchain:
|
||||
- We couldn't use new compilers, which have better optimizations, because we
|
||||
were running an older OS on the build fleet (backporting only the compiler,
|
||||
but not glibc, carries it's own risks).
|
||||
- Official binaries for newer versions of Go are built against a more recent
|
||||
version of GCC than some of our build machines. We had to work around this by
|
||||
compiling Go from source on these machines.
|
||||
|
||||
All of these issues were annoying, but not enough to invest into the toolchain.
|
||||
|
||||
### 2020 Dec: need musl
|
||||
|
||||
I was working on a toy project that is built with Bazel and uses CGo. I wanted
|
||||
my binary to be static, but Bazel is not easily offering that. I spent a couple
|
||||
of evenings creating a Bazel toolchain on top of [musl.cc](https://musl.cc),
|
||||
but didn't go far, because at the time I wasn't able to make sense out of the
|
||||
my binary to be static, but Bazel does not make that easy. I spent a couple of
|
||||
evenings creating a Bazel toolchain on top of [musl.cc](https://musl.cc), but
|
||||
didn't go far, because at the time I wasn't able to make sense out of the
|
||||
Bazel's toolchain documentation, and I didn't find a good example to rely on.
|
||||
|
||||
### 2021 Jan: discovering `zig cc`
|
||||
@ -193,12 +197,12 @@ dependency on system libraries and undoing of a lot of tech debt.
|
||||
- Various places at Uber would benefit from a hermetic C++ cross-compiler, but
|
||||
it's not funded due to a large investment and not enough justification.
|
||||
- bazel-zig-cc kinda works, but both bazel-zig-cc and zig cc have known bugs.
|
||||
- Donations don't "help" for `zig cc`, and I can't realistically implement
|
||||
them. I tried with `zig ar`, a trivial front-end for llvm's ld, and failed.
|
||||
- The monorepo-onboarding diff was simmering and waiting for it's time.
|
||||
- I can't realistically implement the necessary changes or bug fixes. I tried
|
||||
implementing `zig ar`, a trivial front-end for llvm's `ar`, and failed.
|
||||
- Once an issue had been identified as a Zig issue, getting attention from Zig
|
||||
developers was unpredictable. Some issues got resolved within days, some took
|
||||
more than 6 months.
|
||||
more than 6 months. Donations don't change `zig cc` priorities.
|
||||
- The monorepo-onboarding diff was simmering and waiting for it's time.
|
||||
|
||||
### 2021 End: Uber needs a cross-compiler
|
||||
|
||||
@ -218,8 +222,8 @@ thing to manage is risk. As zig is a novel technology (not even 1.0!), it was
|
||||
truly unusual to suggest compiling all of our C and C++ code with it. We should
|
||||
be planning to stick with it for at least a decade. Questions were raised and
|
||||
evaluated with great care and scrutiny. For that I am truly grateful to the Go
|
||||
Monorepo team, especially Ken Micklas, for doing the work and research on this
|
||||
unproven prototype.
|
||||
Monorepo team, especially [Ken Micklas][kmicklas], for doing the work and
|
||||
research on this unproven prototype.
|
||||
|
||||
### Evaluation of different compilers
|
||||
|
||||
@ -234,22 +238,23 @@ Given that we now needed a cross-compiler, we had two candidates:
|
||||
- configurable glibc version. In grailbio case you would need a sysroot
|
||||
(basically, a chroot with the system libraries, so the programs can be linked
|
||||
against them), which needs to be maintained.
|
||||
- a working, albeit still buggy, hermetic (cross-)compiler for OSX.
|
||||
- a working, albeit still buggy, hermetic (cross-)compiler for macOS.
|
||||
|
||||
Glibc we can handle in either case. However, `bazel-toolchain` will unlikely
|
||||
ever have a way to compile to OSX, let alone cross-compile. Relying on the
|
||||
ever have a way to compile to macOS, let alone cross-compile. Relying on the
|
||||
system compiler is undesirable on developer laptops, and Go Platform feels that
|
||||
first-hand, especially during OSX upgrades.
|
||||
first-hand, especially during macOS upgrades.
|
||||
|
||||
The prospect of a hermetic toolchain for OSX targets tripped the scales towards
|
||||
`zig cc`, with all it's warts, risks and instability.
|
||||
The prospect of a hermetic toolchain for macOS targets tripped the scales
|
||||
towards `zig cc`, with all its warts, risks and instability.
|
||||
|
||||
There was another, attention problem: if we were considering to use zig in a
|
||||
serious capacity, we knew we will hit problems, but unlikely have the expertise
|
||||
to solve them. How can we, as a BigCorp, de-risk the engagement question,
|
||||
making sure that bugs important to us are handled timely? We were sure of good
|
||||
intentions of ZSF: it was obvious that, if we find and report a legitimate bug,
|
||||
it would get fixed. But how can we put an upper bound on latency?
|
||||
There was another, attention problem: if we were considering the use of Zig in
|
||||
a serious capacity, we knew we will hit problems, but would be unlikely to have
|
||||
the expertise to solve them. How can we, as a BigCorp, de-risk the engagement
|
||||
question, making sure that bugs important to us are handled timely? We were
|
||||
sure of good intentions of ZSF: it was obvious that, if we find and report a
|
||||
legitimate bug, it would get fixed. But how can we put an upper bound on
|
||||
latency?
|
||||
|
||||
### Money
|
||||
|
||||
@ -271,7 +276,7 @@ bystander. We did not ask for special rights, it's explicit in the contract,
|
||||
and we don't want that.
|
||||
|
||||
The contract was signed, the wire transfer completed, and in 2022 January we
|
||||
hpad:
|
||||
had:
|
||||
|
||||
- A service contract with ZSF that promised to prioritize issues that we've
|
||||
registered.
|
||||
@ -287,8 +292,8 @@ hpad:
|
||||
## 2022 and beyond
|
||||
|
||||
In Feb 2022 the toolchain was gated behind a command-line flag
|
||||
(`--config=hermetic-cc`). As of Feb 2022, you can invoke `zig cc` in Uber's go
|
||||
monorepo without requiring a custom patch.
|
||||
(`--config=hermetic-cc`). As of Feb 2022, you can invoke `zig cc` in Uber's Go
|
||||
Monorepo without requiring a custom patch.
|
||||
|
||||
{{<img src="_/2022/uber-zig-landed.png"
|
||||
alt="WIP DIFF onboarding the monorepo was landed"
|
||||
@ -317,7 +322,7 @@ zig-cc could have failed due to many many reasons.
|
||||
|
||||
Looking back, I think the most important reasons for success is a killer
|
||||
feature at the right time. In our case, there were two: glibc version selection
|
||||
without a sysroot and cross-compiling to OSX.
|
||||
without a sysroot and cross-compiling to macOS.
|
||||
|
||||
## Appendix
|
||||
|
||||
@ -333,6 +338,11 @@ If compilers or adopting software for other CPU architectures (and/or living in
|
||||
the Eastern Europe) is your thing, my team in Vilnius is hiring. Also, my
|
||||
sister teams in Seattle and Bay Area are hiring too. Ping me.
|
||||
|
||||
Credits
|
||||
-------
|
||||
|
||||
Many thanks Abhinav Gupta for reading drafts of this.
|
||||
|
||||
[^1]: Errata: I incorrectly said "by an order of magnitude". The order of
|
||||
magnitude is the same.
|
||||
[^2]: Errata: I said Go was the first monorepo. Go was 4'th.
|
||||
@ -352,3 +362,4 @@ sister teams in Seattle and Bay Area are hiring too. Ping me.
|
||||
[grailbio/bazel-toolchain]: https://github.com/grailbio/bazel-toolchain
|
||||
[milan-youtube]: https://www.youtube.com/watch?v=SCj2J3HcEfc
|
||||
[zig-motiejus-issues]: https://github.com/ziglang/zig/issues?q=author%3Amotiejus+sort%3Acreated-asc
|
||||
[kmicklas]: https://github.com/kmicklas
|
||||
|
Loading…
Reference in New Issue
Block a user