abg's comments
This commit is contained in:
parent
304c635b27
commit
80580a9968
@ -26,14 +26,15 @@ transcript, with some commentary and errata.
|
|||||||
|
|
||||||
TLDR:
|
TLDR:
|
||||||
|
|
||||||
* Uber uses zig to compile it's C/C++ code. Now only in [Go
|
* Uber uses zig to compile its C/C++ code. Now only in the [Go
|
||||||
Monorepo][go-monorepo] via [bazel-zig-cc][bazel-zig-cc], with inconcrete
|
Monorepo][go-monorepo] via [bazel-zig-cc][bazel-zig-cc], with plans to
|
||||||
ideas to expand use of `zig cc` to other monorepos.
|
possibly expand use of `zig cc` to other languages that need a C/C++
|
||||||
* Uber does not have any plans to use zig-the-language.
|
toolchain.
|
||||||
|
* Uber does not have any plans to use zig-the-language yet.
|
||||||
* Uber signed a support agreement with Zig Software Foundation (ZSF) to
|
* Uber signed a support agreement with Zig Software Foundation (ZSF) to
|
||||||
prioritize bug fixes. The contract value is disclosed in the ZSF financial
|
prioritize bug fixes. The contract value is disclosed in the ZSF financial
|
||||||
reports.
|
reports.
|
||||||
* Thanks my team, the Go Monorepo team, the Go Platform team, my director,
|
* Thanks to my team, the Go Monorepo team, the Go Platform team, my director,
|
||||||
finance, legal, and of course Zig Software Foundation for making this
|
finance, legal, and of course Zig Software Foundation for making this
|
||||||
relationship happen. The relationship has been fruitful so far.
|
relationship happen. The relationship has been fruitful so far.
|
||||||
|
|
||||||
@ -47,8 +48,8 @@ languages are Go and Java, with Python and Node allowed for specific use cases
|
|||||||
(like front-end for Node and Python for data analysis/ML). Use of other
|
(like front-end for Node and Python for data analysis/ML). Use of other
|
||||||
languages in back-end code is minimal.
|
languages in back-end code is minimal.
|
||||||
|
|
||||||
Our go monorepo is larger than Linux kernel[^1], and worked on by a couple of
|
Our Go Monorepo is larger than Linux kernel[^1], and worked on by a couple of
|
||||||
thousand engineers. To sum up, it is size-able.
|
thousand engineers. In short, it's big.
|
||||||
|
|
||||||
## How does Uber use Zig?
|
## How does Uber use Zig?
|
||||||
|
|
||||||
@ -77,10 +78,10 @@ wave --- I still remember the complexity.
|
|||||||
|
|
||||||
### 2019: asks for a hermetic toolchain
|
### 2019: asks for a hermetic toolchain
|
||||||
|
|
||||||
At the time, the Go monorepo already used a hermetic Go toolchain. That means
|
At the time, the Go monorepo already used a hermetic Go toolchain. Therefore,
|
||||||
it would download the Go SDK as part of the build process. Therefore, on
|
the Go compiler used to build the monorepo was unaffected by the compiler
|
||||||
whichever environment a Go build was running, it always used the same version
|
installed on the system, if any. Therefore, on whichever environment a Go build
|
||||||
of Go.
|
was running, it always used the same version of Go.
|
||||||
|
|
||||||
{{<img src="_/2022/uber-zig-gm-221.png"
|
{{<img src="_/2022/uber-zig-gm-221.png"
|
||||||
alt="A Jira task asking for a hermetic C++ toolchain."
|
alt="A Jira task asking for a hermetic C++ toolchain."
|
||||||
@ -88,21 +89,21 @@ of Go.
|
|||||||
hint="graph"
|
hint="graph"
|
||||||
>}}
|
>}}
|
||||||
|
|
||||||
C++ toolchain is a collection of programs to compile C/C++ code. Our Go code
|
A C++ toolchain is a collection of programs to compile C/C++ code. It is
|
||||||
uses quite a bit of [CGo][cgo], so it needs a C/C++ compiler. Go then links the
|
unavoidable for some our Go code to use [CGo][cgo], so it needs a C/C++
|
||||||
Go and C parts to the final executable.
|
compiler. Go then links the Go and C parts to the final executable.
|
||||||
|
|
||||||
The C++ toolchain was not hermetic since the start of Go monorepo: Bazel would
|
The C++ toolchain was not hermetic since the start of Go monorepo: Bazel would
|
||||||
use whatever it found on the system. That meant clang on MacOS, gcc (whatever
|
use whatever it found on the system. That meant clang on macOS, gcc (whatever
|
||||||
version) on Linux. Setting up C++ toolchain in Bazel is a lot of work (think
|
version) on Linux. Setting up a hermetic C++ toolchain in Bazel is a lot of
|
||||||
person-months for our monorepo), there was no immediate need, and it also was
|
work (think person-months for our monorepo), there was no immediate need, and
|
||||||
not painful *enough* to be picked up.
|
it also was not painful *enough* to be picked up.
|
||||||
|
|
||||||
At this point it is important to understand the limitations of a non-hermetic
|
At this point it is important to understand the limitations of a non-hermetic
|
||||||
C++ toolchain:
|
C++ toolchain:
|
||||||
- Cannot cross-compile. So we can't compile Linux executables on a Mac if they
|
- Cannot cross-compile. So we can't compile Linux executables on a Mac if they
|
||||||
have CGo (which is most of our service code). This was worked around by...
|
have CGo (which many of our services do). This was worked around by... not
|
||||||
not cross-compiling.
|
cross-compiling.
|
||||||
- CGo executables would link to a glibc version that was found on the system.
|
- CGo executables would link to a glibc version that was found on the system.
|
||||||
That means: when upgrading the OS (multi-month effort), the build fleet must
|
That means: when upgrading the OS (multi-month effort), the build fleet must
|
||||||
be upgraded last. Otherwise, if build host runs a newer glibc than a
|
be upgraded last. Otherwise, if build host runs a newer glibc than a
|
||||||
@ -111,15 +112,18 @@ C++ toolchain:
|
|||||||
- We couldn't use new compilers, which have better optimizations, because we
|
- We couldn't use new compilers, which have better optimizations, because we
|
||||||
were running an older OS on the build fleet (backporting only the compiler,
|
were running an older OS on the build fleet (backporting only the compiler,
|
||||||
but not glibc, carries it's own risks).
|
but not glibc, carries it's own risks).
|
||||||
|
- Official binaries for newer versions of Go are built against a more recent
|
||||||
|
version of GCC than some of our build machines. We had to work around this by
|
||||||
|
compiling Go from source on these machines.
|
||||||
|
|
||||||
All of these issues were annoying, but not enough to invest into the toolchain.
|
All of these issues were annoying, but not enough to invest into the toolchain.
|
||||||
|
|
||||||
### 2020 Dec: need musl
|
### 2020 Dec: need musl
|
||||||
|
|
||||||
I was working on a toy project that is built with Bazel and uses CGo. I wanted
|
I was working on a toy project that is built with Bazel and uses CGo. I wanted
|
||||||
my binary to be static, but Bazel is not easily offering that. I spent a couple
|
my binary to be static, but Bazel does not make that easy. I spent a couple of
|
||||||
of evenings creating a Bazel toolchain on top of [musl.cc](https://musl.cc),
|
evenings creating a Bazel toolchain on top of [musl.cc](https://musl.cc), but
|
||||||
but didn't go far, because at the time I wasn't able to make sense out of the
|
didn't go far, because at the time I wasn't able to make sense out of the
|
||||||
Bazel's toolchain documentation, and I didn't find a good example to rely on.
|
Bazel's toolchain documentation, and I didn't find a good example to rely on.
|
||||||
|
|
||||||
### 2021 Jan: discovering `zig cc`
|
### 2021 Jan: discovering `zig cc`
|
||||||
@ -193,12 +197,12 @@ dependency on system libraries and undoing of a lot of tech debt.
|
|||||||
- Various places at Uber would benefit from a hermetic C++ cross-compiler, but
|
- Various places at Uber would benefit from a hermetic C++ cross-compiler, but
|
||||||
it's not funded due to a large investment and not enough justification.
|
it's not funded due to a large investment and not enough justification.
|
||||||
- bazel-zig-cc kinda works, but both bazel-zig-cc and zig cc have known bugs.
|
- bazel-zig-cc kinda works, but both bazel-zig-cc and zig cc have known bugs.
|
||||||
- Donations don't "help" for `zig cc`, and I can't realistically implement
|
- I can't realistically implement the necessary changes or bug fixes. I tried
|
||||||
them. I tried with `zig ar`, a trivial front-end for llvm's ld, and failed.
|
implementing `zig ar`, a trivial front-end for llvm's `ar`, and failed.
|
||||||
- The monorepo-onboarding diff was simmering and waiting for it's time.
|
|
||||||
- Once an issue had been identified as a Zig issue, getting attention from Zig
|
- Once an issue had been identified as a Zig issue, getting attention from Zig
|
||||||
developers was unpredictable. Some issues got resolved within days, some took
|
developers was unpredictable. Some issues got resolved within days, some took
|
||||||
more than 6 months.
|
more than 6 months. Donations don't change `zig cc` priorities.
|
||||||
|
- The monorepo-onboarding diff was simmering and waiting for it's time.
|
||||||
|
|
||||||
### 2021 End: Uber needs a cross-compiler
|
### 2021 End: Uber needs a cross-compiler
|
||||||
|
|
||||||
@ -218,8 +222,8 @@ thing to manage is risk. As zig is a novel technology (not even 1.0!), it was
|
|||||||
truly unusual to suggest compiling all of our C and C++ code with it. We should
|
truly unusual to suggest compiling all of our C and C++ code with it. We should
|
||||||
be planning to stick with it for at least a decade. Questions were raised and
|
be planning to stick with it for at least a decade. Questions were raised and
|
||||||
evaluated with great care and scrutiny. For that I am truly grateful to the Go
|
evaluated with great care and scrutiny. For that I am truly grateful to the Go
|
||||||
Monorepo team, especially Ken Micklas, for doing the work and research on this
|
Monorepo team, especially [Ken Micklas][kmicklas], for doing the work and
|
||||||
unproven prototype.
|
research on this unproven prototype.
|
||||||
|
|
||||||
### Evaluation of different compilers
|
### Evaluation of different compilers
|
||||||
|
|
||||||
@ -234,22 +238,23 @@ Given that we now needed a cross-compiler, we had two candidates:
|
|||||||
- configurable glibc version. In grailbio case you would need a sysroot
|
- configurable glibc version. In grailbio case you would need a sysroot
|
||||||
(basically, a chroot with the system libraries, so the programs can be linked
|
(basically, a chroot with the system libraries, so the programs can be linked
|
||||||
against them), which needs to be maintained.
|
against them), which needs to be maintained.
|
||||||
- a working, albeit still buggy, hermetic (cross-)compiler for OSX.
|
- a working, albeit still buggy, hermetic (cross-)compiler for macOS.
|
||||||
|
|
||||||
Glibc we can handle in either case. However, `bazel-toolchain` will unlikely
|
Glibc we can handle in either case. However, `bazel-toolchain` will unlikely
|
||||||
ever have a way to compile to OSX, let alone cross-compile. Relying on the
|
ever have a way to compile to macOS, let alone cross-compile. Relying on the
|
||||||
system compiler is undesirable on developer laptops, and Go Platform feels that
|
system compiler is undesirable on developer laptops, and Go Platform feels that
|
||||||
first-hand, especially during OSX upgrades.
|
first-hand, especially during macOS upgrades.
|
||||||
|
|
||||||
The prospect of a hermetic toolchain for OSX targets tripped the scales towards
|
The prospect of a hermetic toolchain for macOS targets tripped the scales
|
||||||
`zig cc`, with all it's warts, risks and instability.
|
towards `zig cc`, with all its warts, risks and instability.
|
||||||
|
|
||||||
There was another, attention problem: if we were considering to use zig in a
|
There was another, attention problem: if we were considering the use of Zig in
|
||||||
serious capacity, we knew we will hit problems, but unlikely have the expertise
|
a serious capacity, we knew we will hit problems, but would be unlikely to have
|
||||||
to solve them. How can we, as a BigCorp, de-risk the engagement question,
|
the expertise to solve them. How can we, as a BigCorp, de-risk the engagement
|
||||||
making sure that bugs important to us are handled timely? We were sure of good
|
question, making sure that bugs important to us are handled timely? We were
|
||||||
intentions of ZSF: it was obvious that, if we find and report a legitimate bug,
|
sure of good intentions of ZSF: it was obvious that, if we find and report a
|
||||||
it would get fixed. But how can we put an upper bound on latency?
|
legitimate bug, it would get fixed. But how can we put an upper bound on
|
||||||
|
latency?
|
||||||
|
|
||||||
### Money
|
### Money
|
||||||
|
|
||||||
@ -271,7 +276,7 @@ bystander. We did not ask for special rights, it's explicit in the contract,
|
|||||||
and we don't want that.
|
and we don't want that.
|
||||||
|
|
||||||
The contract was signed, the wire transfer completed, and in 2022 January we
|
The contract was signed, the wire transfer completed, and in 2022 January we
|
||||||
hpad:
|
had:
|
||||||
|
|
||||||
- A service contract with ZSF that promised to prioritize issues that we've
|
- A service contract with ZSF that promised to prioritize issues that we've
|
||||||
registered.
|
registered.
|
||||||
@ -287,8 +292,8 @@ hpad:
|
|||||||
## 2022 and beyond
|
## 2022 and beyond
|
||||||
|
|
||||||
In Feb 2022 the toolchain was gated behind a command-line flag
|
In Feb 2022 the toolchain was gated behind a command-line flag
|
||||||
(`--config=hermetic-cc`). As of Feb 2022, you can invoke `zig cc` in Uber's go
|
(`--config=hermetic-cc`). As of Feb 2022, you can invoke `zig cc` in Uber's Go
|
||||||
monorepo without requiring a custom patch.
|
Monorepo without requiring a custom patch.
|
||||||
|
|
||||||
{{<img src="_/2022/uber-zig-landed.png"
|
{{<img src="_/2022/uber-zig-landed.png"
|
||||||
alt="WIP DIFF onboarding the monorepo was landed"
|
alt="WIP DIFF onboarding the monorepo was landed"
|
||||||
@ -317,7 +322,7 @@ zig-cc could have failed due to many many reasons.
|
|||||||
|
|
||||||
Looking back, I think the most important reasons for success is a killer
|
Looking back, I think the most important reasons for success is a killer
|
||||||
feature at the right time. In our case, there were two: glibc version selection
|
feature at the right time. In our case, there were two: glibc version selection
|
||||||
without a sysroot and cross-compiling to OSX.
|
without a sysroot and cross-compiling to macOS.
|
||||||
|
|
||||||
## Appendix
|
## Appendix
|
||||||
|
|
||||||
@ -333,6 +338,11 @@ If compilers or adopting software for other CPU architectures (and/or living in
|
|||||||
the Eastern Europe) is your thing, my team in Vilnius is hiring. Also, my
|
the Eastern Europe) is your thing, my team in Vilnius is hiring. Also, my
|
||||||
sister teams in Seattle and Bay Area are hiring too. Ping me.
|
sister teams in Seattle and Bay Area are hiring too. Ping me.
|
||||||
|
|
||||||
|
Credits
|
||||||
|
-------
|
||||||
|
|
||||||
|
Many thanks Abhinav Gupta for reading drafts of this.
|
||||||
|
|
||||||
[^1]: Errata: I incorrectly said "by an order of magnitude". The order of
|
[^1]: Errata: I incorrectly said "by an order of magnitude". The order of
|
||||||
magnitude is the same.
|
magnitude is the same.
|
||||||
[^2]: Errata: I said Go was the first monorepo. Go was 4'th.
|
[^2]: Errata: I said Go was the first monorepo. Go was 4'th.
|
||||||
@ -352,3 +362,4 @@ sister teams in Seattle and Bay Area are hiring too. Ping me.
|
|||||||
[grailbio/bazel-toolchain]: https://github.com/grailbio/bazel-toolchain
|
[grailbio/bazel-toolchain]: https://github.com/grailbio/bazel-toolchain
|
||||||
[milan-youtube]: https://www.youtube.com/watch?v=SCj2J3HcEfc
|
[milan-youtube]: https://www.youtube.com/watch?v=SCj2J3HcEfc
|
||||||
[zig-motiejus-issues]: https://github.com/ziglang/zig/issues?q=author%3Amotiejus+sort%3Acreated-asc
|
[zig-motiejus-issues]: https://github.com/ziglang/zig/issues?q=author%3Amotiejus+sort%3Acreated-asc
|
||||||
|
[kmicklas]: https://github.com/kmicklas
|
||||||
|
Loading…
Reference in New Issue
Block a user