microsoft-git
This commit is contained in:
parent
4606daff43
commit
7165728654
115
content/log/2023/microsoft-git.md
Normal file
115
content/log/2023/microsoft-git.md
Normal file
@ -0,0 +1,115 @@
|
|||||||
|
---
|
||||||
|
title: "This conversation totally didn't happen at Microsoft"
|
||||||
|
slug: microsoft-git
|
||||||
|
date: 2023-12-07T14:00:00+02:00
|
||||||
|
---
|
||||||
|
|
||||||
|
Similarity to real–world events and character names is coincidental.
|
||||||
|
|
||||||
|
Characters, Microsoft employees:
|
||||||
|
|
||||||
|
* *Amy:* a high-level executive. Ex-JPMorgan. Pragmatic.
|
||||||
|
* *Harry:* an engineer in Developer Services team. His organization
|
||||||
|
owns code hosting, developer tools and CI infrastructure. A good listener.
|
||||||
|
|
||||||
|
# 2015 — the beginning of Git at Microsoft
|
||||||
|
|
||||||
|
Exchange between Harry and Amy in a parking lot of a chilly Redmond morning:
|
||||||
|
|
||||||
|
- *Harry*: Amy, our Skype colleagues from Tallinn have been using git since
|
||||||
|
2006 and are making fun of us for using perforce in
|
||||||
|
2015. Our ex-AWS colleagues take offense, since they know Estonians are
|
||||||
|
right. In fact, everyone takes offense, because nobody likes to admit
|
||||||
|
Estonians are right. Git is a tad too slow for large repos, preventing quick
|
||||||
|
migration. Do you mind if I ask my team to take a look into this?
|
||||||
|
- *Amy*: sure, go ahead, Harry. I don't care about version control, do what you
|
||||||
|
think is right as long as it works for everyone.
|
||||||
|
|
||||||
|
Harry starts poking at git to make it work better for larger repositories.
|
||||||
|
|
||||||
|
# late 2016 — money pressure and GVFS
|
||||||
|
|
||||||
|
Harry and his team implements partial clone (later renamed to sparse checkout).
|
||||||
|
With careful hand-holding, crossed fingers and during a good weather, Visual
|
||||||
|
Studio can now load the partially-cloned Windows repository without crashing.
|
||||||
|
Excitement grows. Friendly, congratulatory exchanges between Estonians and
|
||||||
|
Redmondians take place. Engineers get excited thinking the migration is "soon".
|
||||||
|
|
||||||
|
Harry and Amy again:
|
||||||
|
|
||||||
|
- *Amy:* Harry, how's that git thing going? I said I don't care about version
|
||||||
|
control, but for some reason I do now.
|
||||||
|
- *Harry:* pretty well, why?
|
||||||
|
- *Amy:* just curious, what would it take to migrate the whole company to git?
|
||||||
|
- *Harry:* the tooling is robust and we are ready to migrate. One last thing
|
||||||
|
--- Windows and Office repositories are in the hundreds of gigabytes. About
|
||||||
|
50k people will need get their laptops' disks replaced. Oh, and we will kill
|
||||||
|
the office network while they download the initial clone. With good planning,
|
||||||
|
we should be good in a month or two.
|
||||||
|
- *Amy:* sounds like $20 million for the disks and lost productivity while this
|
||||||
|
chaos settles down. Any other ideas?
|
||||||
|
- *Harry:* our central repositories are in the basement, and the office
|
||||||
|
connectivity is quite good. Maybe we can use shallow clones.
|
||||||
|
- *Amy:* whatever that means. If it helps, try to make it happen.
|
||||||
|
|
||||||
|
Harry scrambles to do something about it, creates GVFS. Open sources it.
|
||||||
|
Everyone understands it's a temporary solution, so lives wit it. People use
|
||||||
|
their git.
|
||||||
|
|
||||||
|
# 2017 — migration is over and problems with GVFS
|
||||||
|
|
||||||
|
Migration is over for the last repository. People are complaining about GVFS,
|
||||||
|
but at least they are on git. Amy did not spend her political capital on
|
||||||
|
procurement, so she is happy.
|
||||||
|
|
||||||
|
GVFS is open-source, but only sort-of. It requires many Microsoft assumptions
|
||||||
|
(e.g. don't even try MacOS), but companies cargo-cult GVFS and struggle with it
|
||||||
|
anyway, because it's Microsoft.
|
||||||
|
|
||||||
|
# 2018 — and later: github acquisition and Scalar
|
||||||
|
|
||||||
|
Microsoft buys github. Estonians no longer have anything to make fun of, so
|
||||||
|
they fall back to poking the flies on their office windows. Harry has an eye on
|
||||||
|
replacing GVFS.
|
||||||
|
|
||||||
|
Harry's team keeps improving git. Rewrites GVFS to C and renames it to
|
||||||
|
`scalar`. To take revenge of Estonians, Harry's colleague Theodoric bets that
|
||||||
|
he can put microsoft-specific code into upstream git. He wins:
|
||||||
|
|
||||||
|
https://github.com/git/git/blob/v2.35.0/contrib/scalar/scalar.c#L144
|
||||||
|
|
||||||
|
# Late 2023
|
||||||
|
|
||||||
|
MS taught their developers to use `scalar`. Dozens of other companies who
|
||||||
|
believe their repositories are big clone the Microsoft's workflow. However,
|
||||||
|
their git repositories are not in the basement of their office. So many people
|
||||||
|
unknowingly pay the price of calling into github every few seconds.
|
||||||
|
|
||||||
|
The speed of light is did not change over the last decade. If your git
|
||||||
|
repository is on another continent, it will still take at least 100ms for the
|
||||||
|
round-trip (plus whatever outage your git provider has this minute). Cost of
|
||||||
|
SSD is ~$100/TB, this keeps decreasing.
|
||||||
|
|
||||||
|
`scalar.c` has been "made official" and moved from contrib to top-level. But
|
||||||
|
the azure ghosts are still with us:
|
||||||
|
|
||||||
|
https://github.com/git/git/blob/v2.43.0/scalar.c#L145
|
||||||
|
|
||||||
|
# Takeaways
|
||||||
|
|
||||||
|
Try this if you think your repo is big:
|
||||||
|
|
||||||
|
```
|
||||||
|
git clone -c feature.manyFiles=true git@<...>
|
||||||
|
```
|
||||||
|
|
||||||
|
And forget shallow clones. Sparse checkouts are pretty decently done, so if
|
||||||
|
your repository allows that, it may be a good thing to try.
|
||||||
|
|
||||||
|
Also have a look at `git maintenance` and `git config core.fsmonitor`.
|
||||||
|
|
||||||
|
If you eye a large company for a solution, think about their context. Your
|
||||||
|
repository probably doesn't weigh hundreds of gigabytes, and it will not cost
|
||||||
|
$20 million to procure larger disks for developers.
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user