microsoft-git
This commit is contained in:
parent
4606daff43
commit
7165728654
115
content/log/2023/microsoft-git.md
Normal file
115
content/log/2023/microsoft-git.md
Normal file
@ -0,0 +1,115 @@
|
||||
---
|
||||
title: "This conversation totally didn't happen at Microsoft"
|
||||
slug: microsoft-git
|
||||
date: 2023-12-07T14:00:00+02:00
|
||||
---
|
||||
|
||||
Similarity to real–world events and character names is coincidental.
|
||||
|
||||
Characters, Microsoft employees:
|
||||
|
||||
* *Amy:* a high-level executive. Ex-JPMorgan. Pragmatic.
|
||||
* *Harry:* an engineer in Developer Services team. His organization
|
||||
owns code hosting, developer tools and CI infrastructure. A good listener.
|
||||
|
||||
# 2015 — the beginning of Git at Microsoft
|
||||
|
||||
Exchange between Harry and Amy in a parking lot of a chilly Redmond morning:
|
||||
|
||||
- *Harry*: Amy, our Skype colleagues from Tallinn have been using git since
|
||||
2006 and are making fun of us for using perforce in
|
||||
2015. Our ex-AWS colleagues take offense, since they know Estonians are
|
||||
right. In fact, everyone takes offense, because nobody likes to admit
|
||||
Estonians are right. Git is a tad too slow for large repos, preventing quick
|
||||
migration. Do you mind if I ask my team to take a look into this?
|
||||
- *Amy*: sure, go ahead, Harry. I don't care about version control, do what you
|
||||
think is right as long as it works for everyone.
|
||||
|
||||
Harry starts poking at git to make it work better for larger repositories.
|
||||
|
||||
# late 2016 — money pressure and GVFS
|
||||
|
||||
Harry and his team implements partial clone (later renamed to sparse checkout).
|
||||
With careful hand-holding, crossed fingers and during a good weather, Visual
|
||||
Studio can now load the partially-cloned Windows repository without crashing.
|
||||
Excitement grows. Friendly, congratulatory exchanges between Estonians and
|
||||
Redmondians take place. Engineers get excited thinking the migration is "soon".
|
||||
|
||||
Harry and Amy again:
|
||||
|
||||
- *Amy:* Harry, how's that git thing going? I said I don't care about version
|
||||
control, but for some reason I do now.
|
||||
- *Harry:* pretty well, why?
|
||||
- *Amy:* just curious, what would it take to migrate the whole company to git?
|
||||
- *Harry:* the tooling is robust and we are ready to migrate. One last thing
|
||||
--- Windows and Office repositories are in the hundreds of gigabytes. About
|
||||
50k people will need get their laptops' disks replaced. Oh, and we will kill
|
||||
the office network while they download the initial clone. With good planning,
|
||||
we should be good in a month or two.
|
||||
- *Amy:* sounds like $20 million for the disks and lost productivity while this
|
||||
chaos settles down. Any other ideas?
|
||||
- *Harry:* our central repositories are in the basement, and the office
|
||||
connectivity is quite good. Maybe we can use shallow clones.
|
||||
- *Amy:* whatever that means. If it helps, try to make it happen.
|
||||
|
||||
Harry scrambles to do something about it, creates GVFS. Open sources it.
|
||||
Everyone understands it's a temporary solution, so lives wit it. People use
|
||||
their git.
|
||||
|
||||
# 2017 — migration is over and problems with GVFS
|
||||
|
||||
Migration is over for the last repository. People are complaining about GVFS,
|
||||
but at least they are on git. Amy did not spend her political capital on
|
||||
procurement, so she is happy.
|
||||
|
||||
GVFS is open-source, but only sort-of. It requires many Microsoft assumptions
|
||||
(e.g. don't even try MacOS), but companies cargo-cult GVFS and struggle with it
|
||||
anyway, because it's Microsoft.
|
||||
|
||||
# 2018 — and later: github acquisition and Scalar
|
||||
|
||||
Microsoft buys github. Estonians no longer have anything to make fun of, so
|
||||
they fall back to poking the flies on their office windows. Harry has an eye on
|
||||
replacing GVFS.
|
||||
|
||||
Harry's team keeps improving git. Rewrites GVFS to C and renames it to
|
||||
`scalar`. To take revenge of Estonians, Harry's colleague Theodoric bets that
|
||||
he can put microsoft-specific code into upstream git. He wins:
|
||||
|
||||
https://github.com/git/git/blob/v2.35.0/contrib/scalar/scalar.c#L144
|
||||
|
||||
# Late 2023
|
||||
|
||||
MS taught their developers to use `scalar`. Dozens of other companies who
|
||||
believe their repositories are big clone the Microsoft's workflow. However,
|
||||
their git repositories are not in the basement of their office. So many people
|
||||
unknowingly pay the price of calling into github every few seconds.
|
||||
|
||||
The speed of light is did not change over the last decade. If your git
|
||||
repository is on another continent, it will still take at least 100ms for the
|
||||
round-trip (plus whatever outage your git provider has this minute). Cost of
|
||||
SSD is ~$100/TB, this keeps decreasing.
|
||||
|
||||
`scalar.c` has been "made official" and moved from contrib to top-level. But
|
||||
the azure ghosts are still with us:
|
||||
|
||||
https://github.com/git/git/blob/v2.43.0/scalar.c#L145
|
||||
|
||||
# Takeaways
|
||||
|
||||
Try this if you think your repo is big:
|
||||
|
||||
```
|
||||
git clone -c feature.manyFiles=true git@<...>
|
||||
```
|
||||
|
||||
And forget shallow clones. Sparse checkouts are pretty decently done, so if
|
||||
your repository allows that, it may be a good thing to try.
|
||||
|
||||
Also have a look at `git maintenance` and `git config core.fsmonitor`.
|
||||
|
||||
If you eye a large company for a solution, think about their context. Your
|
||||
repository probably doesn't weigh hundreds of gigabytes, and it will not cost
|
||||
$20 million to procure larger disks for developers.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user