jakstys.lt/content/log/2022/dependencies.md

167 lines
7.9 KiB
Markdown

---
title: "Dependencies, zig and git-subtrac"
date: 2022-04-23T05:37:51+03:00
# FIXME: "slug: dependencies" doesn't do what I meant.
url: 2022/dependencies
draft: true
# FIXME: 'list: never' keeps the link in the feed
#_build:
# list: never
---
<!-- o_ -->
TLDR: modern programming languages make it very easy to add many dependencies.
That is nice for development, but a nightmare for maintenance. Unfortunately,
zig is following suit. I wish we could accept that adding dependencies does not
have to be trivial. If we accept that, thanks to ubiquity of git, we may have
almost solved the dependency problem.
Adding dependencies
-------------------
All of the programming languages I've used professionally whose name does not
start with "c"[^1] have package managers[^2], which make "dependency
management" easy. These package managers will, as part of the project's build
process, download and build the dependencies, which makes adding and using
third-party dependencies easy.
Because C/C++ still does not have a universal package manager, not adding
external dependencies to C/C++ is the path of least resistance; instead, one
relies on libraries already installed in the system. Therefore, there is a
plethora of dependency managers that will discover, but not install
dependencies: autotools, cmake, pkg-config and others. As a result, C/C++
projects I've been involved usually had 0-5 non-system dependencies, whereas
non-C/C++ projects -- tens, hundreds or thousands[^3]. Having many system
dependencies is painful for *every user* of the package (because they have to
make sure the libraries, and their correct versions, are installed), so C/C++
projects avoid having too many of them.
Not doing things that are easy to do requires discipline: brushing teeth,
limiting candy intake, not adding dependencies all over the place. If it is
easy to add dependencies and there is no discipline not doing so, the project
will gain a lot of dependency "weight" with time.
{{<img src="_/2022/brick-house.jpg"
alt="House made out of Duplo pieces"
caption="Just like this brick house, \"modern\" package managers are optimized for building, not maintenance. Photo mine, house by my sons."
hint="photo"
>}}
In Go and Python small number of dependencies is often a sign of care and
quality. [mattn/go-sqlite3](https://github.com/mattn/go-sqlite3),
[uber/zap](https://github.com/uber-go/zap),
[apenwarr/redo](https://github.com/apenwarr/redo) and
[django](https://djangoproject.com) are good examples. Making it easy to depend
on external code is is convenient during development, but frees developers from
their basic right (or obligation?) to audit understand them. And adds real
long-term maintenance costs.
The costs of just having dependencies are huge. I haven't done a survey and
have only my experience to base this on (read: "many anecdotes of me failing to
build stuff I wrote a decade ago"). But it is bad enough that I have a
dependency checklist and am prepared to do the grunt work to save my future
self. Here is it:
1. Does the dependency do what I want, does it work at all?
2. Is it well written? API surface, documentation, tests, error handling, error
signaling, logging, metrics, memory usage (if applicable).
3. How easy is it to build, run and run it's tests? Related: can it be used
outside the default package manager?
4. It's system dependencies.
5. It's transitive dependencies.
Assuming a "programming-language-specific package manager that does what it's
advertised to do", the path of least resistance, when it comes to this
checklist, is doing (1), and perhaps (2). Why bother with transitive
dependencies or it's build complexity, if the package manager will take care of
it all anyway?
Except it will only when you are adding it. Package manager will not help you
when the dependency disappears, it's API changes, it stops doing what it has
advertised and many other [problems][crash-of-leftpad].
I am trying to do all 5. If a dependency is well written, but has more
transitive dependencies than I need and there is no good alternative, I will
fork and trim it. My recent example is
[sql-migrate](https://github.com/motiejus/sql-migrate).
To sum up, the "modern" languages optimize for initial development experience,
not maintenance. And as [Corbet says][linux-rust]. "We can't understand why
Kids These Days just don't want to live that way". Kids want to build, John,
not maintain. A 4-letter Danish corporation made a fortune by selling toys that
do not need to be maintained: they are designed to be disassembled and built
anew. We are still kids. Growing up requires discipline, which is very hard,
when candy is cheap and package managers (and disks and network, which make all
of it possible) are as good as they are today.
If I may combine Corbet's views with mine: if we understand and audit our
dependencies (all of them, including transitive ones), we will have less
dependencies and a more maintainable system. Win-win.
Which brings us to...
git-subtrac
-----------
[`git-subtrac`][git-subtrac] manages our git dependencies (in our git
repository) just like "classic" git submodules, but all refs of the
dependencies stay in the same repository. Wait, stop here. Repeat after me: _it
is git submodules, but all refs stay in the same repository_. I also call it
"good vendoring". Since all the deps are in our repo, no external force can
make our dependency unavailable, change without notice. And it will keep the
size of the repository in check, because it's all there when you pull it.
Because `git-subtrac` is a vendoring tool, not a package manager, it only
vendors, but does not help building packages. Therefore, with `git-subtrac` it
is harder to add and "make work" (build, test, add transitive deps) a
dependency than with a language-specific package manager. Oh, what about the
transitive dependencies?
[`git-subtrac`][git-subtrac] does not deal with transitive dependencies. At
least not directly. Or I am not aware of it. Ok, I haven't tried.
If we audit and thus understand our dependencies, we will be able to add the
transitive ones. So perhaps git-subtrac shouldn't care?
What about Zig?
---------------
Zig will have a package manager ([ziglang/zig#943][943]). I am not not very
enthusiastic about it; can we all use git-subtrac and be done with it?. A few
weeks ago in a park in Milan my conversation with [Andrew
Kelley](https://andrewkelley.me/) was something like:
- me: "git-subtrac yadda yadda yadda submodules but better yadda yadda yadda".
- Andrew: "if I clone a repository that uses it with no extra parameters, will
it work as expected?"
- me: "no, you have to pass `--recursive`, so git will checkout submodules...
even if they are already fetched."
- Andrew: "then it's a piece-of-shit-approach."
Uh, I agree. People have not grown muscle memory to clone repositories with
`--recursive` flag and never will, so it's impossible to adopt git-subtrac
beyond well-controlled silos. Which is why we will have a
yet-another-programming-language-specific-package-manager. Or at least my
argument for using and advertising `git-subtrac` (and saving a lot of time for
Zig folks, and a lot of inevitable misery for it's users) stops right there.
Conclusion
----------
Can git check out submodules when they are in the same repository, so our
conversation of reconsidering (or not having) a Zig package manager doesn't
stop after 5 seconds?
[^1]: Alphabetically: Erlang, Go, Java, Javascript, PHP, Perl, Python.
[^2]: Usually written in the same language. Zoo of package managers (sometimes
a couple of popular ones for the same programming language) is a can of worms
in an on itself worth another blog post.
[^3]: `go.sum` of a project I am currently involved clocks around 6k lines.
This is quite a lot for Go, but still peanuts to Node.js.
[git-subtrac]: https://github.com/apenwarr/git-subtrac/
[linux-rust]: https://lwn.net/SubscriberLink/889924/a733d6630e3b5115/
[crash-of-leftpad]: https://drewdevault.com/2021/11/16/Cash-for-leftpad.html
[943]: https://github.com/ziglang/zig/issues/943