jakstys.lt/content/log/2022/dependencies.md

7.9 KiB

title date url draft
Dependencies, zig and git-subtrac 2022-04-23T05:37:51+03:00 2022/dependencies true

TLDR: modern programming languages make it very easy to add many dependencies. That is nice for development, but a nightmare for maintenance. Unfortunately, zig is following suit. I wish we could accept that adding dependencies does not have to be trivial. If we accept that, thanks to ubiquity of git, we may have almost solved the dependency problem.

Adding dependencies

All of the programming languages I've used professionally whose name does not start with "c"1 have package managers2, which make "dependency management" easy. These package managers will, as part of the project's build process, download and build the dependencies, which makes adding and using third-party dependencies easy.

Because C/C++ still does not have a universal package manager, not adding external dependencies to C/C++ is the path of least resistance; instead, one relies on libraries already installed in the system. Therefore, there is a plethora of dependency managers that will discover, but not install dependencies: autotools, cmake, pkg-config and others. As a result, C/C++ projects I've been involved usually had 0-5 non-system dependencies, whereas non-C/C++ projects -- tens, hundreds or thousands3. Having many system dependencies is painful for every user of the package (because they have to make sure the libraries, and their correct versions, are installed), so C/C++ projects avoid having too many of them.

Not doing things that are easy to do requires discipline: brushing teeth, limiting candy intake, not adding dependencies all over the place. If it is easy to add dependencies and there is no discipline not doing so, the project will gain a lot of dependency "weight" with time.

{{<img src="_/2022/brick-house.jpg" alt="House made out of Duplo pieces" caption="Just like this brick house, "modern" package managers are optimized for building, not maintenance. Photo mine, house by my sons." hint="photo" >}}

In Go and Python small number of dependencies is often a sign of care and quality. mattn/go-sqlite3, uber/zap, apenwarr/redo and django are good examples. Making it easy to depend on external code is is convenient during development, but frees developers from their basic right (or obligation?) to audit understand them. And adds real long-term maintenance costs.

The costs of just having dependencies are huge. I haven't done a survey and have only my experience to base this on (read: "many anecdotes of me failing to build stuff I wrote a decade ago"). But it is bad enough that I have a dependency checklist and am prepared to do the grunt work to save my future self. Here is it:

  1. Does the dependency do what I want, does it work at all?
  2. Is it well written? API surface, documentation, tests, error handling, error signaling, logging, metrics, memory usage (if applicable).
  3. How easy is it to build, run and run it's tests? Related: can it be used outside the default package manager?
  4. It's system dependencies.
  5. It's transitive dependencies.

Assuming a "programming-language-specific package manager that does what it's advertised to do", the path of least resistance, when it comes to this checklist, is doing (1), and perhaps (2). Why bother with transitive dependencies or it's build complexity, if the package manager will take care of it all anyway?

Except it will only when you are adding it. Package manager will not help you when the dependency disappears, it's API changes, it stops doing what it has advertised and many other problems.

I am trying to do all 5. If a dependency is well written, but has more transitive dependencies than I need and there is no good alternative, I will fork and trim it. My recent example is sql-migrate.

To sum up, the "modern" languages optimize for initial development experience, not maintenance. And as Corbet says. "We can't understand why Kids These Days just don't want to live that way". Kids want to build, John, not maintain. A 4-letter Danish corporation made a fortune by selling toys that do not need to be maintained: they are designed to be disassembled and built anew. We are still kids. Growing up requires discipline, which is very hard, when candy is cheap and package managers (and disks and network, which make all of it possible) are as good as they are today.

If I may combine Corbet's views with mine: if we understand and audit our dependencies (all of them, including transitive ones), we will have less dependencies and a more maintainable system. Win-win.

Which brings us to...

git-subtrac

git-subtrac manages our git dependencies (in our git repository) just like "classic" git submodules, but all refs of the dependencies stay in the same repository. Wait, stop here. Repeat after me: it is git submodules, but all refs stay in the same repository. I also call it "good vendoring". Since all the deps are in our repo, no external force can make our dependency unavailable, change without notice. And it will keep the size of the repository in check, because it's all there when you pull it.

Because git-subtrac is a vendoring tool, not a package manager, it only vendors, but does not help building packages. Therefore, with git-subtrac it is harder to add and "make work" (build, test, add transitive deps) a dependency than with a language-specific package manager. Oh, what about the transitive dependencies?

git-subtrac does not deal with transitive dependencies. At least not directly. Or I am not aware of it. Ok, I haven't tried.

If we audit and thus understand our dependencies, we will be able to add the transitive ones. So perhaps git-subtrac shouldn't care?

What about Zig?

Zig will have a package manager (ziglang/zig#943). I am not not very enthusiastic about it; can we all use git-subtrac and be done with it?. A few weeks ago in a park in Milan my conversation with Andrew Kelley was something like:

  • me: "git-subtrac yadda yadda yadda submodules but better yadda yadda yadda".
  • Andrew: "if I clone a repository that uses it with no extra parameters, will it work as expected?"
  • me: "no, you have to pass --recursive, so git will checkout submodules... even if they are already fetched."
  • Andrew: "then it's a piece-of-shit-approach."

Uh, I agree. People have not grown muscle memory to clone repositories with --recursive flag and never will, so it's impossible to adopt git-subtrac beyond well-controlled silos. Which is why we will have a yet-another-programming-language-specific-package-manager. Or at least my argument for using and advertising git-subtrac (and saving a lot of time for Zig folks, and a lot of inevitable misery for it's users) stops right there.

Conclusion

Can git check out submodules when they are in the same repository, so our conversation of reconsidering (or not having) a Zig package manager doesn't stop after 5 seconds?


  1. Alphabetically: Erlang, Go, Java, Javascript, PHP, Perl, Python. ↩︎

  2. Usually written in the same language. Zoo of package managers (sometimes a couple of popular ones for the same programming language) is a can of worms in an on itself worth another blog post. ↩︎

  3. go.sum of a project I am currently involved clocks around 6k lines. This is quite a lot for Go, but still peanuts to Node.js. ↩︎