diff --git a/content/log/2022/dependencies.md b/content/log/2022/dependencies.md deleted file mode 100644 index f605e74..0000000 --- a/content/log/2022/dependencies.md +++ /dev/null @@ -1,211 +0,0 @@ ---- -title: "Dependencies, zig and git-subtrac" -date: 2022-04-23T05:37:51+03:00 -slug: dependencies -# FIXME: 'list: never' keeps the link in the feed -#_build: -# list: never -draft: true ---- - -TLDR ----- - -Modern programming languages make it very easy to add many dependencies. That -is nice for development, but a nightmare for long-term maintenance. -Unfortunately, zig is following suit. I wish we could accept that adding -dependencies does not have to be trivial. If we accept that, thanks to ubiquity -of git, we may have almost solved the dependency problem: not only for zig, but -for everyone. - -Adding dependencies -------------------- - -All of the programming languages I've used professionally, the names of which -do not start with "c"[^1], have package managers[^2], which make "dependency -management" easy. These package managers will, as part of the project's build -process, download and build the dependencies. So there is virtually no -resistance to add dependencies when we need them. - -Because C/C++ still does not have a "universal" package manager, not adding -external dependencies to C/C++ is the path of least resistance; instead, one -relies on libraries already installed in the system. There is a plethora of -tools that will discover system dependencies: autotools, cmake, pkg-config, and -others. As a result, C/C++ projects I've participated in usually had 0-5 -non-system dependencies, whereas non-C/C++ projects -- tens, hundreds, or -thousands[^3]. Having many system dependencies is painful for *every user* of -the package (because they have to make sure the libraries, and their correct -versions, are installed), so C/C++ projects tend avoid having too many of them. - -In Go and Python, a small number of dependencies is often a sign of care and -quality. [mattn/go-sqlite3](https://github.com/mattn/go-sqlite3), -[uber/zap](https://github.com/uber-go/zap), -[apenwarr/redo](https://github.com/apenwarr/redo) and -[django](https://djangoproject.com) are good examples. I've built and used -these projects in a number of environments. Conversely, projects with many -dependencies, even when pinned, often fail to build even in the environment -they are developed at and thus had received most testing (e.g. a specific -OS+architecture, like `Ubuntu 16.04 x86_64`). It's even worse if the -environment, no matter how trivially, is different from the one developer is -working at[^4]. Let's forget about a different OS or a different build system. -Inability to build software, unsurprisingly, leads to user frustration, -packagers' frustration, and the developers asking themselves why have they -chosen a career in software instead of, say, farming. - -To recap, the costs of just having dependencies are huge. I haven't done a -survey and have only my experience to base this on (read: "many anecdotes of me -failing to build stuff I or others wrote a decade ago"). But it is bad enough -that I have a dependency checklist and am prepared to do the grunt work to save -my future self. Here is it: - -1. Does the dependency do what I want, does it work at all? -2. Is it well written? API surface, documentation, tests, error handling, error - signaling, logging, metrics, memory usage (if applicable). -3. How easy is it to build, run, and run it's tests? Related: can it be used - outside the default package manager? -4. It's system dependencies. -5. It's transitive dependencies. - -When working with a "programming-language-specific package manager that does -what it's advertised to do", the path of least resistance, when it comes to -this checklist, is doing (1), and perhaps (2). Why bother with transitive -dependencies or it's build complexity, if the package manager takes care of it -all anyway? - -Except package manager will only help during the initial development, when the -developer happily adds the package. It will work for a couple of days. Package -manager will not help when the dependency disappears, its API changes, it stops -doing what it has advertised and many other [problems][crash-of-leftpad]. When -something breaks (and it inevitably will, unless it's SQLite), the work is on -the maintainer to fix it. - -I am following my checklist. If a dependency is well written, but has more -transitive dependencies than I need and there is no good alternative, I will -fork and trim it. My recent example is -[sql-migrate](https://github.com/motiejus/sql-migrate). - -Not doing things that are easy to do requires discipline: brushing teeth, -limiting candy intake, not adding dependencies all over the place. If adding -dependencies is easy (and there is no established discipline of limiting them), -the project will tend to gain them; lots of them. - -{{House made out of Duplo pieces}} - -To sum up, the "modern" languages optimize for initial development experience, -not maintenance. And as [Corbet says][linux-rust], "We can't understand why -Kids These Days just don't want to live that way". Kids want to build, John, -not maintain. A 4-letter Danish corporation made a fortune by selling toys that -do not need to be maintained: they are designed to be disassembled and built -anew. We are still kids. Growing up and sticking to our own rules requires -discipline. - -If I may combine Corbet's views with mine: if we understand and audit our -dependencies (all of them, including transitive ones), we will have less -dependencies and a more maintainable system. Win. - -Which brings us to git submodules and git-subtract. - -git submodules and git-subtrac ------------------------------- - -A quick primer on [git submodules][git-submodule], a prerequisite to understand -`git-subtrac`: -* A submodule is a pointer to a particular ref in a separate repository, - optionally checked out in our tree. For example, `deps/cmph` would contain - all the files from [cmph][cmph]. This means that once the repository is fully - set up (technically, the submodule is synced/updated), the build system - (Makefiles, build.zig or what have you) can use it just like a regular - directory. -* The pointer to the submodule in your repository is just a tuple: `(git URL, - sha1)`. -* When cloning a repository that has submodules, git will not clone the - submodules, it will just leave empty directories. We must pass `--recursive` - for git to clone everything. Which makes sense when submodules are external - and may not download at all. - -Submodules were designed for adding external dependencies to a repository. -However, using them incorrectly is way too easy, and is not fun when happens. I -see at least these significant usability problems: -- It is too easy to commit unintended changes to submodule, causing misery to - others. -- By default submodule contents (i.e. code of your dependency) lives *outside - the repository*. This means that, with time, if dependency disappears, we - will not be able to compile our code. Gone. - -Because of the many usability problems of submodules, very few people use it. -So [Avery Pennarun][apenwarr] (creator of [git-subtree][git-subtree], by the -way) created [`git-subtrac`][git-subtrac]. `git-subtrac` bundles our git -dependencies just like "classic" git submodules, but all refs of the -dependencies stay in the same repository. Wait, stop here. Repeat after me: _it -is git submodules, but all refs stay in the same repository_. I also call it -"good vendoring". Since all the dependencies are in our repo, no external force -can make our dependency unavailable. And it will keep the size of the -repository in check, because it's all there when we pull it. [`git-subtrac` -fixes a few other submodule usability problems][apenwarr-subtrac] along the -way. - -Because `git-subtrac` is a vendoring tool, not a package manager, it only -vendors but does not help building packages. Therefore, with `git-subtrac` it -is harder to add and "make work" (build, test, add transitive dependencies) a -dependency than with a language-specific package manager. - -`git-subtrac`, just like git and submodules, does not understand "semantic -versions". So we can't ask for "latest foo of version 1.2.X"; the developer -will need to figure out, and hardcode, *exactly* which versions to use. Also, -updating dependencies is not as easy as, say, in Gospeak, `go get -u ./...`; -git will need a bit more hand holding. - -What about Zig? ---------------- - -Zig will have a package manager ([ziglang/zig#943][943]). I am not not very -enthusiastic about it; can we all use git-subtrac and be done with it? A few -weeks ago in a park in Milan my conversation with [Andrew -Kelley](https://andrewkelley.me/) was something like: - -- me: "git-subtrac yadda yadda yadda submodules but better yadda yadda yadda". -- Andrew: "If I clone a repository that uses subtrac with no extra parameters, - will it work as expected?" -- me: "No, you have to pass `--recursive`, so git will checkout submodules... - even if they are already fetched." -- Andrew: "Then it's a piece-of-shit-approach." - -Uh, I agree. People have not grown muscle memory to clone repositories with -`--recursive` flag and never will, so it's impossible to adopt git-subtrac -beyond well-controlled silos. Which is why we will have a -yet-another-programming-language-specific-package-manager. Or at least my -argument offering `git-subtrac` as Zig's package manager (thus saving a lot of -time for Zig folks, and a lot of inevitable misery for its users) stops right -there. - -Zig has a rich standard library, therefore it does not need many dependencies -by design. Does it *really* need a package manager? - -Conclusion ----------- - -When all contents of the submodules are in our repository, can git check out -submodules too? That way, my and Andrew's conversation of reconsidering (or not -having) a Zig package manager will have a chance to not stop after 5 seconds. - -[^1]: Alphabetically: Erlang, Go, Java, Javascript, PHP, Perl, Python. -[^2]: Usually written in the same language. Zoo of package managers (sometimes - a couple of popular ones for the same programming language) is a can of worms - in an on itself worth another blog post. -[^3]: `go.sum` of a project I am currently involved in clocks around 6k lines. - This is quite a lot for Go, but still peanuts to Node.js. -[^4]: For example, they would work on Ubuntu 16.04, but fail on Ubuntu 18.04. - -[git-subtrac]: https://github.com/apenwarr/git-subtrac/ -[linux-rust]: https://lwn.net/SubscriberLink/889924/a733d6630e3b5115/ -[crash-of-leftpad]: https://drewdevault.com/2021/11/16/Cash-for-leftpad.html -[943]: https://github.com/ziglang/zig/issues/943 -[git-submodule]: https://git-scm.com/book/en/v2/Git-Tools-Submodules -[cmph]: http://cmph.sourceforge.net/ -[git-subtree]: https://git.kernel.org/pub/scm/git/git.git/plain/contrib/subtree/git-subtree.txt -[apenwarr]: https://apenwarr.ca -[apenwarr-subtrac]: https://apenwarr.ca/log/20191109