turbonss/README.md

159 lines
5.6 KiB
Markdown

Turbo NSS
---------
Turbonss is a plugin for GNU Name Service Switch ([NSS][nsswitch])
functionality of GNU C Library (glibc). Turbonss implements lookup for `user`
and `passwd` database entries (i.e. system users, groups, and group
memberships). It's main goal is to run [`id(1)`][id] as fast as possible.
Turbonss is optimized for reading. If the data changes in any way, the whole
file will need to be regenerated. Therefore, it was created, and best suited,
for environments that have a central user & group database which then needs to
be distributed to many servers/services, and the data does not change very
often (e.g. hourly).
This is the fastest known NSS passwd/group implementation for *reads*. On a
corpus with 10k users, 10k groups and 500 average members per group, `id` takes
17 seconds with the glibc default implementation, 10-17 milliseconds with a
pre-cached `nscd`, ~8 milliseconds with `turbonss`.
Project status
--------------
The project is finished and is not recommended for production; just use nscd.
Turbonss duly implements the full user/group API in `src/libnss.zig`: feel free
to copy that.
Yours truly (the author) worked on this for about 7 months. And when this was
finished it turned out that just slapping nscd on top of the existing NSS
implementation is almost as fast as this.
Dependencies
------------
1. zig nightly compiler (0.10 should work when it comes out).
2. [cmph][cmph]: bundled with this repository.
Trying it out
-------------
Clone, compile and test first:
$ git clone --recursive https://git.sr.ht/~motiejus/turbonss
$ zig build -fstage1 test
$ zig build -fstage1 -Dtarget=x86_64-linux-gnu.2.31 -Dcpu=x86_64_v3 -Drelease-safe=true
One may choose different options, depending on requirements. Here are some
hints:
1. `-Dcpu=<...>` for the CPU
[microarchitecture](https://en.wikipedia.org/wiki/X86-64#Microarchitecture_levels).
2. `-Drelease-fast=true` for max speed
3. `-Drelease-small=true` for smallest binary sizes.
4. `-Dstrip=true` to strip debug symbols.
Test it on a real system
------------------------
`db.turbo` is the TurboNSS database file. To create one from `/etc/group` and
`/etc/passwd`, use `turbonss-unix2db`:
$ zig-out/bin/turbonss-unix2db --passwd /etc/passwd --group /etc/group
$ zig-out/bin/turbonss-analyze db.turbo
File: /etc/turbonss/db.turbo
Size: 2,624 bytes
Version: 0
Endian: little
Pointer size: 8 bytes
getgr buffer size: 17
getpw buffer size: 74
Users: 19
Groups: 39
Shells: 1
Most memberships: _apt (1)
Sections:
Name Begin End Size bytes
header 00000000 00000080 128
bdz_gid 00000080 000000c0 64
bdz_groupname 000000c0 00000100 64
bdz_uid 00000100 00000140 64
bdz_username 00000140 00000180 64
idx_gid2group 00000180 00000240 192
idx_groupname2group 00000240 00000300 192
idx_uid2user 00000300 00000380 128
idx_name2user 00000380 00000400 128
shell_index 00000400 00000440 64
shell_blob 00000440 00000480 64
groups 00000480 00000700 640
users 00000700 000009c0 704
groupmembers 000009c0 00000a00 64
additional_gids 00000a00 00000a40 64
Run and configure a test container that uses `turbonss` instead of the default
`files`:
$ docker run -ti --rm -v `pwd`:/etc/turbonss -w /etc/turbonss debian:bullseye
# cp zig-out/lib/libnss_turbo.so.2 /lib/x86_64-linux-gnu/
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
And run the commands:
$ getent passwd
$ getent group
$ id root
More users and groups
---------------------
`turbonss-makecorpus` can synthesize more `users` and `groups`:
# ./zig-out/bin/turbonss-makecorpus
wrote users=10000 groups=10000 avg-members=1000 to .
# cat group >> /etc/group
# cat passwd >> /etc/passwd
# time id u_1000000
<...>
real 0m17.380s
user 0m13.117s
sys 0m4.263s
17 seconds for an `id` command! Well, there are indeed many users and groups.
Let's see how turbonss fares with it:
# zig-out/bin/turbonss-unix2db --group /etc/group --passwd /etc/passwd
total 10968512 bytes. groups=10019 users=10039
# ls -hs /etc/group /etc/passwd db.turbo
48M /etc/group 668K /etc/passwd 11M db.turbo
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
# time id u_1000000
real 0m0.008s
user 0m0.000s
sys 0m0.008s
That's ~1500x improvement for the `id` command (and notice about 4X compression
ratio compared to plain files). If the number of users and groups is increased
by 10x (to 100k each), the difference becomes even crazier:
# time id u_1000000
<...>
real 3m42.281s
user 2m30.482s
sys 0m55.840s
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
# time id u_1000000
<...>
real 0m0.008s
user 0m0.000s
sys 0m0.008s
Documentation
-------------
Architecture is detailed in `docs/architecture.md`
Development notes are in `docs/development.md`
[git-subtrac]: https://apenwarr.ca/log/20191109
[nsswitch]: https://linux.die.net/man/5/nsswitch.conf
[id]: https://linux.die.net/man/1/id
[cmph]: http://cmph.sourceforge.net/