turbonss/README.md

178 lines
6.3 KiB
Markdown
Raw Normal View History

2022-02-08 09:52:47 +02:00
Turbo NSS
---------
2022-08-21 06:08:21 +03:00
Turbonss is a plugin for GNU Name Service Switch ([NSS][nsswitch])
functionality of GNU C Library (glibc). Turbonss implements lookup for `user`
and `passwd` database entries (i.e. system users, groups, and group
memberships). It's main goal is to run [`id(1)`][id] as fast as possible.
2022-02-14 10:55:49 +02:00
2022-02-14 13:55:54 +02:00
Turbonss is optimized for reading. If the data changes in any way, the whole
2022-08-21 06:08:21 +03:00
file will need to be regenerated. Therefore, it was created, and best suited,
for environments that have a central user & group database which then needs to
be distributed to many servers/services, and the data does not change very
often (e.g. hourly).
2022-02-14 13:55:54 +02:00
2023-02-08 16:40:39 +02:00
This is the fastest known NSS passwd/group implementation for *reads*. On my
2018-era laptop a corpus with 10k users, 10k groups and 500 average members per
group, `id` takes 17 seconds with the glibc default implementation, 10-17
milliseconds with a pre-cached `nscd`, ~8 milliseconds with `turbonss`.
2022-02-14 10:55:49 +02:00
2022-08-21 06:08:21 +03:00
Project status
--------------
2022-02-14 10:55:49 +02:00
2022-11-30 12:04:12 +02:00
The project is finished and was never used recommended for production. If you
are considering using turbonss, try nscd first. Turbonss is only 2-5 times
faster than pre-warmed nscd, which usually does not matter enough to go through
the hoops of using a nonstandard nss library in the first place.
2022-02-14 10:55:49 +02:00
2022-11-30 12:04:12 +02:00
Yours truly worked on this for about 7 months. This was also my first zig
project which I never went to (nor really needed to) come back and clean up.
2022-07-09 19:04:19 +03:00
2023-02-08 16:40:39 +02:00
Update 2022-02: I am reviving it:
- updated to stage2, so it works on nightly again.
- I learned some zig over the last year, will be cleaning it up.
Currently it has not been fuzz-tested, so it will crash on invalid data. Please
use `ReleaseSafe` until it is fuzzed.
2022-07-15 11:14:46 +03:00
Dependencies
------------
2023-02-08 16:40:39 +02:00
1. zig 0.11.0-dev.1580+a5b34a61a or higher.
2022-08-21 06:08:21 +03:00
2. [cmph][cmph]: bundled with this repository.
2022-02-14 13:05:33 +02:00
2022-08-21 06:08:21 +03:00
Trying it out
-------------
2022-03-17 07:25:47 +02:00
2022-08-21 06:08:21 +03:00
Clone, compile and test first:
2022-02-12 12:30:50 +02:00
2022-08-21 06:08:21 +03:00
$ git clone --recursive https://git.sr.ht/~motiejus/turbonss
2022-11-20 13:33:05 +02:00
$ zig build test
2023-02-08 16:40:39 +02:00
$ zig build -Dtarget=x86_64-linux-gnu.2.16 -Dcpu=baseline -Doptimize=ReleaseSafe
2022-03-17 07:25:47 +02:00
2022-08-21 06:08:21 +03:00
One may choose different options, depending on requirements. Here are some
hints:
2022-02-11 15:37:23 +02:00
2022-08-21 06:08:21 +03:00
1. `-Dcpu=<...>` for the CPU
[microarchitecture](https://en.wikipedia.org/wiki/X86-64#Microarchitecture_levels).
2023-02-08 16:40:39 +02:00
2. `-Dstrip=true` to strip debug symbols.
2022-02-14 13:05:33 +02:00
2022-11-30 12:04:12 +02:00
For reference, size of the shared library and helper binaries when compiled
with `-Dstrip=true -Drelease-small=true`:
2023-02-08 16:40:39 +02:00
28K zig-out/bin/turbonss-analyze
20K zig-out/bin/turbonss-getent
24K zig-out/bin/turbonss-makecorpus
140K zig-out/bin/turbonss-unix2db
24K zig-out/lib/libnss_turbo.so.2.0.0
2022-11-30 12:04:12 +02:00
Many thanks to Ulrich Drepper for [teaching how to link it properly][dso].
Test turobnss on a real system
------------------------------
2022-02-14 13:05:33 +02:00
2022-08-21 06:08:21 +03:00
`db.turbo` is the TurboNSS database file. To create one from `/etc/group` and
`/etc/passwd`, use `turbonss-unix2db`:
2022-02-14 10:55:49 +02:00
2022-08-21 06:08:21 +03:00
$ zig-out/bin/turbonss-unix2db --passwd /etc/passwd --group /etc/group
$ zig-out/bin/turbonss-analyze db.turbo
2022-08-21 06:10:47 +03:00
File: /etc/turbonss/db.turbo
Size: 2,624 bytes
Version: 0
Endian: little
Pointer size: 8 bytes
getgr buffer size: 17
getpw buffer size: 74
Users: 19
Groups: 39
Shells: 1
Most memberships: _apt (1)
Sections:
Name Begin End Size bytes
header 00000000 00000080 128
bdz_gid 00000080 000000c0 64
bdz_groupname 000000c0 00000100 64
bdz_uid 00000100 00000140 64
bdz_username 00000140 00000180 64
idx_gid2group 00000180 00000240 192
idx_groupname2group 00000240 00000300 192
idx_uid2user 00000300 00000380 128
idx_name2user 00000380 00000400 128
shell_index 00000400 00000440 64
shell_blob 00000440 00000480 64
groups 00000480 00000700 640
users 00000700 000009c0 704
groupmembers 000009c0 00000a00 64
additional_gids 00000a00 00000a40 64
2022-08-21 06:08:21 +03:00
Run and configure a test container that uses `turbonss` instead of the default
`files`:
2022-07-04 07:44:20 +03:00
2022-08-21 06:08:21 +03:00
$ docker run -ti --rm -v `pwd`:/etc/turbonss -w /etc/turbonss debian:bullseye
# cp zig-out/lib/libnss_turbo.so.2 /lib/x86_64-linux-gnu/
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
2022-07-04 07:44:20 +03:00
2022-08-21 06:08:21 +03:00
And run the commands:
2022-07-04 14:33:46 +03:00
2022-08-21 06:08:21 +03:00
$ getent passwd
$ getent group
$ id root
2022-07-04 14:33:46 +03:00
2022-08-21 06:08:21 +03:00
More users and groups
---------------------
2022-07-04 14:33:46 +03:00
2022-08-21 06:08:21 +03:00
`turbonss-makecorpus` can synthesize more `users` and `groups`:
2022-07-04 14:33:46 +03:00
2022-08-21 06:08:21 +03:00
# ./zig-out/bin/turbonss-makecorpus
wrote users=10000 groups=10000 avg-members=1000 to .
# cat group >> /etc/group
# cat passwd >> /etc/passwd
# time id u_1000000
<...>
real 0m17.380s
user 0m13.117s
sys 0m4.263s
17 seconds for an `id` command! Well, there are indeed many users and groups.
Let's see how turbonss fares with it:
# zig-out/bin/turbonss-unix2db --group /etc/group --passwd /etc/passwd
total 10968512 bytes. groups=10019 users=10039
# ls -hs /etc/group /etc/passwd db.turbo
48M /etc/group 668K /etc/passwd 11M db.turbo
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
# time id u_1000000
real 0m0.008s
user 0m0.000s
sys 0m0.008s
That's ~1500x improvement for the `id` command (and notice about 4X compression
ratio compared to plain files). If the number of users and groups is increased
by 10x (to 100k each), the difference becomes even crazier:
# time id u_1000000
<...>
real 3m42.281s
user 2m30.482s
sys 0m55.840s
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
# time id u_1000000
<...>
real 0m0.008s
user 0m0.000s
sys 0m0.008s
2022-07-04 14:33:46 +03:00
2023-02-08 16:40:39 +02:00
Note that to author's knowledge this has not been used on any real production
nor a development machine.
2022-08-21 06:08:21 +03:00
Documentation
-------------
2022-07-04 14:33:46 +03:00
2022-11-30 12:04:12 +02:00
- Architecture is detailed in `docs/architecture.md`
- Development notes are in `docs/development.md`
2022-07-04 14:33:46 +03:00
2022-02-14 10:55:49 +02:00
[nsswitch]: https://linux.die.net/man/5/nsswitch.conf
2022-08-21 06:08:21 +03:00
[id]: https://linux.die.net/man/1/id
[cmph]: http://cmph.sourceforge.net/
2022-11-30 12:04:12 +02:00
[dso]: https://akkadia.org/drepper/dsohowto.pdf