update README

This commit is contained in:
Motiejus Jakštys 2023-04-13 11:10:47 +03:00
parent abeb25f3e2
commit 6027508e60

217
README.md
View File

@ -21,41 +21,52 @@ milliseconds with a pre-cached `nscd`, ~8 milliseconds with uncached
Due to the nature of being built with Zig, this will work on glibc versions as Due to the nature of being built with Zig, this will work on glibc versions as
old as 2.16 (may work with even older ones, I did not test beyond that). old as 2.16 (may work with even older ones, I did not test beyond that).
Project status Project goals
-------------- -------------
This project works, but has never seen real production use. If you want to use - Make it as fast as possible. Especially optimize for the `id` command.
turbonss instead of a battle-tested, albeit slower nscd, keep the following in - Small database size (helps making it fast).
mind: - No runtime, no GC, as little as possible overhead.
- turbonss has not been fuzzed, so it will crash a program on invalid database - Easy to compile for ancient glibc versions (comes out of the box with Zig).
file.
Project status and known deficiencies
-------------------------------------
Turbonss works, but, to the author's knowledge, was not deployed to production.
If you want to use turbonss instead of a battle-tested, albeit slower nscd,
keep the following in mind:
- turbonss has not been fuzz-tested, so it will crash a program on invalid
database file. Please compile with `ReleaseSafe`. It is plenty fast with this
mode, but an invalid database will lead to defined behavior (i.e. crash with
a stack trace) instead of overwriting memory wherever.
- if the database file was replaced while the program has been running,
turbonss will not re-read the file (it holds to the previous file
descriptor).
- requires a nightly version of zig (that will change with 0.11). - requires a nightly version of zig (that will change with 0.11).
If you insist on using turbonss in prod, compile with `ReleaseSafe`. It is The license is permissive, so feel free. I am also available for
plenty as fast with this mode, but an invalid database will lead to defined [consulting][consulting] to fix all or part of those above, if that's your
behavior (i.e. crash with a stack trace) instead of overwriting memory preference.
wherever.
Dependencies Dependencies
------------ ------------
1. zig around 0.11.0-dev.2560+602029bb2. 1. zig around `0.11.0-dev.2560+602029bb2`.
2. [cmph][cmph]: bundled with this repository. 2. [cmph][cmph]: bundled with this repository.
Trying it out Demo
------------- ----
Clone, compile and test first: Clone, compile and test first:
$ git clone --recursive https://git.sr.ht/~motiejus/turbonss $ git clone --recursive https://git.jakstys.lt/motiejus/turbonss
$ zig build test $ zig build test
$ zig build -Dtarget=x86_64-linux-gnu.2.16 -Dcpu=baseline -Doptimize=ReleaseSafe $ zig build -Dtarget=x86_64-linux-gnu.2.16 -Doptimize=ReleaseSafe
One may choose different options, depending on requirements. Here are some One may choose different options, depending on requirements. Here are some
hints: hints:
1. `-Dcpu=<...>` for the CPU 1. `-Dcpu=<...>` for the CPU [microarchitecture][mcpu].
[microarchitecture](https://en.wikipedia.org/wiki/X86-64#Microarchitecture_levels).
2. `-Dstrip=true` to strip debug symbols. 2. `-Dstrip=true` to strip debug symbols.
For reference, size of the shared library and helper binaries when compiled For reference, size of the shared library and helper binaries when compiled
@ -69,102 +80,106 @@ with `-Dstrip=true -Drelease-small=true`:
Many thanks to Ulrich Drepper for [teaching how to link it properly][dso]. Many thanks to Ulrich Drepper for [teaching how to link it properly][dso].
Test turobnss on a real system Quick test turbonss on a real system
------------------------------ ------------------------------------
`db.turbo` is the TurboNSS database file. To create one from `/etc/group` and turbonss is best tested, of course, with many users and groups. The guide below
`/etc/passwd`, use `turbonss-unix2db`: will show how to synthesize 10k users, 10k groups with an avereage membership
of 1k users per group, and test ubernss with such corpus.
$ zig-out/bin/turbonss-unix2db --passwd /etc/passwd --group /etc/group 1. Synthesize some users and groups to `passwd` and `group` in the current directory:
$ zig-out/bin/turbonss-analyze db.turbo
File: /etc/turbonss/db.turbo
Size: 2,624 bytes
Version: 0
Endian: little
Pointer size: 8 bytes
getgr buffer size: 17
getpw buffer size: 74
Users: 19
Groups: 39
Shells: 1
Most memberships: _apt (1)
Sections:
Name Begin End Size bytes
header 00000000 00000080 128
bdz_gid 00000080 000000c0 64
bdz_groupname 000000c0 00000100 64
bdz_uid 00000100 00000140 64
bdz_username 00000140 00000180 64
idx_gid2group 00000180 00000240 192
idx_groupname2group 00000240 00000300 192
idx_uid2user 00000300 00000380 128
idx_name2user 00000380 00000400 128
shell_index 00000400 00000440 64
shell_blob 00000440 00000480 64
groups 00000480 00000700 640
users 00000700 000009c0 704
groupmembers 000009c0 00000a00 64
additional_gids 00000a00 00000a40 64
Run and configure a test container that uses `turbonss` instead of the default ```
`files`: $ zig-out/bin/turbonss-makecorpus
wrote users=10000 groups=10000 avg-members=1000 to .
$ ls -1hs passwd group
48M group
668K passwd
```
$ docker run -ti --rm -v `pwd`:/etc/turbonss -w /etc/turbonss debian:bullseye 2. Convert the generated `passwd` and `group` to the turbonss database. Note
# cp zig-out/lib/libnss_turbo.so.2 /lib/x86_64-linux-gnu/ the `db.turbo` database is more than 4 times smaller than the textual one:
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
And run the commands: ```
$ zig-out/bin/turbonss-unix2db --group group --passwd passwd
total 10968064 bytes. groups=10000 users=10000
$ ls -1hs db.turbo
11M db.turbo
```
$ getent passwd 3. Optional: inspect the freshly created database:
$ getent group
$ id root
More users and groups ```
--------------------- $ zig-out/bin/turbonss-analyze db.turbo
File: db.turbo
Size: 10,968,064 bytes
Version: 0
Endian: little
Pointer size: 8 bytes
getgr buffer size: 18000
getpw buffer size: 57
Users: 10000
Groups: 10000
Shells: 4
Most memberships: u_1000000 (501)
Sections:
Name Begin End Size bytes
header 00000000 00000080 128
bdz_gid 00000080 00000e40 3,520
bdz_groupname 00000e40 00001c00 3,520
bdz_uid 00001c00 000029c0 3,520
bdz_username 000029c0 00003780 3,520
idx_gid2group 00003780 0000d3c0 40,000
idx_groupname2group 0000d3c0 00017000 40,000
idx_uid2user 00017000 00020c40 40,000
idx_name2user 00020c40 0002a880 40,000
shell_index 0002a880 0002a8c0 64
shell_blob 0002a8c0 0002a900 64
groups 0002a900 00065280 240,000
users 00065280 000da580 480,000
groupmembers 000da580 005a69c0 5,030,976
additional_gids 005a69c0 00a75c00 5,042,752
$ zig-out/bin/turbonss-getent --db db.turbo passwd u_1000000
u_1000000:x:1000000:1000000:User 1000000:/home/u_1000000:/bin/bash
$ zig-out/bin/turbonss-getent --db db.turbo group g_1000003
g_1000003:x:1000003:u_1000002,u_1000003,u_1000004
```
`turbonss-makecorpus` can synthesize more `users` and `groups`: 4. Now since we will be messing with the system, run all following commands in
a container:
# ./zig-out/bin/turbonss-makecorpus ```
wrote users=10000 groups=10000 avg-members=1000 to . $ docker run -ti --rm -v `pwd`:/etc/turbonss -w /etc/turbonss debian:bullseye
# cat group >> /etc/group # cp zig-out/lib/libnss_turbo.so.2 /lib/x86_64-linux-gnu/
# cat passwd >> /etc/passwd ```
# time id u_1000000
<...>
real 0m17.380s
user 0m13.117s
sys 0m4.263s
17 seconds for an `id` command! Well, there are indeed many users and groups. 5. Instruct `nsswitch.conf` to use both turbonss and the standard resolver:
Let's see how turbonss fares with it:
# zig-out/bin/turbonss-unix2db --group /etc/group --passwd /etc/passwd ```
total 10968512 bytes. groups=10019 users=10039 # sed -i '/passwd\|group/ s/files/turbo files/' /etc/nsswitch.conf
# ls -hs /etc/group /etc/passwd db.turbo # time id u_1000000
48M /etc/group 668K /etc/passwd 11M db.turbo <...>
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf real 0m0.006s
# time id u_1000000 user 0m0.000s
real 0m0.008s sys 0m0.008s
user 0m0.000s ```
sys 0m0.008s
That's ~1500x improvement for the `id` command (and notice about 4X compression The `id` call resolved `u_1000000` from `db.turbo`.
ratio compared to plain files). If the number of users and groups is increased
by 10x (to 100k each), the difference becomes even crazier:
# time id u_1000000 6. Compare the performance to plain `files` (that is, without turbonss):
<...>
real 3m42.281s
user 2m30.482s
sys 0m55.840s
# sed -i '/passwd\|group/ s/files/turbo/' /etc/nsswitch.conf
# time id u_1000000
<...>
real 0m0.008s
user 0m0.000s
sys 0m0.008s
Note that to author's knowledge this has not been used on any real production ```
nor a development machine. # sed -i '/passwd\|group/ s/turbo files/files/' /etc/nsswitch.conf
# cat passwd >> /etc/passwd
# cat group >> /etc/group
# time id u_1000000
<...>
real 0m17.164s
user 0m13.288s
sys 0m3.876s
```
Over 2500x difference.
Documentation Documentation
------------- -------------
@ -176,3 +191,5 @@ Documentation
[id]: https://linux.die.net/man/1/id [id]: https://linux.die.net/man/1/id
[cmph]: http://cmph.sourceforge.net/ [cmph]: http://cmph.sourceforge.net/
[dso]: https://akkadia.org/drepper/dsohowto.pdf [dso]: https://akkadia.org/drepper/dsohowto.pdf
[mcpu]: https://en.wikipedia.org/wiki/X86-64#Microarchitecture_levels
[consulting]: https://jakstys.lt/contact