2022-02-08 09:52:47 +02:00
|
|
|
Turbo NSS
|
|
|
|
---------
|
|
|
|
|
|
|
|
glibc nss library for passwd and group.
|
|
|
|
|
2022-02-09 13:14:42 +02:00
|
|
|
Checking out and building
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
```
|
|
|
|
$ git clone --recursive https://git.sr.ht/~motiejus/turbonss
|
|
|
|
```
|
|
|
|
|
|
|
|
Alternatively, if you forgot `--recursive`:
|
|
|
|
|
|
|
|
```
|
|
|
|
$ git submodule update --init
|
|
|
|
```
|
|
|
|
|
|
|
|
And run tests:
|
|
|
|
|
|
|
|
```
|
|
|
|
$ zig build test
|
|
|
|
```
|
|
|
|
|
|
|
|
... the other commands will be documented as they are implemented.
|
|
|
|
|
|
|
|
This project uses [git subtrac][git-subtrac] for managing dependencies.
|
|
|
|
|
2022-02-08 09:52:47 +02:00
|
|
|
Steps
|
|
|
|
-----
|
|
|
|
|
|
|
|
A known implementation runs id(1) at ~250 rps sequentially. Our goal is 10k
|
|
|
|
ID/s.
|
|
|
|
|
|
|
|
id(1) works as follows:
|
|
|
|
- lookup user by name.
|
|
|
|
- get all additional gids (an array attached to a member).
|
2022-02-12 10:13:10 +02:00
|
|
|
- for each additional gid, get the group name.
|
2022-02-08 09:52:47 +02:00
|
|
|
|
|
|
|
Assuming a member is in ~100 groups on average, that's 1M group lookups per
|
2022-02-12 10:13:10 +02:00
|
|
|
second (cmph can do 1M in <200ms). We need to convert gid to a group index
|
|
|
|
quickly.
|
2022-02-08 09:52:47 +02:00
|
|
|
|
2022-02-12 10:13:10 +02:00
|
|
|
API
|
|
|
|
---
|
2022-02-08 09:52:47 +02:00
|
|
|
|
|
|
|
The following operations need to be fast, in order of importance:
|
|
|
|
|
2022-02-12 10:13:10 +02:00
|
|
|
1. lookup gid -> group (this is on hot path in id) with or without members (2
|
|
|
|
separate calls).
|
2022-02-08 09:52:47 +02:00
|
|
|
2. lookup uid -> user.
|
2022-02-11 15:37:23 +02:00
|
|
|
3. lookup groupname -> group.
|
|
|
|
4. lookup username -> user.
|
2022-02-12 12:30:50 +02:00
|
|
|
5. (optional) iterate users using a defined order (`getent passwd`).
|
|
|
|
6. (optional) iterate groups using a defined order (`getent group`).
|
2022-02-12 10:13:10 +02:00
|
|
|
|
2022-02-12 12:30:50 +02:00
|
|
|
Indices
|
2022-02-12 10:13:10 +02:00
|
|
|
-------
|
2022-02-09 13:14:42 +02:00
|
|
|
|
2022-02-11 13:31:54 +02:00
|
|
|
Preliminary results of playing with [cmph][cmph]:
|
|
|
|
|
|
|
|
BDZ: tried b=3, b=7 (default), and b=10.
|
|
|
|
|
|
|
|
* BDZ algorithm stores 1M values in (900KB, 338KB, 306KB) respectively.
|
|
|
|
* Latency for 1M keys: (170ms, 180ms, 230ms).
|
|
|
|
* Packed vs non-packed latency differences are not meaningful.
|
|
|
|
|
2022-02-12 12:30:50 +02:00
|
|
|
CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with
|
2022-02-11 13:31:54 +02:00
|
|
|
CHM than with BDZ, eliminating the benefit of preserved ordering.
|
|
|
|
|
2022-02-11 15:37:23 +02:00
|
|
|
Full file structure
|
|
|
|
-------------------
|
|
|
|
|
2022-02-12 10:13:10 +02:00
|
|
|
The file structure stars with magic and version number, followed by a list of
|
|
|
|
User, Group records and their indices. All indices are number of bytes,
|
|
|
|
relative to the beginning of the file.
|
2022-02-11 15:37:23 +02:00
|
|
|
|
|
|
|
```
|
2022-02-12 12:30:50 +02:00
|
|
|
const File = struct {
|
|
|
|
magic: [4]u8,
|
2022-02-12 23:01:16 +02:00
|
|
|
version: u4,
|
2022-02-12 12:30:50 +02:00
|
|
|
padding: u4,
|
2022-02-12 23:01:16 +02:00
|
|
|
num_shells: u8,
|
2022-02-12 12:30:50 +02:00
|
|
|
num_users: u32,
|
|
|
|
num_groups: u32,
|
2022-02-12 23:01:16 +02:00
|
|
|
offset_cmph_gid2group: u26,
|
|
|
|
offset_cmph_uid2user: u26,
|
|
|
|
offset_cmph_groupname2group: u26,
|
|
|
|
offset_cmph_username2user: u26,
|
|
|
|
offset_sorted_groups: u26,
|
|
|
|
offset_sorted_users: u26,
|
|
|
|
offset_groupmembers: u26,
|
|
|
|
offset_additional_gids: u26,
|
2022-02-12 12:30:50 +02:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2022-02-12 23:01:16 +02:00
|
|
|
`magic` is 0xf09fa4b7, and `version` must be `0`. Offsets are indices to
|
|
|
|
further sections of the file, with zero being the first block (the magic
|
|
|
|
number). As all blobs are 64-byte aligned, the offsets are pointing to the
|
|
|
|
beginning of the 64-byte "block" (thus u26). All numbers are little-endian.
|
2022-02-12 12:30:50 +02:00
|
|
|
|
2022-02-12 23:01:16 +02:00
|
|
|
As of writing the file header is 40 bytes.
|
2022-02-12 12:30:50 +02:00
|
|
|
|
|
|
|
Primitive types:
|
|
|
|
|
|
|
|
```
|
|
|
|
const Group = struct {
|
|
|
|
gid: u32,
|
|
|
|
// index to a separate structure with a list of members
|
|
|
|
members_offset: u29,
|
|
|
|
padding: u3,
|
|
|
|
groupname_len: u8,
|
|
|
|
// a variable-sized array that will be stored immediately after this
|
|
|
|
// struct.
|
|
|
|
stringdata []u8;
|
|
|
|
}
|
|
|
|
|
2022-02-12 10:13:10 +02:00
|
|
|
const User = struct {
|
|
|
|
uid: u32,
|
|
|
|
gid: u32,
|
2022-02-12 12:30:50 +02:00
|
|
|
// pointer to a separate structure that contains a list of gids
|
2022-02-12 10:13:10 +02:00
|
|
|
additional_gids_offset: u29,
|
2022-02-12 23:01:16 +02:00
|
|
|
// shell is a different story, documented elsewhere.
|
|
|
|
shell_here: u1,
|
|
|
|
shell_len_or_place: u6,
|
2022-02-12 10:13:10 +02:00
|
|
|
home_len: u6,
|
|
|
|
username_len: u6,
|
|
|
|
gecos_len: u8,
|
|
|
|
// a variable-sized array that will be stored immediately after this
|
|
|
|
// struct.
|
|
|
|
stringdata []u8;
|
2022-02-11 15:37:23 +02:00
|
|
|
}
|
2022-02-12 10:14:37 +02:00
|
|
|
```
|
2022-02-11 15:37:23 +02:00
|
|
|
|
2022-02-12 23:01:16 +02:00
|
|
|
TODO explain:
|
|
|
|
- shells
|
|
|
|
- `additional_gids`
|
|
|
|
- `members`
|
|
|
|
|
2022-02-09 13:14:42 +02:00
|
|
|
[git-subtrac]: https://github.com/apenwarr/git-subtrac/
|
2022-02-11 13:31:54 +02:00
|
|
|
[cmph]: http://cmph.sourceforge.net/
|