deps | ||
include/deps/cmph | ||
src | ||
.gitignore | ||
.gitmodules | ||
build.zig | ||
README.md |
Turbo NSS
glibc nss library for passwd and group.
Checking out and building
$ git clone --recursive https://git.sr.ht/~motiejus/turbonss
Alternatively, if you forgot --recursive
:
$ git submodule update --init
And run tests:
$ zig build test
... the other commands will be documented as they are implemented.
This project uses git subtrac for managing dependencies.
Steps
A known implementation runs id(1) at ~250 rps sequentially. Our goal is 10k ID/s.
id(1) works as follows:
- lookup user by name.
- get all additional gids (an array attached to a member).
- for each additional gid, get the group name.
Assuming a member is in ~100 groups on average, that's 1M group lookups per second (cmph can do 1M in <200ms). We need to convert gid to a group index quickly.
API
The following operations need to be fast, in order of importance:
- lookup gid -> group (this is on hot path in id) with or without members (2 separate calls).
- lookup uid -> user.
- lookup groupname -> group.
- lookup username -> user.
- (optional) iterate users using a defined order (
getent passwd
). - (optional) iterate groups using a defined order (
getent group
).
Indices
Preliminary results of playing with cmph:
BDZ: tried b=3, b=7 (default), and b=10.
- BDZ algorithm stores 1M values in (900KB, 338KB, 306KB) respectively.
- Latency for 1M keys: (170ms, 180ms, 230ms).
- Packed vs non-packed latency differences are not meaningful.
CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with CHM than with BDZ, eliminating the benefit of preserved ordering.
Full file structure
The file structure stars with magic and version number, followed by a list of User, Group records and their indices. All indices are number of bytes, relative to the beginning of the file.
const File = struct {
magic: [4]u8,
version: u3,
shells_oob: u1,
padding: u4,
num_users: u32,
num_groups: u32,
<... TBD ...>
}
magic
must be 0xf09fa4b7, and version
must be 0
. The remaining fields are
indices to further sections of the file with their sizes in bytes. All numbers
are little-endian.
What's remaining, variable-length:
- A lookup list of shells (if
shells_oob
is True). - 4 indices mentioned above.
- <...>
Primitive types:
const Group = struct {
gid: u32,
// index to a separate structure with a list of members
members_offset: u29,
padding: u3,
groupname_len: u8,
// a variable-sized array that will be stored immediately after this
// struct.
stringdata []u8;
}
const User = struct {
uid: u32,
gid: u32,
// pointer to a separate structure that contains a list of gids
additional_gids_offset: u29,
padding: u1,
shell_len: u6,
home_len: u6,
username_len: u6,
gecos_len: u8,
// a variable-sized array that will be stored immediately after this
// struct.
stringdata []u8;
}