turbonss

motiejus

turbonss

NSS plugin for passwd and groups databases

Go to file

Motiejus Jakštys 1c39007f6b [readme] add file structure		2022-02-11 15:37:23 +02:00
deps	add deps/cmph	2022-02-09 13:08:25 +02:00
include/deps/cmph	compile cmph from source	2022-02-10 06:07:52 +02:00
src	less error handling	2022-02-10 06:12:12 +02:00
.gitignore	first steps	2022-02-09 12:53:01 +02:00
.gitmodules	add deps/cmph	2022-02-09 13:08:25 +02:00
README.md	[readme] add file structure	2022-02-11 15:37:23 +02:00
build.zig	compile cmph from source	2022-02-10 06:07:52 +02:00

README.md

Turbo NSS

glibc nss library for passwd and group.

Checking out and building

$ git clone --recursive https://git.sr.ht/~motiejus/turbonss

Alternatively, if you forgot --recursive:

$ git submodule update --init

And run tests:

$ zig build test

... the other commands will be documented as they are implemented.

This project uses git subtrac for managing dependencies.

Steps

A known implementation runs id(1) at ~250 rps sequentially. Our goal is 10k ID/s.

id(1) works as follows:

lookup user by name.
get all additional gids (an array attached to a member).
for each additional gid, return the group name.

Assuming a member is in ~100 groups on average, that's 1M group lookups per second. We need to convert gid to a group index quickly.

Data structures

Basic data structures that allow efficient storage:

// reminder:
typedef uid_t uint32;
typedef gid_t uint32;

// 6*32b = 6*4B = 24B/user
typedef struct {
  uid_t uid;
  gid_t gid;
  name_offset uint32; // offset into *usernames
  gecos_offset uint32; // offset into *gecos
  shell_offset uint32; // offset into *shells
  additional_groups_offset uint32; // offset into additional_groups
} user;

const char* usernames; // all concatenated usernames, fsst-compressed
const char* gecoss; // all concatenated gecos, fsst-compressed
const char* shells; // all concatenated home directories, fsst-compressed
const uint8_t additional_groups; // all additional_groups, turbo compressed

typedef struct {
  gid_t gid;
  name_offset uint32; // offset into *groupnames
  members_offset uint32; // offset into members
}

const char* groupnames; // all concatenated group names, fsst-compressed
const uint8_8 members; // all concatenated members, turbo compressed

"turbo compression" encodes a list of uids/gids with this algorithm:

sort ascending.
extract deltas and subtract 1: awk '{diff=$0-prev; prev=$0; print diff-1}'.
varint-encode these deltas into an uint32, like protobuf or utf8.

With typical group memberships (as of writing) this requires ~1.3-1.5 byte per entry.

Indexes

The following operations need to be fast, in order of importance:

lookup gid -> group (this is on hot path in id).
lookup uid -> user.
lookup groupname -> group.
lookup username -> user.
(optional) iterate users using a defined order (getent passwd).
(optional) iterate groups using a defined order (getent group).

Preliminary results of playing with cmph:

BDZ: tried b=3, b=7 (default), and b=10.

BDZ algorithm stores 1M values in (900KB, 338KB, 306KB) respectively.
Latency for 1M keys: (170ms, 180ms, 230ms).
Packed vs non-packed latency differences are not meaningful.

CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with CHM than with BDZ, eliminating the benefit of preserved ordering.

Full file structure

The file structure stars with the metadata field. All indexes are number of bytes, relative to the beginning of the file.

const Offsets = struct {
   magic: [4]u32,
   version: u32,

   num_users, size_num_users: u32,
   num_groups, size_num_groups: u32,
   cmph_gid2group: u32,
   size_cmph_gid2group: u32,
   cmph_uid2user, size_cmph_uid2user: u32,
   cmph_groupname2group, size_cmph_groupname2group: u32,
   cmph_username2user, size_cmph_username2user: u32,
   structs_group, size_structs_group: u32,
   structs_user, size_structs_user: u32,
   fsst_usernames_homes, size_fsst_usernames_homes: u32,
   fsst_groupnames, size_fsst_usernames_homes: u32,
   fsst_shells, size_fsst_shells: u32,
}

magic must be 0xf09fa4b7, and version must be 0x00. The remaining fields are indexes to further sections of the file with their sizes in bytes. All numbers are little-endian. Each field may be aligned to 64B (L1D cache size) or 4KB (standard page size), to be decided.