turbonss/README.md

Turbo NSS
---------

Glibc nss library for passwd and group.

Checking out and building
-------------------------

```
$ git clone --recursive https://git.sr.ht/~motiejus/turbonss
```

Alternatively, if you forgot `--recursive`:

```
$ git submodule update --init
```

And run tests:

```
$ zig build test
```

Other commands will be documented as they are implemented.

This project uses [git subtrac][git-subtrac] for managing dependencies.

remarks on `id(1)`
------------------

A known implementation runs id(1) at ~250 rps sequentially on ~20k users and
~10k groups. Our target is 10k id/s.

`id(1)` works as follows:
- lookup user by name.
- get all additional gids (an array attached to a member).
- for each additional gid, get the group name.

Assuming a member is in ~100 groups on average, that's 1M group lookups per
second. We need to convert gid to a group index, and group index to a group
gid/name quickly.

Caveat: `struct group` contains an array of pointers to names of group members
(`char **gr_mem`). However, `id` does not use that information, resulting in a
significant read amplification. Therefore, if `argv[0] == "id"`, `getgrid(3)`
will return group without the members. This speeds up `id` by about 10x on a
known NSS implementation.

Indices
-------

The following operations need to be fast, in order of importance:

1. lookup gid -> group (this is on hot path in id) with or without members (2
   separate calls).
2. lookup uid -> user.
3. lookup groupname -> group.
4. lookup username -> user.
5. (optional) iterate users using a defined order (`getent passwd`).
6. (optional) iterate groups using a defined order (`getent group`).

First 4 can use perfect hashing like [cmph][cmph]: it hashes a list of bytes to
a sequential list of integers. Perfect hashing algorithms require some space,
and take some time to calculate ("hashing duration"). I've tested BDZ, which
hashes [][]u8 to a sequential list of integers (not preserving order) and CHM, which
does the same, but preserves order. BDZ accepts an argument 3 <= b <= 10.

BDZ: tried b=3, b=7 (default), and b=10.

* BDZ algorithm requires (900KB, 338KB, 306KB, respectively) for 1M values.
* Latency to resolve 1M keys: (170ms, 180ms, 230ms).
* Packed vs non-packed latency differences are not meaningful.

CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with
CHM than with BDZ, eliminating the benefit of preserved ordering.

Turbonss header
---------------

The turbonss header looks like this:

```
OFFSET     TYPE     NAME                          DESCRIPTION
   0      [4]u8     magic                         always 0xf09fa4b7
   4         u8     version                       now `0`
   5        u16     bom                           0x1234
   7         u2     padding
             u6     num_shells                    see "SHELLS" section.
   8        u32     num_users                     number of passwd entries
  12        u32     num_groups                    number of group entries
  16        u32     offset_cmph_gid2group
  20        u32     offset_cmph_uid2user
  24        u32     offset_cmph_groupname2group
  28        u32     offset_cmph_username2user
  32        u32     offset_groupmembers
  36        u32     offset_additional_gids
```

`magic` is 0xf09fa4b7, and `version` must be `0`. All integers are
native-endian. `bom` is a byte-order-mark. It must resolve to `0x1234` (4460).
If that's not true, the file is consumed in a different endianness than it was
created at. Turbonss files cannot be moved across different-endianness
computers. If that happens, turbonss will refuse to read the file.

Offsets are indices to further sections of the file, with zero being the first
block (pointing to the `magic` field). As all blobs are 64-byte aligned, the
offsets are always pointing to the beginning of an 64-byte "block". Therefore,
all `offset_*` values could be `u26`. As `u32` is easier to visualize with xxd,
and the header block fits to 64 bytes anyway, we are keeping them as u32 now.

Primitive types:

```
const Group = struct {
    gid: u32,
    // index to a separate structure with a list of members. The memberlist is
    // always 2^5-byte aligned, this is an index there.
    members_offset: u27,
    groupname_len: u5,
    // a groupname_len-sized string
    groupname []u8;
}

const User = struct {
    uid: u32,
    gid: u32,
    // pointer to a separate structure that contains a list of gids
    additional_gids_offset: u29,
    // shell is a different story, documented elsewhere.
    shell_here: u1,
    shell_len_or_place: u6,
    home_len: u6,
    username_pos: u1,
    username_len: u5,
    gecos_len: u8,
    // a variable-sized array that will be stored immediately after this
    // struct.
    stringdata []u8;
}
```

Complete file structure
-----------------------

```
OFFSET    Section              SIZE                         DESCRIPTION
  0<<6    Header               1<<6                         documented above
  *<<6    []Group              num_groups * sizeof(Group)
  *<<6    []User               num_users * sizeof(User)
  *<<6    []u32                num_groups * sizeof(u32)
  *<<6    []u32                num_users * sizeof(u32)
  *<<6    Shells               unknown                      documented in "SHELLS"
  *<<6    cmph_gid2group       unknown                      offset by offset_cmph_gid2group
  *<<6    cmph_uid2user        unknown                      offset by offset_cmph_gid2group
  *<<6    cmph_groupname2group unknown                      offset by offset_cmph_groupname2group
  *<<6    cmph_username2user   unknown                      offset by offset_cmph_username2user
  *<<6    groupmembers         unknown                      list of group members for each group
  *<<6    additional_gids      unknown                      list of gids (group membership) for each member
```

TODO explain:
- shells
- `additional_gids`
- `groupmembers`

[git-subtrac]: https://github.com/apenwarr/git-subtrac/
[cmph]: http://cmph.sourceforge.net/
Let it be so. 2022-02-08 09:52:47 +02:00			`Turbo NSS`
			`---------`

start with a full file structure 2022-02-13 18:01:44 +02:00			`Glibc nss library for passwd and group.`
Let it be so. 2022-02-08 09:52:47 +02:00
[readme] add download/build instructions 2022-02-09 13:14:42 +02:00			`Checking out and building`
			`-------------------------`

			```
			`$ git clone --recursive https://git.sr.ht/~motiejus/turbonss`
			```

			Alternatively, if you forgot `--recursive`:

			```
			`$ git submodule update --init`
			```

			`And run tests:`

			```
			`$ zig build test`
			```

start with a full file structure 2022-02-13 18:01:44 +02:00			`Other commands will be documented as they are implemented.`
[readme] add download/build instructions 2022-02-09 13:14:42 +02:00
			`This project uses [git subtrac][git-subtrac] for managing dependencies.`

start with a full file structure 2022-02-13 18:01:44 +02:00			remarks on `id(1)`
			`------------------`
Let it be so. 2022-02-08 09:52:47 +02:00
start with a full file structure 2022-02-13 18:01:44 +02:00			`A known implementation runs id(1) at ~250 rps sequentially on ~20k users and`
			`~10k groups. Our target is 10k id/s.`
Let it be so. 2022-02-08 09:52:47 +02:00
start with a full file structure 2022-02-13 18:01:44 +02:00			`id(1)` works as follows:
Let it be so. 2022-02-08 09:52:47 +02:00			`- lookup user by name.`
			`- get all additional gids (an array attached to a member).`
update user record 2022-02-12 10:13:10 +02:00			`- for each additional gid, get the group name.`
Let it be so. 2022-02-08 09:52:47 +02:00
			`Assuming a member is in ~100 groups on average, that's 1M group lookups per`
start with a full file structure 2022-02-13 18:01:44 +02:00			`second. We need to convert gid to a group index, and group index to a group`
			`gid/name quickly.`
Let it be so. 2022-02-08 09:52:47 +02:00
start with a full file structure 2022-02-13 18:01:44 +02:00			Caveat: `struct group` contains an array of pointers to names of group members
			(`char **gr_mem`). However, `id` does not use that information, resulting in a
			significant read amplification. Therefore, if `argv[0] == "id"`, `getgrid(3)`
			will return group without the members. This speeds up `id` by about 10x on a
			`known NSS implementation.`

			`Indices`
			`-------`
Let it be so. 2022-02-08 09:52:47 +02:00
			`The following operations need to be fast, in order of importance:`

update user record 2022-02-12 10:13:10 +02:00			`1. lookup gid -> group (this is on hot path in id) with or without members (2`
			`separate calls).`
Let it be so. 2022-02-08 09:52:47 +02:00			`2. lookup uid -> user.`
[readme] add file structure 2022-02-11 15:37:23 +02:00			`3. lookup groupname -> group.`
			`4. lookup username -> user.`
more primitive types, start with File 2022-02-12 12:30:50 +02:00			5. (optional) iterate users using a defined order (`getent passwd`).
			6. (optional) iterate groups using a defined order (`getent group`).
update user record 2022-02-12 10:13:10 +02:00
start with a full file structure 2022-02-13 18:01:44 +02:00			`First 4 can use perfect hashing like [cmph][cmph]: it hashes a list of bytes to`
			`a sequential list of integers. Perfect hashing algorithms require some space,`
			`and take some time to calculate ("hashing duration"). I've tested BDZ, which`
			`hashes [][]u8 to a sequential list of integers (not preserving order) and CHM, which`
			`does the same, but preserves order. BDZ accepts an argument 3 <= b <= 10.`
Add cmph test results 2022-02-11 13:31:54 +02:00
			`BDZ: tried b=3, b=7 (default), and b=10.`

start with a full file structure 2022-02-13 18:01:44 +02:00			`* BDZ algorithm requires (900KB, 338KB, 306KB, respectively) for 1M values.`
			`* Latency to resolve 1M keys: (170ms, 180ms, 230ms).`
Add cmph test results 2022-02-11 13:31:54 +02:00			`* Packed vs non-packed latency differences are not meaningful.`

more primitive types, start with File 2022-02-12 12:30:50 +02:00			`CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with`
Add cmph test results 2022-02-11 13:31:54 +02:00			`CHM than with BDZ, eliminating the benefit of preserved ordering.`

start with a full file structure 2022-02-13 18:01:44 +02:00			`Turbonss header`
			`---------------`
[readme] add file structure 2022-02-11 15:37:23 +02:00
document global structure better 2022-02-13 10:42:40 +02:00			`The turbonss header looks like this:`

[readme] add file structure 2022-02-11 15:37:23 +02:00			```
document global structure better 2022-02-13 10:42:40 +02:00			`OFFSET TYPE NAME DESCRIPTION`
			`0 [4]u8 magic always 0xf09fa4b7`
			4 u8 version now `0`
start with a full file structure 2022-02-13 18:01:44 +02:00			`5 u16 bom 0x1234`
			`7 u2 padding`
document global structure better 2022-02-13 10:42:40 +02:00			`u6 num_shells see "SHELLS" section.`
start with a full file structure 2022-02-13 18:01:44 +02:00			`8 u32 num_users number of passwd entries`
			`12 u32 num_groups number of group entries`
			`16 u32 offset_cmph_gid2group`
			`20 u32 offset_cmph_uid2user`
			`24 u32 offset_cmph_groupname2group`
			`28 u32 offset_cmph_username2user`
			`32 u32 offset_groupmembers`
			`36 u32 offset_additional_gids`
more primitive types, start with File 2022-02-12 12:30:50 +02:00			```

start with a full file structure 2022-02-13 18:01:44 +02:00			`magic` is 0xf09fa4b7, and `version` must be `0`. All integers are
			native-endian. `bom` is a byte-order-mark. It must resolve to `0x1234` (4460).
			`If that's not true, the file is consumed in a different endianness than it was`
			`created at. Turbonss files cannot be moved across different-endianness`
			`computers. If that happens, turbonss will refuse to read the file.`

document global structure better 2022-02-13 10:42:40 +02:00			`Offsets are indices to further sections of the file, with zero being the first`
start with a full file structure 2022-02-13 18:01:44 +02:00			block (pointing to the `magic` field). As all blobs are 64-byte aligned, the
			`offsets are always pointing to the beginning of an 64-byte "block". Therefore,`
			all `offset_*` values could be `u26`. As `u32` is easier to visualize with xxd,
			`and the header block fits to 64 bytes anyway, we are keeping them as u32 now.`
more primitive types, start with File 2022-02-12 12:30:50 +02:00
			`Primitive types:`

			```
			`const Group = struct {`
			`gid: u32,`
document global structure better 2022-02-13 10:42:40 +02:00			`// index to a separate structure with a list of members. The memberlist is`
			`// always 2^5-byte aligned, this is an index there.`
			`members_offset: u27,`
			`groupname_len: u5,`
			`// a groupname_len-sized string`
			`groupname []u8;`
more primitive types, start with File 2022-02-12 12:30:50 +02:00			`}`

update user record 2022-02-12 10:13:10 +02:00			`const User = struct {`
			`uid: u32,`
			`gid: u32,`
more primitive types, start with File 2022-02-12 12:30:50 +02:00			`// pointer to a separate structure that contains a list of gids`
update user record 2022-02-12 10:13:10 +02:00			`additional_gids_offset: u29,`
add remaining offsets 2022-02-12 23:01:16 +02:00			`// shell is a different story, documented elsewhere.`
			`shell_here: u1,`
			`shell_len_or_place: u6,`
update user record 2022-02-12 10:13:10 +02:00			`home_len: u6,`
document global structure better 2022-02-13 10:42:40 +02:00			`username_pos: u1,`
			`username_len: u5,`
update user record 2022-02-12 10:13:10 +02:00			`gecos_len: u8,`
			`// a variable-sized array that will be stored immediately after this`
			`// struct.`
			`stringdata []u8;`
[readme] add file structure 2022-02-11 15:37:23 +02:00			`}`
formatting 2022-02-12 10:14:37 +02:00			```
[readme] add file structure 2022-02-11 15:37:23 +02:00
start with a full file structure 2022-02-13 18:01:44 +02:00			`Complete file structure`
			`-----------------------`

			```
			`OFFSET Section SIZE DESCRIPTION`
			`0<<6 Header 1<<6 documented above`
			`<<6 []Group num_groups sizeof(Group)`
			`<<6 []User num_users sizeof(User)`
			`<<6 []u32 num_groups sizeof(u32)`
			`<<6 []u32 num_users sizeof(u32)`
			`*<<6 Shells unknown documented in "SHELLS"`
			`*<<6 cmph_gid2group unknown offset by offset_cmph_gid2group`
			`*<<6 cmph_uid2user unknown offset by offset_cmph_gid2group`
			`*<<6 cmph_groupname2group unknown offset by offset_cmph_groupname2group`
			`*<<6 cmph_username2user unknown offset by offset_cmph_username2user`
			`*<<6 groupmembers unknown list of group members for each group`
			`*<<6 additional_gids unknown list of gids (group membership) for each member`
			```

add remaining offsets 2022-02-12 23:01:16 +02:00			`TODO explain:`
			`- shells`
			- `additional_gids`
start with a full file structure 2022-02-13 18:01:44 +02:00			- `groupmembers`
add remaining offsets 2022-02-12 23:01:16 +02:00
[readme] add download/build instructions 2022-02-09 13:14:42 +02:00			`[git-subtrac]: https://github.com/apenwarr/git-subtrac/`
Add cmph test results 2022-02-11 13:31:54 +02:00			`[cmph]: http://cmph.sourceforge.net/`