diff --git a/README.md b/README.md index 4c73154..1dc099a 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ Turbo NSS --------- -glibc nss library for passwd and group. +Glibc nss library for passwd and group. Checking out and building ------------------------- @@ -22,27 +22,33 @@ And run tests: $ zig build test ``` -... the other commands will be documented as they are implemented. +Other commands will be documented as they are implemented. This project uses [git subtrac][git-subtrac] for managing dependencies. -Steps ------ +remarks on `id(1)` +------------------ -A known implementation runs id(1) at ~250 rps sequentially. Our goal is 10k -ID/s. +A known implementation runs id(1) at ~250 rps sequentially on ~20k users and +~10k groups. Our target is 10k id/s. -id(1) works as follows: +`id(1)` works as follows: - lookup user by name. - get all additional gids (an array attached to a member). - for each additional gid, get the group name. Assuming a member is in ~100 groups on average, that's 1M group lookups per -second (cmph can do 1M in <200ms). We need to convert gid to a group index -quickly. +second. We need to convert gid to a group index, and group index to a group +gid/name quickly. -API ---- +Caveat: `struct group` contains an array of pointers to names of group members +(`char **gr_mem`). However, `id` does not use that information, resulting in a +significant read amplification. Therefore, if `argv[0] == "id"`, `getgrid(3)` +will return group without the members. This speeds up `id` by about 10x on a +known NSS implementation. + +Indices +------- The following operations need to be fast, in order of importance: @@ -54,50 +60,54 @@ The following operations need to be fast, in order of importance: 5. (optional) iterate users using a defined order (`getent passwd`). 6. (optional) iterate groups using a defined order (`getent group`). -Indices -------- - -Preliminary results of playing with [cmph][cmph]: +First 4 can use perfect hashing like [cmph][cmph]: it hashes a list of bytes to +a sequential list of integers. Perfect hashing algorithms require some space, +and take some time to calculate ("hashing duration"). I've tested BDZ, which +hashes [][]u8 to a sequential list of integers (not preserving order) and CHM, which +does the same, but preserves order. BDZ accepts an argument 3 <= b <= 10. BDZ: tried b=3, b=7 (default), and b=10. -* BDZ algorithm stores 1M values in (900KB, 338KB, 306KB) respectively. -* Latency for 1M keys: (170ms, 180ms, 230ms). +* BDZ algorithm requires (900KB, 338KB, 306KB, respectively) for 1M values. +* Latency to resolve 1M keys: (170ms, 180ms, 230ms). * Packed vs non-packed latency differences are not meaningful. CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with CHM than with BDZ, eliminating the benefit of preserved ordering. -Full file structure -------------------- +Turbonss header +--------------- The turbonss header looks like this: - ``` OFFSET TYPE NAME DESCRIPTION 0 [4]u8 magic always 0xf09fa4b7 4 u8 version now `0` - 5 u2 padding + 5 u16 bom 0x1234 + 7 u2 padding u6 num_shells see "SHELLS" section. - 6 u32 num_users number of passwd entries - 10 u32 num_groups number of group entries - 14 u32 offset_cmph_gid2group - 18 u32 offset_cmph_uid2user - 22 u32 offset_cmph_groupname2group - 26 u32 offset_cmph_username2user - 30 u32 offset_sorted_groups - 34 u32 offset_sorted_users - 38 u32 offset_groupmembers - 42 u32 offset_additional_gids + 8 u32 num_users number of passwd entries + 12 u32 num_groups number of group entries + 16 u32 offset_cmph_gid2group + 20 u32 offset_cmph_uid2user + 24 u32 offset_cmph_groupname2group + 28 u32 offset_cmph_username2user + 32 u32 offset_groupmembers + 36 u32 offset_additional_gids ``` -`magic` is 0xf09fa4b7, and `version` must be `0`. All integers are big-endian. +`magic` is 0xf09fa4b7, and `version` must be `0`. All integers are +native-endian. `bom` is a byte-order-mark. It must resolve to `0x1234` (4460). +If that's not true, the file is consumed in a different endianness than it was +created at. Turbonss files cannot be moved across different-endianness +computers. If that happens, turbonss will refuse to read the file. + Offsets are indices to further sections of the file, with zero being the first -block (the magic number). As all blobs are 64-byte aligned, the offsets are -always pointing to the beginning of an 64-byte "block". Therefore, all -`offset_*` values could be `u26`. As `u32` is easier to visualize with xxd, and -the File block fits to 64 bytes anyway, we are keeping them as u32 now. +block (pointing to the `magic` field). As all blobs are 64-byte aligned, the +offsets are always pointing to the beginning of an 64-byte "block". Therefore, +all `offset_*` values could be `u26`. As `u32` is easier to visualize with xxd, +and the header block fits to 64 bytes anyway, we are keeping them as u32 now. Primitive types: @@ -130,10 +140,29 @@ const User = struct { } ``` +Complete file structure +----------------------- + +``` +OFFSET Section SIZE DESCRIPTION + 0<<6 Header 1<<6 documented above + *<<6 []Group num_groups * sizeof(Group) + *<<6 []User num_users * sizeof(User) + *<<6 []u32 num_groups * sizeof(u32) + *<<6 []u32 num_users * sizeof(u32) + *<<6 Shells unknown documented in "SHELLS" + *<<6 cmph_gid2group unknown offset by offset_cmph_gid2group + *<<6 cmph_uid2user unknown offset by offset_cmph_gid2group + *<<6 cmph_groupname2group unknown offset by offset_cmph_groupname2group + *<<6 cmph_username2user unknown offset by offset_cmph_username2user + *<<6 groupmembers unknown list of group members for each group + *<<6 additional_gids unknown list of gids (group membership) for each member +``` + TODO explain: - shells - `additional_gids` -- `members` +- `groupmembers` [git-subtrac]: https://github.com/apenwarr/git-subtrac/ [cmph]: http://cmph.sourceforge.net/