1
Fork 0

add missing fields

This commit is contained in:
Motiejus Jakštys 2022-02-14 13:37:10 +02:00 committed by Motiejus Jakštys
parent c6bc383269
commit d422cdf61b
1 changed files with 37 additions and 23 deletions

View File

@ -12,13 +12,14 @@ To understand more about name service switch, start with
Design & constraints Design & constraints
-------------------- --------------------
To be fast, the user/group database (later: DB) has to be small ([highly To be fast, the user/group database (later: DB) has to be small
recommended background viewing][data-oriented-design]). It encodes user & group ([background][data-oriented-design]). It encodes user & group information in a
information in a way that minimizes the DB size, and reduces jumping across the way that minimizes the DB size, and reduces jumping across the DB ("chasing
DB ("chasing pointers and thrashing CPU cache"). pointers and thrashing CPU cache").
For example, [`getpwnam_r(3)`][getpwnam_r] accepts a username and returns To understand how this is done efficiently, let's analyze the
the following user information: [`getpwnam_r(3)`][getpwnam_r] in high level. This API call accepts a username
and returns the following user information:
``` ```
struct passwd { struct passwd {
@ -35,18 +36,21 @@ struct passwd {
Turbonss, among others, implements this call, and takes the following steps to Turbonss, among others, implements this call, and takes the following steps to
resolve a username to a `struct passwd*`: resolve a username to a `struct passwd*`:
- Open the DB (using `mmap`) and interpret it's first 40 bytes as a `struct
Header`. The header stores offsets to the sections of the file. This needs to
be done once, when the NSS library is loaded (or on the first call).
- Hash the username using a perfect hash function. Perfect hash function - Hash the username using a perfect hash function. Perfect hash function
returns a number `n ∈ [0,N-1]`, where N is the total number of users. returns a number `n ∈ [0,N-1]`, where N is the total number of users.
- Jump to the `n`'th location in the DB (by pointer arithmetic) which contains - Jump to the `n`'th location in the `idx_username2user` section (by pointer
the index `i` to the user's information. arithmetic), which contains the index `i` to the user's information.
- Jump to the location `i` (pointer arithmetic) which stores the full user - Jump to the location `i` of section `Users` (again, using pointer arithmetic)
information. which stores the full user information.
- Decode the user information (which is all in a continuous memory block) and - Decode the user information (which is all in a continuous memory block) and
return it to the caller. return it to the caller.
In total, that's one hash for the username (~150ns), two pointer jumps within In total, that's one hash for the username (~150ns), two pointer jumps within
the group file, and, now that the user record is found, `memcpy` for each the group file (to sections `idx_username2user` and `Users`), and, now that the
field. user record is found, `memcpy` for each field.
The turbonss DB file is be `mmap`-ed, making it simple to implement pointer The turbonss DB file is be `mmap`-ed, making it simple to implement pointer
arithmetic and jumping across the file. This also reduces memory usage, arithmetic and jumping across the file. This also reduces memory usage,
@ -288,14 +292,24 @@ A packed list is a list of varints.
Complete file structure Complete file structure
----------------------- -----------------------
`idx_*` entries are of type `[]u29` and are pointing to the respective `Groups`
and `Users` entries (from the beginning of the respective section). Since
entries are 8-byte aligned, 3 bits are saved from every element.
Each section is padded to 64 bytes.
``` ```
SECTION SIZE DESCRIPTION SECTION SIZE DESCRIPTION
Header 1<<6 documented above Header 40 see "Turbonss header" section
[]Group ? list of Group entries idx_gid2group len(group)*4*29/32 list of gid2group indices
[]User ? list of User entries idx_groupname2group len(group)*4*29/32 list of groupname2group indices
Shells ? documented in "SHELLS" idx_uid2user len(user)*4*29/32 list of uid2user indices
idx_username2user len(user)*4*29/32 list of username2user indices
Groups ? list of Group entries
Users ? list of User entries
Shells ? See "Shells" section
cmph_gid2group ? offset by offset_cmph_gid2group cmph_gid2group ? offset by offset_cmph_gid2group
cmph_uid2user ? offset by offset_cmph_gid2group cmph_uid2user ? offset by offset_cmph_uid2user
cmph_groupname2group ? offset by offset_cmph_groupname2group cmph_groupname2group ? offset by offset_cmph_groupname2group
cmph_username2user ? offset by offset_cmph_username2user cmph_username2user ? offset by offset_cmph_username2user
groupmembers ? offset by offset_groupmembers groupmembers ? offset by offset_groupmembers