user packing

This commit is contained in:
2022-02-19 11:35:29 +02:00
committed by Motiejus Jakštys
parent 93c6a1c12a
commit 13b75e8046
3 changed files with 118 additions and 50 deletions

View File

@@ -67,9 +67,9 @@ Tight packing places some constraints on the underlying data:
- Maximum database size: 4GB.
- Permitted length of username and groupname: 1-32 bytes.
- Permitted length of shell and homedir: 1-64 bytes.
- Permitted length of shell and home: 1-64 bytes.
- Permitted comment ("gecos") length: 0-255 bytes.
- Username, groupname and gecos must be utf8-encoded.
- User name, groupname and gecos must be utf8-encoded.
Checking out and building
-------------------------
@@ -219,11 +219,11 @@ const User = struct {
// pointer to a separate structure that contains a list of gids
additional_gids_offset: u29,
// shell is a different story, documented elsewhere.
shell_here: u1,
shell_here: bool,
shell_len_or_idx: u6,
homedir_len: u6,
username_is_a_suffix: u1,
username_offset_or_len: u5,
home_len: u6,
name_is_a_suffix: bool,
name_len: u5,
gecos_len: u8,
// a variable-sized array that will be stored immediately after this
// struct.
@@ -232,27 +232,27 @@ const User = struct {
```
`stringdata` contains a few string entries:
- homedir.
- username.
- home.
- name.
- gecos.
- shell (optional).
First byte of the homedir is stored right after the `gecos_len` field, and it's
length is `homedir_len`. The same logic applies to all the `stringdata` fields:
First byte of the home is stored right after the `gecos_len` field, and it's
length is `home_len`. The same logic applies to all the `stringdata` fields:
there is a way to calculate their relative position from the length of the
fields before them.
Additionally, two optimizations for special fields are made:
- shells are often shared across different users, see the "Shells" section.
- username is frequently a suffix of the homedir. For example, `/home/motiejus`
and `motiejus`. In which case storing both username and homedir strings is
wasteful. For that cases, username has two options:
1. `username_is_a_suffix=true`: username is a suffix of the home dir. In that
case, the username starts at the `username_offset_or_len`'th byte of the
homedir, and ends at the same place as the homedir.
2. `username_is_a_suffix=false`: username is stored separately. In that case,
it begins one byte after homedir, and it's length is
`username_offset_or_len`.
- name is frequently a suffix of the home. For example, `/home/motiejus`
and `motiejus`. In which case storing both name and home strings is
wasteful. For that cases, name has two options:
1. `name_is_a_suffix=true`: name is a suffix of the home dir. In that
case, the name starts at the `home_len - name_len`'th
byte of the home, and ends at the same place as the home.
2. `name_is_a_suffix=false`: name is stored separately. In that case,
it begins one byte after home, and it's length is
`name_len`.
Shells
------
@@ -315,8 +315,7 @@ const AdditionalGids = PackedList;
An entry in `members` field points to the offset into a respective `User` or
`Group` entry (number of bytes relative to the first entry of the type).
`members` in `PackedList` is sorted by the name (`username` or `groupname`) of
the record it is pointing to.
`members` in `PackedList` are sorted the same way as in the input.
A packed list is a list of varints.