2022-02-08 09:52:47 +02:00
|
|
|
Turbo NSS
|
|
|
|
---------
|
|
|
|
|
2022-02-14 10:55:49 +02:00
|
|
|
Turbonss is a plugin for GNU Name Service Switch (NSS) functionality of GNU C
|
|
|
|
Library (glibc). Turbonss implements lookup for `user` and `passwd` database
|
|
|
|
entries (i.e. system users, groups, and group memberships). It's main goal is
|
|
|
|
performance, with focus on making [`id(1)`][id] run as fast as possible.
|
|
|
|
|
2022-02-14 13:55:54 +02:00
|
|
|
Turbonss is optimized for reading. If the data changes in any way, the whole
|
|
|
|
file will need to be regenerated (and tooling only supports only full
|
|
|
|
generation). It was created, and best suited, for environments that have a
|
|
|
|
central user & group database which then needs to be distributed to many
|
2022-02-23 10:45:05 +02:00
|
|
|
servers/services, and the data does not change very often (e.g. hourly).
|
2022-02-14 13:55:54 +02:00
|
|
|
|
2022-02-14 10:55:49 +02:00
|
|
|
To understand more about name service switch, start with
|
2022-02-14 13:05:33 +02:00
|
|
|
[`nsswitch.conf(5)`][nsswitch].
|
2022-02-14 10:55:49 +02:00
|
|
|
|
|
|
|
Design & constraints
|
|
|
|
--------------------
|
|
|
|
|
2022-02-14 13:37:10 +02:00
|
|
|
To be fast, the user/group database (later: DB) has to be small
|
|
|
|
([background][data-oriented-design]). It encodes user & group information in a
|
|
|
|
way that minimizes the DB size, and reduces jumping across the DB ("chasing
|
|
|
|
pointers and thrashing CPU cache").
|
2022-02-14 10:55:49 +02:00
|
|
|
|
2022-02-14 13:37:10 +02:00
|
|
|
To understand how this is done efficiently, let's analyze the
|
|
|
|
[`getpwnam_r(3)`][getpwnam_r] in high level. This API call accepts a username
|
|
|
|
and returns the following user information:
|
2022-02-14 10:55:49 +02:00
|
|
|
|
|
|
|
```
|
|
|
|
struct passwd {
|
|
|
|
char *pw_name; /* username */
|
|
|
|
char *pw_passwd; /* user password */
|
|
|
|
uid_t pw_uid; /* user ID */
|
|
|
|
gid_t pw_gid; /* group ID */
|
|
|
|
char *pw_gecos; /* user information */
|
|
|
|
char *pw_dir; /* home directory */
|
|
|
|
char *pw_shell; /* shell program */
|
|
|
|
};
|
|
|
|
```
|
|
|
|
|
|
|
|
Turbonss, among others, implements this call, and takes the following steps to
|
2022-02-14 13:05:33 +02:00
|
|
|
resolve a username to a `struct passwd*`:
|
2022-02-14 10:55:49 +02:00
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
- Open the DB (using `mmap`) and interpret it's first 64 bytes as a `*struct
|
2022-02-14 13:37:10 +02:00
|
|
|
Header`. The header stores offsets to the sections of the file. This needs to
|
2022-02-23 10:45:05 +02:00
|
|
|
be done once, when the NSS library is loaded.
|
2022-02-14 10:55:49 +02:00
|
|
|
- Hash the username using a perfect hash function. Perfect hash function
|
2022-02-14 13:05:33 +02:00
|
|
|
returns a number `n ∈ [0,N-1]`, where N is the total number of users.
|
2022-02-23 10:45:05 +02:00
|
|
|
- Jump to the `n`'th location in the `idx_name2user` section, which contains
|
|
|
|
the index `i` to the user's information.
|
|
|
|
- Jump to the location `i` of section `Users`, which stores the full user
|
|
|
|
information.
|
2022-02-14 10:55:49 +02:00
|
|
|
- Decode the user information (which is all in a continuous memory block) and
|
|
|
|
return it to the caller.
|
|
|
|
|
|
|
|
In total, that's one hash for the username (~150ns), two pointer jumps within
|
2022-02-19 16:09:46 +02:00
|
|
|
the group file (to sections `idx_name2user` and `Users`), and, now that the
|
2022-02-14 13:37:10 +02:00
|
|
|
user record is found, `memcpy` for each field.
|
2022-02-14 10:55:49 +02:00
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
The turbonss DB file is be `mmap`-ed, making it simple to jump across the file
|
|
|
|
using pointer arithmetic. This also reduces memory usage, as the mmap'ed
|
|
|
|
regions are shared. Turbonss reads do not consume any heap space.
|
2022-02-14 13:05:33 +02:00
|
|
|
|
|
|
|
Tight packing places some constraints on the underlying data:
|
2022-02-14 10:55:49 +02:00
|
|
|
|
|
|
|
- Maximum database size: 4GB.
|
2022-02-15 10:49:03 +02:00
|
|
|
- Permitted length of username and groupname: 1-32 bytes.
|
2022-02-19 11:35:29 +02:00
|
|
|
- Permitted length of shell and home: 1-64 bytes.
|
2022-02-24 05:32:27 +02:00
|
|
|
- Permitted comment ("gecos") length: 0-255 bytes.
|
2022-02-23 06:07:53 +02:00
|
|
|
- User name, groupname, gecos and shell must be utf8-encoded.
|
2022-02-08 09:52:47 +02:00
|
|
|
|
2022-02-09 13:14:42 +02:00
|
|
|
Checking out and building
|
|
|
|
-------------------------
|
|
|
|
|
|
|
|
```
|
|
|
|
$ git clone --recursive https://git.sr.ht/~motiejus/turbonss
|
|
|
|
```
|
|
|
|
|
|
|
|
Alternatively, if you forgot `--recursive`:
|
|
|
|
|
|
|
|
```
|
|
|
|
$ git submodule update --init
|
|
|
|
```
|
|
|
|
|
|
|
|
And run tests:
|
|
|
|
|
|
|
|
```
|
|
|
|
$ zig build test
|
|
|
|
```
|
|
|
|
|
2022-02-13 18:01:44 +02:00
|
|
|
Other commands will be documented as they are implemented.
|
2022-02-09 13:14:42 +02:00
|
|
|
|
2022-02-18 17:24:22 +02:00
|
|
|
This project uses [git subtrac][git-subtrac] for managing dependencies. They
|
|
|
|
work just like regular submodules, except all the refs of the submodules are in
|
|
|
|
this repository. Repeat after me: all the submodules are in this repository.
|
|
|
|
So if you have a copy of this repo, dependencies will not disappear.
|
2022-02-09 13:14:42 +02:00
|
|
|
|
2022-02-13 18:01:44 +02:00
|
|
|
remarks on `id(1)`
|
|
|
|
------------------
|
2022-02-08 09:52:47 +02:00
|
|
|
|
2022-02-13 18:01:44 +02:00
|
|
|
A known implementation runs id(1) at ~250 rps sequentially on ~20k users and
|
2022-02-24 05:32:27 +02:00
|
|
|
~10k groups. Our rps target is much higher.
|
2022-02-08 09:52:47 +02:00
|
|
|
|
2022-02-14 13:05:33 +02:00
|
|
|
To better reason about the trade-offs, it is useful to understand how `id(1)`
|
|
|
|
is implemented, in rough terms:
|
2022-02-23 10:45:05 +02:00
|
|
|
- lookup user by name ([`getpwent_r(3)`][getpwent_r]).
|
|
|
|
- get all gids for the user ([`getgrouplist(3)`][getgrouplist]). Note: it is
|
|
|
|
actually using `initgroups_dyn`, accepts a uid, and is very poorly
|
|
|
|
documented.
|
|
|
|
- for each additional gid, get the `struct group*`
|
|
|
|
([`getgrgid_r(3)`][getgrgid_r]).
|
2022-02-08 09:52:47 +02:00
|
|
|
|
2022-02-24 05:32:27 +02:00
|
|
|
Assuming a member is in ~100 groups on average, to reach 10k id/s translates to
|
|
|
|
1M group lookups per second. We need to convert gid to a group index, and group
|
|
|
|
index to a group gid/name quickly.
|
2022-02-08 09:52:47 +02:00
|
|
|
|
2022-02-13 18:01:44 +02:00
|
|
|
Caveat: `struct group` contains an array of pointers to names of group members
|
2022-02-14 13:05:33 +02:00
|
|
|
(`char **gr_mem`). However, `id` does not use that information, resulting in
|
2022-02-23 10:45:05 +02:00
|
|
|
read amplification, sometimes by 10-100x. Therefore, if `argv[0] == "id"`, our
|
|
|
|
implementation of [`getgrid_r(3)`][getgrid] returns the `struct group*` without
|
|
|
|
the members. This speeds up `id` by about 10x on a known NSS implementation.
|
2022-02-13 18:01:44 +02:00
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
Relatedly, because [`getgrid_r(3)`][getgrid] does not need the group members,
|
|
|
|
the group members are stored in a different DB section, reducing the `Groups`
|
|
|
|
section and making more of it fit the CPU caches.
|
2022-02-11 13:31:54 +02:00
|
|
|
|
2022-02-13 18:01:44 +02:00
|
|
|
Turbonss header
|
|
|
|
---------------
|
2022-02-11 15:37:23 +02:00
|
|
|
|
2022-02-13 10:42:40 +02:00
|
|
|
The turbonss header looks like this:
|
|
|
|
|
2022-02-11 15:37:23 +02:00
|
|
|
```
|
2022-02-13 10:42:40 +02:00
|
|
|
OFFSET TYPE NAME DESCRIPTION
|
|
|
|
0 [4]u8 magic always 0xf09fa4b7
|
|
|
|
4 u8 version now `0`
|
2022-02-13 18:01:44 +02:00
|
|
|
5 u16 bom 0x1234
|
2022-02-23 10:45:05 +02:00
|
|
|
u8 num_shells max value: 63.
|
2022-02-13 18:01:44 +02:00
|
|
|
8 u32 num_users number of passwd entries
|
|
|
|
12 u32 num_groups number of group entries
|
2022-02-18 17:24:22 +02:00
|
|
|
16 u32 offset_bdz_uid2user
|
2022-02-19 16:09:46 +02:00
|
|
|
24 u32 offset_bdz_name2user
|
2022-02-23 15:25:55 +02:00
|
|
|
20 u32 offset_bdz_groupname2group
|
2022-02-14 13:55:54 +02:00
|
|
|
28 u32 offset_idx offset to the first idx_ section
|
|
|
|
32 u32 offset_groups
|
|
|
|
36 u32 offset_users
|
2022-02-15 10:49:03 +02:00
|
|
|
40 u32 offset_groupmembers
|
|
|
|
44 u32 offset_additional_gids
|
2022-02-12 12:30:50 +02:00
|
|
|
```
|
|
|
|
|
2022-02-13 18:01:44 +02:00
|
|
|
`magic` is 0xf09fa4b7, and `version` must be `0`. All integers are
|
|
|
|
native-endian. `bom` is a byte-order-mark. It must resolve to `0x1234` (4460).
|
|
|
|
If that's not true, the file is consumed in a different endianness than it was
|
|
|
|
created at. Turbonss files cannot be moved across different-endianness
|
|
|
|
computers. If that happens, turbonss will refuse to read the file.
|
|
|
|
|
2022-02-13 10:42:40 +02:00
|
|
|
Offsets are indices to further sections of the file, with zero being the first
|
2022-02-23 10:45:05 +02:00
|
|
|
block (pointing to the `magic` field). As all sections are aligned to 64 bytes,
|
|
|
|
the offsets are always pointing to the beginning of an 64-byte "block".
|
|
|
|
Therefore, all `offset_*` values could be `u26`. As `u32` is easier to
|
|
|
|
visualize with xxd, and the header block fits to 64 bytes anyway, we are
|
|
|
|
keeping them as u32 now.
|
2022-02-12 12:30:50 +02:00
|
|
|
|
2022-02-14 13:55:54 +02:00
|
|
|
Sections whose lengths can be calculated do not have a corresponding `offset_*`
|
2022-02-18 17:24:22 +02:00
|
|
|
header field. For example, `bdz_gid2group` comes immediately after the header,
|
2022-02-14 13:55:54 +02:00
|
|
|
and `idx_groupname2group` comes after `idx_gid2group`, whose offset is
|
|
|
|
`offset_idx`, and size can be calculated.
|
|
|
|
|
2022-02-17 11:16:30 +02:00
|
|
|
`num_shells` would fit to u6; however, we would need 2 bits of padding (all
|
|
|
|
other fields are byte-aligned). If we instead do `u2` followed by `u6`, the
|
|
|
|
byte would look very unusual on a little-endian architecture. Therefore we will
|
2022-02-23 10:45:05 +02:00
|
|
|
just reject the DB if the number of shells exceeds 63.
|
2022-02-17 11:16:30 +02:00
|
|
|
|
2022-02-14 10:55:49 +02:00
|
|
|
Primitive types
|
|
|
|
---------------
|
2022-02-12 12:30:50 +02:00
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
`User` and `Group` entries are sorted by the order they were received in the input
|
|
|
|
file. All entries are aligned to 8 bytes. All `User` and `Group` entries are
|
2022-02-14 13:05:33 +02:00
|
|
|
referred by their byte offset in the `Users` and `Groups` section relative to
|
|
|
|
the beginning of the section.
|
|
|
|
|
2022-02-12 12:30:50 +02:00
|
|
|
```
|
2022-02-24 05:51:04 +02:00
|
|
|
const PackedGroup = packed struct {
|
2022-02-12 12:30:50 +02:00
|
|
|
gid: u32,
|
2022-02-13 10:42:40 +02:00
|
|
|
// index to a separate structure with a list of members. The memberlist is
|
2022-02-24 05:51:04 +02:00
|
|
|
// 2^5-byte aligned (32b), this is an index there.
|
2022-02-13 10:42:40 +02:00
|
|
|
members_offset: u27,
|
|
|
|
groupname_len: u5,
|
|
|
|
// a groupname_len-sized string
|
|
|
|
groupname []u8;
|
2022-02-12 12:30:50 +02:00
|
|
|
}
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
pub const PackedUser = packed struct {
|
2022-02-12 10:13:10 +02:00
|
|
|
uid: u32,
|
|
|
|
gid: u32,
|
2022-02-24 05:32:27 +02:00
|
|
|
additional_gids_offset: u29,
|
2022-02-19 11:35:29 +02:00
|
|
|
shell_here: bool,
|
2022-02-18 20:36:32 +02:00
|
|
|
shell_len_or_idx: u6,
|
2022-02-19 11:35:29 +02:00
|
|
|
home_len: u6,
|
|
|
|
name_is_a_suffix: bool,
|
|
|
|
name_len: u5,
|
2022-02-24 05:32:27 +02:00
|
|
|
gecos_len: u8,
|
2022-02-23 10:45:05 +02:00
|
|
|
// pseudocode: variable-sized array that will be stored immediately after
|
|
|
|
// this struct.
|
2022-02-12 10:13:10 +02:00
|
|
|
stringdata []u8;
|
2022-02-11 15:37:23 +02:00
|
|
|
}
|
2022-02-12 10:14:37 +02:00
|
|
|
```
|
2022-02-11 15:37:23 +02:00
|
|
|
|
2022-02-14 13:05:33 +02:00
|
|
|
`stringdata` contains a few string entries:
|
2022-02-19 11:35:29 +02:00
|
|
|
- home.
|
2022-02-23 10:45:05 +02:00
|
|
|
- name (optional).
|
2022-02-14 13:05:33 +02:00
|
|
|
- gecos.
|
|
|
|
- shell (optional).
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
First byte of home is stored right after the `gecos_len` field, and it's
|
2022-02-19 11:35:29 +02:00
|
|
|
length is `home_len`. The same logic applies to all the `stringdata` fields:
|
2022-02-14 13:05:33 +02:00
|
|
|
there is a way to calculate their relative position from the length of the
|
|
|
|
fields before them.
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
Additionally, there are two "easy" optimizations:
|
2022-02-14 13:05:33 +02:00
|
|
|
- shells are often shared across different users, see the "Shells" section.
|
2022-02-23 10:45:05 +02:00
|
|
|
- `name` is frequently a suffix of `home`. For example, `/home/motiejus` and
|
|
|
|
`motiejus`. In this case storing both name and home is wasteful. Therefore
|
|
|
|
name has two options:
|
|
|
|
1. `name_is_a_suffix=true`: name is a suffix of the home dir. Then `name`
|
|
|
|
starts at the `home_len - name_len`'th byte of `home`, and ends at the same
|
|
|
|
place as `home`.
|
|
|
|
2. `name_is_a_suffix=false`: name begins one byte after home, and it's length
|
|
|
|
is `name_len`.
|
2022-02-14 10:55:49 +02:00
|
|
|
|
|
|
|
Shells
|
|
|
|
------
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
Normally there is a limited number of separate shells even in huge user
|
|
|
|
databases. A few examples: `/bin/bash`, `/usr/bin/nologin`, `/bin/zsh` among
|
|
|
|
others. Therefore, "shells" have an optimization: they can be pointed by in the
|
|
|
|
external list, or, if they are unique to the user, reside among the user's
|
|
|
|
data.
|
2022-02-14 10:55:49 +02:00
|
|
|
|
2022-02-15 10:49:03 +02:00
|
|
|
63 most popular shells (i.e. referred to by at least two User entries) are
|
|
|
|
stored externally in "Shells" area. The less popular ones are stored with
|
2022-02-14 10:55:49 +02:00
|
|
|
userdata.
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
Shells section consists of two sub-sections: the index and the blob. The index
|
|
|
|
is a list of structs which point to a location in the "blob" area:
|
2022-02-15 10:49:03 +02:00
|
|
|
|
|
|
|
```
|
|
|
|
const ShellIndex = struct {
|
|
|
|
offset: u10,
|
|
|
|
len: u6,
|
|
|
|
};
|
|
|
|
```
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
In the user's struct `shell_here=true` signifies that the shell is stored with
|
|
|
|
userdata, and it's length is `shell_len_or_idx`. `shell_here=false` means it is
|
|
|
|
stored in the `Shells` section, and it's index is `shell_len_or_idx`.
|
2022-02-14 10:55:49 +02:00
|
|
|
|
2022-02-14 13:05:33 +02:00
|
|
|
Variable-length integers (varints)
|
|
|
|
----------------------------------
|
|
|
|
|
|
|
|
Varint is an efficiently encoded integer (packed for small values). Same as
|
|
|
|
[protocol buffer varints][varint], except the largest possible value is `u64`.
|
2022-02-23 10:45:05 +02:00
|
|
|
They compress integers well. Varints are stored for group memberships.
|
2022-02-14 13:05:33 +02:00
|
|
|
|
2022-02-22 15:04:59 +02:00
|
|
|
Group memberships
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
There are two group memberships at play:
|
|
|
|
|
2022-02-24 05:32:27 +02:00
|
|
|
1. Given a group (gid/name), resolve the members' names (e.g. `getgrgid`).
|
|
|
|
2. Given a username, resolve user's group gids (for `initgroups(3)`).
|
2022-02-22 15:04:59 +02:00
|
|
|
|
|
|
|
|
2022-02-24 05:32:27 +02:00
|
|
|
When group's memberships are resolved in (1), the same call also requires other
|
2022-02-22 15:04:59 +02:00
|
|
|
group information: gid and group name. Therefore it makes sense to store a
|
|
|
|
pointer to the group members in the group information itself. However, the
|
2022-02-23 10:45:05 +02:00
|
|
|
memberships are not *always* necessary (see remarks about `id(1)`), therefore
|
|
|
|
the memberships will be stored separately, outside of the groups section.
|
|
|
|
|
2022-02-24 05:32:27 +02:00
|
|
|
Similarly, when user's groups are resolved in (2), they are not always necessary
|
|
|
|
(i.e. not part of `struct user*`), therefore the memberships themselves are
|
|
|
|
stored out of bound.
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
`Groupmembers` and `Username2gids` store group and user memberships
|
|
|
|
respectively. Membership IDs are used in their entirety — not necessitating
|
|
|
|
random access, thus suitable for tight packing and varint encoding.
|
|
|
|
|
|
|
|
- For each group — a list of pointers (offsets) to User records, because
|
2022-02-24 05:32:27 +02:00
|
|
|
`getgr*_r` returns pointers to membernames.
|
2022-02-23 10:45:05 +02:00
|
|
|
- For each user — a list of gids, because `initgroups_dyn` (and friends)
|
|
|
|
returns an array of gids.
|
2022-02-14 10:55:49 +02:00
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
An entry of `Groupmembers` and `Username2gids` looks like this piece of
|
2022-02-14 10:55:49 +02:00
|
|
|
pseudo-code:
|
|
|
|
|
|
|
|
```
|
|
|
|
const PackedList = struct {
|
2022-02-23 10:45:05 +02:00
|
|
|
Length: varint,
|
|
|
|
Members: [Length]varint,
|
2022-02-14 10:55:49 +02:00
|
|
|
}
|
|
|
|
const Groupmembers = PackedList;
|
2022-02-23 06:19:40 +02:00
|
|
|
const Username2gids = PackedList;
|
2022-02-14 10:55:49 +02:00
|
|
|
```
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
Indices
|
|
|
|
-------
|
2022-02-22 15:16:45 +02:00
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
Now that we've sketched the implementation of `id(3)`, it's clearer to
|
|
|
|
understand which operations need to be fast; in order of importance:
|
|
|
|
|
|
|
|
1. lookup gid -> group info (this is on hot path in id) without members.
|
|
|
|
2. lookup username -> user's groups.
|
|
|
|
3. lookup uid -> user.
|
|
|
|
4. lookup groupname -> group.
|
|
|
|
5. lookup username -> user.
|
2022-02-13 18:01:44 +02:00
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
These indices can use perfect hashing like [bdz from cmph][cmph]: a perfect
|
|
|
|
hash hashes a list of bytes to a sequential list of integers. Perfect hashing
|
|
|
|
algorithms require some space, and take some time to calculate ("hashing
|
2022-02-24 05:32:27 +02:00
|
|
|
duration"). I've tested BDZ, which hashes `[][]u8` to a sequential list of
|
2022-02-23 10:45:05 +02:00
|
|
|
integers (not preserving order) and CHM, preserves order. BDZ accepts an
|
|
|
|
optional argument `3 <= b <= 10`.
|
|
|
|
|
|
|
|
* BDZ algorithm requires (b=3, 900KB, b=7, 338KB, b=10, 306KB) for 1M values.
|
|
|
|
* Latency to resolve 1M keys: (170ms, 180ms, 230ms, respectively).
|
|
|
|
* Packed vs non-packed latency differences are not meaningful.
|
|
|
|
|
|
|
|
CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with
|
|
|
|
CHM than with BDZ, eliminating the benefit of preserved ordering: we can just
|
|
|
|
have a separate index.
|
|
|
|
|
2022-02-24 05:32:27 +02:00
|
|
|
None of the tested perfect hashing algorithms makes the distinction between
|
|
|
|
existing (in the initial dictionary) and new keys. In other words, HASH(value)
|
|
|
|
will be pointing to a number `n ∈ [0,N-1]`, regardless whether the value was in
|
|
|
|
the initial dictionary. Therefore one must always confirm, after calculating
|
|
|
|
the hash, that the key matches what's been hashed.
|
|
|
|
|
|
|
|
`idx_*` sections are of type `[]PackedIntArray(u29)` and are pointing to the
|
|
|
|
respective `Groups` and `Users` entries (from the beginning of the respective
|
|
|
|
section). Since User and Group records are 8-byte aligned, `u29` is used.
|
|
|
|
|
2022-02-23 10:45:05 +02:00
|
|
|
Complete file structure
|
|
|
|
-----------------------
|
2022-02-14 13:37:10 +02:00
|
|
|
|
|
|
|
Each section is padded to 64 bytes.
|
|
|
|
|
2022-02-13 18:01:44 +02:00
|
|
|
```
|
2022-02-23 06:19:40 +02:00
|
|
|
STATUS SECTION SIZE DESCRIPTION
|
|
|
|
✅ Header 48 see "Turbonss header" section
|
2022-02-24 05:51:04 +02:00
|
|
|
✅ bdz_gid ? bdz(gid)
|
|
|
|
✅ bdz_groupname ? bdz(groupname)
|
|
|
|
✅ bdz_uid ? bdz(uid)
|
|
|
|
✅ bdz_name ? bdz(username)
|
2022-02-23 06:19:40 +02:00
|
|
|
idx_gid2group len(group)*29/8 bdz->offset Groups
|
|
|
|
idx_groupname2group len(group)*29/8 bdz->offset Groups
|
|
|
|
idx_uid2user len(user)*29/8 bdz->offset Users
|
|
|
|
idx_name2user len(user)*29/8 bdz->offset Users
|
|
|
|
idx_username2gids len(user)*29/8 bdz->offset Username2gids
|
|
|
|
✅ ShellIndex len(shells)*2 Shell index array
|
|
|
|
✅ ShellBlob <= 4032 Shell data blob (max 63*64 bytes)
|
|
|
|
Groups ? packed Group entries (8b padding)
|
|
|
|
✅ Users ? packed User entries (8b padding)
|
|
|
|
Groupmembers ? per-group memberlist (32b padding)
|
|
|
|
Username2gids ? Per-user gidlist entries (8b padding)
|
2022-02-13 18:01:44 +02:00
|
|
|
```
|
|
|
|
|
2022-02-18 17:24:22 +02:00
|
|
|
[git-subtrac]: https://apenwarr.ca/log/20191109
|
2022-02-11 13:31:54 +02:00
|
|
|
[cmph]: http://cmph.sourceforge.net/
|
2022-02-14 10:55:49 +02:00
|
|
|
[id]: https://linux.die.net/man/1/id
|
|
|
|
[nsswitch]: https://linux.die.net/man/5/nsswitch.conf
|
|
|
|
[data-oriented-design]: https://media.handmade-seattle.com/practical-data-oriented-design/
|
|
|
|
[getpwnam_r]: https://linux.die.net/man/3/getpwnam_r
|
2022-02-14 13:05:33 +02:00
|
|
|
[varint]: https://developers.google.com/protocol-buffers/docs/encoding#varints
|
2022-02-23 10:45:05 +02:00
|
|
|
[getpwent_r]: https://www.man7.org/linux/man-pages/man3/getpwent_r.3.html
|
|
|
|
[getgrouplist]: https://www.man7.org/linux/man-pages/man3/getgrouplist.3.html
|
|
|
|
[getgrid_r]: https://www.man7.org/linux/man-pages/man3/getgrid_r.3.html
|