1
Fork 0

bring back additional_gids_offset

This commit is contained in:
Motiejus Jakštys 2022-02-24 05:32:27 +02:00 committed by Motiejus Jakštys
parent c0afca00b0
commit 3bf1b3fc01
2 changed files with 39 additions and 30 deletions

View File

@ -67,7 +67,7 @@ Tight packing places some constraints on the underlying data:
- Maximum database size: 4GB. - Maximum database size: 4GB.
- Permitted length of username and groupname: 1-32 bytes. - Permitted length of username and groupname: 1-32 bytes.
- Permitted length of shell and home: 1-64 bytes. - Permitted length of shell and home: 1-64 bytes.
- Permitted comment ("gecos") length: 0-1023 bytes. - Permitted comment ("gecos") length: 0-255 bytes.
- User name, groupname, gecos and shell must be utf8-encoded. - User name, groupname, gecos and shell must be utf8-encoded.
Checking out and building Checking out and building
@ -100,7 +100,7 @@ remarks on `id(1)`
------------------ ------------------
A known implementation runs id(1) at ~250 rps sequentially on ~20k users and A known implementation runs id(1) at ~250 rps sequentially on ~20k users and
~10k groups. Our target is 10k id/s for the same payload. ~10k groups. Our rps target is much higher.
To better reason about the trade-offs, it is useful to understand how `id(1)` To better reason about the trade-offs, it is useful to understand how `id(1)`
is implemented, in rough terms: is implemented, in rough terms:
@ -111,9 +111,9 @@ is implemented, in rough terms:
- for each additional gid, get the `struct group*` - for each additional gid, get the `struct group*`
([`getgrgid_r(3)`][getgrgid_r]). ([`getgrgid_r(3)`][getgrgid_r]).
Assuming a member is in ~100 groups on average, that's 1M group lookups per Assuming a member is in ~100 groups on average, to reach 10k id/s translates to
second. We need to convert gid to a group index, and group index to a group 1M group lookups per second. We need to convert gid to a group index, and group
gid/name quickly. index to a group gid/name quickly.
Caveat: `struct group` contains an array of pointers to names of group members Caveat: `struct group` contains an array of pointers to names of group members
(`char **gr_mem`). However, `id` does not use that information, resulting in (`char **gr_mem`). However, `id` does not use that information, resulting in
@ -193,13 +193,13 @@ const PackedGroup = struct {
pub const PackedUser = packed struct { pub const PackedUser = packed struct {
uid: u32, uid: u32,
gid: u32, gid: u32,
additional_gids_offset: u29,
shell_here: bool, shell_here: bool,
shell_len_or_idx: u6, shell_len_or_idx: u6,
home_len: u6, home_len: u6,
name_is_a_suffix: bool, name_is_a_suffix: bool,
name_len: u5, name_len: u5,
gecos_len: u10, gecos_len: u8,
padding: u3,
// pseudocode: variable-sized array that will be stored immediately after // pseudocode: variable-sized array that will be stored immediately after
// this struct. // this struct.
stringdata []u8; stringdata []u8;
@ -267,27 +267,26 @@ Group memberships
There are two group memberships at play: There are two group memberships at play:
1. Given a username, resolve user's group gids (for `initgroups(3)`). 1. Given a group (gid/name), resolve the members' names (e.g. `getgrgid`).
2. Given a group (gid/name), resolve the members' names (e.g. `getgrgid`). 2. Given a username, resolve user's group gids (for `initgroups(3)`).
When user's groups are resolved in (1), the additional userdata is not
requested (there is no way to return it). Therefore, it is reasonable to store
the user's memberships completely out-of-bound, keyed by the hash of the
username.
When group's memberships are resolved in (2), the same call also requires other When group's memberships are resolved in (1), the same call also requires other
group information: gid and group name. Therefore it makes sense to store a group information: gid and group name. Therefore it makes sense to store a
pointer to the group members in the group information itself. However, the pointer to the group members in the group information itself. However, the
memberships are not *always* necessary (see remarks about `id(1)`), therefore memberships are not *always* necessary (see remarks about `id(1)`), therefore
the memberships will be stored separately, outside of the groups section. the memberships will be stored separately, outside of the groups section.
Similarly, when user's groups are resolved in (2), they are not always necessary
(i.e. not part of `struct user*`), therefore the memberships themselves are
stored out of bound.
`Groupmembers` and `Username2gids` store group and user memberships `Groupmembers` and `Username2gids` store group and user memberships
respectively. Membership IDs are used in their entirety — not necessitating respectively. Membership IDs are used in their entirety — not necessitating
random access, thus suitable for tight packing and varint encoding. random access, thus suitable for tight packing and varint encoding.
- For each group — a list of pointers (offsets) to User records, because - For each group — a list of pointers (offsets) to User records, because
`getgr*_r` returns an array of pointers to membernames. `getgr*_r` returns pointers to membernames.
- For each user — a list of gids, because `initgroups_dyn` (and friends) - For each user — a list of gids, because `initgroups_dyn` (and friends)
returns an array of gids. returns an array of gids.
@ -303,8 +302,6 @@ const Groupmembers = PackedList;
const Username2gids = PackedList; const Username2gids = PackedList;
``` ```
A packed list is a list of varints.
Indices Indices
------- -------
@ -317,15 +314,10 @@ understand which operations need to be fast; in order of importance:
4. lookup groupname -> group. 4. lookup groupname -> group.
5. lookup username -> user. 5. lookup username -> user.
`idx_*` sections are of type `[]PackedIntArray(u29)` and are pointing to the
respective `Groups` and `Users` entries (from the beginning of the respective
section). Since User and Group records are 8-byte aligned, 3 bits are saved for
every element.
These indices can use perfect hashing like [bdz from cmph][cmph]: a perfect These indices can use perfect hashing like [bdz from cmph][cmph]: a perfect
hash hashes a list of bytes to a sequential list of integers. Perfect hashing hash hashes a list of bytes to a sequential list of integers. Perfect hashing
algorithms require some space, and take some time to calculate ("hashing algorithms require some space, and take some time to calculate ("hashing
duration"). I've tested BDZ, which hashes [][]u8 to a sequential list of duration"). I've tested BDZ, which hashes `[][]u8` to a sequential list of
integers (not preserving order) and CHM, preserves order. BDZ accepts an integers (not preserving order) and CHM, preserves order. BDZ accepts an
optional argument `3 <= b <= 10`. optional argument `3 <= b <= 10`.
@ -337,6 +329,16 @@ CHM retains order, however, 1M keys weigh 8MB. 10k keys are ~20x larger with
CHM than with BDZ, eliminating the benefit of preserved ordering: we can just CHM than with BDZ, eliminating the benefit of preserved ordering: we can just
have a separate index. have a separate index.
None of the tested perfect hashing algorithms makes the distinction between
existing (in the initial dictionary) and new keys. In other words, HASH(value)
will be pointing to a number `n ∈ [0,N-1]`, regardless whether the value was in
the initial dictionary. Therefore one must always confirm, after calculating
the hash, that the key matches what's been hashed.
`idx_*` sections are of type `[]PackedIntArray(u29)` and are pointing to the
respective `Groups` and `Users` entries (from the beginning of the respective
section). Since User and Group records are 8-byte aligned, `u29` is used.
Complete file structure Complete file structure
----------------------- -----------------------

View File

@ -6,16 +6,17 @@ const Allocator = std.mem.Allocator;
const ArrayList = std.ArrayList; const ArrayList = std.ArrayList;
const cast = std.math.cast; const cast = std.math.cast;
const PackedUserSize = @divExact(@bitSizeOf(PackedUser), 8);
pub const PackedUser = packed struct { pub const PackedUser = packed struct {
uid: u32, uid: u32,
gid: u32, gid: u32,
additional_gids_offset: u29,
shell_here: bool, shell_here: bool,
shell_len_or_idx: u6, shell_len_or_idx: u6,
home_len: u6, home_len: u6,
name_is_a_suffix: bool, name_is_a_suffix: bool,
name_len: u5, name_len: u5,
gecos_len: u10, gecos_len: u8,
padding: u3,
// blobLength returns the length of the blob storing string values. // blobLength returns the length of the blob storing string values.
pub fn blobLength(self: *const PackedUser) usize { pub fn blobLength(self: *const PackedUser) usize {
@ -107,7 +108,7 @@ pub const UserWriter = struct {
const home_len = try downCast(u6, user.home.len - 1); const home_len = try downCast(u6, user.home.len - 1);
const name_len = try downCast(u5, user.name.len - 1); const name_len = try downCast(u5, user.name.len - 1);
const shell_len = try downCast(u6, user.shell.len - 1); const shell_len = try downCast(u6, user.shell.len - 1);
const gecos_len = try downCast(u10, user.gecos.len); const gecos_len = try downCast(u8, user.gecos.len);
try validateUtf8(user.home); try validateUtf8(user.home);
try validateUtf8(user.name); try validateUtf8(user.name);
@ -117,13 +118,13 @@ pub const UserWriter = struct {
var puser = PackedUser{ var puser = PackedUser{
.uid = user.uid, .uid = user.uid,
.gid = user.gid, .gid = user.gid,
.additional_gids_offset = 1 << 29 - 1,
.shell_here = self.shellIndexFn(user.shell) == null, .shell_here = self.shellIndexFn(user.shell) == null,
.shell_len_or_idx = self.shellIndexFn(user.shell) orelse shell_len, .shell_len_or_idx = self.shellIndexFn(user.shell) orelse shell_len,
.home_len = home_len, .home_len = home_len,
.name_is_a_suffix = std.mem.endsWith(u8, user.home, user.name), .name_is_a_suffix = std.mem.endsWith(u8, user.home, user.name),
.name_len = name_len, .name_len = name_len,
.gecos_len = gecos_len, .gecos_len = gecos_len,
.padding = 0,
}; };
try self.appendTo.appendSlice(std.mem.asBytes(&puser)); try self.appendTo.appendSlice(std.mem.asBytes(&puser));
@ -241,7 +242,13 @@ pub const UserReader = struct {
const testing = std.testing; const testing = std.testing;
test "PackedUser internal and external alignment" { test "PackedUser internal and external alignment" {
try testing.expectEqual(@bitSizeOf(PackedUser), @sizeOf(PackedUser) * 8); // External padding (PackedUserAlignmentBits) must be higher or equal to
// the "internal" PackedUser alignment. By aligning PackedUser we are also
// working around https://github.com/ziglang/zig/issues/10958 ; PackedUser
// cannot be converted from/to [@bitSizeOf(PackedUser)/8]u8;
// asBytes/bytesAsValue use @sizeOf, which is larger. Now we are putting no
// more than 1, but it probably could be higher.
try testing.expect(@bitSizeOf(PackedUser) - @sizeOf(PackedUser) * 8 <= 8);
} }
fn testShellIndex(shell: []const u8) ?u6 { fn testShellIndex(shell: []const u8) ?u6 {
@ -284,7 +291,7 @@ test "construct PackedUser section" {
.uid = 0, .uid = 0,
.gid = 4294967295, .gid = 4294967295,
.name = "n" ** 32, .name = "n" ** 32,
.gecos = "g" ** 1023, .gecos = "g" ** 255,
.home = "h" ** 64, .home = "h" ** 64,
.shell = "s" ** 64, .shell = "s" ** 64,
} }; } };