From 5ee8469ec55e95c08eded778963d3ec0ea17925c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Motiejus=20Jak=C5=A1tys?= Date: Thu, 17 Mar 2022 06:10:39 +0100 Subject: [PATCH] tidy up the header structure --- README.md | 70 ++++++++++++++++++++++--------------------------------- 1 file changed, 28 insertions(+), 42 deletions(-) diff --git a/README.md b/README.md index 6d6e6c4..e1965e5 100644 --- a/README.md +++ b/README.md @@ -64,7 +64,6 @@ regions are shared. Turbonss reads do not consume any heap space. Tight packing places some constraints on the underlying data: -- Maximum database size: 4GB. - Permitted length of username and groupname: 1-32 bytes. - Permitted length of shell and home: 1-64 bytes. - Permitted comment ("gecos") length: 0-255 bytes. @@ -136,45 +135,32 @@ Turbonss header The turbonss header looks like this: ``` -OFFSET TYPE NAME DESCRIPTION - 0 [4]u8 magic always 0xf09fa4b7 - 4 u8 version now `0` - 5 u16 bom 0x1234 - u8 num_shells max value: 63. - 8 u32 num_users number of passwd entries - 12 u32 num_groups number of group entries - 16 u32 offset_bdz_uid2user - 24 u32 offset_bdz_name2user - 20 u32 offset_bdz_groupname2group - 28 u32 offset_idx offset to the first idx_ section - 32 u32 offset_groups - 36 u32 offset_users - 40 u32 offset_groupmembers - 44 u32 offset_additional_gids +OFFSET TYPE NAME DESCRIPTION + 0 [4]u8 magic f0 9f a4 b7 + 4 u8 version 0 + 5 u8 bigendian 0 for little-endian, 1 for big-endian + 6 u8 nblocks_shell_blob max value: 63 + 7 u8 num_shells max value: 63 + 8 u32 num_groups number of group entries + 12 u32 num_users number of passwd entries + 16 u32 nblocks_bdz_gid bdz_gid section block count + 20 u32 nblocks_bdz_groupname + 24 u32 nblocks_bdz_uid + 28 u32 nblocks_bdz_username + 32 u64 nblocks_groups + 40 u64 nblocks_users + 48 u64 nblocks_groupmembers + 56 u64 nblocks_usergids ``` `magic` is 0xf09fa4b7, and `version` must be `0`. All integers are -native-endian. `bom` is a byte-order-mark. It must resolve to `0x1234` (4460). -If that's not true, the file is consumed in a different endianness than it was -created at. Turbonss files cannot be moved across different-endianness -computers. If that happens, turbonss will refuse to read the file. +native-endian. `nblocks_*` is the count of blocks of a particular section; this +helps calculate the offsets to all sections. -Offsets are indices to further sections of the file, with zero being the first -block (pointing to the `magic` field). As all sections are aligned to 64 bytes, -the offsets are always pointing to the beginning of an 64-byte "block". -Therefore, all `offset_*` values could be `u26`. As `u32` is easier to -visualize with xxd, and the header block fits to 64 bytes anyway, we are -keeping them as u32 now. - -Sections whose lengths can be calculated do not have a corresponding `offset_*` -header field. For example, `bdz_gid2group` comes immediately after the header, -and `idx_groupname2group` comes after `idx_gid2group`, whose offset is -`offset_idx`, and size can be calculated. - -`num_shells` would fit to u6; however, we would need 2 bits of padding (all -other fields are byte-aligned). If we instead do `u2` followed by `u6`, the -byte would look very unusual on a little-endian architecture. Therefore we will -just reject the DB if the number of shells exceeds 63. +Some numbers, like `nblocks_shell_blob`, `num_shells`, would fit to smaller +number of bytes. However, interpreting `[2]u6` with `xxd(1)` is harder than +interpreting `[2]u8`. Therefore we are using the space we have to make these +integers byte-wide. Primitive types --------------- @@ -345,14 +331,14 @@ the hash, that the key matches what's been hashed. respective `Groups` and `Users` entries (from the beginning of the respective section). Since User and Group records are 8-byte aligned, `u29` is used. -Complete file structure +Database file structure ----------------------- Each section is padded to 64 bytes. ``` SECTION SIZE DESCRIPTION -Header 48 see "Turbonss header" section +header 64 see "Turbonss header" section bdz_gid ? bdz(gid) bdz_groupname ? bdz(groupname) bdz_uid ? bdz(uid) @@ -361,12 +347,12 @@ idx_gid2group len(group)*4 bdz->offset Groups idx_groupname2group len(group)*4 bdz->offset Groups idx_uid2user len(user)*4 bdz->offset Users idx_name2user len(user)*4 bdz->offset Users -shellIndex len(shells)*2 shell index array -shellBlob <= 4032 shell data blob (max 63*64 bytes) +shell_index len(shells)*2 shell index array +shell_blob <= 4032 shell data blob (max 63*64 bytes) groups ? packed Group entries (8b padding) users ? packed User entries (8b padding) -groupMembers ? per-group delta varint memberlist (no padding) -userGids ? per-user delta varint gidlist (no padding) +groupmembers ? per-group delta varint memberlist (no padding) +user_gids ? per-user delta varint gidlist (no padding) ``` Section creation order: