tidy up the header structure

This commit is contained in:
Motiejus Jakštys 2022-03-17 06:10:39 +01:00 committed by Motiejus Jakštys
parent d526f1fab8
commit 5ee8469ec5

View File

@ -64,7 +64,6 @@ regions are shared. Turbonss reads do not consume any heap space.
Tight packing places some constraints on the underlying data: Tight packing places some constraints on the underlying data:
- Maximum database size: 4GB.
- Permitted length of username and groupname: 1-32 bytes. - Permitted length of username and groupname: 1-32 bytes.
- Permitted length of shell and home: 1-64 bytes. - Permitted length of shell and home: 1-64 bytes.
- Permitted comment ("gecos") length: 0-255 bytes. - Permitted comment ("gecos") length: 0-255 bytes.
@ -136,45 +135,32 @@ Turbonss header
The turbonss header looks like this: The turbonss header looks like this:
``` ```
OFFSET TYPE NAME DESCRIPTION OFFSET TYPE NAME DESCRIPTION
0 [4]u8 magic always 0xf09fa4b7 0 [4]u8 magic f0 9f a4 b7
4 u8 version now `0` 4 u8 version 0
5 u16 bom 0x1234 5 u8 bigendian 0 for little-endian, 1 for big-endian
u8 num_shells max value: 63. 6 u8 nblocks_shell_blob max value: 63
8 u32 num_users number of passwd entries 7 u8 num_shells max value: 63
12 u32 num_groups number of group entries 8 u32 num_groups number of group entries
16 u32 offset_bdz_uid2user 12 u32 num_users number of passwd entries
24 u32 offset_bdz_name2user 16 u32 nblocks_bdz_gid bdz_gid section block count
20 u32 offset_bdz_groupname2group 20 u32 nblocks_bdz_groupname
28 u32 offset_idx offset to the first idx_ section 24 u32 nblocks_bdz_uid
32 u32 offset_groups 28 u32 nblocks_bdz_username
36 u32 offset_users 32 u64 nblocks_groups
40 u32 offset_groupmembers 40 u64 nblocks_users
44 u32 offset_additional_gids 48 u64 nblocks_groupmembers
56 u64 nblocks_usergids
``` ```
`magic` is 0xf09fa4b7, and `version` must be `0`. All integers are `magic` is 0xf09fa4b7, and `version` must be `0`. All integers are
native-endian. `bom` is a byte-order-mark. It must resolve to `0x1234` (4460). native-endian. `nblocks_*` is the count of blocks of a particular section; this
If that's not true, the file is consumed in a different endianness than it was helps calculate the offsets to all sections.
created at. Turbonss files cannot be moved across different-endianness
computers. If that happens, turbonss will refuse to read the file.
Offsets are indices to further sections of the file, with zero being the first Some numbers, like `nblocks_shell_blob`, `num_shells`, would fit to smaller
block (pointing to the `magic` field). As all sections are aligned to 64 bytes, number of bytes. However, interpreting `[2]u6` with `xxd(1)` is harder than
the offsets are always pointing to the beginning of an 64-byte "block". interpreting `[2]u8`. Therefore we are using the space we have to make these
Therefore, all `offset_*` values could be `u26`. As `u32` is easier to integers byte-wide.
visualize with xxd, and the header block fits to 64 bytes anyway, we are
keeping them as u32 now.
Sections whose lengths can be calculated do not have a corresponding `offset_*`
header field. For example, `bdz_gid2group` comes immediately after the header,
and `idx_groupname2group` comes after `idx_gid2group`, whose offset is
`offset_idx`, and size can be calculated.
`num_shells` would fit to u6; however, we would need 2 bits of padding (all
other fields are byte-aligned). If we instead do `u2` followed by `u6`, the
byte would look very unusual on a little-endian architecture. Therefore we will
just reject the DB if the number of shells exceeds 63.
Primitive types Primitive types
--------------- ---------------
@ -345,14 +331,14 @@ the hash, that the key matches what's been hashed.
respective `Groups` and `Users` entries (from the beginning of the respective respective `Groups` and `Users` entries (from the beginning of the respective
section). Since User and Group records are 8-byte aligned, `u29` is used. section). Since User and Group records are 8-byte aligned, `u29` is used.
Complete file structure Database file structure
----------------------- -----------------------
Each section is padded to 64 bytes. Each section is padded to 64 bytes.
``` ```
SECTION SIZE DESCRIPTION SECTION SIZE DESCRIPTION
Header 48 see "Turbonss header" section header 64 see "Turbonss header" section
bdz_gid ? bdz(gid) bdz_gid ? bdz(gid)
bdz_groupname ? bdz(groupname) bdz_groupname ? bdz(groupname)
bdz_uid ? bdz(uid) bdz_uid ? bdz(uid)
@ -361,12 +347,12 @@ idx_gid2group len(group)*4 bdz->offset Groups
idx_groupname2group len(group)*4 bdz->offset Groups idx_groupname2group len(group)*4 bdz->offset Groups
idx_uid2user len(user)*4 bdz->offset Users idx_uid2user len(user)*4 bdz->offset Users
idx_name2user len(user)*4 bdz->offset Users idx_name2user len(user)*4 bdz->offset Users
shellIndex len(shells)*2 shell index array shell_index len(shells)*2 shell index array
shellBlob <= 4032 shell data blob (max 63*64 bytes) shell_blob <= 4032 shell data blob (max 63*64 bytes)
groups ? packed Group entries (8b padding) groups ? packed Group entries (8b padding)
users ? packed User entries (8b padding) users ? packed User entries (8b padding)
groupMembers ? per-group delta varint memberlist (no padding) groupmembers ? per-group delta varint memberlist (no padding)
userGids ? per-user delta varint gidlist (no padding) user_gids ? per-user delta varint gidlist (no padding)
``` ```
Section creation order: Section creation order: