1
Fork 0

tidy up the header structure

main
Motiejus Jakštys 2022-03-17 06:10:39 +01:00 committed by Motiejus Jakštys
parent d526f1fab8
commit 5ee8469ec5
1 changed files with 28 additions and 42 deletions

View File

@ -64,7 +64,6 @@ regions are shared. Turbonss reads do not consume any heap space.
Tight packing places some constraints on the underlying data:
- Maximum database size: 4GB.
- Permitted length of username and groupname: 1-32 bytes.
- Permitted length of shell and home: 1-64 bytes.
- Permitted comment ("gecos") length: 0-255 bytes.
@ -136,45 +135,32 @@ Turbonss header
The turbonss header looks like this:
```
OFFSET TYPE NAME DESCRIPTION
0 [4]u8 magic always 0xf09fa4b7
4 u8 version now `0`
5 u16 bom 0x1234
u8 num_shells max value: 63.
8 u32 num_users number of passwd entries
12 u32 num_groups number of group entries
16 u32 offset_bdz_uid2user
24 u32 offset_bdz_name2user
20 u32 offset_bdz_groupname2group
28 u32 offset_idx offset to the first idx_ section
32 u32 offset_groups
36 u32 offset_users
40 u32 offset_groupmembers
44 u32 offset_additional_gids
OFFSET TYPE NAME DESCRIPTION
0 [4]u8 magic f0 9f a4 b7
4 u8 version 0
5 u8 bigendian 0 for little-endian, 1 for big-endian
6 u8 nblocks_shell_blob max value: 63
7 u8 num_shells max value: 63
8 u32 num_groups number of group entries
12 u32 num_users number of passwd entries
16 u32 nblocks_bdz_gid bdz_gid section block count
20 u32 nblocks_bdz_groupname
24 u32 nblocks_bdz_uid
28 u32 nblocks_bdz_username
32 u64 nblocks_groups
40 u64 nblocks_users
48 u64 nblocks_groupmembers
56 u64 nblocks_usergids
```
`magic` is 0xf09fa4b7, and `version` must be `0`. All integers are
native-endian. `bom` is a byte-order-mark. It must resolve to `0x1234` (4460).
If that's not true, the file is consumed in a different endianness than it was
created at. Turbonss files cannot be moved across different-endianness
computers. If that happens, turbonss will refuse to read the file.
native-endian. `nblocks_*` is the count of blocks of a particular section; this
helps calculate the offsets to all sections.
Offsets are indices to further sections of the file, with zero being the first
block (pointing to the `magic` field). As all sections are aligned to 64 bytes,
the offsets are always pointing to the beginning of an 64-byte "block".
Therefore, all `offset_*` values could be `u26`. As `u32` is easier to
visualize with xxd, and the header block fits to 64 bytes anyway, we are
keeping them as u32 now.
Sections whose lengths can be calculated do not have a corresponding `offset_*`
header field. For example, `bdz_gid2group` comes immediately after the header,
and `idx_groupname2group` comes after `idx_gid2group`, whose offset is
`offset_idx`, and size can be calculated.
`num_shells` would fit to u6; however, we would need 2 bits of padding (all
other fields are byte-aligned). If we instead do `u2` followed by `u6`, the
byte would look very unusual on a little-endian architecture. Therefore we will
just reject the DB if the number of shells exceeds 63.
Some numbers, like `nblocks_shell_blob`, `num_shells`, would fit to smaller
number of bytes. However, interpreting `[2]u6` with `xxd(1)` is harder than
interpreting `[2]u8`. Therefore we are using the space we have to make these
integers byte-wide.
Primitive types
---------------
@ -345,14 +331,14 @@ the hash, that the key matches what's been hashed.
respective `Groups` and `Users` entries (from the beginning of the respective
section). Since User and Group records are 8-byte aligned, `u29` is used.
Complete file structure
Database file structure
-----------------------
Each section is padded to 64 bytes.
```
SECTION SIZE DESCRIPTION
Header 48 see "Turbonss header" section
header 64 see "Turbonss header" section
bdz_gid ? bdz(gid)
bdz_groupname ? bdz(groupname)
bdz_uid ? bdz(uid)
@ -361,12 +347,12 @@ idx_gid2group len(group)*4 bdz->offset Groups
idx_groupname2group len(group)*4 bdz->offset Groups
idx_uid2user len(user)*4 bdz->offset Users
idx_name2user len(user)*4 bdz->offset Users
shellIndex len(shells)*2 shell index array
shellBlob <= 4032 shell data blob (max 63*64 bytes)
shell_index len(shells)*2 shell index array
shell_blob <= 4032 shell data blob (max 63*64 bytes)
groups ? packed Group entries (8b padding)
users ? packed User entries (8b padding)
groupMembers ? per-group delta varint memberlist (no padding)
userGids ? per-user delta varint gidlist (no padding)
groupmembers ? per-group delta varint memberlist (no padding)
user_gids ? per-user delta varint gidlist (no padding)
```
Section creation order: