doc.go (3004B) - Raw
1 // Package rootfs extracts all layers of a Docker container image to a single 2 // tarball. It will go trough all layers in order and copy every file to the 3 // destination archive. 4 // 5 // It will also reasonably process those files. 6 // 7 // == Non-directory will be copied only once == 8 // 9 // A non-directory will be copied only once, only from within it's past 10 // occurrence. I.e. if file /a/b was found in layers 0 and 2, only the file 11 // from layer 2 will be used. 12 // Directories will always be copied, even if there are duplicates. This is 13 // to avoid a situation like this: 14 // layer0: 15 // ./dir/ 16 // ./dir/file 17 // layer1: 18 // ./dir/ 19 // ./dir/file 20 // In theory, the directory from layer 1 takes precedence, so a tarball like 21 // this could be created: 22 // ./dir/ (from layer1) 23 // ./dir/file1 (from layer1) 24 // However, imagine the following: 25 // layer0: 26 // ./dir/ 27 // ./dir/file1 28 // layer1: 29 // ./dir/ 30 // Then the resulting tarball would have: 31 // ./dir/file1 (from layer1) 32 // ./dir/ (from layer0) 33 // Which would mean `untar` would try to untar a file to a directory which 34 // was not yet created. Therefore directories will be copied to the resulting 35 // tar in the order they appear in the layers. 36 // 37 // == Special files: .dockerenv == 38 // 39 // .dockernv is present in all docker containers, and is likely to remain 40 // such. So if you do `docker export <container>`, the resulting tarball will 41 // have this file. rootfs will not add it. You are welcome to append one 42 // yourself. 43 // 44 // == Special files: opaque files and dirs (.wh.*) == 45 // 46 // From mount.aufs(8)[1]: 47 // 48 // The whiteout is for hiding files on lower branches. Also it is applied to 49 // stop readdir going lower branches. The latter case is called ‘opaque 50 // directory.’ Any whiteout is an empty file, it means whiteout is just an 51 // mark. In the case of hiding lower files, the name of whiteout is 52 // ‘.wh.<filename>.’ And in the case of stopping readdir, the name is 53 // ‘.wh..wh..opq’. All whiteouts are hardlinked, including ‘<writable branch 54 // top dir>/.wh..wh.aufs`. 55 // 56 // My interpretation: 57 // 58 // 1. a file/hardlink called `.wh..wh..opq` means that directory contents from 59 // the layers below the mentioned file should be ignored. Higher layers may add 60 // files on top. Ambiguity: should the directory from the lower layers be 61 // removed? I am assuming yes, but this assumptions is baseless. 62 // 63 // 2. if file/hardlink `.wh.([^/]+)` is found, $1 should be deleted from the 64 // current and lower layers. 65 // 66 // Note: these may be regular files in practice. So this implementation will 67 // match either. 68 // 69 // == Tar format == 70 // 71 // Since we do care about long filenames and large file sizes (>8GB), we are 72 // using "classic" GNU Tar. However, at least NetBSD pax is known to have 73 // problems reading it[2]. 74 // 75 // [1]: https://manpages.debian.org/unstable/aufs-tools/mount.aufs.8.en.html 76 // 77 // [2]: https://mgorny.pl/articles/portability-of-tar-features.html 78 package rootfs