Partial recovery of damaged files from ZFS
Since ZFS keeps checksums of all metadata and all the data, it is possible to know precisely which parts of the file are damaged.
However, things are more complicated than just reading file data, computing checksum, and cross-checking.
To keep track of where the file data is on the disk, ZFS uses block pointers (broadly similar to what EXT2 does).
Block pointers themselves are grouped into blocks.
If there are too many block pointers, they are indirected.
The first level of block pointers references blocks of second-level block pointers,
and second-level block pointers reference the actual data.
The entire layout looks like this
Typical block pointer and data reference diagram.
The above example only shows three levels of indirection.
ZFS can use up to eight levels for large files.
Each block can point either to the file data or to the pointers of the next level.
So, the top level (root block) contains three pointers to the middle level,
and the middle level contains nine pointers to nine data blocks.
Each block is protected with a checksum, so you can determine if it is damaged.
Even if parts of this arrangement are damaged, something can still be salvaged.
Damaging one data block does not affect anything except that data block.
Damaging a pointer block affects everything below this pointer block.
As the damage point moves closer to the tree's root,
the amount of data lost per single block damage increases.
The data itself may still be intact,
but there is no way to know where it is if one of the pointers is damaged.
Klennet ZFS Recovery has two options for dealing with partially damaged files when copying them out.
ZFS Recovery can fill the damaged areas either with zeros or with the ASCII string "DAMAGED" repeated over and over.
The latter option is useful if you plan a subsequent manual review – the filler stands out in any view, so you immediately know what you are looking at.
Filed under: ZFS.
Created Saturday, June 15, 2019