How much can be recovered from a single-disk ZFS pool?
                        
                        
    
    I did some more tests in continuation of a previous write-up on 
    the recoverability of ZFS.
    Test setup
    
        The test is simple - create a single-disk ZFS pool, fill it with data, and make a disk image.
        Then, start overwriting data with zeros, working from the beginning towards the end of the disk.
        At some preset points, pause the overwrite and attempt recovery (using Klennet ZFS Recovery build 735).
        From the recovery output, measure two parameters:
    
    
        - 
            The number of files recovered with correct checksums. 
            Since the files are more or less all of similar size, the number of files also reflects the total 
            size of recoverable data.
        
- 
            The number of file names recovered. 
            ZFS stores directories separate from file location metadata, 
            so it is interesting to watch file names gradually disappear.
        
        I tested two datasets, one small and one close to 1 TB.
        Numbers for these two datasets are as follows
    
    
        |  | Small sample | Large sample | 
|---|
        | Pool size | 17.3 GB | 899 GB | 
        | Size of files | 12 GB | 683 GB | 
        | Percent full | 68% | 77% | 
        | Number of files | 2966 | 129400 | 
    
    Results
    
        The charts show the percentage of files recovered vs. the percentage of the entire pool space overwritten.
    
    
        - 
            The brown line is the number of recovered files;
        
- 
            the green line is the number of recovered file names;
        
- 
            the red dashed vertical line marks how much disk space is occupied;
        
- 
            the black dashed diagonal line marks a hypothetical one-to-one relationship, 
            where overwriting one percent of disk space causes a loss of exactly one percent of files. 
        
 
    
        Overwritten vs. recovered data, small test.
    
     
    
        Overwritten vs. recovered data, large test.
    
    
        What do these charts show? 
    
    
        - 
            As you overwrite the pool more, more data is lost. That's not entirely unexpected. 
        
- 
            At about one-third of the pool, the last usable set of metadata is overwritten; 
            after that, no useful data can be recovered.
        
- 
            How many files remain recoverable depends on the filesystem layout, that is, 
            on the history of the filesystem and write patterns.
        
- 
            You can recover the entire pool if the damage is limited to about 1%.
        
        For practical purposes, this means 
    
    
        - 
            If only the partition tables are deleted, everything can still be recovered.
        
- 
            If you overwrite the pool with a new blank pool, you can recover everything.
        
- 
            You need to overwrite about one-third of the disk space to make a pool entirely unrecoverable.
        
Filed under: Benchmarks, ZFS.
    
    Created Monday, February 25, 2019