Total Combined Size of a ZFS Snapshot

2020-08-28

Since I started using FreeBSD and ZFS last year, I've always wanted to recursively determine the total space used by a snapshot. After recursively taking a snapshot, the used space can be displayed on each individual dataset, but not all at once.
The reason for this odd behaviour is simple: While snapshots can be recursively created and deleted, they, unlike the filesystems themselves, do not actually depend on one another. Technically, what looks like one big snapshot across a ZFS hierachy, is instead a collection of independant snapshots with the same name. For instance, a snapshot can be deleted on one filesystem but kept on a descendant — that is not possible with filesystems themselves.
A snapshot, unlike a filesystem, does not own or contain its seeming descendants. Therefore, the "used" space of a snapshot does not include the space used by its incarnations on descendant filesystems.
However, since it is possible to recursively create and destroy such snapshots, I figured it would be useful to know the combined space used.

Update (2021-01-02)

I have to add this, because I did not explain it properly: The "used space" I am calculating here is the unique, deduplicated space used by a snapshot, i.e. the space that will be freed up when deleting it. Space shared with other snapshots is ignored.
Adding up the sizes this method returns for individual snapshots will not result in the space used by all snapshots, that information is stored in the 'usedbysnapshots'-property.

Solution

Having finally started tackling the issue, this is how I solved it:

zfs list -Hrp -t snap -o name,used zroot \
| grep 2020-08-23_00-00-00 \
| rev | cut -wf1 | rev \
| paste -sd+ - | bc

This will return the combined size of the snapshot "zroot@2020-08-23_00-00-00" on zroot and all of its descendants in Bytes.

Explanation

  1. Recursively list all snapshots of all descendants
  2. grep all incarnations of one snapshot on all datasets
  3. Get the last field of every line, use whitespace as the separator (-w)
    It is not possible to select the last field directly, so the entire string is reversed before and after being fed into cut
  4. Concatenate the lines with "+" as a the separator, so bc understands what to do
  5. Add the numbers to obtain the combined size

Shell Script

With the main issue solved, I wrapped the above command into a shell script to add flexibility. It will loop through all snapshots of a given dataset and display a neat list of all their individual sizes in a human-readable way.
The script is available on GitHub.