ZFS Filesystem – Bulk Remove Large Directory Without Recursion

directoryfilesystemsrecursivermzfs

I want to remove the contents of a zfs datasets subdir. It's a large amount of data. For the pool "nas", the path is /nas/dataset/certainFolder

$ du -h -d 1 certainFolder/
1.2T    certainFolder/

Rather than me have to wait for rm -rf certainFolder/ can't I just destroy the handle to that directory so its overwrite-able(even by the same dir name if I chose to recreate it) ??

So for e.g. not knowing much about zfs file system internals,
specifically how it journals its files, I wonder if I was able to access
that journal/map directly, for e.g., then remove the right entries, so that the dir would no longer display. That space dir holds has to be removed from some kind of audit as well.

Is there an easy way to do this? Even if on an ext3 fs, or is that already what the recursive remove command has to do in the first place, i.e. pilfer through and edit journals?

I'm just hoping to do something of the likes of kill thisDir to where it simply removes some kind of ID, and poof the directory no longer shows up in ls -la. The data is still there on the drive obviously, but the space will now be reused(overwritten), because ZFS is just that cool?

I mean I think zfs is really that cool, how can we do it? Ideally? rubbing hands together 🙂

My specific use case (besides my love for zfs) is management of a backup archive. The data is pushed to zfs via freefilesync (AWESOME PROG) on/from win boxes across SMB to the zfs pool. When removing rm -rf /nas/dataset/certainFolder through a putty term, it stalls, the term is obviously unusable for a long time now. I of course then have to open another terminal, to continue. Thats gets old, plus its no fun to monitor the rm -rf, it can take hours.

Maybe I should set the command to just release the handle e.g. &, then print to std out, that might be nice. More realistically, recreate the data-set in a few seconds zfs destroy nas/dataset; zfs create -p -o compression=on nas/dataset after the thoughts from the response from @Gilles.

Best Answer

Tracking freed blocks is unavoidable in any decent file system and ZFS is no exception. There is however a simple way under ZFS to have a nearly instantaneous directory deletion by "deferring" the underlying cleanup. It is technically very similar to Gilles' suggestion but is inherently reliable without requiring extra code.

If you create a snapshot of your file system before removing the directory, the directory removal will be very fast because nothing will need to be explored/freed under it, all being still referenced by the snapshot. You can then destroy the snapshot in the background so the space will be gradually recovered.

d=yourPoolName/BackupRootDir/hostNameYourPc/somesubdir
zfs snapshot ${d}@quickdelete && { 
    rm -rf /${d}/certainFolder
    zfs destroy ${d}@quickdelete & 
}