Deduplication Scripts Using Btrfs CoW

btrfsdeduplication

Looking for deduplication tools on Linux there are plenty, see e.g. this wiki page.

Allmost all scripts do either only detection, printing the duplicate file names or removing duplicate files by hardlinking them to a single copy.

With the rise of btrfs there would be another option: creating a CoW (copy-on-write) copy of a file (like cp reflink=always). I have not found any tool that does this, is anyone aware of tool that does this?

Best Answer

I wrote bedup for this purpose. It combines incremental btree scanning with CoW-deduplication. Best used with Linux 3.6, where you can run:

sudo bedup dedup
Related Question