Does WinRAR detect duplicate files

winrar

I have a directory with subdirectories, and a lot of duplicate files in them. If I move everything to a single rar archive, will WinRAR detect the duplicate files, or will all of them be archived and add up to the size of the rar archive?

Best Answer

The new version of WinRAR, 5.00, has introduced the new RAR5 archive format and this feature is one of many improvements:

Save identical files as references

If this option is enabled, WinRAR analyzes the file contents before starting archiving. If several identical files larger than 64 KB are found, the first file in the set is saved as usual file and all following files are saved as references to this first file. It allows to reduce the archive size, but applies some restrictions to resulting archive. You must not delete or rename the first identical file in archive after the archive was created, because it will make extraction of following files using it as a reference impossible. If you modify the first file, following files will also have the modified contents after extracting. Extraction command must involve the first file to create following files successfully.

It is recommended to use this option only if you compress a lot of identical files, will not modify an archive later and will extract an archive entirely, without necessity to unpack or skip individual files. If all identical files are small enough to fit into compression dictionary, solid archiving can provide more flexible solution than this option.

Supported for RAR 5.0 archives only.

My quick test on a folder that contains 320,000 files (Baldur's Gate Trilogy with a lot of mods):

RAR4 compression method, compression set to "Store": 26.1 GB (28,053,815,768 bytes)

RAR5 compression method, compression set to "Store" and "Save identical files as references" turned on: 23.9 GB (25,722,664,097 bytes)

So I was able to save over 9% without using any compression at all!

Related Question