If you are like me, you have tons of archives lying around in different formats (zip, tar, rar, tgz, tar.bz2 etc). In cleaning up my stuff, I have decided to basically leave my archives alone ( and generally access contents via archivemount or avfs ). As a brief aside, for the most part I will not be writing to these archives, but on occasion I or an application may be write configuration files/index files/description files/stray files … to these archives.
However, I would like to have a preferred format that I convert these archives to when I clean up an archive. Some factors for this archive format are clear: it should be easy to convert other formats to this format, preferably directly; accessing files in the archive should not have significant overhead, size is of some consideration but not major, as long as an archive is not twice as big as the same files when extracted.
Now I am not so naive as to expect people to reply with "the best archive format is …", rather I hope to get some ideas of the pros and cons oif various archive formats that you might use in this situation.
Best Answer
In the Unix world
tar
is the de-facto archive format. There are of course other formats that can be both read and written, buttar
is the go-to format any time you want to bundle up files.Your real question seems to be what compression system to use. Compression is always a trade off between speed and compression ratio. Where you take the speed hit can also vary, some being efficient to decompress but taking a long time to compress, some the other way around.
You should use whatever works best for you. No compression at all means easy access and updating of your archive. It also means version control and backup systems like
rsync
are able to look deeper into your data and make more efficient incremental backups. On the other hand, lots of compression keeps the size down. Formats likegzip
andbzip2
are the most commonly used loss-less compression formats, but some others likelzma
and7z
exist. Many of these tools also include options for different compression ratios using the same algorithm.