How to combine / merge zip files

zip

For the last several months I have copied several data folders to zip files at weekly intervals. Now I'd like to combine those zip files into one zip file, because most of the contents of the existing zip files are just different versions of the same data files.

So if a file appears in more than one of the existing zip files, I'd like the newest version to be in the new zip file being created. Of course if a file appears in only one existing zip file, then I want it in the final zip file also.

I'm trying to avoid having to unzip them one by one to a working folder, overwriting data from older zip files with data from newer zip files, and then rezipping everything into a new zip file.

From what I understand pkzip would combine the zip files themselves, but is there a dependable and fast free method anyone can tell me about?

Best Answer

you won't like it but: unzipping everything into a working folder in the right order, then zipping the result is the most effective way.

otherwise, you will end up with a lot of wasted CPU cycles:

  • assume your result goes to 'first.zip'
  • every file from '2.zip', '3.zip' etc has to be unzipped and then zipped again into 'first.zip'
  • in '2.zip' exists a file 'foobar.txt' and in '3.zip' exists another file 'foobar.txt'. merging it the way you want to merge it leads to 'compress it X times'
  • the toc of a .zip is at the end of the file: you add more content (to the middle of the
    .zip by updating a file in the middle) and the whole file has to be rewritten

so, imho just use 'unzip' wiseley:

% mkdir all
% for x in *.zip ; do unzip -d all -o -u $x ; done
% zip -r all.zip all

the order of the unzipping is important, I don't know the pattern of your zip names, but I would extract the newest zip file first, the '-u' option of unzip overwrites only files if they are newer or creates files if not already there. as a result, you will unzip only the newest files and zip the result only once.