Compressing many similar large images

7-zipcompressionimagesvideozip

I'm dealing with a large archive of satellite images of the Earth, each one taken 15 minutes apart over the same area, therefore they are quite similar to each other. Two contiguous ones look like this:
enter image description here

Video algorithms do very well compressing multiple similar images. However, this images are too large for video (10848×10848) and using video encoders would delete the metadata of the images, so extracting them and restoring the metadata would be cumbersome even if I get a video encoder to work with such large images.

To make some tests I've reduced the 96 images of one day to 1080×1080 pixels, totaling 40.1MB and try different compression with the folowing results:

  1. zip: 39.8 MB
  2. rar: 39.8 MB
  3. 7z : 39.6 MB
  4. tar.bz2: 39.7 MB
  5. zpaq v7.14: 38.3 MB
  6. fp8 v2: 32.5 MB
  7. paq8pxd v45: 30.9 MB

The last three, are supposed to take much better advantage of the context and indeed work better than traditional compression, but the compression ratio is still pretty poor compared with mp4 video that can take it to 15 MB or even less preserving the image quality.

However, none of the algorithms used by those compression utilities seem to take advantage of the similarity of the images as video compression do. In fact, using packJPG, that compress each image separately, the whole set get down to 32.9 MB, quite close to fp8 and paq8pxd but without taking at all advantage of the similarities between images (because each image is compressed individually).

In another experiment, I calculated in Matlab the difference of the two images above, and it looks like this:

enter image description here

Compressing both original images (219.5 + 217.0 = 436.5 kB total) with fp8 get them down to 350.0 kB (80%), but compressing one of them and the difference image (as a jpg of the same quality and using 122.5 kB), result in a file of 270.8 kB (62%), so again (as revealed by the mp4 and packJPG comparison), fp8 doesn't seem to take much advantage of the similarities. Even compressed with rar, one image plus the difference do better than fp8 on the original images. In that case, rar get it down to 333.6 kB (76%).

I guess there must be a good compression solution for this problem, as I can envision many applications. Beside my particular case, I guess many professional photographers have many similar shots due to sequential shooting, or time-lapse images, etc. All cases that would benefit from such compression.

Also, I don't require loseless compression, at least not for the image data (metadata must be preserved).

So…
Is there a compression method that do exploit the similarities between the images been compressed?

The two images of the above test can be downloaded here, and the 96 images of the first test here.

Best Answer

I don't know of a specific software that does this, but there is some research on the subject. For example, see the articles Compressing Sets of Similar Images by Samy Ait-Aoudia, Abdelhalim Gabis, Amina Naimi, and Compressing sets of similar images using hybrid compression model by Jiann-Der Lee, Shu-Yen Wan, Chemg-Min Ma, Rui-Feng Wu.

On a more practical level, you could extend your subtraction technique, for example by writing a script that uses ImageMagick to compute the difference between consecutive images, saving the result as a jpeg (or a compressed png if you want it lossless). You'll get one base image and a set of compressed "delta" images that should be much smaller. To compute the difference using ImageMagick:

convert image2.png image1.png -compose MinusSrc -composite -depth 24 -define png:compression-filter=2 -define png:compression-level=9 -define png:compression-strategy=1 difference-2-1.png

To re-compute by adding back:

convert image1.png difference-2-1.png -compose Plus -composite image2-reconstructed.png

(You can do the same using jpg instead and save a lot of space).

Related Question