Linux vs macOS – Why GZip Produces Different Compressed Results

gziplinuxmacos

I have a few thousand files that are individually GZip compressed (passing of course the -n flag so the output is deterministic). They then go into a Git repository. I just discovered that for 3 of these files, Gzip doesn't produce the same output on macOS vs Linux. Here's an example:

macOS

$ cat Engine/Extras/ThirdPartyNotUE/NoRedist/EnsureIT/9.7.0/bin/finalizer | shasum -a 256
0ac378465b576991e1c7323008efcade253ce1ab08145899139f11733187e455  -

$ cat Engine/Extras/ThirdPartyNotUE/NoRedist/EnsureIT/9.7.0/bin/finalizer | gzip --fast -n | shasum -a 256
6e145c6239e64b7e28f61cbab49caacbe0dae846ce33d539bf5c7f2761053712  -

$ cat Engine/Extras/ThirdPartyNotUE/NoRedist/EnsureIT/9.7.0/bin/finalizer | gzip -n | shasum -a 256
3562fd9f1d18d52e500619b4a5d5dfa709f5da8601b9dd64088fb5da8de7b281  -

$ gzip --version
Apple gzip 272.250.1

Linux

$ cat Engine/Extras/ThirdPartyNotUE/NoRedist/EnsureIT/9.7.0/bin/finalizer | shasum -a 256
0ac378465b576991e1c7323008efcade253ce1ab08145899139f11733187e455  -

$ cat Engine/Extras/ThirdPartyNotUE/NoRedist/EnsureIT/9.7.0/bin/finalizer | gzip --fast -n | shasum -a 256
10ac8b80af8d734ad3688aa6c7d9b582ab62cf7eda6bc1a0f08d6159cad96ddc  -

$ cat Engine/Extras/ThirdPartyNotUE/NoRedist/EnsureIT/9.7.0/bin/finalizer | gzip -n | shasum -a 256
cbf249e3a35f62a4f3b13e2c91fe0161af5d96a58727d17cf7a62e0ac3806393  -

$ gzip --version
gzip 1.6
Copyright (C) 2007, 2010, 2011 Free Software Foundation, Inc.
Copyright (C) 1993 Jean-loup Gailly.
This is free software.  You may redistribute copies of it under the terms of
the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
There is NO WARRANTY, to the extent permitted by law.

Written by Jean-loup Gailly.

How is this possible? I thought the GZip implementation was completely standard?

UPDATE: Just to confirm that macOS and Linux versions do produce the same output most of the time, both OSes output the same hash for:

$ echo "Vive la France" | gzip --fast -n | shasum -a 256
af842c0cb2dbf94ae19f31c55e05fa0e403b249c8faead413ac2fa5e9b854768  -

Best Answer

Note that the compression algorithm (Deflate) in GZip is not strictly bijective. To elaborate: For some data, there's more than one possible compressed output depending on the algorithmic implementation and used parameters. So there's no guarantee at all that Apple GZip and gzip 1.6 will return the same compressed output. These outputs are all valid GZip streams, the standard just guarantees that every of these possible outputs will be decompressed to the same original data.

Related Question