How to create a zip file v2.0

zip

How can I create a zip file v2.0?

It seems OpenDocument files are zip files v2.0:

$ file foo.odt
foo.odt: OpenDocument Text
$ hexdump -C -n 16 foo.odt
00000000  50 4b 03 04 14 00 00 08  00 00 03 0d 47 42 5e c6  |PK..........GB^.|
00000010

The fifth byte is 0x14.

But if I unzip foo.odt and zip it back into bar.odt, I get a v1.0 zip file:

$ unzip -d foo foo.odt
$ cd foo/
$ zip -0 -X ../bar.odt mimetype
$ zip -r ../bar.odt * -x mimetype
$ file ../bar.odt
bar.odt: Zip archive data, at least v1.0 to extract
$ hexdump -C -n 16 ../bar.odt
00000000  50 4b 03 04 0a 00 00 00  00 00 00 90 46 42 5e c6  |PK..........FB^.|
00000010

The fifth byte is 0x0a.

zip (2.32), Debian (6.0)

Best Answer

Edit: OK. Notice question has been updated so this You do not get a v0.1 but v1.0. does not longer apply.

The version is not "how capable" the file is but what minimum version is required to extract that file from within the archive.

This is not the overall version for the archive!

One difference here is that e.g. OO tags all files with same version requirement. That in turn is the file in the document (archive over all) with highest requirements.

That is. Each file has a zip-header that specify minimum version required to extract it. By the above we have typically:

  archive-files    PackType  Zip-Required OO-Header `zip`-header
+--------------------------------------------------------------+
| mimetype         Store     1.0          2.0        1.0        |__ foo.odt
| content.xml      Deflate   2.0          2.0        2.0        |
+---------------------------------------------------------------+

So OO set required flag to 2.0 even though it is 1.0. This does not, however, affect the ability to open the document. (It's OK to open a manually zip'ed file in OO even though mimetype is tagged with v1.0).

Versions

foo.odt:

1400   Version needed to extract.
0008   General Purpose
0000   Compression method

Version needed to extract, here the lower byte, 0x14, is translated by dividing and modulus by 10:

Major: 0x14 / 0x0a = 2
Minor: 0x14 % 0x0a = 0

Aka Version 2.0

The higher byte 0x00 indicates what the file is compatible with. If zero, then it is compatible with MS-DOS (FAT, FAT32, VFAT). Else it is specified by a mapping. E.g. if I use zip with no options on my system I get a 0x03 which indicates Unix. 0x0a is NTFS etc.

Version 2.0 indicates: (4.4.3.2 Current minimum feature versions)

* File is a folder (directory)
* File is compressed using Deflate compression
* File is encrypted using traditional PKWARE encryption

In you zip'ed file you have:

bar.odt:

0a00   Version needed to extract.
0000   General Purpose
0000   Compression method


Major: 0x0a / 0x0a = 1
Minor: 0x0a % 0x0a = 0

Aka version 1.0


Version 1.0 is simply default value.

File Hierarchy and minimum version

The reason you see version 1.0 under Version needed to extract - is that what you actually see is the zip-header for the file mimetype. This file is not deflated but stored with no compression. Thus you only need version 1.0 to extract that file. This, however, is not the overall version of the archive. If you look further down you'll find version 2.0 as soon as you find a file saved with deflating. You can check by e.g.:

hexdump -v -e '/1 "%02x "' bar.odt | grep -o '50 4b 03 04 .\{6\}'

Should give you something like

50 4b 03 04 0a 00 
50 4b 03 04 0a 00 
...
50 4b 03 04 14 00 
50 4b 03 04 14 00 
50 4b 03 04 0a 00 
50 4b 03 04 14 00 
...
Central directory file header

There are some file with an extended header. You can list these by:

hexdump -v -e '/1 "%02x "' foo.odt | grep -o '50 4b 01 02.\{16\}'

(Remember to reverse 50 4b ... to 02 01 4b 50 if hexdump -n 4 foo.odt say so)

By this you'll get typically:

                  ____________ Version required (2.0)
                  |   |
50 4b 01 02 14 00 14 00 00 
50 4b 01 02 14 00 14 00 00 
50 4b 01 02 14 00 14 00 08
            |___| 
              |      
              +-------------- Version supported by packing application. v2.0

On the zip created file you could get get e.g.:

                  ____________ Version required for this file (2.0)
                  |   |
50 4b 01 02 1e 03 14 00 00
            |___| 
              |      
              +-------------- Version supported by packing 
                              application. v3.0

General purpose (and other flag set in odt files)

This is a bit flag. As your file is big-endian / Motorola, the flag becomes:

0x0800 = 0000 1000 0000 0000
              |
              +---------------- 11 => File names and comments MUST be 
                                      stored as Utf-8.

At least LibreOffice saves further with various mods.

mimetype is always first and should not be compressed. This is to help various software to identify the file and its content. By this one can e.g.:

$ hexdump -C -s 38 -n 39 foo.odt

00000026  61 70 70 6c 69 63 61 74  69 6f 6e 2f 76 6e 64 2e  |application/vnd.|
00000036  6f 61 73 69 73 2e 6f 70  65 6e 64 6f 63 75 6d 65  |oasis.opendocume|
00000046  6e 74 2e 74 65 78 74                              |nt.text|

While zip typically saves all directories, OO saves only a directory if it is empty. Thus:

zip:

Thumbnails/
Thumbnails/thumbnail.png
META-INF/
META-INF/manifest.xml

oo:

Thumbnails/thumbnail.png
META-INF/manifest.xml

And so on ...

Related Question