I would do something like this (zsh syntax):
unz() (
tmp=$(TMPDIR=. mktemp -d -- ${${argv[-1]:t:r}%.tar}.XXXXXX) || exit
print -r >&2 "Extracting in $tmp"
cd -- $tmp || exit
[[ $argv[-1] = /* ]] || argv[-1]=../$argv[-1]
(set -x; "$@"); ret=$?
files=(*(ND[1,2]))
case $#files in
(0) print -r >&2 "No file created"
rmdir -v "../$tmp";;
(1) mv -v -- $files .. && rmdir -v ../$tmp;;
(*) mv -vT ../$tmp ../$tmp:r;;
esac && exit $ret
)
That is:
- create a directory in anycase
- run the command
- depending on how many files the command generated:
- remove that directory (if it didn't create any file)
- if it created only one file/dir, move it one level up and discard our directory
- otherwise, attempt to strip the random string from the end of our temp directory.
This way, you can do:
unz unzip foo.zip
unz tar xf foo.tar.gz
It assumes that the last argument to the extracting command is the file to extract. It also assumes GNU tools for the -v
options. On non-GNU systems, you can remove those and possibly do the logging by hand. mv -T
is also GNU specific, and is to force mv
to attempt do a rename only.
When searching for a single file in a large archive, it uses method 1, which you can see using strace
:
open("dataset.zip", O_RDONLY) = 3
ioctl(1, TIOCGWINSZ, 0x7fff9a895920) = -1 ENOTTY (Inappropriate ioctl for device)
write(1, "Archive: dataset.zip\n", 22Archive: dataset.zip
) = 22
lseek(3, 943718400, SEEK_SET) = 943718400
read(3, "\340P\356(s\342\306\205\201\27\360U[\250/2\207\346<\252+u\234\225\1[<\2310E\342\274"..., 4522) = 4522
lseek(3, 943722880, SEEK_SET) = 943722880
read(3, "\3\f\225P\\ux\v\0\1\4\350\3\0\0\4\350\3\0\0", 20) = 20
lseek(3, 943718400, SEEK_SET) = 943718400
read(3, "\340P\356(s\342\306\205\201\27\360U[\250/2\207\346<\252+u\234\225\1[<\2310E\342\274"..., 8192) = 4522
lseek(3, 849346560, SEEK_SET) = 849346560
read(3, "D\262nv\210\343\240C\24\227\344\367q\300\223\231\306\330\275\266\213\276M\7I'&35\2\234J"..., 8192) = 8192
stat("rand-28.txt", 0x559f43e0a550) = -1 ENOENT (No such file or directory)
lstat("rand-28.txt", 0x559f43e0a550) = -1 ENOENT (No such file or directory)
stat("rand-28.txt", 0x559f43e0a550) = -1 ENOENT (No such file or directory)
lstat("rand-28.txt", 0x559f43e0a550) = -1 ENOENT (No such file or directory)
open("rand-28.txt", O_RDWR|O_CREAT|O_TRUNC, 0666) = 4
ioctl(1, TIOCGWINSZ, 0x7fff9a895790) = -1 ENOTTY (Inappropriate ioctl for device)
write(1, " extracting: rand-28.txt "..., 37 extracting: rand-28.txt ) = 37
read(3, "\275\3279Y\206\223\217}\355W%:\220YNT\0\257\260z^\361T\242\2\370\21\336\372+\306\310"..., 8192) = 8192
unzip
opens dataset.zip
, seeks to the end, then seeks to the start of the requested file in the archive (rand-28.txt
, at offset 849346560) and reads from there.
The central directory is found by scanning the last 65557 bytes of the archive; see the code starting here:
/*---------------------------------------------------------------------------
Find and process the end-of-central-directory header. UnZip need only
check last 65557 bytes of zipfile: comment may be up to 65535, end-of-
central-directory record is 18 bytes, and signature itself is 4 bytes;
add some to allow for appended garbage. Since ZipInfo is often used as
a debugging tool, search the whole zipfile if zipinfo_mode is true.
---------------------------------------------------------------------------*/
Best Answer
Edit: OK. Notice question has been updated so this
You do not get a v0.1 but v1.0.does not longer apply.The version is not "how capable" the file is but what minimum version is required to extract that file from within the archive.
This is not the overall version for the archive!
One difference here is that e.g. OO tags all files with same version requirement. That in turn is the file in the document (archive over all) with highest requirements.
That is. Each file has a zip-header that specify minimum version required to extract it. By the above we have typically:
So OO set required flag to 2.0 even though it is 1.0. This does not, however, affect the ability to open the document. (It's OK to open a manually zip'ed file in OO even though
mimetype
is tagged with v1.0).Versions
Version needed to extract, here the lower byte,
0x14
, is translated by dividing and modulus by 10:Aka Version 2.0
The higher byte
0x00
indicates what the file is compatible with. If zero, then it is compatible with MS-DOS (FAT, FAT32, VFAT). Else it is specified by a mapping. E.g. if I usezip
with no options on my system I get a0x03
which indicates Unix.0x0a
is NTFS etc.Version 2.0 indicates: (4.4.3.2 Current minimum feature versions)
In you zip'ed file you have:
Aka version 1.0
Version 1.0 is simply default value.
File Hierarchy and minimum version
The reason you see version
1.0
under Version needed to extract - is that what you actually see is the zip-header for the filemimetype
. This file is not deflated but stored with no compression. Thus you only need version1.0
to extract that file. This, however, is not the overall version of the archive. If you look further down you'll find version 2.0 as soon as you find a file saved with deflating. You can check by e.g.:Should give you something like
Central directory file headerThere are some file with an extended header. You can list these by:
(Remember to reverse
50 4b ...
to02 01 4b 50
if hexdump -n 4 foo.odt say so)By this you'll get typically:
On the
zip
created file you could get get e.g.:General purpose (and other flag set in odt files)
This is a bit flag. As your file is big-endian / Motorola, the flag becomes:
At least LibreOffice saves further with various mods.
mimetype
is always first and should not be compressed. This is to help various software to identify the file and its content. By this one can e.g.:$ hexdump -C -s 38 -n 39 foo.odt
While
zip
typically saves all directories, OO saves only a directory if it is empty. Thus:zip:
oo:
And so on ...