Unix unzip is failing but Mac Archive Utility works

archivefile formatzip

I have a bunch of files with a .zip extension that I cannot seem to extract on my HPC:

$ unzip RowlandMetaG_part1.zip
Archive:  RowlandMetaG_part1.zip
warning [RowlandMetaG_part1.zip]:  13082642473 extra bytes at beginning or within zipfile
  (attempting to process anyway)
error [RowlandMetaG_part1.zip]:  start of central directory not found;
  zipfile corrupt.
  (please check that you have transferred or created the zipfile in the
  appropriate BINARY mode and that you have compiled UnZip properly)

The size of the zip file itself is 17377631766 bytes.

However, when I download the file to my mac and double-click, the Archive Utility app is able to unpack the file (it contains a directory with about 200 gzipped files inside).

The place that generated the file says:

The files are simply zipped here on our local lab PC running Windows, then uploaded to Dropbox…most people don’t have any problems with them and many can directly download the links I give them using the Linux wget command directly into their servers, then unzip there (the Linux utility can usually handle PC-zipped files).

I'm not sure that the fact that the files are from dropbox is relevant, but I used curl -LO to download (also tried wget – this doesn't change anything), and the files show up with ?dl=1 at the end of the file name. That said, when I download from dropbox to my mac, unzip still fails with the same error.

My question – is there anyway to get this to unzip on the server? Some software that will accomplish the same thing that Archive Utility.app does, or some other way of determining what unzipping protocol to use?

EDIT: Based on comments: some additional information:

$ file RowlandMetaG_part1.zip
RowlandMetaG_part3.zip: Zip archive data, at least v2.0 to extract
$ zip --version
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
This is Zip 3.0 (July 5th 2008), by Info-ZIP.

Also, I did try tar, but without success.

$ tar -xvf RowlandMetaG_part1.zip
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Archive contains `l@\022\t1\fjp\024uP\020' where numeric off_t value expected
tar: Archive contains `\024\311\032b\234\254\006\031' where numeric mode_t value expected
tar: Archive contains `\312\005hЈ\2138vÃ\032p' where numeric time_t value expected
# etc...

And I end up with crap in the directory like this:

$ ls
???MK??%b???mv?}??????@*??TZ?S?? ??????+??}n>,!???ӟw~?i?(??5?#?ʳ??z0?[?Ed?@?쑱??lT?d???A??T???H??
,??Y??:???'w,??+?ԌU??Wwxm???e~??ZJ]y??ˤ??4?SX?=y$Ʌ{N\?P}x~~?T?3????y?????'

Best Answer

There is a chance that, although the file ends with ".zip", it is not a zip file.

You can confirm if this is a zip file (and at the same time determine what is the actual file format) using the file utility:

file RowlandMetaG_part1.zip

Once the file format is determined you can use the proper tool to unarchive it.