JPEG Repair – Suggestions on How to Fix Corrupt Files

jpeg

( Warning, sample images are slightly NSFW; tattoo work shows a little bit of boob. )

Previously I had a Mac OS X sparse-bundle dmg and used DropBox on two computers. I would add to my secure sparse-bundle which was in my DropBox folder. Then one day I went to my other computer, it noticed that changes had been made, and did what it should, synced the sparse-bundle.

Both sparse-bundle images were then corrupted. Apparently the Mac OS X resource fork data or other meta data was lost, or is not stored. I only use DropBox for simple formats now, and won't trust it for multiple computer sync beyond .txt, .jpg, .png and similar. Somehow I was able to restore some files, pick and poke at the "bands" in the sparse-bundle and at least get the sparse-bundle to open and pass the security check. ~90% of the files are in good shape, ~10% so far, I have been unable to get anywhere with opening them.

Sample images that are used in the below tests…
http://dl.dropbox.com/u/340087/drops/05.14.11/DSC02322-bad.jpg
http://dl.dropbox.com/u/340087/drops/05.14.11/DSC02322-good.jpg

Here is what I know so far:

DSC02322-bad.jpg
DSC02322-good.jpg

No app will open the 'DSC02322-bad.jpg' file. Even terminal commands like cat, head, tail and others show it as if it were empty.

<pre>MD5 (DSC02322-bad.jpg) = 052b34d47cbcc64c208230d84098ceb3
MD5 (DSC02322-good.jpg) = a37995df587f7f146b8950199ecae544</pre>

( The md5 checksums are different, the files are indeed different. )

File sizes are the same:

-rw-r--r--@  1 me  staff  2178036 May 26  2009 DSC02322-bad.jpg
    -rw-r--r--@  1 me  staff  2178036 May 14 23:26 DSC02322-good.jpg

If I do a deeper listing of the files as below, still same file size, but meta data is different…

2128 -rw-r--r--@ 1 me  staff  2178036 May 26  2009 DSC02322-bad.jpg
        com.apple.metadata:kMDItemWhereFroms        179
    2128 -rw-r--r--@ 1 me  staff  2178036 May 14 23:26 DSC02322-good.jpg
        com.apple.metadata:kMDItemWhereFroms        145
        com.apple.quarantine         42 

Here is a file listing output of both files

$file DSC02322-*
    DSC02322-bad.jpg:  data
    DSC02322-good.jpg: JPEG image data, EXIF standard 2.21

This baffles me entirely:
Copy file to 'z-copy.txt', poke it with file, then cat the file which appears empty to cat yet the file size is still 2178036 bytes.

me@macbook:\ $cp DSC02322-bad.jpg z-copy.txt
    me@macbook:\ $file z-copy.txt
    z-copy.txt: data
    me@macbook:\ $cat z-copy.txt
    $ls -la z-copy.txt
    -rw-r--r--@ 1 me  staff  2178036 May 15 00:01 z-copy.txt

Other things I tried:

$cat DSC02322-bad.jpg > try-again.txt
    -rw-r--r--@  1 me  staff  2178036 May 14 23:53 DSC02322-bad.jpg
    -rw-r--r--+  1 me  staff  2178036 May 15 00:04 try-again.txt
    $cat try-again.txt
    ( Returns an empty line )

Neither file has resource forks, DropBox seems to have nuked those, and the data fork on both 'appear' identical to me. All I can do is determine the files are different because of the md5 mis-match and a diff, but normal file info size and other tests make them 'appear' identical. Duplicating the file does nothing helpful. I am pretty sure I only need to add back in arbitrary jpg header data, but am not certain what tools are best to use for this task, or what data to add. I do have backups of these particular files, and they are not all that important, but there are others, not all jpg's, such as flv files and proprietary documents that I need to get back, which I no longer can. Many are too old to even be recoverable by DropBox.

Playing around in various apps that can read jpg file data, the one error that I get that does make a little sense is:

Could not place the document ‘DSC02322-bad.jpg’ because a JPEG marker segment
    length is too short (the file may be truncated or incomplete).

Some last ditch efforts were to make a small jpg, 1×1 px and append/prepend the data to the bad file in hopes that the header data would allow me to open the file, copy it into a new document, and save a clean copy. So far this has not worked.

I have uploaded to imgur and a few other image sharing services, they all error out with a corrupt file message. I have not had a chance to try to read the file in with php or some other scripting language that may give me more specific error messages. Graphic Converter gets the closest, telling me the file is corrupt, actually shows me the data fork, and allows me to try to guess a best case scenario to open the file. I still get nothing in the end.

That is about all I am good for in todays troubleshooting. I haven't put a ton of effort into it as of yet, but it will soon become a war in which I will have to win 🙂 Any suggestions?

I would be happy to know what CLI app is able to at least show me what these 2178036 bytes are.

Ah well, looks like diff at least is being somewhat smart, not that it helps me a heck of a lot at this point:

$diff DSC02322-*
    Binary files DSC02322-bad.jpg and DSC02322-good.jpg differ

Suggestions? Thanks all.

Best Answer

Your DSC02322-bad.jpg file is 2178036 nul bytes (i.e. 2178036 single byte zeros):

$ hexdump DSC02322-bad.jpg 
0000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0213bf0

Your DSC02322-good.jpg file is 2178036 bytes of JPEG data. There's nothing to repair in DSC02322-bad.jpg as there's nothing left.

By the way, the file command says "data" when it doesn't know what else to say.

Related Question