Recover in-memory Pages data from failed hibernation wakeup

hibernatememorypagessnow leopard

My girlfriend's Macbook crashed while attempting to restore from a hibernated file. The progress bar stopped at ~10%, after which we restarted the computer for a normal startup.

This hibernated memory image had an unsaved document open in Pages, which we'd like to recover. There is a sleepimage in /private/var/vm, which I assume is the hibernate image which never got correctly restored. We backed up this thing to keep it alive.

We tried to strings sleepimage | grep known_substring but it returned nothing. grep -a known_substring sleepimage also did nothing, so I'm assuming that Pages didn't keep the text data in memory as plain text.

Edit: After reading this answer on Binary grep I tried to perl -ln0777e 'print unpack("H*",$1), "\n", pos() while /(null_padded_substring)/g' sleepimage, again being fruitless. I padded it with nulls in order to attempt a match for UTF-8 text. Then I tried with .* globs between each character –- still no dice.

So Pages probably doesn't store text by any common encoding in memory. I would need to find a translation rule between ASCII string and Pages data representation — I'm thinking maybe some kind of Objective C string buffer. To me it seems very weird to store character data as anything else than a sequence of characters, but this seems to be what Pages is doing.

If you have any idea on how to figure out the in-memory representation of text inside Pages, it might be very helpful in solving this problem. Maybe I can dump and read the process memory in some simple way?

Another possible solution is simpler — I'm assuming it is somehow possible to reboot the computer from this sleepimage, but I can't find any documentation as to how you would proceed with that. Some other users (macrumors) seem to have encountered this, but for all the forum questions I've found, none of them have responses.

The OS X version is Snow Leopard, 10.6.8.

Complex suggestions involving programming are welcome. I do C and Python.

Thank you.

Best Answer

Update with pictures:

  • that loobsdpkdbik identifier mentioned first, isn't one - just happend to be before my text the fist time I tried it.

  • part of the text seems to get "lost" (i.e. not saved in one continuous memory stretch) and this may worsen with RAM usage

  • you may not be able to recover meaningful text from the sleepimage

Now my original text (with typo in 1st paragraph, sry Mr. Matisse):

Hidden Gems: MoMa’s Abby Aldrich Rockefeller Sculpture Garden, designed by Philip Johnson in 1953, is a spectacular urban oasis with its reflecting pools and beautiful landscaping. This outdoor gallery is installed with changing displays of outdoor sculpture, including works by Aristide Maillol, Alexander Calder, Henri Maisse, Pablo Picasso, and Richard Serra.

While visiting the new painting and sculpture galleries at MoMa, be sure to traverse the staircase bridging the forth and fifth floors in order to see Henri Matisse’s monumental image of joy and energy, Dance (1909). The painting was originally intended to hang in the stair hall of a Russian palace in Moscow.

And the recovered text:

Hidden Gems: Ma s Abby Aldrich Rockeller Sculpre Gn, desigd by Phip John 1953, is spectacular ursithtseflecting pools autifulandscapg. This outdoor gallery is italled with changing displays of outor sculpre, includg workby Aristide Maillol, Alexander Calder, Henri Maisse, Pabloicasso, anchard Sea.

While ving the new paintg sculpture gallies at Ma, be sure to traver t stase bridging the forth fth flrsn ordeto s Henri Matse s mtal imagof joy and ey, Dan (19). The painting waorinally intded to hg t stair hall of Rsian palace Moscow.

And the screen-shots:

Original text in Pages

Recovered text from sleepimage


It seems that for an (unsaved) Pages document (almost) all characters in your text are separated by 0x00 in memory - thus STRING becomes S.T.R.I.N.G with . being 0x00. So you either have to search for that; I can recommend 0xED for a graphical front-end... ..or you search for loobsdpkdbik which seems to be (part of) an identifier, which comes 5 bytes before the text (at least only in one case).