Excel – How to convert scanned handwritten tables to Excel spreadsheets

microsoft excel

Until now, my grandparents handwrote their financial records, but their non-cursive handwriting is neater and more intelligible than the pictures beneath. After they scan each page, can Excel 2019 automatically and forthwith convert the scanned image to an Excel spreadsheet? Even if OCR recognizes the text and numbers, arranging each text and number will consume too much time.

Here's the second picture's source. This 2016 Reddit post yields nothing helpful.

enter image description here

enter image description here

Best Answer

I have to agree with music2myear’s answer.

With any computer to which you would have access, you can't do anything useful to go from handwritten records to Excel.

There are at least three difficult tasks:

  1. Distinguishing "content" from non-content.
  2. Recognizing the layout and translating that to cell locations.
  3. Recognizing the handwritten characters and translating them to text.

Consumer software and online services are available and do a reasonable job of converting machine-printed text that is in clean table format to a spreadsheet file. But even the best can be far from perfect. That's just the task of assigning text to the right cell based on its position.

When you look at those images, your brain is very good at sorting out what is "preprinted form", what is content, what is noise, and what is human markings that aren't relevant. You can recognize how things are aligned, and what goes with what based on context. To the computer, everything that isn't the background color is "something". Figuring out what of that is important to you, and what could potentially be some kind of character to be translated is extremely difficult. And if the content overlaps preprinted lines, that introduces breaks and missing data that the computer can't easily handle.

Take your images, for example. The first image is a lost cause. Much of it ignores the lines and layout. You would have the additional task of separating and removing the preprinted grid from the content. In the second image, the content is mostly within the bounds of the grid, but there are lots of stray markings (slashes, underlines, etc.) that would require cleanup.

The toughest part, though, is recognizing handwriting and converting that to computer text. For image 1, even humans would have trouble figuring out what some of that is, and it would involve a lot of guessing based on context and familiarity with the words. In image 2, most of the numbers aren't too bad, but the text would be a problem.

If your grandparents' records are non-cursive, and neat, legible, consistent, and similar to machine printing, OCR might do a "reasonable" job on it. But you would still have a lot of cleanup.

For perspective, the US Postal Service has some of the most advanced handwriting recognition, which it uses to read addresses on mailpieces so they can be sorted with automated equipment. The only way they are able to do it is because the addresses are in a prescribed structure and format, and they know every possible address ahead of time. The objective is more to match the handwritten addresses to viable candidates than to get every character right.

There is a ton of redundancy. If you can only decipher half of the characters, there still may be only one or a few possible matches. Even with that, a substantial portion requires human intervention. When it's done and the mail gets to the carrier for delivery, the carrier knows the addresses and names on their route, and they check it all to ensure that the addresses weren't misinterpreted.

That's the level of handwriting OCR with state-of-the-art technology and an extremely controlled range of possibilities to compare against. Your task needs to translate every character. You don't have a master list of all the words that could legitimately be in those records (other than a dictionary of the entire language). OCR would require so much cleanup that it would be faster to simply read the records and type them into Excel. That's not an unusual task, and professional data entry people can do it pretty quickly and inexpensively.

Related Question