Excel HTML Import – Prevent Values from Converting to Dates

conversionhtmlimportmicrosoft excelmicrosoft-excel-2010

I have some HTML files that contain tables, which I need to perform some analysis on.

I can open them in Excel, and it preserves all the table formatting and layout (which is what I want).

The problem is that it, by default, formats all cells as "general". This means Excel's "smart" data conversion kicks in, which, as has been noted my many on stackexchange in the past, causes all kinds of problems when codes and names show up as dates and get converted to a number.

There are ways to get around this when importing from plaintext, forcing Excel to bring up a wizard which allows you to change the import format from "general" to "text". How do I make Excel treat everything as text for an HTML file?

Is there perhaps some way I can change a global Excel setting that stops the general format from converting dates? Or is there some way specifically for opening html files that will stop the "general" format from being applied?

Best Answer

I would use the Power Query Add-In for this. Power Query can read HTML files (local or web). It looks for tables so there will need to be some consistency in the HTML structure. Once the HTML table has been read, it will try to automatically detect dates - you can override this and convert columns manually.

Related Question