Excel – How to stop Excel from eating the delicious CSV files and excreting useless data

csvmicrosoft excel

I have a database which tracks sales of widgets by serial number. Users enter purchaser data and quantity, and scan each widget into a custom client program. They then finalize the order. This all works flawlessly.

Some customers want an Excel-compatible spreadsheet of the widgets they have purchased. We generate this with a PHP script which queries the database and outputs the result as a CSV with the store name and associated data. This works perfectly well too.

When opened in a text editor such as Notepad or vi, the file looks like this:

"Account Number","Store Name","S1","S2","S3","Widget Type","Date"
"4173","SpeedyCorp","268435459705526269","","268435459705526269","848 Model Widget","2011-01-17"

As you can see, the serial numbers are present (in this case twice, not all secondary serials are the same) and are long strings of numbers.
When this file is opened in Excel, the result becomes:

Account Number  Store Name  S1           S2  S3           Widget Type       Date
4173            SpeedyCorp  2.68435E+17      2.68435E+17  848 Model Widget  2011-01-17

As you may have observed, the serial numbers are enclosed by double quotes. Excel does not seem to respect text qualifiers in .csv files. When importing these files into Access, we have zero difficulty. When opening them as text, no trouble at all. But Excel, without fail, converts these files into useless garbage. Trying to instruct end users in the art of opening a CSV file with a non-default application is becoming, shall we say, tiresome. Is there hope? Is there a setting I've been unable to find? This seems to be the case with Excel 2003, 2007, and 2010.

Best Answer

But Excel, without fail, converts these files into useless garbage.

Excel is useless garbage.

Solution

I would be a little surprised if any client wanting your data in an Excel format was unable to change the visible formatting on those three columns to "Number" with zero decimal places or to "text." But let's assume that a short how-to document is out of the question.

Your options are:

  1. Toss a non numeric, not whitespace character into your serial numbers.
  2. Write out an xls file or xlsx file with some default formatting.
  3. Cheat and output those numbers as formulas ="268435459705526269","",="268435459705526269" (you can also do ="268435459705526269",,="268435459705526269" saving yourself 2 characters). This has the advantage of displaying correctly, and probably being generally useful, but subtly broken (as they are formulas).

Be careful with option 3, because some programs (including Excel & Open Office Calc), will no longer treat commas inside ="" fields as escaped. That means ="abc,xyz" will span two columns and break the import.

Using the format of "=""abc,xy""" solves this problem, but this method still limits you to 255 characters because of Excel's formula length limit.

Related Question