Excel – Removing Line Breaks From CSV for Proper Excel Import

batchcsvmicrosoft excelnotepadtext formatting

So I've created a batch script that outputs some text to a CSV. When I open the CSV in Notepad, it looks like this, which is how I want it.

173030,8,2793,2821244
173032,1,2793,2820692
173033,1,2793,2821120
173035,2,2793,2820788

However, in Excel it appears as:

173030   8
2793
2821244

173032   1
2793
2820692

173033   1
2793
2821120

173035   2
2793
2820788

The point being that there are some unwanted line breaks. It appears that Notepad only recognize a CRLF line break, but there are some other line breaks that are either only CR or LF that Excel sees that Notepad doesn't. What's the best way to take care of this? Since I'm already creating it via a Batch script, it'd be best if there's a Batch command that could remove the unwanted line breaks. But if there's maybe an excel setting or something I can change, that could work too.

Best Answer

If your example is accurate, you may not be creating something that Excel really can make sense of. I would recommend trying to put a row containing headings as the first row to see if that helps.

Also check your line endings to make sure you are adding \r\n (e.g. Windows line endings) rather than just \n - Excel should cope with the later but it can be hard to know.

Finally, make certain that your batch output is not including spurious characters. For example, are you sure that the numbers don't have leading or trailing characters such as a binary code <32?

UPDATE:

OK, that file you shared is not a valid CSV. You must take the odd characters out. I've just opened it in the Brackets text editor and the lines are split there. That is why Excel is doing the same.

On the first line for example, after the number 8, you have 13 spaces followed by a carriage return (\n)

UPDATE 2: Options for cleaning up the output. Here are a few possibilities that you could choose from depending on your knowledge, background and the allowed capabilities on your desktop:

  • PowerShell - you can use the features of PS to fairly easily cleanse the data and output to a valid CSV (PS has native CSV capabilities). That may mean learning PS of course.
  • Node.JS - my favourite tool for all platforms! A cross-platform JavaScript host that lets you run both command line and "server" processes. It has good native file system access and has vast arrays of libraries for doing pretty much everything. Great if you already know JavaScript from web development or want to minimise the number of languages you have to deal with. Also you can use node-webkit for creating cross-platform UI's with the power of Node behind them.
  • Windows Scripting Host - also capable of running JavaScript (well JScript anyway which is close). You can also use VBScript if you must. Can be run from the command line as long as it hasn't been disabled by zealous admins. You can use the same code in a HTA if you need a UI.
  • Google Refine - a very useful ETL tool for tidying up and transforming data. A bit long in the tooth now and rarely updated but it works. Free.
  • Excel PowerQuery - an addin from Microsoft that is hugely useful for ETL as long as the L is Excel! I use this all the time now. Free, part of Microsoft's BI tools.

There are many others of course, these are just a few as examples.

Related Question