Unable to open CSV file

csvlaunch-servicesnumberstextedit

I have a reasonably sized CSV file (40 Megabytes). I know that some apps won't work with certain encodings, and that they will throw error if the encoding isn't ASCII. But its alarming that even TextEdit.app is unable to open this file.

First I tried Numbers.app. I made sure this isn't a Launch Services error, because I opened the file within the app. It wouldn't load. The progress bar always gets stuck half way. When I open this CSV file with TextEdit.app, it wouldn't load either. Not even Google Sheets. The file is just 40 Megabytes and has only ASCII characters. Running file command in Terminal.app returns the following message:

file.csv: ASCII text, with very long lines.

I am able to open the file using Visual Studio Code but not in TextEdit.app. How can I fix it? I cleaned the file for any \r characters.

Update: Running wc -l file.csv command returns 176831. My system is a 2016 MacBookPro with 16 GB RAM.

Best Answer

It is quite common for GUI programs to be unable to handle large text files. Although 40MB doesn't sound like large regarding todays standards, but that might bloat up to a lot more in memory depending on how the application is written. And GUI applications often aren't the most efficient ones.

You might want to split up the text file in multiple smaller ones using the terminal. First, check if you can open the file using less filename.csv in the Terminal, and if the characters read fine. If not, the file might be corrupted, and that might be the issue.

For the acutal splitting, try using something like this on the terminal:

#!/bin/bash
N=10000 # Number of lines per file
i=1
j=0
filename="hugefile.csv"
extension=.csv
while [ $i -le $(wc -l $filename|awk '{print $1}') ]
do 
    newfilename="$(basename $filename $extension)$j$extension"
    echo $newfilename: $i
    sed -n $i,$((i+$N))p $filename > $newfilename
    j=$((j+1)); i=$((i+$N))
done

Copy and paste that into a plain text document (e.g. TextEdit in plain text mode or nano on the Terminal) and name it split.sh or something similiar. Customize the paramters N and filename as needed, e.g. enter the desired count of numbers per file in N=... and the filename of your sourcefile as filename="...". This will generate the neccessary amount of files in your current directory to cover all the lines of the source files in smaller files of N lines each. The files will have a number appended, e.g. hugefile0.txt to hugefile9.txt or something like that.

Now you should be able to open each of these files in your desired application. It's often desirable to work with smaller portions of one large file than with the whole file at once. You could even open the resulting CSV files in Numbers one after another and copy the lines from each file into one large Numbers document. That way the importer probably won't hang on such a large file.

In case you get any errors regarding sed or awk, that's because Mac sed and awk are different than the regular sed or awk. In that case, you might need to install regular sed and awk from something like macports or homebrew.