I have a reasonably sized CSV file (40 Megabytes). I know that some apps won't work with certain encodings, and that they will throw error if the encoding isn't ASCII. But its alarming that even TextEdit.app is unable to open this file.
First I tried Numbers.app. I made sure this isn't a Launch Services error, because I opened the file within the app. It wouldn't load. The progress bar always gets stuck half way. When I open this CSV file with TextEdit.app, it wouldn't load either. Not even Google Sheets. The file is just 40 Megabytes and has only ASCII characters. Running file
command in Terminal.app returns the following message:
file.csv: ASCII text, with very long lines
.
I am able to open the file using Visual Studio Code but not in TextEdit.app. How can I fix it? I cleaned the file for any \r
characters.
Update: Running wc -l file.csv
command returns 176831
. My system is a 2016 MacBookPro with 16 GB RAM.
Best Answer
It is quite common for GUI programs to be unable to handle large text files. Although 40MB doesn't sound like large regarding todays standards, but that might bloat up to a lot more in memory depending on how the application is written. And GUI applications often aren't the most efficient ones.
You might want to split up the text file in multiple smaller ones using the terminal. First, check if you can open the file using
less filename.csv
in the Terminal, and if the characters read fine. If not, the file might be corrupted, and that might be the issue.For the acutal splitting, try using something like this on the terminal:
Copy and paste that into a plain text document (e.g. TextEdit in plain text mode or nano on the Terminal) and name it
split.sh
or something similiar. Customize the paramtersN
andfilename
as needed, e.g. enter the desired count of numbers per file inN=...
and the filename of your sourcefile asfilename="..."
. This will generate the neccessary amount of files in your current directory to cover all the lines of the source files in smaller files ofN
lines each. The files will have a number appended, e.g.hugefile0.txt
tohugefile9.txt
or something like that.Now you should be able to open each of these files in your desired application. It's often desirable to work with smaller portions of one large file than with the whole file at once. You could even open the resulting CSV files in Numbers one after another and copy the lines from each file into one large Numbers document. That way the importer probably won't hang on such a large file.
In case you get any errors regarding
sed
orawk
, that's because Macsed
andawk
are different than the regularsed
orawk
. In that case, you might need to install regularsed
andawk
from something like macports or homebrew.