Mysql – 11.2GB text file import MySQL or PostgreSQL

importMySQLpostgresql

I am out of my depth a bit perhaps but submit this query hoping someone has some experience importing large (11.2GB) tab delimited text file (2.2GB tar.gz file) into either MySQL 5.5.41 or PostgreSQL 9.3.6. Double precision for the fields will be required as it is spatial data (Latitude, Longitude and Elevation).

I have MySQL and PostgreSQL setup on Ubuntu 14.04 using phpMyAdmin and phpPGAdmin to interact with the servers. However I realise that command line interaction will probably be better.

I have read a bit and there seems to be a way to split data for MySQL (SQLDumpSplitter2 – http://www.rusiczki.net/2007/01/24/sql-dump-file-splitter/).

Any suggestions or alternative options greatly appreciated.

Thanks
Paul

Best Answer

For PostgreSQL, I would recommend using pgloader.

pgloader leaves the rows that it cannot import in a separate text file, so you can take a look at what went wrong, fix the errors, and finish importing the data. COPY simply crashed when it encounters a row it didn't expect, which is likely to happen several times for very big files.

pgloader can also be set up to load data in parallel, but I'm not sure how much help it would be if you're just loading a single table.

For MySQL, the standard LOAD DATA LOCAL INFILE is as fast as you're going to get (as far as I know). It spits out a warning for each row that it cannot import, so you can go back and look at that data later.

Other than that, 11.2 GB isn't too bad. Be sure to remove all existing indexes on the table that you're importing, and recreate the indexes afterwards.

I've imported text files of comparable size into both databases without a problem. Loading data and creating indexes might take a few hours, so just leave it overnight.

Related Solutions

Mysql – How to import a .sql file in MySQL

juergen d's answer is of course correct; however, considering your error message, you can also add to your SQL file at the beginning line like:

USE your_database_name;

This should also do the job and let you import under the Workbench.

When you are using mysqldump, the exported file will be saved in the current folder. It doesn't matter under what path it is. Just when importing from command line you need to be at the same folder or specify path to the file. But this isn't the case when you are using visual tool like Workbench, where you need to select the file from folder tree anyway.

PostgreSQL Import – How to Import Binary Database Files

Assuming you took a copy of the whole data directory including pg_xlog, pg_clog, global, base, etc, then you can simply ensure that PostgreSQL 9.3 is installed and:

postgres -D /path/to/data/directory -c "port=5433"

then in another terminal use pg_dump to dump the database(s) from the database running on port 5433 with (eg) pg_dump "dbname=mydb port=5433".

This process is well-covered elsewhere; it's just starting a new PostgreSQL server and dumping the database. So I won't go into detail and cover all the possible errors, etc.

However, if you only copied the base directory, you're pretty much stuffed. The data is sort-of there, but it's missing all the vital information that tells the system what databases there are, handles crash-safe updates, keeps track of rolled back vs committed transactions, and lots more.

If you only have base, recovery might be possible but would probably involve paying an expert for a fair bit of time to try to extract the data. Even then, it'd likely be significantly corrupted - duplicates in primary key columns, deleted data reappearing, old versions of updated rows reappearing, recently committed transactions only half-complete, etc.

So I hope you kept backups.

Best Answer

Related Solutions

Mysql – How to import a .sql file in MySQL

PostgreSQL Import – How to Import Binary Database Files

Related Question