Mysql – How to migrate large blob table from thesql to postgresql

migrationMySQLpostgresql

I'm now in process of migrating my MySQL database to PostgreSQL. Almost everything went fine (well, after lots of googling for correct mysqldump params etc.) except one table I have – actually the most important table in my app.

Table structure is very simple:

mysql> show create table samples;
.. skipped ...
CREATE TABLE `samples` (
      `File_ID` int(11) NOT NULL,
      `File` longblob,
      PRIMARY KEY (`File_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=binary

but it is very large (> 20 Gb).

I've tried to use –hex-blob parameter of mysqldump – but data in this format is not accepted by PostgreSQL when I tried to use resulting dumpfile as a command file. Another option I've tried is using –tab option to just get a dump and then insert it to PostgreSQL with COPY command – but –hex-blob is not working with –tab and PostgreSQL still does not accept the dumpfile saying there are invalid characters in it.

I'd be very glad to get any advice on this matter – although I'm starting to think that writing a custom migration tool is not a bad idea after all…

Best Answer

I think that simplest way is to use that --hex-blob switch on mysqldump and restore by psql, with decode(string text, type text). However it's not that simple, because you need to change a little produced dump (sed, awk), adding that decode function. For example:

mysqldump -u root -p --skip-quote-names --hex-blob --skip-triggers \
    --compact --no-create-info mysql samples > prepg.dump

sed "s/0x\([0-9,A-F]*\))/decode('\1','hex'))/g" prepg.dump > pg.dump

psql session:

CREATE TABLE samples
(
    file_id integer PRIMARY KEY,
    file bytea
);

\i 'path/to/pg.dump'