Postgresql – Compress JSON String Stored in PostgreSQL, such as MessagePack

compressionencodingjsonpostgresqlpostgresql-11

JSON strings are currently being stored in a PostgreSQL 11 table in a text field. For example, a row can have the text field asks containing the string:

{"0.000295":1544.2,"0.000324":1050,"0.000325":40.1,"0.000348":0}

Question: Is it possible to store it in a format that consumes lesser space? Using some CPU cycles to serialize/deserialize the JSON string is acceptable as a compromise for using lesser storage space. The JSON data does not need to be searchable. The JSON object keys are almost always different in different rows.

I am particularly interested in using JSON encoding/compression algorithms like MessagePack with zlib, but how can we use this when inserting the record into the PostgreSQL table?

Note: Also using Node.js with Knex.js to communicate with PostgreSQL. Currently converting JSON objects to strings using Node's JSON.stringify function.

Best Answer

Individual PostgreSQL fields over 2000 bytes are automatically compressed, for TOAST storage. However, this compression is not very good, and is local to just that one PostgreSQL value, so can't compress out repetition that occurs between values (like the same keys being used repeatedly, but only once in each JSON structure). You might want to look at ZSON for compression that uses the whole column for compression context (I've never use it myself).

I am particularly interested in using JSON encoding/compression algorithms like MessagePack with zlib, but how can we use this when inserting the record into the PostgreSQL table?

You could always compress it on the client, then pass it to and from the database as bytea. Of course then you can't do JSON manipulation inside the database. Also working with bytea is generally unpleasant.

If you can find an implementation of MessagePack in Perl or Python, you could use those pl languages to allow compression and decompression inside the database.