How to Dump to CSV Valid JSON with PostgreSQL

csvjsonpostgresql

I am trying to dump some data as json with:

\copy (SELECT json_build_object('areaSqft',area_sqft, 'titleNos',title_nos, 'buildingIds',building_ids, 'geometry',geom_json) FROM my_data) to my_data.csv with delimiter ',' csv header

what I am expecting is a valid json per row, but what I get is:

"{""areaSqft"": 214.394254595041, ""geometry"": {""type"": ""MultiPolygon"", ""coordinates"": [[[[0.000015, 51.449107], [0.000154, 51.441108], [0.000238, 51.44111], [0.00024, 51.441052], [0.000137, 51.441051], [0.000041, 51.441049], [0.000015, 51.441107]]]]}, ""titleNos"": [""ZB78669""], ""buildingIds"": [7521141, 9530393, 7530394]}"

There are extra " as first and last character and "" around instead of single ".

How can I get a valid json stripping unnecessary quotes?

Best Answer

The quotes around fields and inside fields are part of the CSV format, and they're required here, because, according the CSV spec:

  1. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes

  2. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote

I think that you don't want or need CSV in your case. Just take the output of SELECT, with the unaligned format of psql

=# \pset format unaligned 
Output format is unaligned.

=# select json_build_object('foo', 1, bar, 2) AS myjson
   from (values (E'xy\zt'), ('ab,cd')) as b(bar);

myjson
{"foo" : 1, "xyzt" : 2}
{"foo" : 1, "ab,cd" : 2}
(2 rows)

You may also use \g output.json instead of the semi-colon at the end of the query to have psql redirect the results of that query to a file, and \pset tuples_only to remove headers and footers.