I'm trying to build a query to aggregate together multiple columns in a legacy table stored in a similar structure as below:
CREATE TEMPORARY TABLE foo AS
SELECT * FROM ( VALUES
(1,'Router','Networking','Sale',NULL),
(2,NULL,'Router','Networking','Sale'),
(3,NULL,NULL,'Networking','Sale'),
(4,NULL,NULL,NULL,NULL)
) AS t(id,tag_1,tag_2,tag_3,tag_4);
An example of NOT WHAT I WANT
This is an example of the query I want to build:
SELECT ID, json_build_array(Tag_1, Tag_2, Tag_3, Tag_4) AS tags
FROM table
The problem is that the above query adds the NULL values from the rows to the array:
ID Tags
--------------------------------------------------
1 ['Router', 'Networking', 'Sale', null]
2 [null, 'Router', 'Networking', 'Sale']
3 [null, null, 'Networking', 'Sale']
4 [null, null, null, null]
I want to avoid having to write an overly complicated CASE WHEN
statement to filter out the NULLs and I'm still new to working PostgreSQL's JSON datatypes. Is there anyway I can avoid including NULLs when building a JSON array in Postgres?
Best Answer
I would suggest not using a JSON array, and instead using the native SQL array syntax which is likely much faster and more efficiently stored. It's also stronger typed. The JSON array is "possibly-heterogeneously-typed" per the docs.
I also wouldn't do this routinely. I would alter the table's schema to have an
ARRAY
(preferably SQL) on the table itself to store the tags without ever storing null in columns. This can put you down the path of correcting the schema.Building arrays
Strictly-typed PostgreSQL array
Just use the ARRAY literal constructor.
A JSON array
Filtering Nulls without manual
coalesce
Strictly-typed PostgreSQL array
You can easily filter nulls in a single pass by wrapping the above in
array_remove
.JSON array