I'm trying to deal with a JSON column that does not have an explicit JSON schema or convention. There are varying keys within the JSON. Some keys do not exist, some keys do exist but empty string values.
For example:
CREATE TABLE json_table (
id int,
json_data json
);
INSERT INTO json_table (id, json_data) VALUES
(1, '{"foo" : "bar", "baz": "biz"}');
(2, '{"foo" : "", "baz": "biz"}');
(3, '{"hello" : "world"}');
(4, '{"hello" : "world2", "foo" : "bar2", "baz" : "" }');
-
Is it possible to query this table and select/yield a subset of
json_data
with only key-value pairs that are non-empty strings? -
Is it possible to query this table and select records that contain JSON keys that are within a given list? (I essentially want to find records with
json_data
that have the key "foo")
For #2 I do have a solution that selects only records where json_data
keys are within a regex pattern:
SELECT tmp.*
FROM (
SELECT id,
ARRAY(SELECT json_object_keys(json_data))::text AS keys
FROM json_table
) tmp
WHERE tmp.keys LIKE ANY(ARRAY['%foo%', '%bar%']);
This yields:
"id","keys"
1,"{foo,baz}"
2,"{foo,baz}"
4,"{hello,foo,baz}"
But there's a caveat! I'm returning records that DO have the keys, but empty string values! Is there a way to incorporate a solution to #1 into this and pre-filter the JSON to only "valid" key-value pairs?
Best Answer
You can remove the empty values during aggregation. And use an array operator in the final select to get only rows that contain a specific key:
If you want the complete JSON values, rather than just the keys, you can use something like this:
(Note that I am using
jsonb
for the result in this case as that is much more flexible to handle. In general it's recommended to prefer jsonb over json these days)If you need to cleanup the json frequently, you might want to consider writing a function for that: