In Postgres 12.3 I am having trouble building the correct SQL query that can join a table from a deeply nested array in a jsonb object with rows from another table.
The deeply nested array contains strings and optionally an array of strings. These arrays should be flattened and then used to join in the other table.
Here is a DB Fiddle with the below data ready to test:
https://www.db-fiddle.com/f/6tMMNn3JsgkbehCptw4xEx/0
Details
I need a SQL query given the below tables that will return from order_batches
the batches which contain orders where state does not equal 'picked' or 'packed'. With the given table data below only order_batches
with ID 1
should be returned.
Essentially I need the query to look at each row in order_batches
, for all the tags->'bins'
arrays, find and flatten any order_numbers
, and join (or sub-select?) the orders
table on those order_numbers
. The where clause should retrieve only those batches that include an order with a state that does not match 'picked' or 'packed'.
Expressing this in pseudo sql might look like:
SELECT order_batches.*
FROM order_batches
JOIN orders ON order_batches.tags->'bins'->__all array elements__->'order_numbers' = orders.number
WHERE orders.state IS NOT IN ('picked', 'packed')
Desired Output
The output that I am looking for is equal to the result of this query:
SELECT * from order_numbers WHERE id = 1
╔════╦══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║ id ║ tags ║
╠════╬══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╣
║ 1 ║ {"bins":[{"order_numbers":["555111111","555111112",["555111114","555111115"]]},{"order_numbers":["555111116","555111117"]}]} ║
╚════╩══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
Resources
I have tried searching for an answer, but did not find one exactly like mine. These results have helped, but I'm still not sure how to get it right:
- https://stackoverflow.com/questions/49766873/how-to-join-on-the-values-returned-from-json-array-elements-in-postgresql
- PostgreSql : flatten json array data
Test Data
Here is the db schema and test data:
create table "order_batches" (id bigint primary key, tags jsonb);
insert into "order_batches"
(id, tags)
values
(1, '{
"bins": [
{
"order_numbers": [
"555111111",
"555111112",
[
"555111114",
"555111115"
]
]
},
{
"order_numbers": [
"555111116",
"555111117"
]
}
]
}'),
(2, '{
"bins": [
{
"order_numbers": [
"555111118",
"111111111"
]
}
]
}');
create table "orders" (id bigint primary key, "number" varchar(255), state varchar(255));
INSERT INTO "orders" (id, "number", state) VALUES
(1, '555111111', 'picked'),
(2, '555111112', 'picked'),
(3, '555111114', 'picked'),
(4, '555111115', 'ready'),
(5, '555111116', 'ready'),
(6, '555111117', 'ready'),
(7, '555111118', 'picked'),
(8, '111111111', 'packed');
Attempts
Here are some queries I have tried in my attempt to get it working and to understand the problem better:
SELECT ob.id, j.nums->'order_numbers'
FROM order_batches ob
CROSS JOIN lateral jsonb_array_elements(tags->'bins') j(nums)
JOIN orders o on o.number = j.nums->'order_numbers'
WITH numbers AS(
SELECT jsonb_array_elements("tags"->'bins') AS numbers FROM order_batches
)
SELECT * FROM numbers JOIN orders o ON o.number = numbers.numbers
SELECT
case jsonb_typeof(numbers)
when 'string' then numbers->>0
when 'array' then jsonb_array_elements_text(numbers)
end as number
FROM order_batches, jsonb_array_elements(order_batches.tags->'bins') as numbers
Thanks in advance for any help with this problem. At this point I do not think it would be worth revising the data structure. I'm hoping a SQL expert can whip up some magic to help me (and others) out here.
Best Answer
This is relatively easy to solve in Postgres 12, because the new SQL JSON/Path feature does the heavy lifting of iterating through (nested) arrays.
The JSON path expression
'$.bins[*].order_numbers[*]
iterates through all array elements in theorder_numbers
array - including nested arrays. Thejsonb_path_exists()
then checks if there is a match with thenumber
from theorders
table.The number is passed as a "json parameter" to the
jsonb_path_exists()
function and the$nr
placeholder references that. The name of that parameter must match the key of the JSON value passed as the third parameter of thejsonb_path_exists
function.I don't think this is going to be very fast though as the way the
jsonb_path_exists()
function is used, can't make use of an index as far as I know.Online example