Say I have the following table structure:
jobs
+----+------------+
| id | some_field |
+----+------------+
| 1 | some_val |
| 2 | some_val |
| 3 | some_val |
+----+------------+
events
+----+--------+----------------------------------------+
| id | job_id | payload (JSONB) |
+----+--------+----------------------------------------+
| 1 | 1 | {'type':1, 'a':'some_val', ...} |
| 2 | 1 | {'type':2, 'c':'some_other_val1', ...} |
| 3 | 2 | {'type':1, 'a':'some_other_val2', ...} |
| 4 | 2 | {'type':1, 'a':'some_other_val3', ...} |
| 5 | 3 | {'type':1, 'a':'some_other_val4', ...} |
| 6 | 3 | {'type':1, 'a':'some_other_val5', ...} |
| 7 | 3 | {'type':2, 'c':'some_other_val6', ...} |
| 8 | 3 | {'type':3, 'd':'some_other_val7', ...} |
+----+--------+----------------------------------------+
I want to create a query/view that …
- … selects one row for each job IF it has an event with
payload->>'type' = 2
- … in that row, shows values from other events that are associated with this job.
- … in that row, shows values from other events that are associated with this job if that event type occurs multiple times.
For example, with the above data I'd like to select all jobs that have an event of type 2
, and find out the value of the d
field from events with type 3
and some aggregate (say, COUNT()
) of the a
field from events with type 1
. The result would be:
Results
+--------+------------------+----------+
| job_id | d | COUNT(a) |
+--------+------------------+----------+
| 1 | NULL | 1 |
| 3 | some_other_val7 | 2 |
+--------+------------------+----------+
As a bonus: Each job
can have only one event of type 2
, so the basic query can be structured like:
SELECT
job.id
FROM
event
LEFT JOIN
job ON event.job_id = job.id
WHERE
event.payload ->> 'type' = 2;
How can I get the result table? Do I need to do sub queries for each field/event type, which would result in relatively bad performance, or can I use some GROUP BY job_id
on the event table and maybe join the JSONs? (Note that my data is more complex and has many more fields in the payload that I would like to select).
Here is a DB Fiddle with the example data and my test query:
CREATE TABLE job ( id SERIAL PRIMARY KEY, some_field TEXT ); CREATE TABLE event ( id SERIAL PRIMARY KEY, job_id INTEGER, payload JSONB ); INSERT INTO job (some_field) VALUES ('val1'), ('val2'), ('val3'); INSERT INTO event (job_id, payload) VALUES (1, '{"type":1, "a":"some_val"}'::json), (1, '{"type":2, "c":"some_other_val1"}'::json), (2, '{"type":1, "a":"some_other_val2"}'::json), (2, '{"type":1, "a":"some_other_val3"}'::json), (3, '{"type":1, "a":"some_other_val4"}'::json), (3, '{"type":1, "a":"some_other_val5"}'::json), (3, '{"type":2, "c":"some_other_val6"}'::json), (3, '{"type":3, "d":"some_other_val7"}'::json); SELECT job.id, job.some_field FROM event LEFT JOIN job ON event.job_id = job.id WHERE event.payload ->> 'type' = '2';
id | some_field -: | :--------- 1 | val1 3 | val3
db<>fiddle here
Best Answer
As per comments, you need a timestamp field and I've added it to your sample data. Now IMHO you should first unnest your jsonb data. I've used a SUM(CASE for this purpose and it calculates too the minimum timestamp value.
This is the result:
db<>fiddle here
Starting at this point you can easily filter it by type and get the desired result.