Postgresql – Gather values from different related rows with JSONB fields

jsonpostgresqlquery

Say I have the following table structure:

jobs
+----+------------+
| id | some_field |
+----+------------+
|  1 |  some_val  |
|  2 |  some_val  |
|  3 |  some_val  |
+----+------------+

events
+----+--------+----------------------------------------+
| id | job_id |            payload (JSONB)             |
+----+--------+----------------------------------------+
|  1 |      1 | {'type':1, 'a':'some_val', ...}        |
|  2 |      1 | {'type':2, 'c':'some_other_val1', ...} |
|  3 |      2 | {'type':1, 'a':'some_other_val2', ...} |
|  4 |      2 | {'type':1, 'a':'some_other_val3', ...} |
|  5 |      3 | {'type':1, 'a':'some_other_val4', ...} |
|  6 |      3 | {'type':1, 'a':'some_other_val5', ...} |
|  7 |      3 | {'type':2, 'c':'some_other_val6', ...} |
|  8 |      3 | {'type':3, 'd':'some_other_val7', ...} |
+----+--------+----------------------------------------+

I want to create a query/view that …

… selects one row for each job IF it has an event with payload->>'type' = 2
… in that row, shows values from other events that are associated with this job.
… in that row, shows values from other events that are associated with this job if that event type occurs multiple times.

For example, with the above data I'd like to select all jobs that have an event of type 2, and find out the value of the d field from events with type 3 and some aggregate (say, COUNT()) of the a field from events with type 1. The result would be:

Results
+--------+------------------+----------+
| job_id |        d         | COUNT(a) |
+--------+------------------+----------+
|      1 |  NULL            |        1 |
|      3 |  some_other_val7 |        2 |
+--------+------------------+----------+

As a bonus: Each job can have only one event of type 2, so the basic query can be structured like:

SELECT
  job.id
FROM
  event
LEFT JOIN 
    job ON event.job_id = job.id
WHERE
  event.payload ->> 'type' = 2;

How can I get the result table? Do I need to do sub queries for each field/event type, which would result in relatively bad performance, or can I use some GROUP BY job_id on the event table and maybe join the JSONs? (Note that my data is more complex and has many more fields in the payload that I would like to select).

Here is a DB Fiddle with the example data and my test query:

CREATE TABLE job (
      id SERIAL PRIMARY KEY,
      some_field TEXT
);

CREATE TABLE event (
      id SERIAL PRIMARY KEY,
      job_id INTEGER,
      payload JSONB
);

INSERT INTO job (some_field) VALUES ('val1'), ('val2'), ('val3');
INSERT INTO event (job_id, payload) VALUES 
    (1, '{"type":1, "a":"some_val"}'::json),
    (1, '{"type":2, "c":"some_other_val1"}'::json),
    (2, '{"type":1, "a":"some_other_val2"}'::json),
    (2, '{"type":1, "a":"some_other_val3"}'::json),
    (3, '{"type":1, "a":"some_other_val4"}'::json),
    (3, '{"type":1, "a":"some_other_val5"}'::json),
    (3, '{"type":2, "c":"some_other_val6"}'::json),
    (3, '{"type":3, "d":"some_other_val7"}'::json);

SELECT job.id, job.some_field 
FROM event 
LEFT JOIN  job 
     ON event.job_id = job.id WHERE event.payload ->> 'type' = '2';

id | some_field
-: | :---------
 1 | val1      
 3 | val3

db<>fiddle here

Best Answer

As per comments, you need a timestamp field and I've added it to your sample data. Now IMHO you should first unnest your jsonb data. I've used a SUM(CASE for this purpose and it calculates too the minimum timestamp value.

SELECT 
    event.job_id, 
    job.some_field,
    event.payload->>'type' AS type, 
    SUM(CASE WHEN event.payload->>'a' IS NOT NULL THEN 1 ELSE 0 END) AS a,
    SUM(CASE WHEN event.payload->>'b' IS NOT NULL THEN 1 ELSE 0 END) AS b,
    SUM(CASE WHEN event.payload->>'c' IS NOT NULL THEN 1 ELSE 0 END) AS c,
    SUM(CASE WHEN event.payload->>'d' IS NOT NULL THEN 1 ELSE 0 END) AS d,
    MIN((event.payload->>'ts'::text)::timestamp) as min_ts
FROM   
    event
JOIN
    job
    ON event.job_id = job.id
GROUP BY 
    event.job_id, 
    job.some_field,
    event.payload->>'type'
ORDER BY
    1, 2;

This is the result:

job_id | some_field | type |  a |  b |  c |  d | min_ts             
-----: | :--------- | :--- | -: | -: | -: | -: | :------------------
     1 | val1       | 1    |  1 |  0 |  0 |  0 | 2004-10-19 10:23:54
     1 | val1       | 2    |  0 |  0 |  1 |  0 | 2002-10-19 10:23:54
     2 | val2       | 1    |  2 |  0 |  0 |  0 | 2003-10-19 10:23:54
     3 | val3       | 1    |  2 |  0 |  0 |  0 | 2004-10-19 10:23:54
     3 | val3       | 2    |  0 |  0 |  1 |  0 | 2005-10-19 10:23:54
     3 | val3       | 3    |  0 |  0 |  0 |  1 | 2006-10-19 10:23:54

db<>fiddle here

Starting at this point you can easily filter it by type and get the desired result.

Related Solutions

PostgreSQL JSON – Create Multi-Key JSON Object with Integers or Floats

Update

According to your sample data and desired result you can get it using json_object_agg Postgres function.

CREATE TABLE test (str text, num double precision);
INSERT INTO test(str, num) VALUES ('a', 1.0);
INSERT INTO test(str, num) VALUES ('b', 2.3);

select json_object_agg(str, num)
from test;

| json_object_agg        |
| :--------------------- |
| { "a" : 1, "b" : 2.3 } |

dbfiddle here

You can use row_to_json Postgres function.

create table foo(f1 int, f2 varchar(10), f3 float, f4 decimal(18,2));
insert into foo values (2, 'name', 2.345, 23.23);

select row_to_json(f)
from   (select f1,f2,f3,f4 from foo) f;

| row_to_json                                |
| :----------------------------------------- |
| {"f1":2,"f2":"name","f3":2.345,"f4":23.23} |

dbfiddle here

PostgreSQL – Converting Undefined JSONB Fields into a Row Dynamically

My question : Is there any way to do a select all over the keys of jsonb without specify manually the keys?

No, because there is no way to have a query return an undefined result set. However, if the table is not accepting new queries you can generate a dynamic sql statement.

SELECT FORMAT(
  $$ SELECT * FROM %I.%I CROSS JOIN LATERAL jsonb_to_record(%I) AS rs(%s); $$,
  'public',
  'v',
  'jsonb',
  array_to_string(
                (SELECT ARRAY(SELECT DISTINCT col FROM v CROSS JOIN LATERAL jsonb_object_keys(jsonb) AS t(col) ORDER BY col)), ' text , '
  ) || ' text'
);

Then run that query, or run \gexec in psql.

                                                    format                                                    
--------------------------------------------------------------------------------------------------------------
  SELECT * FROM public.v CROSS JOIN LATERAL jsonb_to_record(jsonb) AS rs(a text , b text , c text , d text); 
(1 row)

test=# \gexec
 id |             jsonb              | a | b | c | d 
----+--------------------------------+---+---+---+---
  1 | {"a": "4", "b": "5"}           | 4 | 5 |   | 
  2 | {}                             |   |   |   | 
  3 | {"a": "8", "c": "9", "d": "9"} | 8 |   | 9 | 9
(3 rows)

You may or may not want to build some type inference with jsonb_typeof back to pg types: keep in mind, you can never go back to integer or something, but you should be able to store the numbers as double precision.

Best Answer

Related Solutions

PostgreSQL JSON – Create Multi-Key JSON Object with Integers or Floats

PostgreSQL – Converting Undefined JSONB Fields into a Row Dynamically

Related Question