PostgreSQL – Performance Differences Between jsonb and Composite Types

composite-typesdatabase-designdatatypespostgresqlpostgresql-10

What considerations are involved in choosing between a jsonb column and a composite type column of the same structure?

For example, consider a column like that used in the Postgres documentation:

CREATE TYPE inventory_item AS (
name            text,
supplier_id     integer,
price           numeric
);

What are the tradeoffs involved between this approach vs a jsonb column mirroring this structure?

For example, I suspect that the composite type won't require storing the key names for every record, whereas the jsonb type would require this.

Best Answer

The TYPE inventory_item was defined in the question (same as in the guide), so we need only to define tables tc with composite (ROW) type, and tj with JSONb type.

INSERT TIME

-- drop table tc; drop table tj;
CREATE TABLE tc (id serial, x inventory_item);
CREATE TABLE tj (id serial, x JSONb);

EXPLAIN ANALYSE 
  INSERT INTO tc(x) VALUES 
    (ROW('fuzzy dice', 42, 1.99)),
    (ROW('test pi', 3, 3.1415))
; -- Execution Time: try1 0.386 ms; try2 0.559 ms; try3 0.102 ms; ... 
EXPLAIN ANALYSE 
  INSERT INTO tj(x) VALUES 
    ('{"name":"fuzzy dice", "supplier_id":42, "price":1.99}'::jsonb),
    ('{"name":"test pi", "supplier_id":3, "price":3.1415}'::jsonb)
; -- Execution Time: try1 0.343; try2 0.355 ms; try3 0.112 ms; ...

Of course, we need loops, etc. something por complex to test... But seems "comparable" times, no big difference.

SELECT NATIVE TIME

Only retrieving the original datatype. Need good benchmark, but lets imagine something simple only to check big differences.

EXPLAIN ANALYSE   SELECT x, i FROM tc, generate_series(1,999999) g(i); 
EXPLAIN ANALYSE   SELECT x, i FROM tj, generate_series(1,999999) g(i); 

no difference again. Both with "Execution Time: ~460".

EXPLODE TIME

EXPLAIN ANALYSE
  SELECT i, id, (x).name, (x).supplier_id, (x).price
  FROM tc, generate_series(1,999999) g(i)
; -- Execution Time: ~490 ms 
EXPLAIN ANALYSE
  SELECT i, tj.id, t.* 
  FROM tj, generate_series(1,999999) g(i),
       LATERAL jsonb_populate_record(null::inventory_item, tj.x) t  
; -- Execution Time: ~650 ms

Seems that is very fast to transform JSONb-object into SQL-row! Seems a binary cast: we can suppose that the function jsonb_populate_record maps the JSONb types to SQL using by inventory_item internal definition.

And it is faster tham composite table.

Explode and calculate something

EXPLAIN ANALYSE
  SELECT i, (x).supplier_id+i, (x).price+0.01
  FROM tc, generate_series(1,999999) g(i)
; -- Execution Time: ~800 ms

EXPLAIN ANALYSE
  SELECT i,  t.supplier_id+i, t.price+0.01
  FROM tj, generate_series(1,999999) g(i),
       LATERAL jsonb_populate_record(null::inventory_item, tj.x) t  
; -- Execution Time: ~620 ms

Perhaps ~150 ms to calculations, so same expected time... There are some error in the example above, need better benchmark to check real difference.


Check comparative time to cast from text.

EXPLAIN ANALYSE -- (supposed to) cast from binary
  SELECT i, id, x->>'name' as name, 
         (x->'supplier_id')::int as supplier_id, (x->'price')::float as price
  FROM tj, generate_series(1,999999) g(i)
; -- Execution Time: ~1600 ms 

EXPLAIN ANALYSE -- cast from text
  SELECT i, id, x->>'name' as name, 
         (x->>'supplier_id')::int as supplier_id, (x->>'price')::float as price
  FROM tj, generate_series(1,999999) g(i)
; -- Execution Time: ~1600 ms 

Long and same times. Seems that (x->'supplier_id')::int it is only a sugar syntax for (x->>'supplier_id')::int or (x->'supplier_id')::text::int.

PS: this answer is also a complement for this other question, about "Binary to binary cast with JSONb".