Postgresql – Loop through key/value pairs of a jsonb object in postgresql function

jsonplpgsqlpostgresql

I am trying to create a function in Postgres that can loop through each key/value pair in a given jsonb object.

create or replace function myFunction
(input jsonb)
returns jsonb as $$
BEGIN

    // foreach(key in input)
    //       do some math operation on its corresponding value

    returns input;

END; $$

The argument input is expected to be a jsonb object, such as {"a":1, "b":2, "c":3}.

I want to loop through each key/value pair in the object. As you can tell from my comment, I used to write backend code with more general-purpose programming languages like c, java, etc. So I am not good at SQL. I have tried to search online, but they talk about how to loop through jsonb array instead of object. So really stuck here.

Best Answer

Use jsonb_each(jsonb) or jsonb_each_text(jsonb) in a FOR loop like:

CREATE OR REPLACE FUNCTION my_function(input jsonb)
  RETURNS jsonb
  LANGUAGE plpgsql AS  -- language declaration required
$func$
DECLARE
   _key   text;
   _value text;
BEGIN
    FOR _key, _value IN
       SELECT * FROM jsonb_each_text($1)
    LOOP
       -- do some math operation on its corresponding value
       RAISE NOTICE '%: %', _key, _value;
    END LOOP;

    RETURN input;
END
$func$;

Call:

SELECT my_function('{"a":1, "b":2, "c":3}');

Declare variable of table type in PL/pgSQL

Note that for many problems where people used to programming languages like c, java, etc. would tend to use a loop, there is a superior set-based solution in SQL around the corner. Side-by-side code examples:

Related Solutions

PostgreSQL – Return unique combinations of columns based on where clause

The execution plan shown does not seem to match the big SELECT DISTINCT query because the Sort and Unique steps are missing. Anyway you are correct than when retrieving ~50% of a table, index don't help. The best strategy is a big sequential scan of the main table and only fast hardware helps with that.

For the 2nd part of the question:

How would I go about selecting only the unique combinations of adjacent columns? Is this too complicated a task to perform through a database query? Would it speed up the query?

To remove duplicate combinations of adjacent columns, the structure of the resultset should be changed so that each output row has only one couple of adjacent columns along with their corresponding dimensions in the parallel coordinates graph. Well, except that the dimension for the 2nd column is not necessary since it's always the dimension for the other column plus one.

In one single query, this could be written like this:

WITH logs as (
  SELECT log_time_mapped, syslog_priority_mapped, 
     operation_mapped, message_code_mapped, protocol_mapped, 
     source_ip_mapped, destination_ip_mapped, 
     source_port_mapped, destination_port_mapped, 
     destination_service_mapped, direction_mapped, 
     connections_built_mapped, connections_torn_down_mapped, 
     hourofday_mapped, meridiem_mapped 
  FROM firewall_logs_mapped 
  WHERE operation = 'Built')
SELECT DISTINCT 1, log_time_mapped, syslog_priority_mapped FROM logs
UNION ALL
SELECT DISTINCT 2, syslog_priority_mapped, operation_mapped FROM logs
UNION ALL
SELECT DISTINCT 3, operation_mapped, message_code_mapped FROM logs
UNION ALL
...etc...
SELECT DISTINCT 14,  hourofday_mapped, meridiem_mapped FROM logs
;

The first SELECT DISTINCT subquery extracts the lines to draw between dimensions 1 and 2, the next subquery between dimensions 2 and 3, and so on. DISTINCT eliminates duplicates, so the client side doesn't have to do it. The UNION ALL concatenates the results without any further processing.

However it's a heavy query and it's dubious that it would be any faster than what you're already doing.

The WITH subquery is likely to gets slowly materialized on disk, so it might be interesting to compare the execution time with this other form repeating the same condition:

SELECT DISTINCT 1, log_time_mapped, syslog_priority_mapped
   FROM firewall_logs_mapped WHERE operation = 'Built'
UNION ALL
SELECT DISTINCT 2, syslog_priority_mapped, operation_mapped
   FROM firewall_logs_mapped WHERE operation = 'Built'
UNION ALL
SELECT DISTINCT 3, operation_mapped, message_code_mapped
   FROM firewall_logs_mapped WHERE operation = 'Built'
...etc...
;

Postgresql – Update Postgres Table with Array of JSON

If you know the desired column names (be it the same or different from the keys in the JSON structure, you can use json[b]_to_recordset():

SELECT * FROM jsonb_to_recordset('[
    { "col1": "a", "col2": 1, "col3": 1, "col4": "one"},
    { "col1": "b", "col2": 2, "col3": 2, "col4": "two"},
    { "col1": "c", "col2": 3, "col3": 3, "col4": "three"},
    { "col1": "d", "col2": 4, "col3": 4, "col4": "four"}
]'::jsonb) AS t (col1 text, col2 integer, col3 integer, col4 text);

 col1 │ col2 │ col3 │ col4  
──────┼──────┼──────┼───────
 a    │    1 │    1 │ one
 b    │    2 │    2 │ two
 c    │    3 │    3 │ three
 d    │    4 │    4 │ four

As the documentation tells us,

Note: In json_populate_record, json_populate_recordset, json_to_record and json_to_recordset, type coercion from the JSON is "best effort" and may not result in desired values for some types. JSON keys are matched to identical column names in the target row type. JSON fields that do not appear in the target row type will be omitted from the output, and target columns that do not match any JSON field will simply be NULL.

If you already have a table to work with, json_populate_recordset() is an even better solution:

CREATE TABLE inputtable (col1 text, col2 integer, col3 integer, col4 text);

SELECT * FROM jsonb_populate_recordset(NULL::yourtable, '[
    { "col1": "a", "col2": 1, "col3": 1, "col4": "one"},
    { "col1": "b", "col2": 2, "col3": 2, "col4": "two"},
    { "col1": "c", "col2": 3, "col3": 3, "col4": "three"},
    { "col1": "d", "col2": 4, "col3": 4, "col4": "four"}
]'::jsonb);

 col1 │ col2 │ col3 │ col4  
──────┼──────┼──────┼───────
 a    │    1 │    1 │ one
 b    │    2 │    2 │ two
 c    │    3 │    3 │ three
 d    │    4 │    4 │ four

Now updating the table itself may be done like this:

WITH source AS (SELECT * FROM jsonb_populate_recordset [...])
UPDATE yourtable
   SET col1 = s.col1, col2 = s.col2
  FROM source AS s
 WHERE col3 = s.col3;

In case it seems slow, it might make sense to not use the CTE, but a subquery in the FROM clause instead.

Best Answer

Related Solutions

PostgreSQL – Return unique combinations of columns based on where clause

Postgresql – Update Postgres Table with Array of JSON

Related Question