Postgresql – Returning case-sensitive input of case-insensitive array match

case sensitivejsonpostgresqlpostgresql-9.4

Taking a JSON table with some simple data:

CREATE TABLE jsonthings(d JSONB NOT NULL);
INSERT INTO jsonthings(d) VALUES('{"id":1,"email":"Test1@test.com"}');
INSERT INTO jsonthings(d) VALUES('{"id":2,"email":"Test2@test.com"}');
INSERT INTO jsonthings(d) VALUES('{"id":3,"email":"Test3@test.com"}');

I have a requirement to obtain the rows which case-insensitively match a set of email addresses:

SELECT d FROM jsonthings
WHERE (d->>'email')::CITEXT = ANY(ARRAY['TEST1@TEST.COM','test3@test.com']::CITEXT[]);

However, the data I return should contain the email addresses in the case in which they were passed to the query rather than the case in which they are stored within the database. For example, the above query provides the results:

{"id": 1, "email": "Test1@test.com"}
{"id": 3, "email": "Test3@test.com"}

But I need it to return the results:

{"id": 1, "email": "TEST1@TEST.COM"}
{"id": 3, "email": "test3@test.com"}

How can I obtain the matching item in the array and use it in the results?

Best Answer

First of all, you should mention that you are using the additional module citext that provides the case insensitive text data type citext, which is not strictly needed here. The solution below only requires standard Postgres 9.4.

Key elements are jsonb_each_text() to unnest jsonb values into key/value pairs and json_object_agg() to re-assemble the JSON value with changed parts:

SELECT j.id, json_object_agg(d.key, CASE WHEN d.key = 'email'
                                         THEN t.email ELSE d.value END)::jsonb AS d
FROM   jsonthings j
JOIN   unnest('{TEST1@TEST.COM,test3@test.com}'::text[]) t(email)
                                                ON lower(t.email) = lower(j.d->>'email')
     , jsonb_each_text(j.d) d
GROUP  BY 1;

Result:

id | d
---+---------------------------------------
1  | {"id": "1", "email": "TEST1@TEST.COM"}
3  | {"id": "3", "email": "test3@test.com"}

How?

(0.) Of yourse your table has a primary key you did not mention. Let's call it id.

  1. Unnest the text array into individual rows and JOIN to the 'email' value of the jsonb column using lower() on both sides to make it case insensitive.

  2. Decompose the jsonb column (only identified rows!) into key/value pairs with jsonb_each_text().

  3. Replace original 'email' values with the search string with a CASE statement.

  4. Finally aggregate everything back to a json object with json_object_agg() and cast to jsonb.

Aside: For a simple structured form like you present in the example, a normalized schema with scalar Postgres data types would be much more efficient: