Pattern matching and operators
Full text search is not the right tool for pattern matching (and possibly even fuzzy, fault tolerant input). Typically, trigram-similarity search with the %
operator is the superior approach here. You need to install the additional module pg_trgm
once per database:
Possibly combine that with LIKE
(also supported by trigram indexes), and text search like you had in mind. Related answer where text search and similarity are combined:
Overview of pattern-matching functionality in Postgres:
Database design
json
is a very bad choice for your intended functionality. It is very inefficient to have to plow through huge columns holding everything in a semi-structured big string (json
).
Consider normalizing your schema, which would result in superior performance (when done properly).
If you are hellbent on json
, consider the new jsonb
in the upcoming (and soon to be released) Postgres 9.4. More (last paragraph):
And be sure to use functional indexes like:
Further optimization is likely possible. But that would require more detailed information including table definitions, cardinalities, sample data and typical queries.
A jsonb GIN index with the jsonb_path_ops
operator class would be an option.
But to just index (multiple) application #>> '{*, tour_id, id}')
values per jsonb
field (pseudocode, the *
is illegal), I would write a tiny, IMMUTABLE
SQL function to extract and cast the UUID fields to an UUID array (uuid[]
) - and create an expression GIN index on that. Should be considerably smaller and hence faster:
CREATE OR REPLACE FUNCTION f_extract_tour_ids(jsonb)
RETURNS uuid[] AS
$func$
SELECT ARRAY(SELECT (jsonb_array_elements($1) #>> '{tour_id, id}')::uuid)
$func$ LANGUAGE sql IMMUTABLE;
What was missing in my original answer: there is no GIN operator class for uuid[]
, yet (up to Postgres 10). See this thread on pgsql-hackers.
Update: Starting with Postgres 11, the additional module btree_gin provides the missing operator class.
You have to create it yourself (once per database):
CREATE OPERATOR CLASS _uuid_ops DEFAULT
FOR TYPE _uuid USING gin AS
OPERATOR 1 &&(anyarray, anyarray),
OPERATOR 2 @>(anyarray, anyarray),
OPERATOR 3 <@(anyarray, anyarray),
OPERATOR 4 =(anyarray, anyarray),
FUNCTION 1 uuid_cmp(uuid, uuid),
FUNCTION 2 ginarrayextract(anyarray, internal, internal),
FUNCTION 3 ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
FUNCTION 4 ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
STORAGE uuid;
Note (quoting the manual):
The user who defines an operator class becomes its owner. Presently,
the creating user must be a superuser.
See:
Then create the index:
CREATE INDEX payments_application_tour_ids_gin_idx ON payments
USING gin (f_extract_tour_ids(application));
Match the expression and use an indexable array operator in your queries to be applicable, like:
SELECT *
FROM payment
WHERE f_extract_tour_ids(application) @> '{d61802ff-3eec-4a72-97ca-832f51b96bf0}'::uuid[];
As you can see, you can search for any number of UUIDs at once with this query.
Or, if you can't (or won't) create the operator class, you can work with the text
representation of the UUID values - which is considerably bigger and hence more expensive.
Adapt f_extract_tour_ids(jsonb)
like this:
CREATE OR REPLACE FUNCTION f_extract_tour_ids(jsonb)
RETURNS text[] AS
$func$
SELECT ARRAY(SELECT (jsonb_array_elements($1) #>> '{tour_id, id}'))
$func$ LANGUAGE sql IMMUTABLE;
And the query:
SELECT *
FROM payment
WHERE f_extract_tour_ids(application) @> '{d61802ff-3eec-4a72-97ca-832f51b96bf0}'::text[];
Be sure to match the UUID format exactly.
Related:
Best Answer
Assuming you meant to ask:
I'm using this SQL query to retrieve the row where
application #>> '{0, tour_id, id}'
equals the provided UUID... which already carries half an answer.
If your outer JSON array only has a single element like your example value implies, use this simple query:
Note how I replaced the (assumingly) unneeded fuzziness of
like '%a43d38d4-9bfb-4ffa-8704-e536f04b0c60%'
with simpler and faster
= 'a43d38d4-9bfb-4ffa-8704-e536f04b0c60'
.Else (for any number of elements in the outer JSON array):
That's the
jsonb
containment operator@>
, which can be supported with a generic index or a more specializedjsonb_path_ops
index. Instructions and links:The manual hints: