I suggest a MATERIALIZED VIEW
with unnested values and a trigram index as search tools.
If you don't need nested values, hstore
may be even better for you. Use the function svals(hstore)
to unnest hstore
values.
You need to install the additional module hstore
once per database:
CREATE EXTENSION hstore;
Table
CREATE TABLE person AS
SELECT * FROM (
VALUES
(1, 'Joe Doe', hstore 'toy=>car, color=>red')
, (2, 'Jane Doe', 'food=>hamburguer, color=>blue')
) t(person_id, name, preferences);
Materialized view
CREATE MATERIALIZED VIEW person_pref AS
SELECT p.person_id, j.preference -- just pref
FROM person p, svals(p.preferences) j(preference);
This is an implicit CROSS JOIN LATERAL
to the set-returning function svals()
.
Trigram Index
You need to install the additional module pg_trgm
once per database:
CREATE EXTENSION pg_trgm;
Then:
CREATE INDEX person_pref_j_trgm_idx ON person_pref_j
USING gin (preference gin_trgm_ops);
Details:
Query
SELECT *
FROM person p
WHERE EXISTS (
SELECT 1
FROM person_pref pp
WHERE pp.person_id = p.person_id
AND pp.preference ILIKE '%burg%'
);
Be aware that this is pretty fast.
If you have nested values or numeric
or boolean
values, jsonb
may be more efficient. You can do the same as above with jsonb_each_text(jsonb)
:
CREATE TABLE person AS
SELECT * FROM (
VALUES
(1, 'Joe Doe', jsonb '{"toy": "car", "color": "red"}')
, (2, 'Jane Doe', '{"food": "hamburguer", "color": "blue"}')
) t(person_id, name, preferences);
CREATE MATERIALIZED VIEW person_pref AS
SELECT p.person_id, j.key, j.preference -- incl. key or just pref?
FROM person_j p, jsonb_each_text(p.preferences) j(key, preference);
Same index, same query. You might want to add the key to the MV and search for that, too:
CREATE INDEX person_pref_trgm_idx ON person_pref_j
USING GIN (key gin_trgm_ops, preference gin_trgm_ops);
I often query a 170K-row table using InnoDB FULLTEXT. (Avg: 300 words.) I typically get the answer of a few dozen rows in under 200ms. That 200ms drops to below 20ms if the cache is warm.
I don't think the number of documents in the table matters as much as the number of documents returned. If you need to fetch the text for 100 documents (after searching the index), that could take a full second to randomly fetch from a cheap disk.
Beware of benchmarks; they may be delivering only document ids, not the text; and they may be assuming everything is cached in RAM. Since you have only a few MB, the caching is a valid assumption for you.
Best Answer
Like our expert ypercube mentioned, you can use
LIKE
if you know the phrase to be searched. Otherwise you can also use something calledCHARINDEX
which I personally find a very useful command in SQL. I'll give you a small example. Suppose you have a keyword and you want to search that keyword in a column or a group of column, you can do it pretty easily.This works on variables too very well. Let me know if this resolves your query.
Now if you had to apply the same search on multiple columns, you could write: