This is not really a use case for full text search because full text relies on stemming the text and parsing the chunks into tokens. As you can see from keywords
, '580h' is parsed as its own word because there's no language in which '580' is a "stem" of '580h'. You'd probably be better off with regular expression matching.
Here's a query that I worked up for you:
SELECT id, title
FROM stickers WHERE
(title ~* '580')
AND
(title ~* 'case')
ORDER BY id
If you're looking for alternatives other than JSON, and you just need arrays of text name-value pairs collections, you could simply use PostgreSQL's arrays:
CREATE TYPE mydoc AS (content_type text, names text[], values text[]);
CREATE TABLE documentData(id int primary key, documentName text, docs mydoc[]);
INSERT INTO documentData(id,documentName,docs) VALUES (1, 'DocumentA', '{"(Table,\"{ColA,ColB,ColC}\",\"{val1,val2,val3}\")","(Table,\"{ColA,ColB,ColC}\",\"{val4,val5,val6}\")"}'::mydoc[])
SELECT d.id, d.documentName, dd.*
FROM documentData d
LEFT JOIN LATERAL unnest(d.docs) dd ON (true);
This won't get a significant advantage over JSON, other than portability to older PostgreSQL versions, but unnest
and =ANY
type of operations are often much simpler than the JSON functions, so it may be easier to query and manipulate, e.g.:
SELECT d.id, d.documentName, dd.content_type, pos, name, dd.values[pos] AS value
FROM documentData d
LEFT JOIN LATERAL unnest(d.docs) dd ON (true)
LEFT JOIN LATERAL unnest(dd.names) WITH ORDINALITY AS y (name, pos) ON (true)
WHERE dd.content_type = 'Table' AND name ~ '^Col[AB]';
Best Answer
You should definitely use a different column per language.
The main reason is that different languages have different stop words and stemming rules, so if you index something with
to_tsvector('spanish', ...)
, you will not always find it with ato_tsquery('english', ...)
and vice versa:Even better would be not to create a column per language, but only GIN indexes on
to_tsvector('english', (tags->'tags'->'en'))
andto_tsvector('spanish', (tags->'tags'->'es'))
. For example:Then you can use the first index with