Trigram similarity and distance operators put more weight on leading matches (prefix) automatically and to a lesser extent on trailing matches (suffix), due to the way trigrams are extracted from strings. The manual:
Each word is considered to have two spaces prefixed and one space
suffixed when determining the set of trigrams contained in the string.
There is more, including examples. Read the manual. And consider the demos in my fiddle at the bottom of this answer.
Test setup
Based on your table definition and example:
CREATE TABLE search (id int PRIMARY KEY, search_on text, comment text);
INSERT INTO search (id, search_on, comment) VALUES
( 1, 'abc123456789', 'leading')
, ( 2, '123abc456789', 'nested')
, ( 3, '123456789abc', 'trailing')
, ( 4, 'abc123abc456', 'leading, nested 1x')
, ( 5, '123abc456abc', 'trailing,nested 1x')
, ( 6, 'abcabcabc123', 'leading, nested 2x')
, ( 7, '123abcabcabc', 'trailing nested 2x')
, ( 8, '1abcabcabc23', 'nested 3x')
, (10, 'abc12' , 'leading short')
, (11, '12abc' , 'trailing short')
, (12, '1abc2' , 'nested short');
CREATE INDEX index_search_search_on ON search USING gist (search_on gist_trgm_ops);
Not using your odd type bpchar
(blank padded character type) for id
- and I suggest you don't either. text
or varchar
should serve better:
Queries
We need a low threshold for the demo:
SET pg_trgm.similarity_threshold = .01; -- show weak matches, too
Demonstrating the built-in bias in your favor:
SELECT *, search_on <-> 'abc' AS distance
FROM search
WHERE search_on % 'abc'
ORDER BY search_on <-> 'abc';
id | search_on | comment | distance
-: | :----------- | :----------------- | :-------
10 | abc12 | leading short | 0.571429
6 | abcabcabc123 | leading, nested 2x | 0.7
11 | 12abc | trailing short | 0.75
4 | abc123abc456 | leading, nested 1x | 0.769231
1 | abc123456789 | leading | 0.785714
7 | 123abcabcabc | trailing nested 2x | 0.818182
5 | 123abc456abc | trailing,nested 1x | 0.857143
3 | 123456789abc | trailing | 0.866667
12 | 1abc2 | nested short | 0.888889
8 | 1abcabcabc23 | nested 3x | 0.916667
2 | 123abc456789 | nested | 0.9375
As you can see, leading matches have more weight. But it's still just a relative bias.
To make this absolute:
... at the beginning of my column's text, I'd want that factored into the ordering to come higher ...
SELECT *, search_on <-> 'abc' AS distance, search_on ILIKE 'abc%' AS prefix_match
FROM search
WHERE search_on % 'abc'
ORDER BY search_on NOT ILIKE 'abc%' -- prefix matches first
, search_on <-> 'abc'; -- then sort by distance
id | search_on | comment | distance | prefix
-: | :----------- | :----------------- | :------- | :-----
10 | abc12 | leading short | 0.571429 | t
6 | abcabcabc123 | leading, nested 2x | 0.7 | t
4 | abc123abc456 | leading, nested 1x | 0.769231 | t
1 | abc123456789 | leading | 0.785714 | t
11 | 12abc | trailing short | 0.75 | f
7 | 123abcabcabc | trailing nested 2x | 0.818182 | f
5 | 123abc456abc | trailing,nested 1x | 0.857143 | f
3 | 123456789abc | trailing | 0.866667 | f
12 | 1abc2 | nested short | 0.888889 | f
8 | 1abcabcabc23 | nested 3x | 0.916667 | f
2 | 123abc456789 | nested | 0.9375 | f
I chose the expression search_on NOT ILIKE 'abc%'
to still sort NULL values last. Equivalent: search_on ILIKE 'abc%' DESC NULLS LAST
. Related:
You could sort trailing matches in a similar fashion or combine both:
...
ORDER BY search_on NOT ILIKE 'abc%' -- prefix matches first
, search_on NOT ILIKE '%abc' -- suffix matches next
, search_on <-> 'abc'; -- then sort by distance
db<>fiddle here
BTW 1: Full Text Search also supports prefix matching.
BTW 2: The "C" collation COLLATE "C"
allows plain btree index support for prefix matches.
Related:
Best Answer
Check if this is what you're looking for [I didn't fully understand your first OR / AND condition. I assumed it was just an OR].
(The inclusion of number_of_matching_tags and rank columns is only to clarify results)