Sql-server – Ranking not correct in sql server 2012 full text search

full-text-searchsql serversql-server-2012

I am having a table like below:

CREATE TABLE [dbo].[test_table](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [name] [varchar](50) NULL,
    [area] [varchar](50) NULL,
    [designation] [varchar](100) NULL,
 CONSTRAINT [PK_test_table] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)

id is the primary key in test_table and this table is having full text indexes on name, area and designation columns.

Now I am using a full text query like below:

SELECT test_table.*, ftt.RANK FROM test_table 

INNER JOIN 

FREETEXTTABLE(test_table, (name,area,designation), 'ranjit trail engineer') as ftt
ON
ftt.[KEY]=test_table.id

ORDER BY ftt.RANK DESC

I get the results like this:

id,    name,           area,          designation ,      rank

84829,"Ranjit Barasa","marine drive","software engineer",24

85323,"Jimmy David","hayes road",   "software engineer", 13

85223, "Arthur Philip","center trail","Quality Engineer",13

As you can see above, row 3 with name as 'Arthur Philip' is having same rank as row 2 with name 'Jimmy David' though row 3 has more matches.(row 3 matched with word 'engineer' and 'trail' but still has same rank as row 2)

Can anyone help on how to solve this issue?

Best Answer

Since you are using FREETEXTTABLE the ranking is very loose because its interpretation of the string is loose.

The text you supplied for row 2 only has one direct match ("Engineer"), but row 3 has 2 matches("trail" and "Engineer").

But you need to understand the difference between FREETEXT and FREETEXTTTABLE operators and the more precise CONTAINS and CONTAINSTABLE operators.

See https://msdn.microsoft.com/en-us/library/ms176078.aspx which explains in part:

• Generates inflectional forms of the words (stemming).

• Identifies a list of expansions or replacements for the terms based on matches in the thesaurus.

Because FREETEXT freely interprets your data using a thesaurus it is likely that "trail" and "road" are interpreted as equivalent in meaning and therefore the two rows get the same score.

If you need more precise answers you should use CONTAINSTABLE.