Sql-server – Full-text search in SQL Server on multiple tables fails

full-text-searchsql-server-2008-r2

tl;dr version: A full-text search using FREETEXTABLE that worked fine on one column in one table produces 100% false postitives and 100% false negatives when the same steps are followed but the full-text index is on several columns in a view made of two base tables (the original table and a new one). Research reveals contradictory claims that a full-text search can be run against multiple tables if they are joined in a view, and that the search can only involve a single base table when a view is queried.

We are running SQL Server 2008 R2. As part of a larger project, we are creating a searchable database of recipes for a customer. The customer provided data in XML format purchased from a third party. Based on the wireframe of the application, we imported the data and what data was relevant. DDL in pertinent part and a subset of the data:

CREATE TABLE recipe (
    id INT IDENTITY(1,1) NOT NULL,
    title NVARCHAR(255) NOT NULL,
    descrip NVARCHAR(4000) NOT NULL,
    prep_time NVARCHAR(127),
    ease_of_prep NVARCHAR(25)
    CONSTRAINT PK_recipe PRIMARY KEY CLUSTERED (id)
);

INSERT INTO recipe (title, descrip, prep_time, ease_of_prep)
VALUES ('Aromatic Rice Pudding', 'Store-bought rice pudding with some simple stir-ins.', '5 minutes', 'Easy'),
    ('Lemon Chicken Stir-Fry', 'Spiked with lots of zesty lemon, this stir-fry has a colorful mix of snow peas, carrots and scallions. Substitute any thinly sliced vegetables like bell peppers or celery.', '40 minutes', 'Easy'),
    ('Salsa-Roasted Salmon', 'Fire up the food processor, add a few simple ingredients, and you’ve got a vibrant-tasting salsa in minutes. Substitute other fish, chicken or turkey for the salmon—adjust the roasting time accordingly.', '10 minutes', 'Easy');

I was not told what the search criteria would be, but assumed more than one field and that full-text would yield results more in tune with what the customer was looking for. DDL for FTC and full-text index:

CREATE FULLTEXT CATALOG [ftc_default]WITH ACCENT_SENSITIVITY = OFF
AS DEFAULT
AUTHORIZATION [dbo];

CREATE FULLTEXT INDEX ON recipe KEY INDEX PK_recipe ON (ftc_default) WITH (CHANGE_TRACKING AUTO);

ALTER FULLTEXT INDEX ON recipe ADD title;
ALTER FULLTEXT INDEX ON recipe ENABLE;

I wrote the following stored procedure (optional but useful for testing):

CREATE PROCEDURE test_fulltext
    @terms NVARCHAR(2000) = NULL

AS
BEGIN

    IF ISNULL(@terms, '') = ''

        SELECT id, title, descrip
        FROM recipe;

    ELSE
        SELECT r.id, r.title, r.descrip
        FROM recipe r 
        INNER JOIN FREETEXTTABLE(recipe, title, @terms) kt
            ON r.idid = kt.[KEY]
        ORDER BY RANK DESC;

END

This worked as advertised. A search on 'chili-rubbed salsa' would return about a dozen recipes, with "Chili-Rubbed Steaks & Pan Salsa" as the top recipe and others with decreasing relevance down to "Paprika-Herb Rubbed Chicken". Just from eyeballing the titles, there appeared to be no false positives.

I updated the project manager to let him know that recipes were searchable on title and if the customer wanted any other fields included, I could add them in. He then asked why we weren't searching on cuisine type, theme, health considerations, and others. Ugh. We weren't told to import that data. Back to square one.

The other data wouldn't fit into the recipe table without denormalizing it, so we imported it into a new table. Entire DDL and a subset of data:

CREATE TABLE recipe_search (
    ID INT IDENTITY(1,1) NOT NULL,
    recipe_id INT NOT NULL,
    term_type NVARCHAR(25) NULL,
    term_value NVARCHAR(50) NULL
    CONSTRAINT PK_recipe_search PRIMARY KEY CLUSTERED (id)
);

ALTER TABLE recipe_search 
WITH CHECK ADD CONSTRAINT FK_recipe_recipe_id 
FOREIGN KEY (recipe_id) REFERENCES recipe (id);

INSERT INTO recipe-search (recipe_id, term_type, term_value)
VALUES (1, 'Course', 'Dessert'),
    (1, 'Course', 'Snacks'),
    (1, 'Cuisine', 'Middle Eastern'),
    (1, 'Cusines', 'Mediterranean'),
    (1, 'Dish Type', 'Desserts'),
    (1, 'Health Consideration', 'Healthy Weight'),
    (1, 'Season', 'Winter'),
    (1, 'Season', 'Fall'),
    (1, 'Style/Theme', 'Vegetarian'),
    (1, 'Technique', 'No Cook'),
    (2, 'Course', 'Dinner'),
    (2, 'Cuisine', 'Asian'),
    (2, 'Dish Type', 'Main Dish'),
    (2, 'Health Consideration', 'Low Calorie'),
    (2. 'Health Consideration', 'Low Carb'),
    (2, 'Main Ingredient', 'Chicken'),
    (2, 'Technique', 'Stir-fry'),
    (3, 'Course', 'Dinner'),
    (3, 'Cuisine', 'Southwestern'),
    (3, 'Cuisine', 'Mexican'),
    (3, 'Cuisine', 'American'),
    (3, 'Dish Type', 'Main Dish'),
    (3, 'Health Consideration', 'Low Carb'),
    (3, 'Season', 'Summer'),
    (3, 'Technique', 'Bake'),
    (3, 'Roast', 'Food Processor');

Then, having read that full-text search can only work on one table, and that to search multiple tables one must create a view, I did so:

CREATE VIEW dbo.vw_recipe_search
WITH SCHEMABINDING
AS
    SELECT rs.id search_id, r.id recipe_id, r.title, r.descrip, rs.term_value, r.ease_of_prep
    FROM dbo.recipe r
    INNER JOIN dbo.recipe_search rs ON r.id = rs.recipe_id
    WHERE r.active = 1;
GO

CREATE UNIQUE CLUSTERED INDEX idx_searchid ON vw_recipe_search (search_id);
GO

CREATE FULLTEXT INDEX ON vw_recipe_search
KEY INDEX idx_searchid ON (ftc_fu_default) WITH (CHANGE_TRACKING AUTO);
GO

ALTER FULLTEXT INDEX ON vw_recipe_search ADD (descrip);
GO
ALTER FULLTEXT INDEX ON vw_recipe_search ADD (term_value);
GO
ALTER FULLTEXT INDEX ON vw_recipe_search ADD (title);
GO
ALTER FULLTEXT INDEX ON vw_recipe_search ADD (ease_of_prep);
GO
ALTER FULLTEXT INDEX ON vw_recipe_search ENABLE;

And altered the SP (EDIT: I have updated the code below to indicate the actual changes I had made):

ALTER PROCEDURE test_fulltext
    @terms NVARCHAR(2000) = NULL

AS
BEGIN

    IF ISNULL(@terms, '') = ''

        SELECT id, title, descrip
        FROM recipe;

    ELSE
        SELECT v.*
    FROM vw_recipe_search v 
    INNER JOIN FREETEXTTABLE(vw_recipe_search, (title, descrip, term_value, ease_of_prep), @terms) kt
        ON v.recipe_id = kt.[KEY]
    ORDER BY v.recipe_id DESC;

END

And now the results of the search are complete nonsense. A search on "salsa", for example, finds five unique recipes, none of which contains the word "salsa" or any reasonable thesaurus synonym in any of the indexed fields; none of the types of dishes one associates with salsa are found. (One of the hits does contain the word "sauce", which I suppose might be a thesaurus synonym for "salsa".) Two of the five recipes are the first two recipes in the dataset above. Meanwhile neither salsa-roasted salmon, nor any other recipe which contains the word "salsa", is returned by the search. I have verified that the full-text catalog and index are both created as expected, and that the full-text catalog has the same number of items in it as the number of rows returned from the view (about 4000, even though it reports being 0 MB in size).

An additional annoyance: If a recipe meets the criteria, then ALL of its records in the view are selected. It's almost as though it is always one of the columns in the table recipe that leads to the hit, so all the joined records in recipe_search are included.

In looking into this bizarre reversal, I came across the following link: "When querying a view, only one full-text indexed base table can be involved."

Question #1: Which is correct–can you, or can you not, query multiple tables in a full-text search?
Question #2: Have I left something out in trying to create a multi-table full-text search, and if not, what might be the issue here?

Best Answer

Hi we can query multiple tables in a full-text search by using views ,but we some advantages and disadvantages . Here in your case you created view dbo.vw_recipe_search it seems your are not using it in altered procedure when i query it i am getting results as per intended

  SELECT *
        FROM dbo.vw_recipe_search r 
        INNER JOIN FREETEXTTABLE(vw_recipe_search, *, 'salsa peas') kt
            ON r.search_id = kt.[KEY]
        ORDER BY RANK DESC;

The problem in this approach is ranking and weightage of each column , like suppose if u want to give high weight to title column than descrip you have to do a lot of work in order by rank column . We used Levenstein string comparision function to order for better results set