Full-Text Index on VARBINARY Column – How to Find Strings with Dots in SQL Server 2012

full-text-searchsql serversql-server-2012

On a SQL Server 2012 SP3 CU8 server instance, we have PDF files that have been inserted into a table called OBJECT_FILE created as follows:

CREATE TABLE [dbo].[OBJECT_FILE](
    [FILE_ID] [int] NOT NULL,
    [FILE_FIN] [int] NOT NULL,
    [OBJ_FILE_IDX_DOCTYPE] [varchar](3) NULL,
    [FILE_TYPE] [smallint] NOT NULL,
    [FILE_TXT] [varbinary](max) NULL,
    [FILE_TXT_SIZE] [int] NULL DEFAULT ((0)),
    CONSTRAINT [PK_DM_OBJECT_FILE] PRIMARY KEY CLUSTERED (
        [FILE_ID] ASC
    )
    WITH (
        PAD_INDEX = OFF, 
        STATISTICS_NORECOMPUTE = OFF, 
        IGNORE_DUP_KEY = OFF, 
        ALLOW_ROW_LOCKS = ON, 
        ALLOW_PAGE_LOCKS = ON
    ) ON [INDEXFG]
) ON [INDEXFG] TEXTIMAGE_ON [BLOBFG]

We also created a FullText Index that is built on FILE_TXT column, which is a VARBINARY(MAX), created as follows:

CREATE FULLTEXT INDEX 
ON OBJECT_FILE (
    FILE_TXT TYPE COLUMN OBJ_FILE_IDX_DOCTYPE
) 
KEY INDEX PK_DM_OBJECT_FILE 
WITH STOPLIST=OFF;

In that table, we are willing to insert the contents of old documents. In these contents, we will find a unique identifier inserted in the past which follows a naming convention:

CORP-MMM-<three_digits_0_to_9>.<three_digits_0_to_9>

When I try to do a lookup using CONTAINS built-in function on a particular identifier that I know it exists, it doesn't give me back any results. I use following WHERE clause:

WHERE CONTAINS(FILE_TXT,'CORP-MMM-456.245')

although a lookup with following WHERE clause provides me multiple results:

WHERE CONTAINS(FILE_TXT,'CORP-MMM-456.*')

and the record in question can be found with LIKE:

WHERE FILE_TXT LIKE '%CORP-MMM-456.245%' 

My question is:

Why can't I get a specific record regarding my WHERE clause and how could I make it work ?

Best Answer

Sorry this might not be an answer but I cannot put this in comment. Based on the comment by OP @Jefferson (as an answer to my comment) I could not reproduce the behavior. I can get a record back with where clause exactly matching the value.

I tested this with following code (copied and modified from here)

CREATE TABLE Department 
   (DepartmentID INT IDENTITY CONSTRAINT DepartmentPK PRIMARY KEY, 
    NAME NVARCHAR(50),
    File_txt NVARCHAR(50));
GO
INSERT INTO Department (NAME, File_txt) VALUES ('OP1', 'CORP-MMM-456.245');
INSERT INTO Department (NAME, File_txt) VALUES ('OP2', 'CORP-MMM-456.246');
INSERT INTO Department (NAME, File_txt) VALUES ('OP3', 'CORP-MMM-456.247');
INSERT INTO Department (NAME, File_txt) VALUES ('OP4', 'CORP-MMM-456.245.246');

Go
CREATE FULLTEXT CATALOG ComplexFTS AS DEFAULT;
GO
CREATE FULLTEXT INDEX ON Department(File_txt LANGUAGE 1033 /* 0 = Neutral, 1033 = American English  */)  
KEY INDEX DepartmentPK WITH (STOPLIST = OFF); /* Or use (stoplist = Off) for no stoplist */
GO
WAITFOR DELAY '00:00:05';
GO
SELECT * FROM Department WHERE CONTAINS (File_txt,'"CORP-MMM-456.*"');
SELECT * FROM Department  WHERE CONTAINS(File_txt,'CORP-MMM-456.245')
GO

This is what I get as my result set.

enter image description here

Clean up code.

DROP FULLTEXT INDEX ON Department;
GO
DROP FULLTEXT CATALOG ComplexFTS;
GO
DROP TABLE Department;
GO