Sql-server – get similar Full Text Search functionality without using Full-Text Search

full-text-searchsql server

Normally, I wouldn't post this kind of question here, but I can't seem to find anyone who has actually used SQL Server Full Text Search, so I'm posting here.

I'm creating a solution for a client that involves searching an Item catalog. The number of items is currently around 56k items. This number could reach 200k in a few years, so I'm going to safely assume 500k max items.

Here are my requirements?

Searching needs to be fast.

Need to be able to do a keyword search that will search multiple columns (i.e. Material, Feature, VendorItemName, ItemName, etc…)

An item may have a value of 'Aluminum' in all 4 of the columns mentioned above. Multiple matches for an item would 'rank' higher than an item matching on only one column.

Word Inflection: i.e. Search for PE should return Polyethelene material items, or search for RND (industry standard) should return items with Shape of Round.

Results will be weighted bases upon matches, number of matches, and other criteria that may be handled in code.

It seems Full-Text search will handle a lot of this, but I'm not sure it's the right approach. I don't have any single columns with a lot of text (like a description field), which from what I understand, Full-Text search is really designed for. I've considered creating a table that contains a concatenation of all the columns data combined into a single column. I would use this column for searching only, and then use the returned index keys to join on my actual Item table.

When I start to consider these kinds of solutions, I begin to wonder if perhaps I'm better off not using Full-Text search.

On one hand, it seems like overkill. But on the other hand, it has a lot of features that I could use (thesaurus comes to mind).

So, should I continue to look into using SQL Server's Full-Text Search, or try and come up with my own solution.

Best Answer

Native full-text can handle all of the features you've mentioned.

Whether or not it will be the best solution for your particular domain is a tough call. Rather than attempt to hand-roll an alternative, spike a proof of concept using SQL FTS and repeat the exercise with one or more open source alternatives.

Lucene is the obvious contender to compare and comes with the "Stack Overflow search is now 81% less crappy" endorsement of the SO team.