Sql-server – SQL Server: compare names ignoring non-alpha

sql serversql-server-2008-r2

I have a database table with a text column "title".
New rows are inserted all the time, and every time I would like to check for potential duplicates on this column.

I define duplicate by "having the same sequence of alpha characters". So, for example, if I already have the string "Lisa's Nights":

  • "LISASNIGHTS" is duplicate, because case doesn't matter, spaces don't
    matter, apostrophe doesn't matter
  • "Lisa's Night" is not duplicate because the final 's' is missing.

Of course I could store an altered version of the title and search there. But is there any other solution?

Best Answer

You can do it on the fly without storing the stripped down name, but that would mean recalculating the strings (e.g. Lisa's Nights, Lisa's Night, LISASNIGHTS) each time a new entry should to be checked against an existing string.

On the other hand, if your table also has a column for the STRIPPEDDOWNNAME, you can leverage SQL Server's constraints. Create a UNIQUE CONSTRAINT (or a UNIQUE INDEX) on the STRIPPEDDOWNNAME column. Then you will calculate the stripped down name only once, when you insert it into the table. With the UNIQUE CONSTRAINT the SQL Server will immediately let you know if an inserted string is a duplicate.

You can then capture the error and respond as appropriate to your application.

I encourage you to let the SQL Server manage the duplicate check whenever possible.