I am in a SQL Server 2008 environment.
I am trying to use pattern matching in the WHERE
clause to find rows where a certain column's value contains characters that are not alpha-numeric, underscore, dash, period, or space.
This is my code, and sample data, but I am not getting the results expected.
In the example data, I want to return rows 7, 8, 9, and 12, but I am getting rows 5 and 6.
If this isn't the best way to achieve the goal, I am open to hearing other methods.
I am not in an environment where I can implement regex so my solution limited to out-of-the-box functionality.
CREATE TABLE PATTERN_TEST
(
ID INT NOT NULL,
STRING NVARCHAR(40) NOT NULL
)
INSERT INTO PATTERN_TEST
SELECT 1, 'string' UNION
SELECT 2, 'STRING' UNION
SELECT 3, 'string space' UNION
SELECT 4, 'STRING SPACE' UNION
SELECT 5, 'string-dash' UNION
SELECT 6, 'string-dash space' UNION
SELECT 7, 'string "otherchar"' UNION
SELECT 8, 'string "other char"' UNION
SELECT 9, '"string"' UNION
SELECT 10, 'string_underscore' UNION
SELECT 11, 'string_underscore space' UNION
SELECT 12, '"'
;
SELECT * FROM PATTERN_TEST WHERE STRING LIKE '%[^a-zA-Z0-9_ -.]%';
Best Answer
This will work as well:
Tested at rextester.com
The only difference is that dash (
-
) is put at the end of the[...]
pattern. It has to do with the special meaning of the dash character (A-Z
is interpreted as "any character fromA
toZ
").In your pattern the last three characters (space-dash-dot:
-.
) are interpreted as "any character from space () to dot (
.
)", and the results are not the wanted/expected.You could also escape the dash:
Note though that using
ESCAPE
may adversely affect cardinality estimation.See the MSDN documentation for some details:
LIKE
:To be clear: special characters need escaping when they are used as non-special. Inside [], the three (^, -, ]) are special. Putting the dash at the end is more like a hack.