Sql-server – Extract street address from string with multiple delimiters and positions

sql serversql-server-2005string manipulation

Trying to pull just the Street addresses out of this data:

CREATE TABLE foo(Places varchar(50));
INSERT foo(Places) VALUES
('MARKET @123 NORTH RD'),
('HARDWARE @600 Main RD ;W1'),
('MAIN AV / NORTH RD'),
('500 NORTH RD @LIBRARY'),
('500 ANYSTREET ; *** SIDEWALK****'),
('MARKET @123 NORTH RD'),
('700 ANYSTREET'),
('(088.12345,088.12345) ;75 SOUTH RD');

I've trying to do use CASE, SUBSTRING and (CHAR/PAT)INDEX… But I'm just not doing this correctly.

Select 
[Address] = CASE WHEN PATINDEX('%@[0-9]%' , Places) = 0 THEN     (SUBSTRING(Places, 0, CHARINDEX('@',Places)))
                WHEN PATINDEX('%@[a-z]%' , Places) = 0 THEN     (SUBSTRING(Places, CHARINDEX('@',Places) + 1, LEN(Places)))
                WHEN PATINDEX('%;[0-9]%' , Places) = 0 THEN (SUBSTRING(Places, CHARINDEX(';',Places) + 1, LEN(Places)))
                WHEN PATINDEX('[0-9]%' , Places) = 1 THEN (SUBSTRING(Places, 0, CHARINDEX(';',Places))) 
                WHEN PATINDEX('%/%' , Places) > 0 THEN Places END

From dbo.foo

What is the better way to grab that data? I've looked at string selects for finding email addresses, but seem to fall flat again when dealing with the ';'

TIA!

edit for desired output, table containing:

123 NORTH RD 
600 Main RD
MAIN AV / NORTH RD
500 NORTH RD 
500 ANYSTREET
123 NORTH RD 
700 ANYSTREET
7575 SOUTH RD

Best Answer

Try this.

;With cte as
(
Select 
    places,
    StartHere = 
        CASE 
            WHEN PATINDEX('%@[0-9]%', places) > 0 THEN PATINDEX('%@[0-9]%', places) + 1     -- @ plus number, start here
            WHEN PATINDEX('%;[0-9]%', places) > 0 THEN PATINDEX('%;[0-9]%', places) + 1     -- ; plus number, start here
            ELSE 0
        END
    ,EndHere = 
        CASE
            WHEN PATINDEX('%@[a-z]%', places) > 0 THEN PATINDEX('%@[a-z]%', places)   -- @plus letter, stop here        
            WHEN PATINDEX('%@[0-9]%', places) > 0 AND PATINDEX('%;%', places) > 0 THEN  PATINDEX('%;%', places)  --@ plus number stop on semicolon if there
            WHEN PATINDEX('%;[0-9]%', places) = 0 AND PATINDEX('%;%', places) > 0 THEN PATINDEX('%;%', places)   --no semi plus number, so stop on semicolon if there
            ELSE len(places) + 1
        END
FROM
    foo
)
    SELECT 
        places, 
        StartHere
        ,EndHere
        ,SUBSTRING(places, StartHere, EndHere - StartHere)
    FROM 
        cte