Sql-server – How to separate name string by the spaces in the name in T-SQL

sql serverstringsubstringt-sql

I have a list of names, such as: Doe John James, Doe Jane and Doe Henry Albert Sr

I need to split those out into the four component parts of the name (if the name has four parts) to put each of them in their own column (last name, first name, middle name, suffix), while allowing for names that may or may not have a middle name or suffix.

I've got it figured out for everything but the third instance, where there is a suffix.

For first name I have:

case
    when LEN(LTRIM(rtrim(@name))) - LEN(replace(ltrim(rtrim(@name)),' ','')) = 2
        then LEFT(SUBSTRING(LTRIM(RTRIM(@name)), CHARINDEX(' ', LTRIM(RTRIM(@name))) + 1, LEN(LTRIM(RTRIM(@name)))), CHARINDEX(' ', SUBSTRING(LTRIM(RTRIM(@name)), CHARINDEX(' ', LTRIM(RTRIM(@name))) + 2, LEN(LTRIM(RTRIM(@name))))))
    else
        SUBSTRING(@name, CHARINDEX(' ', @name) + 1, LEN(@name))
end as firstName

It's pretty crazy, but it works…

For middle name I have:

case
    when LEN(LTRIM(rtrim(@name))) - LEN(replace(ltrim(rtrim(@name)),' ','')) = 2
        then
            REVERSE(SUBSTRING(REVERSE(LTRIM(RTRIM(@name))),1,CHARINDEX(' ',REVERSE(LTRIM(RTRIM(@name))))))
    else ''
end as middleName

Not quite as complex as the first name.

Since the last name is at the beginning that's the easiest:

SUBSTRING(LTRIM(RTRIM(@name)),1,CHARINDEX(' ',LTRIM(RTRIM(@name)))-1) as lastName

But I can't figure out how to allow for names with a suffix (Jr, Sr, III, etc).

Best Answer

Does something like this help you:

With split(id, n, start, pos) as(
    SELECT id, 1, 0, CHARINDEX(' ', fullname) FROM @names
    UNION ALL
    SELECT n.id, n+1, pos+1, CHARINDEX(' ', fullname, pos+1) 
    FROM @names n
    INNER JOIN split s ON n.id = s.id
    WHERE CHARINDEX(' ', fullname, start+1) > 0
)
SELECT id
    , lastName = MAX(CASE WHEN v.n = 1 THEN v.val END)
    , firstname = MAX(CASE WHEN v.n = 2 THEN v.val END)
    , middleName = MAX(CASE WHEN v.n = 3 THEN v.val END)
    , title = MAX(CASE WHEN v.n = 4 THEN v.val END)
FROM (
    SELECT n.id, s.n
        , SUBSTRING(n.fullname
                , s.start
                , CASE WHEN s.pos = 0 THEN LEN(n.fullname)+1 ELSE s.pos END -s.start
            )
    FROM @names n
    INNER JOIN split s ON n.id = s.id
) as v(id, n, val)
GROUP BY id
--OPTION(MAXRECURSION  4)
;
  • The recursive CTE get the start and end position of each name along with a value n from 1 to up to 4.
  • Based on the value of n the GROUP BY put them in their respective columns.

On a big table, it may not be very efficient though.

Output:

id  lastName    firstname   middleName  title
0   Doe         John        James       NULL
1   Doe         Jane        NULL        NULL
2   Doe         Henry       Albert      Sr
3   Bart        Simpson     NULL        NULL

Data:

declare @names table(id int identity(0, 1), fullname varchar(200), first varchar(50), middle varchar(50), last varchar(50), title varchar(50));
INSERT INTO @names(fullname) values
    ('Doe John James')
    , ('Doe Jane')
    , ('Doe Henry Albert Sr')
    , ('Bart Simpson')
;