Sql-server – Splitting different parts of a string

sql serverstring

I have an SQL Server database where there is a cell with a delimiter (\) separated string. An example of this string would be:

category_path
=============
RootCategory\Middle Category\Child Category\Child of child category
RootCategory\Middle Category\Other Child Category\

There are a lot of category paths like this. I would like to parse all of them into an other table:

category_name            parent_path
====================================
RootCategory             null
Middle Category          RootCategory
Child Category           RootCategory\Middle Category
Child of child category  RootCategory\Middle Category\Child Category
Other Child Category     RootCategory\Middle Category

I have left out the duplicates here, but the result could contain duplicates (I will need to cursor over the rows of this table later, and I can call distinct there).

There are a lot of examples around the net where a string is simply splitted. (Here for example) I was thinking about writing a query where I split the string, cursor over the results and accumlate the parent_path in each step, but that seems to be very suboptimal.

Is there a solution where I wouldn't have to declare so many cursors for a simple string?

Best Answer

Splitting a string with a recursive CTE is usually a bad choice but this is a case where I at least would consider a solution using that.

If you used a faster string split technique you would have to preserve the order of items and rebuild parent_category using for xml. That might still be faster than using a recursive CTE but here is the recursive stuff anyway.

SQL Fiddle

MS SQL Server 2008 Schema Setup:

create table YourTable
(
  category_path nvarchar(max)
);

insert into YourTable values
('RootCategory\Middle Category\Child Category\Child of child category'),
('RootCategory\Middle Category\Other Child Category\');

Query 1:

with C as
(
  select left(category_path, charindex('\', category_path+'\') - 1) as category_name,
         stuff(category_path, 1, charindex('\', category_path+'\'), '') as category_path,
         cast(null as nvarchar(max)) as parent_path
  from YourTable
  union all
  select left(category_path, charindex('\', category_path+'\') - 1),
         stuff(category_path, 1, charindex('\', category_path+'\'), ''),
         coalesce(parent_path+'\', '')+category_name
  from C
  where category_path <> ''
)
select distinct
       category_name,
       parent_path
from C;

Results:

|           CATEGORY_NAME |                                 PARENT_PATH |
-------------------------------------------------------------------------
|          Child Category |                RootCategory\Middle Category |
| Child of child category | RootCategory\Middle Category\Child Category |
|         Middle Category |                                RootCategory |
|    Other Child Category |                RootCategory\Middle Category |
|            RootCategory |                                      (null) |