Sql-server – Unrolling a self-referencing table referred to by another table…

sql servert-sql

What's a proper way to "unroll" a self-referencing table with a variable depth and concatenate the levels with an associated value from another table?

table 1: FILENAME

filename      folder_id  size
-----------------------------  
toplevel1.txt 1          1  
onedeep1.txt  2          2  
twodeep.txt   3          4   
toplevel2.txt 1          4    
onedeep2.txt  4          3  

table 2: FOLDERS

folder_id  folder_name  parent_folder_id  
----------------------------------------  
0          ROOT         NULL    
1          temp         0
2          subdir1      1
3          subsubdir    2  
4          subdir2      1  

Desired output:

path                                size  
----------------------------------------  
temp\toplevel1.txt                  1  
temp\subdir1\onedeep1.txt           2  
temp\subdir1\subsubdir\twodeep.txt  4  
temp\toplevel2.txt                  4  
temp\subdir2\onedeep2.txt           3  

Have tried (and failed) using while loops, ctes, and cursors. Seems relatively straight forward, but I can't get a handle on the variable depth of the paths…

Best Answer

Use a recursive CTE to expand the folder names to full paths:

WITH
  folderpaths AS
  (
    SELECT
      folder_id,
      folder_name,
      parent_folder_id,
      folder_path = CAST(folder_name AS varchar(1000))
    FROM
      dbo.Folders AS f
    WHERE
      parent_folder_id = 0

    UNION ALL

    SELECT
      c.folder_id,
      c.folder_name,
      c.parent_folder_id,
      folder_path = CAST(p.folder_name + '\' + c.folder_name AS varchar(1000))
    FROM
      dbo.Folders AS c
      INNER JOIN folderpaths AS p ON c.parent_folder_id = p.folder_id
  )
...

For your example, the CTE will produce this output:

folder_id  folder_name  parent_folder_id  folder_path
---------  -----------  ----------------  ------------------
1          temp         0                 temp
2          subdir1      1                 temp\subdir1
4          subdir2      1                 temp\subdir2
3          subsubdir    2                 subdir1\subsubdir

As you can probably guess now, you can just join the Filename table to the CTE and use its folder_path column to get the path column of your output:

WITH
  folderpaths AS
  (
    SELECT
      folder_id,
      folder_name,
      parent_folder_id,
      folder_path = CAST(folder_name AS varchar(1000))
    FROM
      dbo.Folders AS f
    WHERE
      parent_folder_id = 0

    UNION ALL

    SELECT
      c.folder_id,
      c.folder_name,
      c.parent_folder_id,
      folder_path = CAST(p.folder_name + '\' + c.folder_name AS varchar(1000))
    FROM
      dbo.Folders AS c
      INNER JOIN folderpaths AS p ON c.parent_folder_id = p.folder_id
  )
SELECT
  path = fp.folder_path + '\' + fn.filename,
  fn.size
FROM
  dbo.Filenames AS fn
  INNER JOIN folderpaths AS fp ON fn.folder_id = fp.folder_id
;

This is the output:

path                             size
-------------------------------  ----
temp\toplevel1.txt               1
temp\subdir1\onedeep1.txt        2
subdir1\subsubdir\twodeep.txt    4
temp\toplevel2.txt               4
temp\subdir2\onedeep2.txt        3

A live demo of this solution is available at Rextester.