Sql-server – How to break down RELATIONAL data into HIERARCHY data structure

hierarchysql serversql-server-2008

I have a query that returns the following dataset:

RELATIONid MAPid  D1id   D2id  D3id
4999       4999   626    1250    7 
5000       5000   626    1250    8

For the next step, I need to bind those datasets into treeview (hierarchy structure). I need to transform this dataset into the following:

Nodeid   ParentNodeid  Header
626       null           D1
1250      626            D2
7         1250           D3
8         1250           D3 

How I can achieve those structure from the original dataset?


I have one more favor to ask, the data structure a (little bit) more complex than the previous one. Let's say I have sample dataset like this:

RELATIONid MAPid  D1id   D2id  D3id
4999       4999   626    1250    7 
5000       5000   626    1250    8
5001       5001   627    1300    10 
5002       5002   627    1300    12 
5003       5003   628    1400    15 

From the following dataset, we have 3 MainParent: 626, 627, 628 and the transformation (cross apply) output expectation will be like this:

Nodeid   ParentNodeid  Header
626       null           D1
1250      626            D2
7         1250           D3
8         1250           D3
627       null           D1
1300      627            D2
10        1300           D3
12        1300           D3
628       null           D1
1400      628            D2
15        1400           D3

Please note that the data are ordered per ParentNode followed by its node data.

Best Answer

A CROSS APPLY would seem perfect for the job:

SELECT
  v.Nodeid,
  v.ParentNodeid,
  v.Header
FROM
  dbo.atable
  CROSS APPLY
  (
    VALUES
    (D1id, NULL, 'D1'),
    (D2id, D1id, 'D2'),
    (D3id, D2id, 'D3')
  ) AS v (Nodeid, ParentNodeid, Header)
;

For every row of the source dataset, CROSS APPLY produces three, using the VALUES row constructor, explicitly specifying which column of the original set goes into which new column.

Now, the above will return duplicates if some pairs in your source repeat. You can suppress them with DISTINCT:

SELECT DISTINCT
  v.Nodeid,
  v.ParentNodeid,
  v.Header
...

In order for the transformed set to follow the order of D1id ASC, D2id ASC, D3id ASC, you could include those columns in the output and use them for sorting:

SELECT DISTINCT
  t.D1id,
  t.D2id,
  t.D3id,
  v.Nodeid,
  v.ParentNodeid,
  v.Header
FROM
  dbo.atable AS t
  CROSS APPLY
  (
    VALUES
    (t.D1id, NULL  , 'D1'),
    (t.D2id, t.D1id, 'D2'),
    (t.D3id, t.D2id, 'D3')
  ) AS v (Nodeid, ParentNodeid, Header)
ORDER BY
  t.D1id ASC,
  t.D2id ASC,
  t.D3id ASC
;

The reason you have to include them in SELECT is because, when you have DISTINCT, you may only sort by columns in the SELECT clause. Naturally, the result set will include the three extra columns as well. If you do not want them in the output, you can use the above as a derived table: your outer SELECT would pull only the three required columns and sort by the other three:

SELECT
  Nodeid,
  ParentNodeid,
  Header
FROM
(
  SELECT DISTINCT
    t.D1id,
    t.D2id,
    t.D3id,
    v.Nodeid,
    v.ParentNodeid,
    v.Header
  FROM
    dbo.atable AS t
    CROSS APPLY
    (
      VALUES
      (t.D1id, NULL  , 'D1'),
      (t.D2id, t.D1id, 'D2'),
      (t.D3id, t.D2id, 'D3')
    ) AS v (Nodeid, ParentNodeid, Header)
) AS s
ORDER BY
  D1id ASC,
  D2id ASC,
  D3id ASC
;

Alternatively, you could use GROUP BY instead of DISTINCT and thus return the sorted three-column set without a derived table:

SELECT
  v.Nodeid,
  v.ParentNodeid,
  v.Header
FROM
  dbo.atable AS t
  CROSS APPLY
  (
    VALUES
    (t.D1id, NULL  , 'D1'),
    (t.D2id, t.D1id, 'D2'),
    (t.D3id, t.D2id, 'D3')
  ) AS v (Nodeid, ParentNodeid, Header)
GROUP BY
  t.D1id,
  t.D2id,
  t.D3id,
  v.Nodeid,
  v.ParentNodeid,
  v.Header
ORDER BY
  t.D1id ASC,
  t.D2id ASC,
  t.D3id ASC
;