Sql-server – Better defining a parent-child relationship on a table

database-designhierarchynormalizationsql server

Our team is developing a new system and it has a Warehouse table that must have a hierarchical relationship between entries since a customer can have 1st, 2nd, 3rd and Nth degree warehouses.

We first thought that this was the best way to do it. By creating a Parent and a Child column on the table and then making Parent the FK to Child (on the same table) but friend DBA told us that there is a different way.

So, for instance, we had thought about making a table like this:

CREATE TABLE Warehouse(
filterId INT PRIMARY KEY IDENTITY,
parentId INT NOT NULL,
active BOOL DEFAULT 1,
warehouseName VARCHAR(30) NOT NULL,
FOREIGN KEY parentId REFERENCES Warehouse(filterId)

| FILTERID   | PARENTID | ACTIVE | WAREHOUSENAME |
--------------------------------------------------
|          1 |        0 |      1 | N WAREHOUSE 1 |
|          2 |        1 |      1 | ROOM 1        |
|          3 |        1 |      1 | ROOM 2        |
|          4 |        1 |      1 | ROOM 3        |
|          5 |        1 |      1 | ROOM 4        |
|          6 |        2 |      1 | SHELF 1       |
|          7 |        2 |      1 | SHELF 2       |
|          8 |        2 |      1 | SHELF 3       |
|          9 |        2 |      1 | SHELF 4       |
|         10 |        2 |      1 | SHELF 5       |
|         11 |        2 |      1 | SHELF 6       |
|         12 |        2 |      0 | SHELF 7       |
|         13 |        3 |      1 | BOX 1         |
|         14 |        3 |      0 | BOX 2         |
|         15 |        3 |      1 | BOX 3         |
|         16 |        3 |      1 | BOX 4         |
--------------------------------------------------

But this DBA said that we should work on something like this:

CREATE TABLE Warehouse(
filterId VARCHAR(20) PRIMARY KEY,
active BOOLEAN DEFAULT 1,
warehouseName VARCHAR(30) NOT NULL)
| FILTERID   | ACTIVE | WAREHOUSENAME |
---------------------------------------
| 1.00       |      1 | N WAREHOUSE 1 |
| 1.01       |      1 | ROOM 1        |
| 1.02       |      1 | ROOM 2        |
| 1.03       |      1 | ROOM 3        |
| 1.04       |      1 | ROOM 4        |
| 1.01.01    |      1 | SHELF 1       |
| 1.01.02    |      1 | SHELF 2       |
| 1.01.03    |      1 | SHELF 3       |
| 1.01.04    |      1 | SHELF 4       |
| 1.01.05    |      1 | SHELF 5       |
| 1.01.06    |      1 | SHELF 6       |
| 1.01.07    |      0 | SHELF 7       |
| 1.01.01.01 |      1 | BOX 1         |
| 1.01.01.02 |      0 | BOX 2         |
| 1.01.01.03 |      1 | BOX 3         |
| 1.01.01.04 |      1 | BOX 4         |
---------------------------------------

Which way should we follow?

On one hand we have a relationship between parentId and filterId; on the other we do not have any relationship, but it is easy to understand where does the entry belong.

Not all the warehouses are of the same level – that's what I mean by degree. They should be ordered like: For Customer A, 1 is parent of 1.1 and 1.2, 2 is parent of 2.1, 3 is parent of 3.1 and 4 is parent of 4.1 and 4.2 – but 1, 2, 3 and 4 are of the same level (1st level), and 1.1, 1.2, 2.1, 3.1, 4.1 and 4.2 are also of the same level (2nd level) but not of the same parents.

Best Answer

There are several models for hierarchical data in relational databases. These include

  1. Adjacency List. This is your suggestion above.
  2. Path Enumeration. This is the DBA's suggestion.
  3. Nested Set
  4. Closure Table

Each has a different insert, delete, neighbour-read and set-read characteristic. You will have to determine which, on average, is best for your application performance and development needs. For example, nested sets is great for reading a whole sub-tree but is expensive for inserts, and can be difficult to comprehend. Adjacency list is simple to understand and code, read/write is fast but frequent, large recursive queries can be expensive.

Since you're using SQL Server be aware of the hierarchyid data type.

I have found Joe Celko's book "Trees and Hierarchies in SQL for Smarties" to be most readable and informative.

There's a discussion on Stackoverflow which discusses this widely.