Modelling Hierarchical attributes

database-designhierarchytree

I am trying to figure out the data model for a retailer.

The retailer has several stores across the country and they are modeled using the following hierarchy:

Channel -> Zone -> City -> Store

Each store contains several articles. And each article has attributes like

  • Activation flag (this indicates the presence of article)
  • Price
  • Supplier
  • Warehouse

Now, the retailer can set these attributes at any level in the hierarchy. Consider the following cases:

  • Setting price for an article at channel level will apply it to all stores.
  • The price set at a higher level can be overridden at any other level. For e.g., at a city level for just the stores in a city or for a particular store.
  • This applies to all attributes listed above.

As of now, they have modeled it using RDBMS by defining global rules at the top of hierarchy and calling out exceptions separately as individual rows. Say, price table, will have price set for an article at Channel level and any changes at any level will be specified separately. Obviously, this is not very efficient when fetching the attributes at the store level.

Sample Data

Assume Channel, Zone, City and Store are collectively called entities. Channel will have ids ranging >= 4000, Zone >= 3000, City >= 2000 and stores range from 1 to 1000.

A subset of the hierarchy relationship data is given below:

Channel   | Zone     | City    | Store |
----------+----------+------------------
4001      | 3001     | 2001    | 13    |
4001      | 3001     | 2001    | 14    |
4001      | 3001     | 2002    | 15    |
4001      | 3002     | 2003    | 16    |
4001      | 3003     | 2006    | 74    |

Price Table

ArticleID | EntityID | Price
----------+----------+----------
12345     | 4001     | 2.5
12345     | 2003     | 2.9
12345     | 74       | 3.0

Here, the price 2.5 for channel 4001 will be set for the article in all stores. The next two rows set the price exceptions in some stores. The second price 2.9 set for city 2003 will be applicable only for the article in store 16 as per the hierarchy relationship mentioned above. And the third row directly sets price 3.0 for the article in store 74.

Hope this gives an idea of current model. So, can you please suggest a better way to store this?

Best Answer

You could formalize the approach that you're already using and make it more efficient to both maintain and query. Consider an ERD like so:

ERD

Note that all of your hierarchy is flattened into a single set of columns. There is an involuted relationship so that each child entity in the hierarchy can point to it's immediate parent. I've also included left and right visitation numbers which would really help you with your price lookups. Lastly there is a Level column to store the rank within the hierarchy.

If you're not familiar with visitation numbers please have a look at my answer to this question, where I go into quite a bit of detail.

So why is this better than what you're doing now?

  1. You only track each item in the hierarchy once. Instead of having 4001 show up in the channel column many, many times, you just have one record with 4001.

  2. The structure is flexible and allows for insertion of new levels at a later time. It also allows for uneven trees, say for example ZONE 3001 is subdivided into three SUBZONEs and that cities are under the subzones there (but go direct to zone outside of 3001).

  3. All of your prices are in exactly one place (which doesn't need any reading logic that goes outside of normal SQL queries. Instead of having to have logic that joins a price to different Entity tables depending on what range the EntityID is in, you have one consistent price retrieval query.

  4. You don't need coded meaning in your identifiers so you won't run into trouble when you get too many entities at any one level, thus breaking your coding scheme.

  5. The price retrieval query is dead simple:


-- Look up the left and right numbers for the store in question:
declare @StoreLeft int
declare @StoreRight int
select @StoreLeft = H.left, @StoreRight = H.right
from HIERARCHY H where H.ID = @DesiredStoreId

-- Look up the lowest level (most granular) price for this store:
select TOP 1
  P.price_amt
, P.currency
from PRICE P
  inner join HIERARCHY H
    on H.Left <= @StoreLeft and H.right >= @StoreRight
  inner join ARTICLE A
    on P.article_id = A.id
where A.id = @TheArticleInQuestion
order by H.Level ASC   -- ASC or DESC depends on how you count levels.