Storing different countries’ administrative divisions

database-design

I'm designing a data base which should relate users to their
administrative division for statistical purposes, considering that
every country has its own type of divisions. Chile, for example,
is divided in regions which are divided in provinces which are divided
in communes, while Angola is divided in provinces which are divided in
municipalities which are divided in communes. I thought a solution
could be to have a column called first_administrative_division
in the user table, which will refer to a
first_administrative_division table, which will have a
second_administrative_division column which will refer to a
third_administrative_division table which will have a column
which will refer to the country table. But reading
this
I didn't only found that there are countries which have only two
administrative divisions, and in other cases, four; some cases
are even more complex, like Argentina and Brazil, which have
different kind of internal divisions, divided in different ways. So,
I'm crying right now.

I don't want to cover the entire world, countries will be added as
they are needed. Is it posible to modelate such thing?

Thanks beforehand.

Best Answer

One way to store different levels of hierarchy is by creating a hierarchical table:

ID    Type
---   ---------------------------
1     Country
2     Province/State
3     County
4     City
5     Regions
6     Communes
7     Municipality

Then you can create your divisions and list them appropriately. This allows you to flexibly assign divisions and the parent divisions.

ID    Division    Parent    Type
----- ----------- --------- -----
1     Chile                 1
2     Region-A    1         5
3     Province-A  2         2
4     Commune-A   3         6
5     Angola                1
6     Province-A  5         2
7     Municip-A   6         7
8     Commune-A   7         6

The benefits is that different levels of hierarchy can be assigned in this design for different top level divisions. People can be associated with any level of hierarchy.

ID    Person      Division
----- ----------- ---------
1     Tom         4 (Commune-A of Province-A of Region-A of Chile)
2     Dick        7 (Municip-A of Province-A of Angola)
3     Harry       1 (Chile)

The challenge is to create a user defined function or stored procedure or utilize hierarchy data structures (e.g. SQL Server's Hierarchy ID) to extract different pieces of information for the statisticians. For example:

Show sales by highest level of hierarchy of each salesperson and list the level and the name of that level (e.g. Harry sold $45M of widget A in Chile (level 1 of 1) in comparison to Dick who sold $40M of the same widget in Municip-A (level 3 of 4)).

Note that Angola's hierarchy is 4 levels deep and Dick is functioning on 1 level above the bottom-most level. Stats can be generated by different depths.

Show sales for level 1 of each salesperson (e.g. sales for Chile, Angola etc.). Show sales for level 2 (e.g. sales for Province-A if Chile, or Municip-A if Angola etc.). Show all sales for a level named Province, or even more complex - show all sales for Provinces of countries who have it, or Municipalities of countries who have it.

The denormalized data warehouse structure can also be used, like you mentioned in the question. In a dimensional table, add sale person's name and assign level1, level1type, level2, level2type etc. fields. Tom's level 1 would be Chile and type would be Country. Level 2 would be Region-A and type would be Region etc. This kind of structure can be easy for data extraction if statisticians are looking to hone down on level 1 or 2 or such. We have just added to complexity of adding more columns as we discover hierarchy of different countries but the benefit is that the data is easy to visualize. A simple select statement can help us understand how Tom is functioning.

Sit down with statisticians and find out what they like to see and how they would like to use the data. Based on their ideas you might be able to build a better, perhaps a hybrid system.