Sql-server – Purpose of Data Warehousing

data-warehousesql server

I have been reading up on Data Warehousing and its purpose. The main purpose seems to be to track historical data and whatever changes may have occurred over time.

The DW book I am reading gives a number of reasons for using a DW schema…

1) Normalized schema may have 1000's of tables

2) Naming conventions may not be enforced

3) Data may be in multiple databases

4) Data quality issues

5) Historical data requirements

I am now working on a project where it has been suggested to me to use DW schema. However, none of the above 5 reasons apply to my situation.

1) Data is in region of 10 tables

2) Columns are named correctly
3) All data is in a single database

4) No reported Data Quality issues

5) No requirements for historical data

So, is tracking historical data the main reason to use DW?

Best Answer

As a BI consultant, my view on datawarehousing is that it provides (primarily) non-technical users with an easily accessible set of facts and dimensions.

Often, you'll see the following features in a data warehouse:

  • denormalized dimensions,
  • simplified fact tables and measures,
  • pre-aggregated balances,
  • pre-accumulated amounts over some time dimension,
  • often with some business logic (distributions, calculations, statistics) already applied to the data,
  • dimension data organized as a hierarchy where originally wasn't,
  • etc

All of this serves to simplify the understanding of, and accessibility to, the business data for people who aren't too comfortable writing SQL queries or using pivot table tools. Some of the greatest pitfalls in data warehousing and BI involve users not understanding the structure or meaning of the data, producing incorrect results and subsequently not trusting the BI solution.