When to choose an attribute vs. a new dimension

data-warehousedimensional-modelingfacttableslowly-changing-dimension

I am fully aware of what is a fact, attribute and dimension. However, I'm quite confused to which traits I should choose for dimensions vs attributes of that dimension. For instance, I'm building a hospital data warehouse and gender could be a dimension. However, it can also be an attribute of the "patient" dimension.

I am aware that ONE advantage of choosing gender as a dimension is that I can save on space (i.e. there could be millions of patients, and thus millions of attributes). But, what if gender changes? How can I track that? Same with other attributes: diagnoses can be a dimension AND it could be an attrbiute. If I choose it as a dimension, how can I track a patient's flow/change of diagnoses??

Thank you so much

Best Answer

To make gender a freestanding "dimension" is, more likely than not, an attempt to normalize your dimensional model. Unless you have attributes associated with gender then how is it a dimension? Is it possible to have a conformed gender dimension?

As for diagnosis are you modeling the process or the values? Very different requirements and issues to deal with. I would imagine a series of dimensions and facts to accurately keep track of the diagnosis process.

Disk space... considering our modern computing environment, space is your last optimization/design concern. If it truly is, you should know this for certain.

what if gender changes? How can I track that?

SCD Type 2 Dimension

how can I track a patient's flow/change of diagnoses?

Transactional or Accumulating Snapshot Fact table(s)