Non numeric attributes in fact table (to track data source)

data-warehousefacttablestar-schema

I am creating a data warehouse using star schemas, following the Kimball methodology. One of the rules is only put numeric values in fact tables.

I am considering putting an "id" in a fact table, in order to track the source of the data that the fact was inserted from. Users may upload data in a variety of ways, I am required to support a use case where a user has made a mistake in their data upload and wishes to redact it. I am considering therefore putting job and row ids on all uploaded data and keeping the raw data uploaded (prior to ETL from a staging area). This way the users will be able to see a log of uploads, and can redact them and replace with correct data if need be.

Is it really bad to put these ids in the fact table? An alternative would be to put them in a special tracking dimension. However, this dimension will have one row per fact table row, so does it not make sense to put them in the fact table?

Best Answer

I will typically have my business key in the fact table so I can easily track back to the source system for any questions. I usually will put a unique constraint on it to ensure the granularity of the fact table is the same as the business key.

Related Question