Dimensional Modeling of a Loan Approval Process and Fact Grain(s)

data-warehousedimensional-modelingfacttable

Context:

I'm modeling for a lending company and I'm modeling our loan approval process. I have some basic dimensions relative to our business figured out:

  • Time(day)
  • Account
  • Opportunity
  • Channel

And I and getting to the modeling of fact table(s) around the loan approval and credit decisioning process. An opportunity will get one or more "Credit Reviews" which have a series of steps they go through before arriving at a "Credit Decision". My instinct is to model the fact grain at the credit review step level or maybe the credit review level (or have one fact for each grain).

I'm working with some analysts who are used to using a “canned report”, one flattened table extracted from the source system that is at the opportunity level, and finds a "best credit review", and do a lot of their reporting from that.

I'm trying to explain my credit approval fact and how it would work and they think it's "overly complicated" and seem to want a fact table that is one-to-one with the opportunity, which is a dimension. And contains information on only the "best" credit review. In my mind that seems odd. I haven't run across a fact table that is at the same granularity (one-to-one) with a dimension. Have you ever heard of this? It feels like they are just trying to reproduce what they are used to, a non-dimensionally modeled data mart table at the opportunity level.

I'm trying to explain the utility of having a fact table that is at a higher granularity than they are used to in order to support many questions about the data rather than one, but I seem to be running into a wall getting this concept across. I’m not sure who is right here.

My question is this: Is it every appropriate, say in this situation, to model a fact that has the same cardinality as a dimension (one-to-one) with a core dimension? What would you recommend for this situation?

Best Answer

Dimensions that are 1-1 with facts are not unheard of. The problem with such dimensions is their size, which won't get any smaller just because you introduce a fact table that's even more detailed. Whether you truly need a dimension at the grain of an individual opportunity is a question I'd think about.

You don't mention measures in your question at all, and they're rather important for identifying the fact tables. For instance I'd expect the "Opportunity" (though I'm accustomed to calling it a Loan application) to have the amounts requested and approved. How would you calculate the total amount approved and the number of applications approved/total in the model you're thinking of?

I find it weird that you're attemting to convince the potential users that they need more functionality than they think they do; in my experience it's usually the other way around. How much domain experience do you have? If you're sure the use-cases you've come up with are real, you should be able to find a stakeholder who agrees with you. If you can't, you probably should postpone implementation until someone needs it.

(Unless your architecture is such that adding a more detailed fact table will require unreasonably large effort if not done upfront, e.g. you're planning to ETL source data directly to the dimensional data mart, throwing away all the data that doesn't fit - but that's an argument against that architecture, not against the "Opportunity" fact table, IMO.)

An "accumulating snapshot" fact table is commonly mentioned when modelling a process in a dimensional paradigm (1, 2).