Normalization – Does Customer Table with Last Order ID Violate Normalization?

normalization

So I'm looking at a customers table that has a relationship with an orders table. There is a trigger in place to set the last_order_id field of the customers table whenever a order it placed. Does having a last_order_id violate normalization rules? Specifically 3rd normal form? And what would be a better methodology for this type of work?

Best Answer

No it does not violate normalization rules because NF forms are defined in terms of dependency rules that involve ONLY A SINGLE TABLE.

Obviously, it creates a form of redundancy in your database, because as has been said, the "last order" can be derived by looking at the ORDERS table, but that is a form of redundancy that the traditional normal forms do not seek to address. Traditional normal forms up to BCNF consider functional dependencies only, and since it is presumably indeed the case that your customerid will unambiguously determine the corresponding last_order (*), having the last_order in the customers table is even the only place where the NF forms will say it belongs.

(*) if it can be the case that a customer has no orders at all, then there will be nothing to put in this column. You will probably use NULL in that case, and there are those who say that this is in itself a violation of 1NF. The more traditional version of normalization theory glosses over NULL cases completely.

EDIT

I noticed I have forgotten to answer the bonus question "what would be a better methodology for this type of work". IMO, the proper way, at least theoretically, to address these kinds of situation is by using materialized views. Define the aggregate query that computes the last_order for each customer as a view, and have the DBMS keep a physical record of its value ("materialize the view"). This way, the redundancy in the design has been brought under control (it is now entirely managed by the DBMS itself). Unfortunately, few DBMS's will be able to support this for views that involve aggregation. The rulesets of what is possible and what isn't, in this area, for a given DBMS, are rather diverse and extremely ad-hoc.

The trigger you say you already have is the next best thing if materialized view isn't an option. Or remove the redundancy altogether and just recompute the last_order from the ORDERS table whenever it is needed, if that is feasible given the applicable performance requirements/expectations.