Mysql – Is a 5+ column primary key bad for large (100 million+) table

database-designMySQLprimary-key

I was reading about some real life DB issues, and one project had a 100 million row plus table that had 5 columns as its primary. I'm thinking this is bad, but can anyone tell me exactly why?

The table was kind of a micro rollup/aggregation table, so the 5 columns were like (day, market_id, product_id…). At first I thought that a 5 column primary key wasn't ideal, but the more I thought, I couldn't really come up with a good reason why it was bad.

This was in a late night discussion with half the company engineers. Someone just mentioned this was a bad design, one senior engineer agreed, but no one really jumped in as to why. Thus trying to research the matter for myself!

Best Answer

There are performance issues with very complex primary keys. And It may not be defending against duplication as well as a simpler primary key might.

However, there is one design pattern that frequently yields tables with a primary key made up of six or so components. It's star schema fact tables. If the fact table of a star schema has six dimensions, then the primary key will have six components. I've never seen a fact table with no declared primary key, and I think it's well worth the overhead, even though the ETL process still has to be quite carefully written.

Some reporting databases imitate the pattern of star schema even if it's not explicitly designed that way.

100 million + rows is not overly big for a fact table, especially with today's big data.