In designing a MySQL schema, should I avoid using an enum
type altogether, as I have been reading really strong, but favored opinions on this subject, such as:
- Stackoverflow post on efficiency of enum vs a join
enum
is data, not code- Blog: Reason why MySQL's enum data type is evil
- Right time to use an
enum
Or, is there a choice to be made specific to the use case in hand? If there is, what decision framework should I use, i.e. when to choose one of the two options?
For instance:
Currency
: It's better to use a table join, since currency codes to be supported could change with time (since modification has been emphasized in many posts as anti-enum, but then what could not change with time?).PaymentStatus
: It's better to useenum
type with values like,SUCCESS
,FAILED
,CANCELLED
.TierPlans
: It's better to use a table join, as it would give the flexibility to add columns like monthlyPricing, annualPricing, discountRate, etc. in the future. (I am currently usingenum
instead, with values likeFREE
,PRO
,MASTER
; hadn't thought of pricing yet -_- ).
I am finding that if I stress enough, I could find a table join being a better choice always, mutation safe route. Is increasing number of tables or table joins a symptom of bad design? How should I approach
design in general, and design in such use cases specifically?
Note: I had asked this question on SO here. I am posting it here since I see that there is a closing request there, and I desperately need an answer to this confusing roadblock of mine. Understandably this might not be a good question to ask here too. I'd rather edit, if you may guide!
Best Answer
(Yes, there have been many heated discussions of ENUM vs TINYINT vs VARCHAR, both here and on stackoverflow.com. I have put my 2-cents in on some of them. I am addressing this Question to discuss nuances of the examples brought up.)
Currency is a tricky one. When, for example, a devaluation occurs, does the currency_code change?
If so, then is the question is about how often the list of values is likely to change.
If it does not change, then you may have a mess for the data -- Is "XYZ", the old xyz currency, or the 'new'?
With things like country_code (with well-established 2-letter codes) and currency_code (3 letter standard), etc, I recommend using things like
Rationale:
ENUM
orTINYINT
.JOIN
except when you want the extra info. (JOINing
is most costly when you need to start from multiple Dimension tables in an attempt to look up rows in the Fact table. AJOIN
is reasonably cheap going the other direction.)