Mysql – Best practice for when the ideal number of columns is unknown

database-designMySQL

I'm currently making a MySQL database to list products for an e-commerce site. Each product will have several sizes available which will each have their own dimensions and measurements (e.g. length, width, thickness) and pricing. However, different products, even within the same category, have different numbers of dimensions (some will have 2 widths at different sections, etc.) and there isn't necessarily a maximum number of measurements required to uniquely ID a product size. The options I can see are:

Create a table every time a product is created to account for how many size measurements it needs. Obviously a poor way to handle this.
Create a table with an arbitrarily large number of "measurement" columns. Leaves a lot of nulls if it's too large, and requires that I manually edit the table structure later if it's too small.
Create a table with two CSV varchars: one to identify pseudo-column names, the other for values. Sorting and searching by any value but the first would be largely impossible to my knowledge, but the information is all there.
Create two tables: one for the individual product size code (linking to the product of course), and another with some kind of "key-value" setup: one column for the product code, another for the name of the measurement (e.g. length), and a third for the value of that measurement. Necessitates more joins to get the same information as the other solutions, plus an extra column for an "index" if I want to keep sizing attributes in a particular order (or the size key could be an enum, I suppose).

The last solution seems most reasonable, but is there a better way to go about this?

Best Answer

Will you ever search by those parameters accross different products? It does not seem very probable to me.

If you do not need to manipulate those in sql, only retrieve them for any given product, then store it as xml/json/anything in one blob column (or dynamic column on MariaDB). You then can get one product with all variants and maybe "filter" them by user input in application code (there won't probably be hundreds of them).

But if you have to search/aggregate etc, then do the second table for them and use join or another query to retrieve them as needed. Join is fast, it is a core function of relational DB.

Related Solutions

Database design for part-number generator

First, I don't see any problem with doing this in the db. In fact we are moving this direction with LedgerSMB. The big caution is that you want to be prepared for business rule changes. Of course your old partnumbers will be stable. but SKU's are stable for a long time usually unless you are Frito Lay and then it just makes trouble for everyone else (they change UPC's it seems every time a word on their packaging changes).

What we've done in LedgerSMB is to essentially store a template, and then we preprocess and increment. That isn't quite what you are looking at doing, but it may help to see how we have looked at this in the past and where we are going.

Our templates look something like:

<?lsmb field_name ?>somechars<?lsmb another_field?>0001102

Leading zeros are preserved on increment. Inside the db, we have a stored procedure which increments the number, keeping leading zeros and passes the rest up to the application. I don't see how we can do full preprocessing of this sort using invoice numbers and the like but for partnumbers the goal eventually is to move all that preprocessing into the database.

Now in your case, you probably want to have the parts tied together for easier management. I understand the arguments against this. I would suggest putting all these fields you want to aggregate in the table representing the parts. Then you can integrate these with a user defined function or something (that's how I'd do it in PostgreSQL) which you can use to populate a missing part number.

In this case your table structure will look very different:

product_group(id-- 0000-9999, etc)
fabric (id ....)
etc....
part (partnumber, productgroup_id, fabric_id, etc)

I don't see another sane way of doing it. You are going to have to track each of these things you want to use separately, put them in the parts table, and then generate your partnumber based on that.

Mysql – Database design for an E-commerce website

If you know most of the attributes your products will have ahead of time, you could hard-code them into the tables for different product types, such as:


books
-----
  id
  price
  title
  author
  isbn
  etc....

mobile_device
-------------
  id
  price
  size
  colour
  model
  etc...

You could also try something like this:


base_product
------------
  id
  base_price
  product_type

book_product_attributes
-----------------------
  base_product_id (FK to base_product)
  title
  author

mobile_dev_product_attributes
-----------------------------
  base_product_id (FK to base_product)
  model
  colour

This way, you can group your common attributes in a base product table, and more specific attributes are stored in other tables when necessary.

This will work if your product types are mostly static. On the other hand, if you have no idea what product types or attributes you'll need in the future, and entity-attribute-value based system might work better. The idea behind that model is that you have a table for the different attributes you might have, a table for the different entities in your system, and a table for the attribute values that an entity has. Example:

entities
--------
  id

attributes
----------
  id
  name

values
------
  id
  entity_id
  attribute_id
  value

The data might look like this:


entities
ID
----
  1

attributes
ID |  name   
----------
1  |  title
2  |  author


values
ID  | attribute_id   | entity_id   | value
---------------------------------------------------------
1   | 1              | 1           | "Great Expectations"
2   | 2              | 1           | "Charles Dickens"

This data very briefly describes a product which has two attributes: title, and author, with the valus "Great Expectations" and "Charles Dickens", respectively.

Be aware that using an EAV database can make queries very awkward. With enough data and the wrong model design, it's also possible to run into performance problems. The example I gave was very simple but more realistic designs tend to be more complicated than that. It can take time to get this kind of database correct and it's not always the best solution.

Best Answer

Related Solutions

Database design for part-number generator

Mysql – Database design for an E-commerce website

Related Question