Mysql – Best way to store different types of data in a single column, which should be query friendly – MySql

database-designmulti-tenantMySQLperformancequery-performance

I'm working on a project in which user can create their own web services from UI by simply selecting number of fields (Max 200) and defining datatype for each field.
I'm considering following table structure for the same :

ServiceMaster
+-----------+-------------+-------------+
| ServiceId | ServiceName | FieldsCount |
+-----------+-------------+-------------+
| 1         | Service1    | 2           |
+-----------+-------------+-------------+
| 2         | Service3    | 3           |
+-----------+-------------+-------------+

ServiceDetails
+------------------+-----------+--------+---------+-------+-------------+
| ServiceDetailsId | ServiceId | Field1 | Field2  | Field3| Field...200 |
+------------------+-----------+--------+---------+-------+-------------+
| 1                | 1         | 5      | Active  | NULL  | NULL ...    |
+------------------+-----------+--------+---------+-------+-------------+
| 2                | 2         | High   | 9.0     | 7     | NULL ...    |
+------------------+-----------+--------+---------+-------+-------------+
| 3                | 1         | 2      | Running | 7     | NULL ...    |
+------------------+-----------+--------+---------+-------+-------------+

I'm not adding DataTypeDetails table here for clean and simple code, but yes I have one table to hold datatype for fields.
If you see above table structure, I can define Varchar or any string datatype for all Fields, but if I do that then while applying filters on table I have to CAST fields data with it's original datatype. Like, 9.0 (Float) or 7 (Int) etc.
I think this will slowdown query if I will use TYPECASTING for all columns to apply filters. This able may have Billions of records or more.

So is there any alternate way to achieve the same with better performance.

Best Answer

This is basically a question about a multitenant application; and cal also be thought of as a particular application of an Entity Attribute Value scenario.

You can have different approaches, on top of your current one:

  1. Have one ServiceDetails table per user/service (per tenant), with fields defined according to his/her/its needs. Your query will use ServiceDetails1 ... ServiceDetailsN tables, instead of just 1. Instead of having your table DataTypeDetails, you can use the database metadata (information_schema) if needed. This is the most query friendly way of doing things.

  2. Instead of having field1, field2, ..., field200, have a much larger collection of fields: field1_text, field1_integer, field1_float, field1_timestamp, field2_... and use the version of your column that has the proper type for your specific case. You will have an enormous amount of NULLs, and probably a lot of indexes. This uses only (a very large and sparse) table. It's not normally the best approach, but it is a possibility.

  3. Use a database that allows for more flexible structures (such as JSON) for variable-schema. (PostgreSQL JSONB data-type, which can be indexed, would be a good fit for this scenario; even if you will still need to cast types).

References: