mysql – Using JSON to Store Multiple Attributes in MySQL

database-designjsonMySQL

Typically we have the table designed like this

UserId  FirstName LastName EmailId  MobileNumber Age Sex City

Where UserId is going to be the primary key

Is it good to have multiple columns combined into one single column?
i.e what if i have the above table structure changed to one below

UserId  EmailId UserDetails

Where the UserDetails will contain all the user details in a json format.

When we are querying, we can bring the data and build the user object from json string and get the desired results using Linq to Sql queries

What are the problems we might face if we follow this approach?

Best Answer

This way if you need to filter/order by any part of UserDetails, you cannot use index.

So you can fetch ALL rows to your app and do it there - that means:

  • reading everything in sql (slow disks)
  • moving it to app (slow network)
  • manage in app (lot of memory)

Or you can "walk" all rows in sql using LIKE or something. That will still mean reading entire table but at least it will give you only somehow relevant rows where there is nontrivial chance that it will really match after "deserializing" the JSON so you skip latter two of previous three points.

Linq2Sql is just that - it normally builds queries (hopefully quite efficient ones, but you can sure break that). You probably meant Linq to objects/collections.

Those stand if you need to search "all" (means big enough subset) users by Age/Name etc. In that case having the data in separate columns (and properly indexed) is really needed for anything reasonable (if you will never have more than few hundreds of users, you can probably do it any way and not notice anything, but if you get to tens of thousands or even millions, it will really start to matter).

But - if you know that even with millions of users you will always need to search only specific userId/emailId, or small set of them (say one specific school class in a table of all students of US) and never to it without those limits, then JSON won't hurt you and table will probably be smaller without those indexes you are not going to use.

Same if you never search by those at all, but that does not seem probable (even if you do not now, you probably will in the future - if management asks you for statistics of first names per city, you have a problem with JSON).