Mysql – Large scale, hard to model design advice needed

database-designMySQL

I would love to hear some opinions or thoughts on a MySQL database design.

Basically, I have a tomcat server which receives different types of data from about 1000 systems out in the field. Each of these systems is unique, and will be reporting unique data.

The data sent can be categorized as frequent, and infrequent data. The infrequent data is only sent about once a day and doesn't change much; it is basically just configuration based data.

Frequent data is sent every 2-3 minutes while the system is turned on, and represents the current state of the system.

This data needs to be stored in a database for each system, and be accessible at any given time from a PHP page. Essentially for any system in the field, a PHP page needs to be able to access all the data on that client system and display it. In other words, the database needs to show the state of the system.

The information itself is all text-based, and there is a lot of it. The config data (that doesn't change much) is key-value pairs and there are currently about 100 of them.

My idea for the design was to have 100+ columns, and 1 row for each system to hold the config data. But I am worried about having that many columns, mainly because it isn't too future proof if I need to add columns in the future. I am also worried about insert speed if I do it that way. This might blow out to a 2000 row x 200 column table that gets accessed about 100 times a second so I need to cater for this in my initial design.

I am also wondering, if there are any design philosophies out there that cater for frequently- and seldom-changing data based on the engine. This would make sense as I want to keep INSERT/UPDATE time low, and I don't care too much about the SELECT time from PHP.

I would also love to know how to split up data. If frequently changing data can be categorised in a few different ways should I have a bunch of tables, representing the data and join them on selects? I am worried about this because I will probably have to make a report to show common properties between all systems (i.e. show all systems with a certain condition).

I hope I have provided enough information here for someone to point me in the right direction, any help on the matter would be great. Or if someone has done something similar and can offer advice I would be very appreciative.

Best Answer

Store the data in flat text files which are uniquely named based on each system, not in a database.

Databases (particularly relational ones) excel at storing data in tables with a consistant schema, relating those tables to one another and querying them in arbitrary ways. What you are describing is (based on my reading) basically a web page to display text files. There is only one way to access the data (via what is effectively a unique key, the system name), there don't appear to be any relationships, and the data will vary considerably from system to system (hence your issue with lots of columns).

So I don't see much benefit in using a relational database (a document database such as RavenDB, CouchDB or MongoDB may be a better fit, but even they are likely to be overkill for what you describe).

If you're really keen on using MySQL, just store the contends of the file in a table you can lookup via a name:

system_name varchar(255) NOT NULL Primary Key
file_type char(2) NOT NULL = 'F' or 'IF' (for frequent or infrequent)
data Longtext NULL

And parse your textual data in PHP (or simply display it).

You could store your data exactly as you receive it, or process it into XML, JSON or YAML to standardise it.


I just noticed the show common properties requirement.

Perhaps you could have columns in your table for those common properties, but not all the properties. You can quickly query the common properties while still being able to view the full data. (I don't have direct experience with any of the above document databases, but I believe all of them will let you index arbitrary fields in your documents, which will be much more flexible than a fixed number of columns).

If you really want to stay with MySQL, you may want to look into the Entity-Attribute-Value pattern. (IMO, EAV is an anti-pattern, but it should be OK just for the common properties).

Of course, if display time isn't really a big worry, just opening and parsing 1000 text files isn't going to take too long (I'd expect ~10 seconds).