MySQL Database Design – Tracking Patient Stats Over Time

database-designinnodbMySQLperformanceschema

I want to create a database which allows to record patient reading data points like: weight, height, pressure, etc., over time.

I found a database schema like the one discussed in this Q & A which is looks similar to my needs. I'm wondering if such design offers a good performance for a large number of users?

For example, if we're planning to track 10 parameters for 10 000 users at least 20 times a year – it's around 2 000 000 rows/per year.

Available engine – InnodDB, MyISAM.

DB healthstats
TABLE user
    memberID (int*, auto increment, unsigned, primary key)
    name (varchar, 50)
    gender (char, 1)
    birthdate (date)

TABLE reading
    readingID (int*, auto increment, unsigned, primary key)
    memberID (int*, FK: TABLE user)
    date (datetime)

TABLE stat
    statID (int*, auto increment, unsigned, primary key)
    readingID (int*, FK: TABLE reading)
    type (varchar, 3)*
    value (decimal 4.1)*

Best Answer

  • Does readingID need to be ZEROFILL?
  • Does birthdate need to be DATETIME?
  • Use InnoDB.
  • Do not use EAV! 10 columns of stats is no big deal, and you can use appropriate datatypes for each column.

That is, have only 2 tables -- One for Users, one for Readings.

If you graph things, please use dates for the x-axis. I saw one clinic that evenly spaced the readings across the x-axis. That's just plain bad.

For weight you could pick FLOAT or DECIMAL(4,1) or SMALLINT UNSIGNED (4/3/2 bytes, sufficient for pounds or kg -- but be consistent on what you store)

For Gender, is using CHAR, do CHAR(1) CHARACTER SET ascii and allow more than the usual choices (these days!).

If you want to disallow two sets of measurements in a single day, get rid of readingID and have PRIMARY KEY(memberID, date). This will have the side effect of making looking up all the info about a person more efficient.

2M rows/year is "medium sized"; not a problem. By getting rid of stat, you are down to 200K rows/year, and the entire dataset will be closer to "tiny". The database would be perhaps 8MB/year. (Versus perhaps 100MB/year for EAV with stats.)

The insert rate would be 200K/year. MySQL can easily handle 200K/day, so the server would be very idle.

With my suggestions, should able to handle even a million rows without further tweaking. (With EAV, you would be struggling long before that.)

Why or why not EAV

EAV shines when there is a lot of variation in attributes. Your application has a fixed set of attributes -- Everyone has a pulse rate; at worst you might store NULL if that test was not performed.

If, on the other hand, you choose to store the hundred(?) readings you can get from blood samples, ... Usually blood work focuses on a small number of possible tests. That is, the data is very sparse.

For blood work, I would recommend having a column of JSON text. (Depending on whether you are using a recent MySQL or MariaDB, there might be a JSON datatype, or just a TEXT datatype; the effect is similar.) In this column you would list only the readings that you have information for. Meanwhile, the 'regular' readings (pulse, temp, etc) would have their own columns. You might have a 'real' column that says "there is some bloodwork reading that is out-of-range" as a clue to look into it.

Yes, JSON has hassles when rummaging inside it. But I claim that EAV is even worse -- for size, for speed, for coding.

In your Comment you mentioned adding 10-20 new readings. The necessary ALTER TABLE .. ADD COLUMN .. NULL, ADD COLUMN .. NULL; would be a one-time nuisance. But I still prefer it to either EAV or JSON.

Another thing to note: INDEXes are not always useful. When you have values that are mostly missing (columns, EAV, or JSON), they may be costly to search. If you would like to show us some of the queries, we can discuss further.

For the simple query of "SELECT * FROM Readings WHERE userID = 123" to build graphs, then the optimal schema is a single Readings table with or without JSON, and without EAV. And I gave the optimal PRIMARY KEY.