Mysql – Adding TIME data to a DATE type column in time series table, and what sort of foreign key

datedatetimeforeign keyMySQLprimary-key

I've got a v. simple table currently containing millions of data points of price on a given date:

CREATE TABLE data_point (
    id INT PRIMARY KEY,
    symbol_id INT,
    x DATE,
    y DOUBLE
)

This is a timeseries database and for this release I'll keep data with a daily (or less frequent) period but if I need to add a time component, would it be better to change the data_point table and convert x to DATETIME?

If this did happen, initially all of the existing data would remain unchanged, it would not be adjusted – only future data (and not all of it) would be saved with time.

Or would it be better to add a second table with a one-to-one relationship to data_point, which held the time data?

Performance is more critical than space.

This leads on to the associated question about this table. Which is better in terms of performance for huge amounts of data, when all the queries on data_point will be joins from symbol? (Assuming INT for a primary key will get filled up within the app's expected lifetime)

CREATE TABLE symbol (
    id INT PRIMARY KEY,
    name VARCHAR(100)
);

CREATE TABLE data_point (
    id BIGINT PRIMARY KEY,
    symbol_id INT,
    x DATE,
    y DOUBLE,
    CONSTRAINT FOREIGN KEY (symbol_id) 
        REFERENCES symbol(id)
);

CREATE TABLE data_point (
    symbol_id INT,
    x DATE,
    y DOUBLE,
    CONSTRAINT PRIMARY KEY (symbol_id, x),
    CONSTRAINT FOREIGN KEY (symbol_id) 
        REFERENCES symbol(id)
);

Best Answer

When in doubt, keep it simple - just use DATETIME. I can think of two reasons to flake TIME off into another table, but neither are commonly applicable:

Your table is very large and in use constantly, so you can't afford to block it while changing a data type. I'd still work around this by adding a new DATETIME column, populating it gradually, and then dropping the old DATE field when possible.
It's very important that you keep your table narrow, and you'll need the time values very rarely.

You've stated that performance is more important than space, but I really wonder if six bytes for DATETIME2(0..2) is really going to make much difference over three bytes for DATE. You need to value your own time, and the time of everyone who tries to understand your system in the future.

Your question about the PK really should be asked separately. However, I concur with Rick James that, in this case, you probably don't need a synthetic key (your id field); just index on symbol_id and date. Assuming you're clustering on the PK, consider putting date first, so you can easily query date ranges. If you primarily filter (not just join) by symbol_id, then put it first in the index instead. If you do both, cluster on { date, symbol_id } and index on symbol_id.

Related Solutions

Mysql – Finding rows for a specified date range

Another way to get the result is this. It finds first all groups that the teacher has surely taught (or is going to) by checking that she has started within the month and then in another subquery it finds - for every group - the last teacher that started at the first day of the month or earlier.

With the unique index you have on the table, the second subquery should be quite efficient. The first subquery would benefit from an index on (teacherid, startdate, groupid):

SELECT groupid
FROM pupilgroupteacher
WHERE teacherid = @teacher 
  AND startdate >= @month + INTERVAL 1 DAY
  AND startdate < @month + INTERVAL 1 MONTH

UNION DISTINCT

SELECT gg.groupid
FROM 
    ( SELECT DISTINCT groupid
      FROM pupilgroupteacher
    ) AS gd
  JOIN pupilgroupteacher AS gg
    ON  gg.groupid = gd.groupid
    AND gg.startdate =
        ( SELECT MAX(gi.startdate)
          FROM pupilgroupteacher AS gi 
          WHERE gi.groupid = gd.groupid
            AND gi.startdate < @month + INTERVAL 1 DAY
        )
WHERE gg.teacherid = @teacher ;

EAV-like table with both primary column pointing to the same foreign column

A simple CHECK constraint works just fine:

$ sqlite
SQLite version 3.8.4.1 2014-03-11 15:27:36
...
sqlite> CREATE TABLE table_B(
   ...>     user1,
   ...>     user2,
   ...>     [other stuff],
   ...>     CHECK (user2 < user1)
   ...> );
sqlite> INSERT INTO table_B VALUES (1, 0);
sqlite> INSERT INTO table_B VALUES (2, 3);
Error: CHECK constraint failed: table_B

(This has nothing to do with EAV.)

Best Answer

Related Solutions

Mysql – Finding rows for a specified date range

EAV-like table with both primary column pointing to the same foreign column

Related Question