Mysql – best practice for recurring calendar events

MySQLschema

I'm not a dba but a php programmer, so that I'm not an SQL pro. Now, I have searched in both programmers and dba stackexchange sites but the answers are too specific or not conclusive.

My question is very simple, I need to create a calendar. just a calendar, like Google calendar, with normal and recurring events.

Now to summarize what I have learn in my research about recurring events, appears to be 2 main schools:

Create a "parent" event and an occurrences table with one row for each repetition of the event.
Create an events table (1 row for a recurring event) and a second table to store repetition pasterns (generally 1 or 2 row per event).

There are good and bad things in each of the 2 solutions:

Easy to query, easy to create exception (like changing only 1 occurrence in the series), easy to store complex patters, BUT will create a lot of rows, hard to export (ex iCal), easy to cause data integrity issues.
Easy to store complex patterns, clean, easy to export, BUT difficult to query.

When a I say hard or easy to query, I mean to query for a specific time range. For example I want to query all events (or occurrences) for this month, having 2 Timestamps.

The complexity of the recurring events should also be very granular. For example, repeat every 2 months on the 2nd Saturday, or repeat every 2 weeks on Friday and Monday, or repeat every Friday to Monday of the first week of every other month.

To summarize I am creating a very basic task here and I'm sure there is a well know DB schema for this. Lets call it a best practice.

So, what is the best Schema to create those events?

BTW: I use MySQL.

Best Answer

Separate the data from the processing. A database contains data; application code does the processing.

In your situation there are effectively two sets of data:

Raw data (especially for recurring events, such as "4pm every 3rd tues of the month until next June")
Calendar events, such as the the dozens of entries that the above item implies.

Consider starting with two tables, one for each of those. Then write the messy code that translates the definition of a recurring event into the individual entries. (Note: That is code, not SQL.)

Obviously, if there is the option to change a recurring event, there needs to be a way to make potentially massive changes to the second table. This implies an id for the recurring event that is stored in the second table. A FOREIGN KEY might be the way to go. (I prefer to do such myself.)

There are probably other tables, such as for 'normalizing' users, etc. (Do not normalize dates or datetimes.)

"What's scheduled for this afternoon?" That turns into a range query.

Hint: If something is ongoing (eg vacationing for 2 weeks), break it into multiple daily events (14, midnight to midnight) for the second table. Otherwise the SQL can get messy and inefficient.

How does all that feel?

Related Solutions

Mysql – Query for data that is not there

Why are you determining a member's absence through another member's presence?

I guess there must be an events table in your schema. And I would expect it to have a column called something like event_date. It's the event_date column that you should use to determine which events a member could visit (members.registration_date < events.event_date).

After getting the list of members and their respective valid events, you can anti-join it to the attendance table to see which member missed which (or how many) events.

Here's an example query illustrating what I just said. It implements the anti-join using LEFT JOIN + WHERE IS NULL check:

SELECT
  m.member_id,
  e.event_id
FROM       members    m
INNER JOIN events     e ON m.registration_date < e.event_date
LEFT  JOIN attendance a ON m.member_id = a.member_id
                       AND e.event_id  = a.event_id
WHERE a.member_id IS NULL
  AND ... /* some other conditions to filter members and/or events, if necessary */

Or here's another example showing how you could count total events available to a member and those actually attended by him/her:

SELECT
  m.member_id,
  COUNT(e.event_id) AS total_events,
  COUNT(a.event_id) AS attended_events,
FROM       members    m
INNER JOIN events     e ON m.registration_date < e.event_date
LEFT  JOIN attendance a ON m.member_id = a.member_id
                       AND e.event_id  = a.event_id
WHERE /* some conditions as necessary */

Note that in the last example there's no IS NULL check. It's because that query needs to get all the events per member. Missed events are just not counted by the COUNT() function (because the corresponding a.event_id value contains NULL in those cases).

UPDATE

Since, as per your comment, the events are recurring (and thus have no fixed dates), I would suggest adding and using an event_calendar table, populated beforehand either manually or programmatically. Naturally, the events table in the above examples would be replaced with event_calendar.

Alternatively, if maintaining an event calendar is not an option, you could replace events in your queries with

(
  SELECT DISTINCT
    event_id,
    attendance_date AS event_date
    /* or, if they are not purely dates:
    CAST(attendance_date AS date) AS event_date
    */
  FROM attendance
) e

(basically, @ypercube's suggestion with dates thrown in).

MySQL schema for recurring events – Query by specific date

Let's say you made the table like this

CREATE TABLE TrainEvents
(
    id INT NOT NULL AUTO_INCREMENT,
    station_arr_id INT NOT NULL,
    station_dep_id INT NOT NULL,
    arr_dt DATETIME NOT NULL,
    dep_dt DATETIME NOT NULL,

    ...
    PRIMARY KEY (id),
    KEY station_arr_index (station_arr_id,arr_dt),
    KEY station_dep_index (station_dep_id,dep_dt)
);

Next, you insert into the table the event

Arriving From Station 21 on April 30, 2014
Arrived at Station 29 at 10:00 AM on April 30, 2014
Departing From Station 29 1 minute later

Here is that INSERT query

INSERT INTO TrainEvents (station_arr_id,arr_dt,station_dep_id,dep_dt)
VALUES (21,'2014-04-30 10:00:00',29,'2014-04-30 10:01:00');

OK, now let's see how you can query the following departure

All Departures 10:00 AM on April 30, 2014 from Station 29
Check Departures 15 minutes before and after

Here is that SELECT query

SET @MinuteWindow = 15;
SET @DepartureDT = '2014-04-30 10:00:00';
SET @DepartureDTMinBef = @DepartureDT - INTERVAL @MinuteWindow MINUTE;
SET @DepartureDTMinAft = @DepartureDT + INTERVAL @MinuteWindow MINUTE;
SELECT * FROM TrainEvents
WHERE station_dep_id = 29
AND dep_dt >= @DepartureDTMinBef;
AND dep_dt <= @DepartureDTMinAft;

EPILOGUE

Given your initial idea, you could change the station_arr_id and station_dep_id into a single route_id and store the routes in a route table

CREATE TABLE TrainRoutes
(
    station_id INT NOT NULL AUTO_INCREMENT,
    station_name VARCHAR(128) NOT NULL,

    ...
    PRIMARY KEY (station_id),
    KEY station_name_ndx (station_name)
);
CREATE TABLE TrainEvents
(
    id INT NOT NULL AUTO_INCREMENT,
    route_id INT NOT NULL,
    arr_dt DATETIME NOT NULL,
    dep_dt DATETIME NOT NULL,

    ...
    PRIMARY KEY (id),
    KEY arr_index (route_id,arr_dt),
    KEY dep_index (route_id,dep_dt)
);

You could also make the event record arrivals and departures separately.

CREATE TABLE TrainRoutes
(
    station_id INT NOT NULL AUTO_INCREMENT,
    station_name VARCHAR(128) NOT NULL,

    ...
    PRIMARY KEY (station_id),
    KEY station_name_ndx (station_name)
);
CREATE TABLE TrainEvents
(
    id INT NOT NULL AUTO_INCREMENT,
    route_id INT NOT NULL,
    event_dt DATETIME NOT NULL,
    event_type TINYINT NOT NULL # 1 for arrival, 2 for departure

    ...
    PRIMARY KEY (id),
    KEY event_index1 (route_id,event_dt,event_type),
    KEY event_index2 (route_id,event_type,event_dt),
);

I am just giving ideas. I'll leave it to you to implement. Perhaps, to give you some UI ideas, see the site I use to commute : http://as0.mta.info/mnr/schedules/sched_form.cfm . The result of the route you pick in that site shows route departures within a 5-hour window (2.5 hours before and 2.5 hours after).

Best Answer

Related Solutions

Mysql – Query for data that is not there

MySQL schema for recurring events – Query by specific date

EPILOGUE

Give it a Try !!!

Related Question