MySQL Computed Column counting other rows with attributes relative to this one

MySQLquerywindow functions

I have a table in a MySQL database that has information about passenger flights. I want to do some analysis of flight delays. I would like to add a calculated attribute to each row that indicates how many flights arriving at the current row's departure airport, by the same carrier, were delayed in the past 24 hours.

I haven't had any luck writing this query. I think that I can express my intent with:

-- Find a count of all rows
    count(*) where
-- Of the same carrier 
    a.FlightCarrier = thisRow.FlightCarrier
-- that arrived at our departure airport
    a.DestinationAirport = thisRow.OriginAirport
-- scheduled to arrive before ScheduledDepartureTimeUTC
    a.ScheduledArrivalTimeUTC < thisRow.ScheduledDepartureTimeUTC
-- scheduled to arrive after (ScheduledDepartureTimeUTC - 24h)
    a.ScheduledArrivalTimeUTC > (thisRow.ScheduledDepartureTimeUTC - 24h)
-- had a delayed arrival
    a.DelayedArrival = 1

I attempted to write this as a SQL statement:

alter table flight add column DelayedCarrierArrivals24H int as
(select count(*) from flight a where 
 a.FlightCarrier = FlightCarrier
 and a.DestinationAirport = OriginAirport
 and a.ScheduledArrivalTimeUTC < ScheduledDepartureTimeUTC
 and a.ScheduledArrivalTimeUTC > (ScheduledDepartureTimeUTC - (60 * 60 * 24))
 and a.DelayedArrival = 1) STORED;

However, MySQL balks at this with "You have an error in your SQL syntax… near 'select count(*) from flight a where a.FlightCarrier = FlightCarrier"

What's the 'right' way to create this column in MySQL? I had originally thought to use windowing functions but couldn't divine the right way to express my intent with those features, either.

Best Answer

The problem

https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html says:

Generated column expressions must adhere to the following rules. An error occurs if an expression contains disallowed constructs.

...

Subqueries, parameters, variables, stored functions, and user-defined functions are not permitted.

...

Your expression includes a subquery. Note that the error message pointed directly to the subquery: "... near 'select ..."

Generated columns are rather limited (for good reasons).

Suggestion 1

May I suggest that you do the computation when you are fetching data from the table. Consider writing a VIEW or Stored routine to hide it from the caller.

Suggestion 2

Perhaps this comes closer to what you are looking for?

SELECT  b.*, 
    (
        SELECT  count(*)
            from  flight a
            where  a.FlightCarrier = b.FlightCarrier
              and  a.DestinationAirport = b.OriginAirport
              and  a.ScheduledArrivalTimeUTC < b.ScheduledDepartureTimeUTC
              and  a.ScheduledArrivalTimeUTC >= (b.ScheduledDepartureTimeUTC - (60 * 60 * 24))
              and  a.DelayedArrival = 1 
    )
    FROM  flight AS b;

It won't be efficient due to the correlated subquery, but we can work on that. First, verify that it does what you need.

Suggestion 3

Better would be to have a column for that info, then do the computation separately as you are storing the b information.

INSERT INTO flight ( ..., DelayedCarrierArrivals24H)
    SELECT ...,    -- the values from your source
        ( SELECT ... ) AS DelayedCarrierArrivals24H;