Mysql – Difference data between adjacent rows

cursorsMySQLselect

I have a table (called Visits) with ReqTime DATETIME NOT NULL column AND Duration INT UNSIGNED column.

Duration should be set to the difference of ReqTime for two adjacent rows. But Duration is not set when inserting new rows to the table. It is calculated afterward.

I calculate the "duration" of a row R as the interval (in seconds) between R.ReqTime and N.ReqTime where N is the next row (first row, inserted later the current row).

Thus for each row R (except of the last inserted row, as for it N is undefined) we have Duration value.

See pseudocode for updating the table with the correct Duration value (where R is the current row and N is the next (inserted later) row):

UPDATE Visits SET R.Duration=TIMEDIFF(N.ReqTime, R.ReqTime) WHERE R.Duration IS NULL

Should I use cursors to solve this problem? Or are MIN/MAX/ORDER BY fine?

I am not yet comfortable with cursors.

MySQL 5.

Best Answer

This SQL works fine in SQL Server (with appropriate syntax modifications):

UPDATE Visits
SET Duration = TimeDiff(
         ( SELECT ReqTime FROM Visits N WHERE n.ReqTime > R.ReqTime ORDER BY ReqTime LIMIT 1)
        ,R.ReqTime
    )
FROM Visits R ;

Unfortunately, the syntax hits a limitation of MySQL that an updated table cannot be referenced (again) in the WHERE clause of the UPDATE. A common workaround is to rewrite with a JOIN:

UPDATE Visits R
  JOIN Visits U
    ON U.pk = ( SELECT N.pk                 -- the Primary Key of the table 
                FROM Visits N 
                WHERE N.ReqTime > R.ReqTime 
                ORDER BY N.ReqTime 
                LIMIT 1
              )
SET R.Duration = TimeDiff(U.ReqTime, R.ReqTime) ;

Related Solutions

MySQL looking up more rows than needed (indexing issue)

Your indexes are fine for the two types of queries you mentioned.

This query will be satisfied by traversing the clustered index on the primary key...

[...] WHERE participant_id = x AND question_id = y AND given_answer_id = z;

...and this one is satisfied by the index on 'question_id':

[...] WHERE question_id = x;

The output of EXPLAIN SELECT is not telling you what you think it is telling you, because the value shown in rows is an estimate of the number of rows the server will need to consider, not the actual rows it will examine. For InnoDB these are based on index statistics.

rows

The rows column indicates the number of rows MySQL believes it must examine to execute the query.

For InnoDB tables, this number is an estimate, and may not always be exact.

^{— http://dev.mysql.com/doc/refman/5.5/en/explain-output.html#explain_rows}

The optimizer gathers information about different possible query plans, and chooses the one with the lowest cost. The information shown in EXPLAIN is the information the optimizer gathered about the plan it selected.

When type is ref and key is not NULL, this means that the name listed in the key column is the name of the index that the optimizer has chosen to use to find the desired rows, so your query plan looks exactly as it should.

Note, sometimes you will see Using index in the Extra column and a lot of people assume that this means an index is being used, or that no index is being used when that doesn't appear, but that's not correct, either. Using index describes a special case called a "covering index" -- it does not indicate whether an index is being used to locate the rows of interest.

It's possible that running ANALYZE [LOCAL] TABLE would cause the numbers in rows shown by EXPLAIN to differ, but this is a simple query and selecting this index is an obvious choice for the optimizer to make, so ANALYZE TABLE is unlikely to make any actual difference in performance.

It is possible, however, that your overall performance might see some marginal improvement with an occasional OPTIMIZE [LOCAL] TABLE, because you are not inserting rows in primary key order (as would be the case with an auto_increment primary key)... but on large tables this can be time-consuming because it rebuilds a new copy of the table... but, again, I wouldn't expect any significant change.

Sql-server – How to reduce the run time of stored procedure by getting rid of cursors

You want to increase the speed of your cursors? Wrap them in a transaction. If you are processing millions of records and don't want/need them all in one transaction, you can commit it on occasion to reduce resources.

I did this with a cursor that took an hour to run (this is an extreme case) and afterword it ran in 1 1/2 minutes.

I know that does not answer you question but it might help you avoid doing the conversion until your more familiar with SQL. The answer to your question is... experience. There is no magic site or book you can read, it just takes time learning to do more and more in a single statement.

Best Answer

Related Solutions

MySQL looking up more rows than needed (indexing issue)

Sql-server – How to reduce the run time of stored procedure by getting rid of cursors

Related Question