DB2 Select Statement – How to Select Every Week’s Log Record

db2selecttimetimestamp

Here is a simplified version of the table in question:

+------------------------------------------------------+
| DatabaseName | LogRecordNumber | LogTimestamp        |
+------------------------------------------------------+
|DB1           | 47378910        | 2014-11-12 10:40:00 |
|DB1           | 48268418        | 2014-11-12 10:41:00 |
|DB2           | 223781480       | 2014-11-12 10:40:00 |
|DB2           | 228976522       | 2014-11-12 10:41:00 |
+------------------------------------------------------+

We have a table that contains the log record number for each database captured every minute. This data spans back a full year, or shorter if the database is newer than that.

We have to watch the log record number to ensure it will not go over the maximum number of bits that it can have. Currently I'm tracking this information with a simple query and adding it into a spreadsheet.

Sample Query:

SELECT Databasename, MAX(LogRecordNumber), LogTimestamp
FROM MyDB.MyTable
WHERE LogTimestamp BETWEEN '2014-11-12-04...' AND '2014-11-12-08...'
GROUP BY DatabaseName, LogTimestamp

I would like to instead develop a report that a technician could run to determine through stat regression when the log records would be out of bits.

So I now need a select statement that would retrieve the LogRecordNumber for each interval of time where we have a log record. Getting the log number once a week would be appropriate and allow me to make a reasonable regression guess at when we will be out of bits.

Again these data can be as far as a year back or as soon as a few months ago, so hard coding something like DATE(CURRENT_TIMESTAMP) – X WEEKS is not possible.

Is there any way to do what I'm suggesting with a query, or do I need to rethink this?

If it matters, the database in question is DB2.

Best Answer

It is possible do this in SQL. There are really 2 problems here:

  1. Find 1 record per week for each database
  2. Calculate the average rate of consumption

For the first problem, you have a couple of ways to do this:

select databasename, logrecordnumber, logtimestamp
  from MyTable
 where logtimestamp > current timestamp - 42 days
   and dayofweek(logtimestamp) = 0
   and time(logtimestamp) between '11:59:30' and '12:00:30'

This looks at only records in the last 6 weeks, and then finds records that occurred on Sunday (dayofweek() = 0) at 12:00. (Using the between predicate should handle minor differences in clocks on your database servers and/or instances when your script doesn't execute exactly on the minute.

It may also be possible to write this using OLAP windowing functions, but I'm not sure it would be any more efficient.


For the second problem you may want to look at using the LAG() function so you can calculate the delta in the log record:

select 
    databasename
   ,logrecordnumber
   ,logtimestamp
   ,logrecordnumber - lag(logrecordnumber,1) over (partition by database order by logtimestamp) as logrecordnumber_growth
from 
    MyTable
where 
    logtimestamp > current timestamp - 42 days
    and dayofweek(logtimestamp) = 0
    and time(logtimestamp) between '11:59:30' and '12:00:30'

This example adds a calculation showing how much the logrecordnumber has grown since the week before.

Doing the rest of the calculation (extrapolating to when LSNs might run out) is left as an exercise for the OP.