How to Interpolate Dates Along a Line in Oracle

computed-columndate mathoracleoracle-12c

I have a road_condition table in Oracle 12c:

create table road_condition (
    cond_id number(5,0),
    road_id number(5,0),
    cond_date date,
    condition number(5,0)
);

insert into road_condition (cond_id,road_id,cond_date,condition)
values (1,100,to_date('01-NOV-84','DD-MON-RR'),18);
insert into road_condition (cond_id,road_id,cond_date,condition)
values (2,100,to_date('01-JAN-09','DD-MON-RR'),6);
insert into road_condition (cond_id,road_id,cond_date,condition)
values (3,100,to_date('19-JUN-12','DD-MON-RR'),4);
insert into road_condition (cond_id,road_id,cond_date,condition)
values (4,100,to_date('29-APR-15','DD-MON-RR'),4);
insert into road_condition (cond_id,road_id,cond_date,condition)
values (5,200,to_date('29-APR-92','DD-MON-RR'),20);
insert into road_condition (cond_id,road_id,cond_date,condition)
values (6,200,to_date('05-APR-17','DD-MON-RR'),3);
commit;

Resulting table data:

COND_ID ROAD_ID   COND_DAT CONDITION
------- -------   -------- ---------
      1     100   84-11-01        18
      2     100   09-01-01         6
      3     100   12-06-19         4
      4     100   15-04-29         4
      5     200   92-04-29        20
      6     200   17-04-05         3

Here is a graph of road #100:

graph of road 100

Condition in the table is categorized like this:

  • 20 to 15 is in Good condition and does not require any treatment
  • 14 to 11 requires an Overlay treatment
  • 10 to 8 requires a Full Depth Removal (FDR) treatment
  • 7 to 0 requires Reconstruction treatment

Condition is from 0 to 20, 20 being a perfect road.

When roads are inspected, the condition is stored as a whole number(integer), never as a decimal.

Question:

Rather than show the line in the graph as a single colour (blue), I would like to stripe the line as coloured categories.

I've mocked up road #100 in Excel:

Excel graph of road 100

Idea:

To stripe/categorize the data as described, I think I need to create dummy points (rows) at the beginning and end of each range along the line.

Perhaps the simplest way to do this would be to create points (rows) at all intervals of condition, instead of just at the beginning and end of each range.

The graphing software that I'm using creates a new line in the graph for each Y-column/series (similar to the way Excel does).

Therefore, I believe I need each category to be isolated in a separate column, so it can be drawn as a separate line (with a separate colour).

Additionally, there needs to be overlap between each category, so that the graphing software does not display gaps between lines.

How can I create the dummy rows (interpolate the dates)?


Update:

I have a related question here: Date as x-axis: How to multiply and divide dates?

Best Answer

Have fun.

with
  num_years as (select extract(year from max(cond_date)) - extract(year from min(cond_date)) +1 as years from road_condition where road_id = 100),
  generated_dates as (select add_months(trunc((select min(cond_date) from road_condition where road_id = 100), 'YYYY'), (rownum - 1) * 12) as cond_date from dual connect by level <= (select years from num_years)),
  generated_data as (select rc.cond_id, rc.road_id,nvl(rc.cond_date, gd.cond_date) as cond_date, rc.condition
    from generated_dates gd left join (select * from road_condition where road_id = 100) rc on (gd.cond_date = trunc(rc.cond_date, 'YYYY'))
  ),
  data1 as (select
      last_value(cond_id ignore nulls) over (order by cond_date) as cond_id,
      cond_id as original_cond_id,
      road_id, cond_date, condition,
      last_value(condition ignore nulls) over (order by cond_date) as s1,
      lead(condition ignore nulls) over (order by cond_date) as s2
    from generated_data
  ),
  data2 as (
    select
      cond_id, road_id, original_cond_id, cond_date, condition, s1, s2,
      count(*) over (partition by cond_id) + 1 as s,
      count(*) over (partition by cond_id order by cond_date rows between unbounded preceding and current row) as s0
   from data1
  ),
  data3 as (
    select original_cond_id as cond_id, road_id, cond_date, 
     round(nvl(condition, (1-s0/s)*(s1-s2)+s2)) as condition
    from data2
  ),
  final_data as (
    select cond_id, road_id, cond_date,
      case when condition between 15 and 20 then condition end as cond_good,
      case when condition between 11 and 14 then condition end as cond_overlay,
      case when condition between 8 and 10 then condition end as cond_fdr,
     case when condition between 0 and 7 then condition end as cond_recon
    from data3
  ),
  data_for_graph as 
  (
    select 
      cond_id, road_id, cond_date, cond_good,
      case when cond_good is not null and lead(cond_overlay) over (order by cond_date) is not null then cond_good else cond_overlay end as cond_overlay,
      case when cond_overlay is not null and lead(cond_fdr) over (order by cond_date) is not null then cond_overlay else cond_fdr end as cond_fdr,
      case when cond_fdr is not null and lead(cond_recon) over (order by cond_date) is not null then cond_fdr else cond_recon end as cond_recon
    from final_data
  )
select * from data_for_graph order by cond_date;

Output:

   COND_ID    ROAD_ID COND_DATE   COND_GOOD COND_OVERLAY   COND_FDR COND_RECON
---------- ---------- ---------- ---------- ------------ ---------- ----------
         1        100 1984-11-01         18                                   
                      1985-01-01         17                                   
                      1986-01-01         17                                   
                      1987-01-01         16                                   
                      1988-01-01         16                                   
                      1989-01-01         15                                   
                      1990-01-01         15           15                      
                      1991-01-01                      14                      
                      1992-01-01                      14                      
                      1993-01-01                      13                      
                      1994-01-01                      13                      
                      1995-01-01                      12                      
                      1996-01-01                      12                      
                      1997-01-01                      12                      
                      1998-01-01                      11                      
                      1999-01-01                      11         11           
                      2000-01-01                                 10           
                      2001-01-01                                 10           
                      2002-01-01                                  9           
                      2003-01-01                                  9           
                      2004-01-01                                  8           
                      2005-01-01                                  8          8
                      2006-01-01                                             7
                      2007-01-01                                             7
                      2008-01-01                                             6
         2        100 2009-01-01                                             6
                      2010-01-01                                             5
                      2011-01-01                                             5
         3        100 2012-06-19                                             4
                      2013-01-01                                             4
                      2014-01-01                                             4
         4        100 2015-04-29                                             4

You can follow the steps by selecting from the CTEs in the last line in the order I listed them (so select * from num_years, then select * from generated_dates, then select * from generated_data, and so on).