I am looking for some help to create an efficient SQL query (SQL Server 2012) that will identify the latest records (date) for all the items in the table (table 2 below) on each of a list of given dates (table 1 below). Here are the tables and sample data set. The tables are rather huge (600 dates, 5K+ data entries each for 6000 items).
table 1: qdate
:
QUERY_DATES
03/03/2015
03/04/2015
04/09/2015
06/12/2015
05/05/2016
table 2: svalue
:
ITEM DATA_ENTRY_DATE VALUE
1 03/02/2014 24
1 03/02/2015 22
1 04/04/2015 100
1 03/09/2016 102
2 03/02/2015 -25
Required SQL Result:
ITEM QUERY_DATES VALUE DATA_ENTRY_DATE
1 03/03/2015 22 03/02/2015
1 03/04/2015 22 03/02/2015
1 04/09/2015 100 04/04/2015
1 06/12/2015 100 04/04/2015
1 05/05/2016 102 03/09/2016
2 03/03/2015 -25 03/02/2015
2 03/04/2015 -25 03/02/2015
2 04/09/2015 -25 03/02/2015
2 06/12/2015 -25 03/02/2015
2 05/05/2016 -25 03/02/2016
Best Answer
I think this is what you want:
Now about efficiency, I doubt there is way to make this blazing fast with a big table. Even with the rather small tables (600 dates, 6K items), you still want a result set of 3.6M rows.
An index on
(item, data_entry_date) INCLUDE (value)
will probably help a lot but still, it's 3 million rows of results. Are you going to display them? Or save them in another table for further analysis?If there is another table with
items
, it would also be probably better to replace thei
derived table with that table.