I am going to assume that there isn't an index on the date columns, otherwise I think that the query would have been structured differently. If there is, you can probably find a better performing one than this.
The advantage of this query is that it can get all the data in one scan. The disadvantage is that it has to sort the data and join EventEmployee
on the entire table. So as always, test with your own situation. This query also assumes that the MAX
date is either unique or that equivalent rows would be acceptable.
USE AdventureWorks2012
GO
;
WITH Base AS (
SELECT
TransactionHistory.*
,ProductVendor.BusinessEntityID
,MAX(CASE WHEN TransactionDate < '2008-08-01' THEN TransactionDate END)
OVER (PARTITION BY ProductVendor.BusinessEntityID) AS PreviousVendorTransaction
,COUNT(CASE WHEN TransactionDate >= '2008-08-01' THEN 1 END )
OVER (PARTITION BY ProductVendor.BusinessEntityID) AS VendorAfterCutoff
FROM
Production.TransactionHistory
-- Doesn't make the most sense, but I need a repeating relation
INNER JOIN Purchasing.ProductVendor
ON TransactionHistory.ProductID = ProductVendor.ProductID
),
Filtered AS (
SELECT
*
FROM
Base
WHERE
Base.TransactionDate >= '2008-08-01'
OR (TransactionDate = PreviousVendorTransaction AND VendorAfterCutoff > 0)
)
SELECT DISTINCT
TransactionID
,ProductID
,ReferenceOrderID
,ReferenceOrderLineID
,TransactionDate
,TransactionType
,Quantity
,ActualCost
,ModifiedDate
FROM
Filtered
Edit:
Hmm, I think I may have to take back my comment on structuring it differently if there are indexes. The other suggestions that I have are probably fairly minor.
- Make sure the query is using the indexes you're expecting it to. Start and End date to build temp table, end date to drive the previous event loop.
- If the query to build the temp table is doing a lookup on the clustered index, it may be better to hold off and do that as part of the main query.
- Try using a cte instead of a temp table. I think that a cte might be more competitive with the way that the query is structured below.
- If you are returning a lot of events, it might be better to pull out the event table lookup to the main query to give the optimizer the option of doing a merge join.
- I don't see a way of optimizing the previous event lookup short of an indexed view.
Here's a query that combines a few of those ideas.
SELECT
e.[EventID]
INTO #EventTemp
FROM
[Events] AS e
WHERE
( e.[EventStart] >= @StartDate AND e.[EventStart] <= @EndDate )
OR ( e.[EventEnd] >= @StartDate AND e.[EventEnd] <= @EndDate )
;
WITH PrevEvent AS (
SELECT
EmpPrevEvent.[EventID]
FROM
(
SELECT DISTINCT
ee.[EmployeeID]
FROM
#EventTemp
INNER JOIN [EventEmployee] AS ee ON
#EventTemp.[EventID] = ee.[EventID]
) AS Emp
CROSS APPLY (
SELECT TOP 1
e.[EventID]
FROM
[Events] AS e
INNER JOIN [EventEmployee] AS ee ON
e.[EventID] = ee.[EventID]
WHERE
ee.[EmployeeID] = Emp.[EmployeeID]
AND e.[EventEnd] < @StartDate
ORDER BY
e.[EventEnd] DESC
) AS EmpPrevEvent
)
SELECT
e.[EventID],
e.[EventStart],
e.[EventEnd],
e.[EventTypeID]
FROM
[Events] AS e
WHERE
e.EventID IN (
SELECT EventID
FROM #EventTemp
UNION
SELECT EventID
FROM PrevEvent
)
For the query plan itself, you can force compilation every time using the following option on your query:
OPTION (RECOMPILE);
However I suspect what is happening is that the second and subsequent executions of the query are pulling the data from memory instead of disk, and of course memory is faster than disk.
If you are trying to test the scenario where this data always comes from disk, there's not really a good way to do that on your production instance without affecting anything else, because you can only drop clean buffers for the entire instance at a time, not for a single database, never mind table. So what you could do perhaps is set up an instance on the same hardware, with just this table, and run the following before every run of the query:
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
However, this does not make a lot of sense, as ideally you will be querying data that is in memory, and you should be optimizing for the best case scenario. You should have some idea of what the worst case scenario is like, but again, this should not be the normal situation, unless you have a very big database and a very small amount of memory (in which case, open the wallet and buy more memory - it's much cheaper than optimizing for the lack of it).
Best Answer
No, there isn't a way to get the "original query text". I quote this, because the log records record what changes happen in that database. Since select statements, cte's, etc., do not generate* log records there is no way to get that information. Additionally, the log records describe the changes that have happened, of which a query could be generated to create the same changes again, reverse them, or show what was changed... however there is no way to find what the larger part - if any - it may play in the batch.
Let's take a quick and simple example:
Suppose we wanted to find all of the accounts using gift cards and give them an extra $1 if they made a purchase over $5 in the last 7 days. The log records would show the $1 increase to any of the accounts that met the above logic, but there is no place for us to see what that predicate was... we'd just see that some accounts (if any) were updated to have 1 added to a column (or whatever your model was).
Thus we could re-create the UPDATE of the individual record, but no way to figure out what the original query was, however we could look at the entire transaction and make a logically equal batch.
What if you wanted to do this in the future - to know all of the queries that have happened?
Native to SQL Server there are various ways to log all of the queries that happen. Additionally there are 3rd party services and applications that can log this as well.