The percentage costs on an execution plan are from the optimizer estimates, even when an actual execution plan produced. The actual execution plan does use the exact plan and include both the estimated rows and the actual row counts. Discrepancies between the row counts can be useful to determine how accurate the estimate was.
Somewhere between the subquery and comparing it to the column in derived
, the optimizer wasn't able to correctly estimate how many rows would match. It guessed that there would be 18 rows from derived
when there were actually over 220,000. An additional clue is the warning message Cardnality Estimate: CONVERT(nvarchar(35),[mssqlsystemresource].[sys].[spt_values].[name],0)
on the SELECT
node.
If you were to check the query run length with something else, such as STATISTICS TIME
, I would expect them to be much closer, and likely the second query running faster.
Here's another plan analysis with a somewhat similar situation. (SQL Server Plan Explorer) (with hat tip to Kendra Little on how to fool the optimizer)
The estimate shows a 93%/7% cost split, but by looking at the actual CPU, time, or IO, the difference is not that extreme. IO is about 75%/25% and CPU is roughly 60%/40%. (I tried to come up with something more even, but wasn't able to.)
some 7 seconds for 500k to return, and also a lot of time to render the grid you're likely displaying the results in.
You are waiting 7 seconds because that's how much it takes for SQL to push 500k rows to your client. Look at client statistics in your SSMS, see Database Engine Query Editor:
Include Client Statistics: Includes a Client Statistics window that contains statistics about the query and about the network packets, and the elapsed time of the query.
SQL Server Execution Times: CPU time = 47 ms, elapsed time = 769 ms.
Your actual query executes in 47ms. Elapsed time is much longer (still under 1 second) because of network waits. You can confirm this using wait stats analysis, read How to analyse SQL Server performance for details, including how to capture the query wait stats.
Ultimately, the problem is returning 500k rows to the client. There cannot be any reason for such operation, no human user can comprehend half a million rows. Process data on the back end.
Half a million rows is used since it is easier to notice speed changes with more than less data
Well, in this case you have sent yourself on a snipe hunt. There is no problem, other that one of your own doing in marshaling and rendering 500k rows. Is a completely bogus scenario, no app should retrieve 500k rows. And processing of 500k rows(eg. aggregates) should tests... the processing, including the aggregates.
Best Answer
Are you allowed to add computed columns in the table or create a materialized view?
You could add a computed column and then index that:
The issue with this approach is that your queries will have to reference the new computed column (
ntext_as_nvarchar_max
) and not the original column for the index to be used.Test at SQL-Fiddle