Sql-server – Querying a table with 200+ million rows by filtering on the clustered index AND additional column is super slow

execution-planquery-performancesql serversql-server-2019

We have one table that has clustered index on two columns: first one is NVARCHAR(50) and second one is on DATETIME2(3) type.

There are few other columns. When we query the table with WHERE clause on the clustered index columns, the result is received immediately, even when the result set is tens of thousands of rows.

However, if we add one more predicate on the where clause, the query is super slow. After ten seconds I just stop it, because that is not useful to us.

The table has 200+ million rows.

This is the query that is super fast:

SELECT *
FROM Table
WHERE Col1 = 'blah' AND Col2 BETWEEN 'date1' AND 'date2'

This is the query that is super slow:

SELECT *
FROM Table
WHERE Col1 = 'blah' AND Col2 BETWEEN 'date1' AND 'date2' AND BooleanColumn = 1

In my mind, the second query should use the clustered index to search for rows and then simply scan the result set to filter out what's needed.

Is it possible to (somehow) make the second query work fast without creating additional non clustered index that would include other columns that we need to filter on?

Here is the query plan for the second:

Query Plan

https://www.brentozar.com/pastetheplan/?id=BJj28mjvL

After the first run, it works fast now for different combination of values for the WHERE part, which means probably it's not that results are cached, but the query plan is generated and optimized.

Best Answer

Every time you add a new column predicate, the optimizer builds (sampled) statistics on that column to compute selectivity. This is where the time goes - building the statistics. On a large table, that can take a while, even with a low sampling rate.