SQL Server Performance – Why Data Retrieval Takes Over 4 Hours for 400K Records

azure-sql-databasequery-performancesql serverssms

I have a table with 15 million rows, it's a parent table with 12 child tables.

Even for a simple count query, it is taking hour's to complete.

select count(1) from table
where col_filter >= 'number'

table contains 40 columns and col_filter is having varchar data type. col_filter is not indexed.

Questions:

  1. What should I check to find the potential issues with my table setup?
  2. I am using Microsoft SQL server management studio 18, is there any tool that I can use to understand and get some recommendations to optimize the performance?
  3. If indexing is the solution, is it possible to calculate how much extra space the index creation will occupy?

Update-1:

This table is part of other data fetch query, which contains, cte, inner joins and then using cte as base table which perform group by operations, the table which I posted in post is the base table inside cte, as it's taking long time even for count, I thought it would be good start to debug. Here is the full query with estimated execution plan and actual execution plan which runs for 18+hours.

Note: Actual execution plan was taken from the running query as shown in this SO answer

wait_info for select count(1) from table where col_filter >= 'number':

(35ms)PAGEIOLATCH_SH:dev-db:1(*)
(26ms)PAGEIOLATCH_SH:dev-db:1(*)
(86ms)PAGEIOLATCH_SH:dev-db:1(*)
(9ms)PAGEIOLATCH_SH:dev-db:1(*)

and it's actual execution plan

Any suggestions would be really appreciable.

Best Answer

As you've already been advised on (between both your Posts), an index on (col_filter) would help the example query you've provided. If you're only running aggregative queries like this, then a nonclustered columnstore index might be best so that you can get columnar compression (which will minimize disk space overhead of the index) and improve performance with batch mode operations.

Outside of that, as Martin pointed out on your other Post, 4 hours is still unusual to scan even the whole table of 1.5 million rows. But since you're on the cloud, there's a number of things that can be bottlenecking your queries. It's hard to say without seeing the actual execution plan. You can also run sp_WhoIsActive in a separate query window, while you're waiting on your query to finish executing, to see what it's waiting on (wait types) and if there's any blocking processes. This would be helpful to know too.

Start with indexing, and see if that makes a significant improvement (i.e. your example query shouldn't take more than a minute - and that's slow to be honest). If it's still problematic, please update your Post with your table's definition and the actual execution plan of the slow query, which you can upload on Paste The Plan and then link in your Post.