Why COUNT Query is Faster Than Result Set Query

dmlexecution-planperformancequery-performance

Suppose I have two similar queries, with complex joins; query 1 returns rows and columns:

SELECT col1, col2, col3, ...etc 
FROM MyBigTable
   LEFT JOIN AnotherTable1 On ... etc 
   LEFT JOIN AnotherTable2 On ... etc
WHERE someCol1= val1
AND someCol2= val2
AND etc...

and query 2 returns just the count of rows as per query 1:

SELECT count(MyBigTable.PKcol)
FROM MyBigTable
   LEFT JOIN AnotherTable1 On ... etc 
   LEFT JOIN AnotherTable2 On ... etc
WHERE someCol1= val1
AND someCol2= val2
AND etc...

Self-evident logic tells me the SQL engine (of most databases, say SQL Server, MySQL, IBM DB2, whatevs…) will return from the count query faster
than the result set query – at least when we are talking big result sets.
Obviously transferring lots of rows and columns across a network will take longer than just a single scalar value of count!

My questions are:

  1. Does the DB Engine have to do similar amount of work / effort for both queries?
  2. If so, is the bottleneck (or delay in receiving the results) simply due to transferring bigger data across a network?

Best Answer

It depends on the table and indexing layout, and the database engine you are using, but generally there are two reasons for the COUNT example to be faster:

1. Less pages need to be read. After finding which rows need to be returned due to the results of the join and filter operations, the query may still need to read data pages to find the values for the extra columns you want to output.
This is not always the case: if the table design and query are such that everything is being table-scanned anyway the two queries will likely perform equally badly. Also in DBs that support clustered indexes if the query planner ends up using only the clustering keys there probably won't be further reads to do. Also you may have covering indexes which have a similar effects on how much work is needed to find all the desired output information.
Using examples with SQL Server's query plans from one of our DBs (other DBs will behave similarly): SELECT JobTitleId FROM Person WHERE JobTitleId = 17 produces
plan for select just indexed value
showing that it just needed to look at the simple index to find all it needed. The plan for SELECT JobTitleId, LastName, FirstName FROM Person WHERE JobTitleId = 17 shows the extra lookups needed for the extra output columns:
plan for select with extra columns
The plan for SELECT COUNT(*) FROM Person WHERE JobTitleId = 17 shows a little extra work needs to be done to actually count the rows, but this is very small compared to having to go away and read extra data pages:
plan for select count
You can see the number of pages touched in each case with SET STATISTICS IO ON: both the single column and count examples show logical reads 2 and the one with extra columns output shows logical reads 213.

2. Less data needs to be transferred and processed. This is often less significant than the above, though not always for large data and/or slow connections. Once the engine has find the data it needs to send it to the calling application (possibly over a network rather than via a local IPC response) then the application needs to process and display it. Your COUNT query will only ever result in one row with one value so this will be quick, the other query could result in a great many rows being transferred to, and processed by, the application.

NOTE: I've used simpler queries than your examples in my explanation, but the considerations are the same for more complex queries with joins, sub-queries, and other extra work, just potentially multiplied by the number of objects in play.