Query_hash definition from sys.dm_exec_query_stats is:
query_hash: Binary hash value calculated on the query and used to identify queries with similar logic. You can use the query hash to determine the aggregate resource usage for queries that differ only by literal values.
But when I search for one particular query_hash (as shown below), I am getting multiple query texts that differ heavily in logic and size_in_bytes
(for plan).
Note: I am using SQL Server 2016
DECLARE @QueryHashTest BINARY(8)
SET @QueryHashTest = CONVERT(BINARY(8), 'Ð…U¹üŒv¿')
SELECT
QCP.objtype
,qStat.query_hash,
CONVERT(VARCHAR(100), qStat.query_Hash) AS VARCHAR_query_hash
,sText.text AS QueryText
,QCP.size_in_bytes
,qStat.creation_time
,qp.query_plan
FROM (
SELECT query_hash,
COUNT(query_hash) AS PlanCount
FROM sys.dm_exec_query_stats
GROUP BY query_hash
) AS MultipleQ
INNER JOIN sys.dm_exec_query_stats qStat ON MultipleQ.query_hash = qStat.query_hash
INNER JOIN sys.dm_exec_cached_plans QCP
ON QCP.plan_handle = qStat.plan_handle
CROSS APPLY sys.dm_exec_sql_text(qStat.sql_handle) AS sText
CROSS APPLY sys.dm_exec_query_plan(qStat.plan_handle) AS qp
WHERE PlanCount > 1
AND QCP.objtype = 'Proc'
AND qStat.query_hash= @QueryHashTest
AND (size_in_bytes >= 2179072 OR size_in_bytes <= 262144)
ORDER BY size_in_bytes DESC
Why are the unrelated queries showing same query_hash? Is it a bug or is there something unexpected going on in the database?
Best Answer
While hash collisions are possible, I suspect what's happening is you're pulling the text for the whole batch, not the particular statement. A procedure with multiple statements can get a cache entry for each statement. And if some statements are identical across procedures, they'll have the same query hash. Let me demonstrate:
Use the below query to see the how the same statement that is in each proc shows up separately.
If you use the query in your question, it fails to pick up the identical statement, since it looks at only the batch text.