This is a simple question that I can't seem to find the answer for.
In terms of performance, If I have a WHERE
clause such as a=0 and b=0 and ... z=0
, Would I gain any performance if I replaced that condition with a+b+...+z=0
?
In other words, is there any performance gain by replacing the following
Select *
From MyTable
Where A=0 and B=0 and C=0 and D=0...
With
Select *
From MyTable
Where A+B+C+D=0...
I know it can depend on indexes, but for this purpose, let's just say no indexes exist. Does the arithmetic operator (+) perform better than an "OR" or "AND" Logical Operator?
I'm under the impression that the addition performs better than multiple conditions with ANDs or ORs.
Test results
On a Table of 4.2 million rows
Returning rows Where A=0 B=0 and C=0 -> 351748 Rows
The addition (A+B+C=0) took 5 seconds while the Logical conditions A=0 and B=0 and C=0 took 11 seconds.
On the other hand
Returning rows Where A<>0 B<>0 or C<>0 -> 3829750 Rows 58 seconds
Returning rows Where F65+ F67+f64<>0 -> 3829750 Rows 57 seconds
For the OR, it seems that there is no significant difference.
I agree with gbn:
If A is -1 and B is 1, A+B=0 but A=0 and B= 0 is false
and with AMtwo:
ABS(A)+ABS(B)+ABS(C)+ABS(D)… Even if you expect only positive values, if the column accepts negative values, you should assume that you might encounter one
The results are very impressive, as I thought, It seems that the addition is much quicker than the logical operators.
A = Float, B = Money and C = Float. The query used is as shown. In my case, all are positive numbers. No indexes. It is just logical in my mind that addition would be quicker than Logical conditions!
Best Answer
In your question, you detail some tests that you've prepared where you "prove" that the addition option is quicker than comparing the discrete columns. I suspect your test methodology may be flawed in several ways, as @gbn and @srutzky have alluded to.
First, you need to ensure you're not testing SQL Server Management Studio (or whatever client you're using). For instance, if you are running a
SELECT *
from a table with 3 million rows, you're mostly testing SSMS's ability to pull rows from SQL Server and render them on-screen. You're far better off to use something likeSELECT COUNT(1)
which negates the need to pull millions of rows across the network, and render them on screen.Second, you need to be aware of SQL Server's data cache. Typically, we test the speed of reading data from storage, and processing that data, from a cold-cache (i.e. SQL Server's buffers are empty). Occasionally, it makes sense to do all your testing with a warm-cache, but you need to approach your testing explicitly with that in mind.
For a cold-cache test, you need to run
CHECKPOINT
andDBCC DROPCLEANBUFFERS
prior to each run of the test.For the test you've asked about in your question, I created the following test-bed:
This returns a count of 260,144,641 on my machine.
To test the "addition" method, I run:
The messages tab shows:
For the "discrete columns" test:
again, from the messages tab:
From the stats above you can see the second variant, with the discrete columns compared to 0, the elapsed time is about 10 seconds shorter, and the CPU time is about 6 times less. The long durations in my tests above are mostly a result of reading a lot of rows from disk. If you drop the number of rows to 3 million, you see the ratios remain about the same but the elapsed times drop noticeably, since the disk I/O has much less of an effect.
With the "Addition" method:
With the "discrete columns" method:
What will make a really really big difference for this test? An appropriate index, such as:
The "addition" method:
The "discrete columns" method:
The execution plan for each query (with the above index in-place) is quite telling.
The "addition" method, which must perform a scan of the entire index:
and the "discrete columns" method, which can seek to the first row of the index where the leading index column,
A
, is zero: