Difference Between IN and NOT IN Performance Wise

oracle-11gperformancequery-performance

I am new to database.I am having doubt that what is the difference between IN and NOT IN performance wise.

When I am using IN, takes less time for the same query(Logically same query) when I use NOT IN.

Best Answer

If there is an index on the column the IN clause can make better use of it. You can test this yourself:

CREATE TABLE TEST1
(
  STATUS NUMBER NOT NULL
);

CREATE INDEX IXTEST1 ON TEST1(STATUS);

insert into test1
select MOD(level,10)
from dual
connect by level <= 10000;

select * from TEST1 where status in (0,1,2,3,4);
select * from TEST1 where status not in (5,6,7,8,9);

If we run oracle tuning advisor for both statements we get for the first:

-------------------------------------------------------------------------------
There are no recommendations to improve the statement.

-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------

1- Original
-----------
Plan hash value: 1479979182


-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |     1 |    13 |     1   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR  |         |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| IXTEST1 |     1 |    13 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1
   2 - SEL$1 / TEST1@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("STATUS"=0 OR "STATUS"=1 OR "STATUS"=2 OR "STATUS"=3 OR 
              "STATUS"=4)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "STATUS"[NUMBER,22]
   2 - "STATUS"[NUMBER,22]

-------------------------------------------------------------------------------

but for the second:

-------------------------------------------------------------------------------
FINDINGS SECTION (5 findings)
-------------------------------------------------------------------------------

1- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
  Predicate "TEST1"."STATUS"5 used at line ID 1 of the execution plan is an
  inequality condition on indexed column "STATUS". This inequality condition
  prevents the optimizer from selecting indices  on table "MWARE"."TEST1".

  Recommendation
  --------------
  - Rewrite the predicate into an equivalent form to take advantage of
    indices.

2- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
  Predicate "TEST1"."STATUS"6 used at line ID 1 of the execution plan is an
  inequality condition on indexed column "STATUS". This inequality condition
  prevents the optimizer from selecting indices  on table "MWARE"."TEST1".

  Recommendation
  --------------
  - Rewrite the predicate into an equivalent form to take advantage of
    indices.

3- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
  Predicate "TEST1"."STATUS"7 used at line ID 1 of the execution plan is an
  inequality condition on indexed column "STATUS". This inequality condition
  prevents the optimizer from selecting indices  on table "MWARE"."TEST1".

  Recommendation
  --------------
  - Rewrite the predicate into an equivalent form to take advantage of
    indices.

4- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
  Predicate "TEST1"."STATUS"8 used at line ID 1 of the execution plan is an
  inequality condition on indexed column "STATUS". This inequality condition
  prevents the optimizer from selecting indices  on table "MWARE"."TEST1".

  Recommendation
  --------------
  - Rewrite the predicate into an equivalent form to take advantage of
    indices.

5- Restructure SQL finding (see plan 1 in explain plans section)
----------------------------------------------------------------
  Predicate "TEST1"."STATUS"9 used at line ID 1 of the execution plan is an
  inequality condition on indexed column "STATUS". This inequality condition
  prevents the optimizer from selecting indices  on table "MWARE"."TEST1".

  Recommendation
  --------------
  - Rewrite the predicate into an equivalent form to take advantage of
    indices.

-------------------------------------------------------------------------------
EXPLAIN PLANS SECTION
-------------------------------------------------------------------------------

1- Original
-----------
Plan hash value: 648532652


----------------------------------------------------------------------------
| Id  | Operation        | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT |         |     1 |    13 |     1   (0)| 00:00:01 |
|*  1 |  INDEX FULL SCAN | IXTEST1 |     1 |    13 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - SEL$1 / TEST1@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("STATUS"5 AND "STATUS"6 AND "STATUS"7 AND 
              "STATUS"8 AND "STATUS"9)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "STATUS"[NUMBER,22]

-------------------------------------------------------------------------------

Related Solutions

SQL Server 2005 – INSERT Performance: Temporary Tables vs Table Variables

The obvious difference between the two plans is that the fast one is parallel and the slower one serial.

This is one of the limitations of plans that insert into table variables. As mentioned in the comments (and it seems as though it had the desired effect) you could try doing

INSERT INTO @DATA ( ... ) 
EXEC('SELECT .. FROM ...')

to see if that gets around the limitation.

Mysql – the performance difference between IN (42) and id = 42 in MySQL

Whether the queries are equivalent or not is up to the MySQL Query Optimizer. Why ?

Back n Mar ,13 2013 I wrote an answer to this post: Is there an execution difference between a JOIN condition and a WHERE condition?

In that post I described exactly how JOINs are executed. The following is taken from my post which quotes from page 172 of Understanding MySQL Internals:

Determine which keys can be used to retrieve the records from tables, and choose the best one for each table.
For each table, decide whether a table scan is better that reading on a key. If there are a lot of records that match the key value, the advantages of the key are reduced and the table scan becomes faster.
Determine the order in which tables should be joined when more than one table is present in the query.
Rewrite the WHERE clauses to eliminate dead code, reducing the unnecessary computations and changing the constraints wherever possible to the open the way for using keys.
Eliminate unused tables from the join.
Determine whether keys can be used for ORDER BY and GROUP BY.
Attempt to simplify subqueries, as well as determine to what extent their results can be cached.
Merge views (expand the view reference as a macro)

On that same page, it says the following:

In MySQL optimizer terminology, every query is a set of joins. The term join is used here more broadly than in SQL commands. A query on only one table is a degenerate join. While we normally do not think of reading records from one table as a join, the same structures and algorithms used with conventional joins work perfectly to resolve the query with only one table.

From the aforementioned information, JOIN behavior will execute the same say regardless of a query having multiple tables or event just one table.

YOUR ORIGINAL QUESTION

Under the hood, MySQL will evaluate the two queries the same way. If you want better query performance, you have to take the bull by the horns. You should do all you can to the table so that MySQL join behavior goes as smooth as possible.

Add the needed indexes
Increase session level buffers (sort_buffer_size, join_buffer_size)
Take advantage of Storage Engine mechanisms for tuning data and indexes
Refactor the query

If you look at dimitar's answer, now it spells out a case where MySQL's join behavior is put to the test. Instead of betting on two horses you own (your queries) to see who runs better, invest time into getting a faster horse if such a horse exists.

From ditimar's post, you have these

SELECT * FROM table WHERE id IN (42,43,44,45);
SELECT * FROM table WHERE id = 42 or id = 43 or id = 44 or id = 45;

Here is yet another one I suggest for the sake of example

SELECT A.* FROM table A INNER JOIN
(SELECT 42 id UNION SELECT 43 UNION SELECT 44 UNION SELECT 45) B
USING (id);

and another

SELECT * FROM table WHERE id = 42
UNION
SELECT * FROM table WHERE id = 43
UNION
SELECT * FROM table WHERE id = 44
UNION
SELECT * FROM table WHERE id = 45;

I can make up other possibilities, but the main idea here is to try to write good queries the first time. When your amount of data grows, your best queries may suffer due to key distribution and stale index stats which may require optimizing tables or even rewriting queries to suit bigger data.

Best Answer

Related Solutions

SQL Server 2005 – INSERT Performance: Temporary Tables vs Table Variables

Mysql – the performance difference between IN (42) and id = 42 in MySQL

YOUR ORIGINAL QUESTION

Related Question