IN vs EXISTS operators. Why you shouldn’t use IN 2

One of the things i always look for when i am tuning queries is the usage of the IN operator. If this operator is not used correctly you can suffer some severe performance issues. In this blog post i try to give some understanding about when to use the IN operator and when to use an EXISTS.

IN

According to Books Online?this operator “Determines whether a specified value matches any value in a subquery or a list.”. Hence the “any” part of that line. So in other words, the IN operator will return TRUE/FALSE/UNKNOWN when a value is IN the result of a (sub)query.

EXISTS

In Books Online we see that the EXISTS operator “Specifies a subquery to test for the existence of rows.”. And also “Returns TRUE if a subquery contains any rows.”. So?existence?of a row is key here.

 

IN vs EXISTS Compilation resources

So, what does this all mean? Well, the IN operator is used to check if a value is available in some subquery, where it will be used on the complete?result-set?of this subquery. If you have a big?result-set?where you are checking for some values you might suffer performance issues. The EXISTS operator on the other hand is very?useful?for checking if a value exists because it uses something we call “short-circuiting”. If a value is found in the subquery it will?immediately stop, because the operator can return TRUE. So on a very large?result-set?you will get your logical answer quicker when using the EXISTS in stead of the IN operator.

 

Let’s look at this with an example.

SELECT  p.FirstName
      , p.LastName
      , e.JobTitle
FROM    Person.Person AS p 
        JOIN HumanResources.Employee AS e
          ON e.BusinessEntityID = p.BusinessEntityID 
        JOIN HumanResources.EmployeeDepartmentHistory AS edh
          ON e.BusinessEntityID = edh.BusinessEntityID 
WHERE   edh.DepartmentID IN (SELECT DepartmentID
                             FROM   HumanResources.Department
                             WHERE  Name LIKE 'P%');

This will produce the following execution plan:

IN operator - Execution Plan

And then the same query, but this time optimized with the EXISTS operator

SELECT  p.FirstName
      , p.LastName
      , e.JobTitle
FROM    Person.Person AS p 
        JOIN HumanResources.Employee AS e
          ON e.BusinessEntityID = p.BusinessEntityID 
WHERE   EXISTS (SELECT *
                FROM    HumanResources.Department AS d
                        JOIN HumanResources.EmployeeDepartmentHistory AS edh
                          ON d.DepartmentID = edh.DepartmentID
                WHERE   e.BusinessEntityID = edh.BusinessEntityID
                AND     d.Name LIKE 'P%');

And this will produce the following execution plan:

EXISTS operator - Execution Plan

That looks almost the same, doesn’t it. Well, ALMOST! When we look at the cost of the several operators we see that most of them are different. When we run both queries together we see that the query with the IN operator costs 66% of the complete execution, where the query with the EXISTS operator takes 34%. That is a significant difference in cost. Lets look at the SELECT part of both queries. The SELECT properties of the query with the IN operator is on the left, the SELECT properties of the query with the EXISTS operator is on the right.

 

IN vs EXISTS - select cost

 

We see several things. CompileCPU and CompileTime for instance is showing that the EXISTS operator use less CPU compile time. The CPU memory is even lower, together with memory grant. So with this information we can say that the query with the EXISTS operator takes less time to compile, so therefor it is faster.

Logical processing

Besides it takes less CPU Compile time when using an EXISTS operator, there is also a logical part that is better when using the EXISTS and IN operators. In the beginning of this post i said something about the possible values that the IN and EXISTS operators will return.?TRUE/FALSE/UNKNOWN. The TRUE/FALSE part is self explaining. Just a normal?Boolean?operator. However, the UNKNOWN part is somewhat different. The UNKNOWN will be used for NULL columns for example. And NULL is equal to nothing, so comparing NULL with NULL becomes not true. Nothing isn’t equal to nothing.

For “normal” usage we will not see any real difference and/or damage done with the IN or EXISTS operator. But when we are using NOT IN or NOT EXISTS, things will get nasty! For example. When using the NOT statement the logical part will be NOT TRUE or NOT UNKNOWN. ?So with a column being queried that contains a NULL value, you will get no results, even when there are columns?which?are NOT TRUE. So NOT IN is a no go here!

And here comes the short-circuiting part of the EXISTS operator as being the solution for this problem. Even if a column contains a NULL value, if ANY column will return TRUE the operator will stop processing since there is an existing record.

Conclusion

Well, if there is any conclusion that you can take from this is to NOT USE THE IN OPERATOR at any time. It uses more compiletime, so more resources and logic wise it isn’t always giving the results that are expected.

 

Happy coding :)

 

 

2 thoughts on “IN vs EXISTS operators. Why you shouldn’t use IN

  1. Reply Phil Nov 4, 2012 16:26

    “Well, if there is any conclusion that you can take from this is to NOT USE THE IN OPERATOR at any time. It uses more compiletime, so more resources and logic wise it isn?t always giving the results that are expected.”

    You have not mentioned queries such as
    select f1, f2
    from table
    where f3 in (‘a’, ‘b’)

    Do you advocate changing this to
    where f3 = ‘a’ or f3 = ‘b’ ?

    If so, why?

    • Reply Mark Kremers Nov 4, 2012 18:50

      Hi Phil,

      Well, for such small queries where you are certain of a small amount it doesn’t really matter. The overhead in compiletime for such structures wouldn’t be that high when you compare it with the EXISTS operator. For readable logic i certainly would go with the in operator, but again, only on small resultsets.

      regards
      Mark

Leave a Reply

  

  

  

* Copy This Password *

* Type Or Paste Password Here *

Switch to our mobile site

%d bloggers like this: