Mysql – what is the difference between first and second execution of a query

MySQL

I notice that when I am executing queries the first time (i.e., just after mysqld_safe --user=mysql &), the queries are slower than when they are executed second time. When the query (source) is:

select 
    s_name, s_address 
from 
    SUPPLIER, NATION 
where 
    s_suppkey in ( 
        select 
            ps_suppkey 
        from 
            PARTSUPP 
        where 
            ps_partkey in (
                select 
                    p_partkey 
                from 
                    PART 
                where 
                    p_name 
                like 
                    'green%'
            ) and 
            ps_availqty > (
                select 
                    0.5 * sum(l_quantity) 
                from 
                    LINEITEM 
                where 
                    l_partkey = ps_partkey and 
                    l_suppkey = ps_suppkey and 
                    l_shipdate >= date '1993-01-01' and 
                    l_shipdate < date '1993-01-01' + interval '1' year
            )
    ) and 
    s_nationkey = n_nationkey and 
    n_name = 'ALGERIA' 
order by 
    s_name;

And I got:

> SELECT .....;
>   41.255s  -------- first time after startup
> SELECT .....; 
>   6.242s   -------- second time after startup
> SELECT .....;
>   6.104s   -------- third time after startup

However, in some other times, the difference does not exist. For example, this query(source):

SELECT 
    s_name, count(*) as numwait 
FROM 
    SUPPLIER, LINEITEM l1, ORDERS, NATION 
WHERE 
    s_suppkey = l1.l_suppkey and 
    o_orderkey = l1.l_orderkey and 
    o_orderstatus = 'F' and 
    l1.l_receiptdate > l1.l_commitdate and 
    exists ( 
        SELECT * 
        FROM LINEITEM l2 
        WHERE 
            l2.l_orderkey = l1.l_orderkey and 
            l2.l_suppkey <> l1.l_suppkey
    ) and 
    not exists (
        SELECT * 
        FROM LINEITEM l3 
        WHERE 
            l3.l_orderkey = l1.l_orderkey and 
            l3.l_suppkey <> l1.l_suppkey and 
            l3.l_receiptdate > l3.l_commitdate
    ) and 
    s_nationkey = n_nationkey and 
    n_name = 'EGYPT' 
GROUP BY 
    s_name
ORDER BY 
    numwait desc, s_name 
LIMIT 100;

I got:

> SELECT .....;
>   3m9.264s -------- first time after startup
> SELECT .....;
>   3m9.377s -------- second time after startup
> SELECT .....;
>   3m7.287s -------- third time after startup
>
> ------ This time, the difference do not exist.

MySQL version is 8.0.22
All configurations are default.

Why it is the case?

Thanks.

Best Answer

As others have mentioned in the comments, likely the performance improvement on subsequent runs that you saw in your first example is due to the data being cached in memory after your first run. There are multiple types of caches in MySQL, so even if you're not using the Query Cache, your MySQL instance uses the Table Cache and the buffer_pool which means the data from your query is actually cached in memory so that subsequent runs of that query don't need to fetch the data from disk again. This is beneficial if the size of the tables in your query are large.

The reason your second query didn't see much of a difference in performance on subsequent runs (though I would assume the raw table data was still cached) was because most of the runtime you're experiencing is probably due to the complexity of the query itself. Some of those complexities are from the EXISTS & NOT EXISTS that can be written in a more relationally efficient way, multiple inequality predicates, and the fact you're doing a COUNT() which may result in scans instead of seek operations, for example. The EXPLAIN ANALYZE / query plan for your second query would probably give more insight on specifically which part of it is the complexity that is causing most of the runtime.