Sql-server – Using table variable instead of temp table makes query execution slow

performancequery-performancesql serversql-server-2008temporary-tables

I have a table with historical data about cars AutoData with combined clustered key Cas (DateTime) + GCom (Car ID). One record contains various indicators, like fuel level, vehicle state etc.

Intervals between individual records for one car in AutoData table are irregular, sometimes it is 120 seconds, sometimes few seconds, sometimes hours etc. I need to normalize the records for viewing, so that it shows one record per every 30 seconds.

I have the following script:

DECLARE @GCom int = 2563,
    @Od DateTime2(0) = '20170210', 
    @Do DateTime2(0) = '20170224'    

--Create a table with intervals by 30 seconds
declare @temp Table ([cas] datetime2(0))
INSERT @temp([cas])
SELECT d
FROM
(
  SELECT
      d = DATEADD(SECOND, (rn - 1)*30, @Od)
  FROM 
  (
      SELECT TOP (DATEDIFF(MINUTE, @Od, @Do)*2) 
          rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
      FROM
          sys.all_objects AS s1
      CROSS JOIN
          sys.all_objects AS s2
      ORDER BY
          s1.[object_id]
  ) AS x
) AS y;

--Create temp table
CREATE TABLE #AutoData (
    [Cas] [datetime2](0) NOT NULL PRIMARY KEY,
    [IDProvozniRezim] [tinyint] NOT NULL,
    [IDRidic] [smallint] NULL,
    [Stav] [tinyint] NOT NULL,
    [Klicek] [bit] NOT NULL,
    [Alarm] [bit] NOT NULL,
    [MAlarm] [tinyint] NOT NULL,
    [DAlarm] [bit] NOT NULL,
    [Bypass] [bit] NOT NULL,
    [Lat] [real] NULL,
    [Lon] [real] NULL,
    [ObjemAktualni] [real] NOT NULL,
    [RychlostMaxV1] [real] NOT NULL,
    [RychlostV2] [real] NOT NULL,
    [Otacky] [smallint] NOT NULL,
    [Nadspotreba] [real] NOT NULL,
    [Vzdalenost] [real] NOT NULL,
    [Motor] [smallint] NOT NULL
)

--Populate the temp table selecting only relevant AutoData records
INSERT INTO #AutoData
SELECT [Cas]
      ,[IDProvozniRezim]
      ,[IDRidic]
      ,[Stav]
      ,[Klicek]
      ,[Alarm]
      ,[MAlarm]
      ,[DAlarm]
      ,[Bypass]
      ,[Lat]
      ,[Lon]
      ,[ObjemAktualni]
      ,[RychlostMaxV1]
      ,[RychlostV2]
      ,[Otacky]
      ,[Nadspotreba]
      ,[Vzdalenost]
      ,[Motor]
FROM AutoData a 
WHERE a.GCom = @GCom AND a.cas BETWEEN @Od AND @do

--Select final data
SELECT t.cas, ad.malarm, ad.IDProvoznirezim, ad.Otacky, ad.motor, ad.objemAktualni, ad.Nadspotreba 
FROM @temp t
OUTER APPLY (
SELECT TOP 1 stav, malarm, otacky,motor, objemAktualni, Nadspotreba, IDProvoznirezim  FROM #AutoData a
                     WHERE DATEDIFF(SECOND, a.cas, t.cas)<=CASE WHEN Motor>120 THEN Motor ELSE 120 END 
                     AND DATEDIFF(SECOND,  a.cas, t.cas)>-30 
                     ORDER BY CASE WHEN DATEDIFF(SECOND, a.cas, t.cas)>0 THEN DATEDIFF(SECOND, a.cas, t.cas) ELSE (DATEDIFF(SECOND, a.cas, t.cas)*-1) +120 END
) ad

DROP TABLE #AutoData

At the first I have tried to write the script with only one table variable @temp placing the condition WHERE a.GCom = @GCom AND a.cas BETWEEN @Od AND @do in the last select. The script took 39 seconds to execute.

When I have used #AutoData temp table to preload data subset in a temp table like it is shown in the script above, it dropped to 5 seconds.

Then I have tried to use a table variable @AutoData instead of #AutoData – but it took again much longer – 22 seconds.

@temp table has 40320 records and #AutoData table has 1904 records for this example. But suprisingly just using #temp table instead of @temp variable made the execution slow again.

I was suprised to see such differences using or not-using temp table/variable. Appearently SQL Server could not by itself optimize the insides of the OUTER APPLY clause.

But why there is such a big difference using table variables vs. temp tables?
Is there any other way to know, what to use and not just trying it?

Execution plan with temp table #AutoData:

https://www.brentozar.com/pastetheplan/?id=B1y2x2Zcg

Execution plan with variable @AutoData:

https://www.brentozar.com/pastetheplan/?id=r1rAZnbqx

Best Answer

The key is in this part of your question:

@temp table has 40320 records

In the execution plan, hover your mouse over the @temp table's scan. Compare the estimated number of rows versus the actual number of rows. (If you'd like to post the plan at http://PasteThePlan.com, we can give you more specific details. Disclaimer: that's my company's site.)

You're going to see that the estimated number of rows is really low.

SQL Server estimates that 1-3 rows will come back from a table variable (depending on your version of SQL Server, cardinality estimator, trace flags, etc.) This in turn gives you a really bad execution plan because SQL Server underestimates how much work it'll need from other tables, how much memory to set aside, etc.

Here are the two most popular ways to get a more accurate estimate:

Try a temp table instead (and look at estimated vs actual rows in the plan)
Use OPTION (RECOMPILE) on your query - which will get you a much more precise estimate, but with some very big drawbacks around plan cache visibility and CPU usage

To see me doing it live, watch the 1-hour Watch Brent Tune Queries (disclaimer: that's me, linking to a video of me) where I take a Stack Overflow query that uses a table variable, and tune it live in front of an audience at SQL Rally Norway.

YOUR QUERY

SELECT post.postid, post.attach FROM newbb_innopost AS post WHERE post.threadid = 51506;

At first glance, that query should only touches 1.1597% (62510 out of 5390146) of the table. It should be fast given the key distribution of threadid 51506.

REALITY CHECK

No matter which version of MySQL (Oracle, Percona, MariaDB) you use, none of them can fight to one enemy they all have in common : The InnoDB Architecture.

InnoDB Architecture

CLUSTERED INDEX

Please keep in mind that the each threadid entry has a primary key attached. This means that when you read from the index, it must do a primary key lookup within the ClusteredIndex (internally named gen_clust_index). In the ClusteredIndex, each InnoDB page contains both data and PRIMARY KEY index info. See my post Best of MyISAM and InnoDB for more info.

REDUNDANT INDEXES

You have a lot of clutter in the table because some indexes have the same leading columns. MySQL and InnoDB has to navigate through the index clutter to get to needed BTREE nodes. You should reduced that clutter by running the following:

ALTER TABLE newbb_innopost
    DROP INDEX threadid,
    DROP INDEX threadid_2,
    DROP INDEX threadid_visible_dateline,
    ADD INDEX threadid_visible_dateline_index (`threadid`,`visible`,`dateline`,`userid`)
;

Why strip down these indexes ?

The first three indexes start with threadid
threadid_2 and threadid_visible_dateline start with the same three columns
threadid_visible_dateline does not need postid since it's the PRIMARY KEY and it's embedded

BUFFER CACHING

The InnoDB Buffer Pool caches data and index pages. MyISAM only caches index pages.

Just in this area alone, MyISAM does not waste time caching data. That's because it's not designed to cache data. InnoDB caches every data page and index page (and its grandmother) it touches. If your InnoDB Buffer Pool is too small, you could be caching pages, invalidating pages, and removing pages all in one query.

TABLE LAYOUT

You could shave of some space from the row by considering importthreadid and importpostid. You have them as BIGINTs. They take up 16 bytes in the ClusteredIndex per row.

You should run this

SELECT importthreadid,importpostid FROM newbb_innopost PROCEDURE ANALYSE();

This will recommend what data types these columns should be for the given dataset.

CONCLUSION

MyISAM has a lot less to contend with than InnoDB, especially in the area of caching.

While you revealed the amount of RAM (32GB) and the version of MySQL (Server version: 10.0.12-MariaDB-1~trusty-wsrep-log mariadb.org binary distribution, wsrep_25.10.r4002), there are still other pieces to this puzzle you have not revealed

The InnoDB settings
The Number of Cores
Other settings from my.cnf

If you can add these things to the question, I can further elaborate.

UPDATE 2014-08-28 11:27 EDT

You should increase threading

innodb_read_io_threads = 64
innodb_write_io_threads = 16
innodb_log_buffer_size = 256M

I would consider disabling the query cache (See my recent post Why query_cache_type is disabled by default start from MySQL 5.6?)

query_cache_size = 0

I would preserve the Buffer Pool

innodb_buffer_pool_dump_at_shutdown=1
innodb_buffer_pool_load_at_startup=1

Increase purge threads (if you do DML on multiple tables)

innodb_purge_threads = 4

Best Answer

Related Solutions

Sql-server – Database design for handling 1 billion rows and counting

Mysql – Why are simple SELECTs on InnoDB 100x slower than on MyISAM