Sql-server – SQL Server tempdb on SSD or virtual RAM disc to increase temp-table performance

performanceperformance-tuningsql serversql server 2014tempdb

On a monthly bases we have a lot of temp calculations that run 4 days in total. We can't really change so many scripts and therefore i am searching for a config / hardware solution.

Background

Its not a ETL process because we are creating bits of information out of at least 10 applications (therefore i would call it calculations or lots of update-statements). This combination of information is only 1 stage of a bigger process that saves the calculated data finally with a normal ETL process (SSIS) in a DWH. If this calculation stage fails (server crash), we can easily restart it.

Problem

The MSSQL server where this calculation is running has only 2 HDD-raid1. One raid1 for data, indices and tempdb and one for the log. This calculation produce so much I/O that the whole process takes days. The server has 180GB of RAM and theoretically the whole data and result of these calculations could fit in 5GB memory. This indicates we are using the database/resources in the wrong way.

Possible Solution

I think we have to reduce the I/O in the first place and also handle the I/O that is needed anyway in a second step. BTW the server is SQL Server 2014 Standard Edition.

  1. Move the tempdb on a virtual ram disc (ca. 40GB). That means we move the main I/O to memory, correct?
  2. Use more temp tables to reduce server-logging and recovery time and use the new ram disc (is there really less logging?)
  3. Put the logs, indices and important tables on a new SSD raid5
  4. Put all other data on the old HDD-drives

Is this a good solution or would you suggest the standard solution tempdb on the SSDs as well?

Best Answer

Before you commit to a particular course of action, you first need to understand what is causing the poor performance.

Look at wait stats while the calculation process is underway. Check this article by Paul Randal for an excellent starting point on how to gather and analyze them.

Use the system dmv, sys.dm_io_virtual_file_stats to understand the I/O requirements of your process. I use this code, which shows the activity over a 10 minute period:

IF (COALESCE(OBJECT_ID('tempdb..#vfs_stats'), 0) = 0)
CREATE TABLE #vfs_stats
(
    run_num INT NOT NULL
    , database_id INT NOT NULL
    , file_id INT NOT NULL
    , sample_ms BIGINT NOT NULL
    , num_of_reads BIGINT NOT NULL
    , num_of_bytes_read BIGINT NOT NULL
    , io_stall_read_ms BIGINT NOT NULL
    , num_of_writes BIGINT NOT NULL
    , num_of_bytes_written BIGINT NOT NULL
    , io_stall_write_ms BIGINT NOT NULL
    , io_stall BIGINT NOT NULL
    , size_on_disk_bytes BIGINT NOT NULL
);

TRUNCATE TABLE #vfs_stats;

INSERT INTO #vfs_stats (run_num
    , database_id
    , file_id
    , sample_ms
    , num_of_reads
    , num_of_bytes_read
    , io_stall_read_ms
    , num_of_writes
    , num_of_bytes_written
    , io_stall_write_ms
    , io_stall
    , size_on_disk_bytes
)
SELECT 1
    , vfs.database_id
    , vfs.file_id
    , vfs.sample_ms
    , vfs.num_of_reads
    , vfs.num_of_bytes_read
    , vfs.io_stall_read_ms
    , vfs.num_of_writes
    , vfs.num_of_bytes_written
    , vfs.io_stall_write_ms
    , vfs.io_stall
    , vfs.size_on_disk_bytes
FROM sys.dm_io_virtual_file_stats(NULL, NULL) vfs;

WAITFOR DELAY '00:10:00';

INSERT INTO #vfs_stats (run_num
    , database_id
    , file_id
    , sample_ms
    , num_of_reads
    , num_of_bytes_read
    , io_stall_read_ms
    , num_of_writes
    , num_of_bytes_written
    , io_stall_write_ms
    , io_stall
    , size_on_disk_bytes
)
SELECT 2
    , vfs.database_id
    , vfs.file_id
    , vfs.sample_ms
    , vfs.num_of_reads
    , vfs.num_of_bytes_read
    , vfs.io_stall_read_ms
    , vfs.num_of_writes
    , vfs.num_of_bytes_written
    , vfs.io_stall_write_ms
    , vfs.io_stall
    , vfs.size_on_disk_bytes
FROM sys.dm_io_virtual_file_stats(NULL, NULL) vfs;

SELECT 
    DatbaseName = d.name
    , FileName = mf.name
    , SampleTime_ms =   s2.sample_ms            - s1.sample_ms
    , Reads =           s2.num_of_reads         - s1.num_of_reads
    , BytesRead =       s2.num_of_bytes_read    - s1.num_of_bytes_read
    , IOStallRead =     s2.io_stall_read_ms     - s1.io_stall_read_ms
    , Writes =          s2.num_of_writes        - s1.num_of_writes
    , BytesWritten =    s2.num_of_bytes_written - s1.num_of_bytes_written
    , IOStallWRite =    s2.io_stall_write_ms    - s1.io_stall_write_ms
    , IOStall =         s2.io_stall             - s1.io_stall
    , GrowthOnDisk =    s2.size_on_disk_bytes   - s1.size_on_disk_bytes
FROM #vfs_stats s1
    INNER JOIN #vfs_stats s2 ON s1.run_num = (s2.run_num - 1)
        AND s1.database_id = s2.database_id
        AND s1.file_id = s2.file_id
    INNER JOIN sys.databases d ON s1.database_id = d.database_id
    INNER JOIN sys.master_files mf ON s1.database_id = mf.database_id 
        AND s1.file_id = mf.file_id
ORDER BY d.name
    , mf.name;

You'll want to get an excellent understanding of the performance profile of the process that you can use as a baseline. Once you do make changes, you can compare the performance profile after the changes to the baseline performance to understand what impact your change had.

Without understanding the performance profile you will be shooting in the dark by throwing hardware at the problem. However, this can be a good strategy if you know you have underperforming hardware, such as a the data files on old 7200 rpm spinning rust, or old Xeon processors. As a total off-the-cuff recommendation, you could try putting all the databases (tempdb, and the data database) onto a nice fast SSD, preferably a PCIe SSD, and making sure you have enough processor speed and memory bandwidth.