Please look at the following view of the InnoDB/XtraDB Architecture
Please notice the Percona Server Documentation on This
Performance Impact of the Doublewrite Buffer
In usual workloads the performance impact is low-5% or so. As a
consequence, you should always enable the doublewrite buffer because
the strong guarantee against data corruption is worth the small
performance drop.
But if you experience a heavy workload, especially if your data does
not fit in the buffer pool, the writes in the doublewrite buffer will
compete against the random reads to access the disk. In this case, you
can see a sharp performance drop compared to the same workload without
the doublewrite buffer-a 30% performance degradation is not uncommon.
Another case when you can see a big performance impact is when the
doublewrite buffer is full. Then new writes must wait until entries in
the doublewrite buffer are freed.
How to Choose a Good Location for the Doublewrite Buffer
Basically if you want to improve the I/O activity, you will put the
doublewrite buffer on a different disk. But is it better on an SSD or
a more traditional HDD? First you should note that pages are written
in a circular fashion in the doublewrite buffer and only read on
recovery. So the doublewrite buffer performs mostly sequential writes
and a few sequential reads. Second HDDs are very good at sequential
write if a write cache is enabled, which is not the case of SSDs.
Therefore you should choose a fast HDD if you want to see performance
benefits from this option. For instance, you could place the redo logs
(also written in a sequential way) and the doublewrite buffer on the
same disk.
What can we learn from this ?
- The doublewrite buffer is a circular buffer. It is subject to pending writes when it is full.
- The fact that the doublewrite buffer can get full proves that growth is never an issue.
CAVEAT
Looking at the InnoDB/XtraDB Architecture, it says that the Undo Logs can grow unllimitedly. That is the source of uncontrolled growth in the system tablespace.
Your Actual Question
How big can grow innodb_doublewrite_file? How to calculate optimal disk size, if it would be used only for this file?
Nothing in the Documentation states a way to set a size or know its limitations. It only recommends placing the doublewrite file on a fast HDD.
CONJECTURE (OK I am guessing at this point)
If there is no limit on the doublewrite buffer size, then the number of blocks that make up all the data and index pages would be the maximum size. You would run this query:
SELECT
ibbytes InnoDBSpace,
ibbytes/POWER(1024,1) InnoDBSpaceKB,
ibbytes/POWER(1024,2) InnoDBSpaceMB,
ibbytes/POWER(1024,3) InnoDBSpaceGB,
ibbytes/POWER(2,14) InnoDBBlocks
FROM (SELECT SUM(data_length+index_length) ibbytes
FROM information_schema.tables WHERE engine = 'InnoDB') A;
These numbers assume that every block of data and index for InnoDB has a place in the doublewrite buffer to go to.
A more realistic number may be this:
SELECT
ibbytes InnoDBSpace,
ibbytes/POWER(1024,1) InnoDBSpaceKB,
ibbytes/POWER(1024,2) InnoDBSpaceMB,
ibbytes/POWER(1024,3) InnoDBSpaceGB,
ibbytes/POWER(2,14) InnoDBBlocks
FROM (SELECT (variable_value * POWER(2,14)) ibbytes
FROM information_schema.global_status
WHERE variable_name = 'Innodb_buffer_pool_pages_data');
These numbers are going by the actual working set in the InnoDB Buffer Pool
I hope this little explanation helped you !!!
File writes are distributed across files in the same file group proportionally based on the current size of each file in the file group. This is referred to as the "proportional fill algorithm" - look at http://sqlserver-performance-tuning.net/?p=2552 for some interesting details around that.
tempdb
can have only a single filegroup. If you attempt to create a filegroup in tempdb
you get the following:
Msg 1826, Level 16, State 1, Line 1
User-defined filegroups are not allowed on "tempdb".
If you have trace flag 1117 turned on, files within a group will autogrow simultaneously across files in the filegroup for each file that is not currently at its maximum size, and where space exists on the disk.
Does your instance have trace flag 1117 turned on? You'd probably want it off in this particular instance even though "best practices" often indicate having this turned on. There is an item on Microsoft Connect asking to have a setting like this that can be enabled/disabled on a per-database basis, here: https://connect.microsoft.com/SQLServer/feedback/details/781198/trace-flag-1117-autogrowth-of-data-files-is-instance-wide-would-like-a-flag-for-just-tempdb
Assuming the SSDs are devoted to tempdb, I agree with your assertion and suggest making the tempdb on the SSDs as big as possible (not 100% of the drive, perhaps leave 10% free). Make the tempdb files on disk as small as possible, say 1MB, with autogrowth, and maximum file size as big as you need. Monitor for tempdb filegrowth on the HDDs, and make the case for getting bigger SSDs if you think the company would benefit from it.
According to the documentation, SQL Server files can be created on raw partitions (partitions that haven't been formatted) by using just the drive letter of the partition in the ALTER DATABASE ... ADD FILE
syntax. This apparently removes the need to grow or shrink the file since it inherently uses the entire raw partition, as needed. I'm uncertain if this would help your situation or not; just thought I'd throw it out there as an interesting factoid. See "If the file is on a raw partition, os_file_name must specify only the drive letter of an existing raw partition. Only one file can be put on each raw partition."
Best Answer
Could you not simply run
sp_spaceused
, or a cut down version as you probably don't need both datasets, on tempdb regularly and log the results?Obviously this will likely result in an underestimate, because it is unlikely that peak space use will be exactly on one of your samples, so you will need to inflate the maximum figures. Perhaps you could use this method to decide a good starting point for the "shrink and watch if it grows back a bit" experiment that you describe.