Sql-server – Fast reads without paying the price on writes

indexperformancesql server

We have a large append-only table containing financial transactions. On average 1000 transactions are inserted per minute. Since there are now more and more usecases where we actually want to read, search and aggregate these transactions, fast reads would be very nice.

We want to guarantee very fast writes, and adding indexes to cover some of the reads would slow down writes.

The good news is that we can afford stale data. I was considering to create a copy of the data to a read-optimized table (with indexes) every n minutes, which also allows for a periodical bulk-copy (thus limiting the number of ops?)

I'm looking for an opinion on whether this is a valid strategy. If you consider this a decent strategy, do you have implementation options pointers? If not, what are the alternatives?

Best Answer

For SQL Server 2014 and later my recommendation is rather radical: switch to a clustered columnstore index. 1000 records/min is well within the range of columnstore bulk load capabilities, on even modest hardware. See Clustered Columnstore Index: Data Load Optimizations – Minimal Logging and SQL Server clustered columnstore Tuple Mover. The query performance of a clustered columnstore is quite staggering due to the inherent advantages of columnar storage and batch processing. But even more so on time series (which your data probably is), due to very likely segment elimination.

On SQL Server 2016 there are some specific improvements targeting your case, see Real-Time Operational Analytics Using In-Memory Technology and Speeding up Business Analytics Using In-Memory Technology.

For SQL Server 2012 and earlier my recommendation is to upgrade to 2014 or 2016.

In any case, I would shy away from Transactional Replication, for two reasons:

it forces full logged INSERT for bulk operations (=very dramatic write perf impact)
I wouldn't want to be the admin responsible for the replication agents on a 1000 writes/min env.

Also need to consider that log shipping or AlwaysOn readable secondaries can only offload the processing of querying, but not the schema requirements (ie. indexes). Any index required by querying on the replica would have to be created on the original DB, and the price is paid at write time.

Of course I assume you did due diligence and your writes are optimized now, ie. all the advice in The Data Loading Performance Guide is applied and your upload is bulk and minimally logged.

Related Solutions

Sql-server – Getting ratio of reads to writes on SQL Server

This will be incomplete, but it will give you a good idea of the ratios you're looking for. You can use sys.dm_db_index_usage_stats to retreive the access from various indexes on the system. If you combine the seeks, scans, and lookups you'll get a good idea of the reads. Something like this:

SELECT  ddius.user_updates,
        ddius.user_lookups + ddius.user_scans + ddius.user_seeks AS user_reads
FROM    sys.dm_db_index_usage_stats AS ddius

This data is cumulative, so you'll need to capture the previous day in order to find the differences between yesterday & today. As I say, this won't give you a perfect measure, but it will move you in the right direction.

SQL Server DMV – Understanding ‘Reads’ and ‘Writes’ in dm_exec_query_stats

Reads and writes are expressed in terms of "the number 8K pages." It should be documented better on the page you reference, but you can piece this together from other areas of the documentation, e.g. from Reading Pages:

A logical read occurs every time the Database Engine requests a page from the buffer cache. If the page is not currently in the buffer cache, a physical read first copies the page from disk into the cache.

And from Understanding Pages and Extents:

In SQL Server, the page size is 8 KB.

Best Answer

Related Solutions

Sql-server – Getting ratio of reads to writes on SQL Server

SQL Server DMV – Understanding ‘Reads’ and ‘Writes’ in dm_exec_query_stats

Related Question