Sql-server – Fast reads without paying the price on writes

indexperformancesql server

We have a large append-only table containing financial transactions. On average 1000 transactions are inserted per minute. Since there are now more and more usecases where we actually want to read, search and aggregate these transactions, fast reads would be very nice.

We want to guarantee very fast writes, and adding indexes to cover some of the reads would slow down writes.

The good news is that we can afford stale data. I was considering to create a copy of the data to a read-optimized table (with indexes) every n minutes, which also allows for a periodical bulk-copy (thus limiting the number of ops?)

I'm looking for an opinion on whether this is a valid strategy. If you consider this a decent strategy, do you have implementation options pointers? If not, what are the alternatives?

Best Answer

For SQL Server 2014 and later my recommendation is rather radical: switch to a clustered columnstore index. 1000 records/min is well within the range of columnstore bulk load capabilities, on even modest hardware. See Clustered Columnstore Index: Data Load Optimizations – Minimal Logging and SQL Server clustered columnstore Tuple Mover. The query performance of a clustered columnstore is quite staggering due to the inherent advantages of columnar storage and batch processing. But even more so on time series (which your data probably is), due to very likely segment elimination.

On SQL Server 2016 there are some specific improvements targeting your case, see Real-Time Operational Analytics Using In-Memory Technology and Speeding up Business Analytics Using In-Memory Technology.

For SQL Server 2012 and earlier my recommendation is to upgrade to 2014 or 2016.

In any case, I would shy away from Transactional Replication, for two reasons:

Also need to consider that log shipping or AlwaysOn readable secondaries can only offload the processing of querying, but not the schema requirements (ie. indexes). Any index required by querying on the replica would have to be created on the original DB, and the price is paid at write time.

Of course I assume you did due diligence and your writes are optimized now, ie. all the advice in The Data Loading Performance Guide is applied and your upload is bulk and minimally logged.