Sql-server – Best approach to have a Live copy of a table in the same database

sql serversql-server-2008-r2

I have a table where 15,000 to 20,000 rows are being inserted every hour.

Table Schema is something like

ColumnName   DataType
ID           BIGINT IDENTITY(1,1)
Column1      INT
Column2      INT
Column3      DATETIME DEFAULT = GETDATE()

No row is ever update nor deleted. As you can imagine table grows huge in very short time.

The table is queried as well time to time. SELECT statements with filtering on DATETIME column and sometimes other filtering on INT column too.

To my knowledge, I have two options

  1. indexing columns (Datetime, some int columns) to get better performance on select statements and get a performance hit on my INSERTS

OR

  1. Keep indexing minimum, Primary key Clustered index and a Datetime index and keeping the inserts very fast and get a performance hit on selects.

Another option that was suggested to me, was to create another table, Populate that table from this current table and index the living hell out of it to help all the possible select queries. Read data from this duplicated table.

Keep the original table as it is with minimum indexing, for quick inserts.

Since I am duplicating the data, I know this option violates the basic rules of normilazation, but this sounds like a good option for keeping inserts and reads as fast as possible.

The problem is how can I maintain this near real time copy of this table inside the same database.

I do not want to use any After Insert triggers as this will end up firing 15,000 to 20,000 times an hour.

What other options I have to keep this near real time copy of table in the same database??? Or maybe another approach altogether, any suggestion or pointers in the right direction are much appreciated.

Best Answer

20,000/60 = 333 records per minute is not that rate where you worry about inserts, even if they are burst. Of course it depends on your hardware, but since your table is rather huge, you need something big. So I would defenitely create 1 or 2 indexes (you can omit clustered) to speed up your queries.

At a higher insert rate you probably go with partitioning and split PAGELATCH contention between partitions. Or you could use a variation of GUID clustered primary key. There is a lot of arguing here and you may want to check Tom Kejser blog post here.

Also there is no shame if you go below 3NF. People create highly denormalized data shops just for reporting purposes. Since you have enterprise edition you could use CDC (Change Data Capture) technique and create SSIS package to update reporting table, at a time you need fresh data.