T-sql – Architecture of table for lot of inserts per second in SQL Server 2019

availability-groupsfilegroupspartitioningt-sql

We are in process of development of project which will process around 51 000 000 records daily (~around 500-600 inserts per second). We choose Always On solution which will contain:

  1. One Primary Server
  2. Three replicas which has to work in synchronous mode

Main table of this project will contain 45 columns with several data types. Like I wrote above to this table we will have to insert around 500-600 rows per second (idenepndly, I mean that will be procedure insertMainTable which will insert one row to this table will be executed 500-600 times per second).

According to this efficiency we decide to create this table as partition table. Our idea is to create 32 partitions which base on smallint column in this table. Due to this solution we have idea to create 16 file groups with 32 files (so 1 file group will contain 2 files). So every partition will be in different file. What is more every 2 files will be on different filegroup. I was testing this solution and have results below:

https://stackoverflow.com/questions/60339635/sql-server-2019-always-on-performance

My question is:

Do you think that this kind of solution is enough/to complicated comparing to goals? What would you recommend?

Best Answer

I have some experience with high throughput tables, partitioning and Availability Groups, so I feel confident enough to post this as an answer, although the best I can really do is raise points for you to consider.

First - your 3 synchronous replicas are going to be the slowest links in your chain. Every commit is going to have to wait for all three to confirm before the client receives the commit signal. I don't know what to tell you here but make sure you are on fast network and fast disks.

Second - I don't think you need the multiple files to make your solution fast and it may slow things down. If you move items between partitions then if they are in the same filegroup that's fairly fast and meta-only. However, if each partition has it's own file/filegroup and you move things around, the system has to pick that data up and physically move it instead of just updating an index.

Third - What is the goal/point of your partitioning? You should be able to run many hundreds of inserts to a single table. And SQL 2019 has improvements for highly concurrent inserts to avoid the hot spot issue. Have you tested it without all the partitioning overhead?

Are you going to be updating this data? Are you going to be purging the data later? If you are purging the data later then http://kejser.org/table-pattern-rotating-log-ring-buffer/ may be of use to you. I've implemented that pattern in my production environment to great success.

What are the datatypes for this 45 column table? If you are using VARCHAR(MAX), XML or other large datatypes then there may not be much that can be done without re-architecting the table.

Partitioning isn't really great at speeding table access up. It's main benefit is in allowing for rapid maintenance activities. Such as quickly loading new data, or discarding old data.