Reducing Sysallocunits Contention in SQL Server

sql serversql-server-2012

I have a database which is being inserted into by a large number of independent sessions. Each session writes to a different table, so you would naively expect that they would all proceed in parallel. However, by inspecting sys.dm_os_wait_stats and sys.dm_os_latch_stats, I found that I was actually experiencing a high number of PAGELATCH_* waits on pages that DBCC PAGE reported as type 11, i.e. PFS pages.

So basically the whole process is being slowed down because each thread is contending with the others to allocate new pages.

The standard advice in this situation is to:

I did both of these things and it did indeed reduce contention on PFS pages, but I'm now seeing a fair bit of PAGELATCH_EX contention on pages that DBCC PAGE tells me are data pages belonging to object ID 7, i.e. sysallocunits.

There isn't much information available online about the purpose of the sysallocunits system table, and I couldn't find anything about contention on it. It seems like I've just managed to replace my allocation bottleneck with a bottleneck when mutating this table!

I suppose I might be able to reduce contention by putting each table in its own database (which would then not share a sysallocunits table), but this would be a fairly major architectural change that I'd prefer to avoid.

Is there any way that I can reduce sysallocunits contention without creating extra databases?

Edit: as requested, this is the output of DBCC PAGE on one of the pages showing contention:

PAGE: (1:476)


BUFFER:


BUF @0x0000006143763A80

bpage = 0x0000004064D20000          bhash = 0x0000006301FCBAC0          bpageno = (1:476)
bdbid = 15                          breferences = 2047                  bcputicks = 48351
bsampleCount = 90                   bUse1 = 32392                       bstat = 0x10b
blog = 0x7a7a7a7a                   bnext = 0x0000000000000000          

PAGE HEADER:


Page @0x0000004064D20000

m_pageId = (1:476)                  m_headerVersion = 1                 m_type = 1
m_typeFlagBits = 0x0                m_level = 0                         m_flagBits = 0x0
m_objId (AllocUnitId.idObj) = 7     m_indexId (AllocUnitId.idInd) = 0   Metadata: AllocUnitId = 458752
Metadata: PartitionId = 458752      Metadata: IndexId = 1               Metadata: ObjectId = 7
m_prevPage = (1:475)                m_nextPage = (0:0)                  pminlen = 69
m_slotCnt = 34                      m_freeCnt = 5546                    m_freeData = 3089
m_reservedCnt = 0                   m_lsn = (18391:146685:11)           m_xactReserved = 0
m_xdesId = (0:588003648)            m_ghostRecCnt = 0                   m_tornBits = 208117691
DB Frag ID = 1                      

Allocation Status

GAM (1:2) = ALLOCATED               SGAM (1:3) = NOT ALLOCATED          PFS (1:1) = 0x40 ALLOCATED   0_PCT_FULL
DIFF (1:6) = CHANGED                ML (1:7) = NOT MIN_LOGGED           

Edit 2: a bit more detail on the actual insert process. Basically:

  • There are about 200 tables I'm ultimately trying to insert into (the tables are called things like dbo.LastTradePrice, dbo.Volume, etc). These tables all have clustered indexes of on a (SecurityID, Date, FromSourceID) triple. The tables can have 100s of millions of rows.
  • I have 200 simultaneous connections to the SQL server, each of which bulk insert into a different staging table (stage.LastTradePrice, stage.Volume etc). These tables all have clustered indexes on an autoincrementing primary key ID. Each staging table can hold up to 300,000 rows.
  • Then I have up to 32 connections which attempt to concurrently merge each staging table into the corresponding normal table. For each table, this merge process involves sorting the staging table (which I guess will spill to tempdb) and storing a subset of the sorted result into a table variable (another tempdb hit?). The data in the table variable is then used to update the main tables, a process which involves both INSERT and UPDATE operations.

Best Answer

With the workload description you have provided, you are putting a large amount of pressure on the allocations unit, especially since you are trying to maintain a sorted load (with the IDENTITY column and cluster). Keeping track of the allocations that happen will put a lot of pressure on the metadata - unless you are in bulk mode of course.

Here are the potential solutions:

  • Load into a heap. Rebuild the index (if you need it) after load is done. This also allows concurrent BULK INSERT on a single table. If you are on 2012, prefer SEQUENCER with cache over IDENTITY (IDENTITY columns have bottlenecks)
  • Use Trace flag 610 to enable minimally logged load on cluster indexes (if you MUST use clusters)
  • Put each table on a separate file group (this spreads out the allocations). Do this only as a last resort
  • If you plan to merge the staging tables into the final tables, why not merge directly without going through the intermediate step of table variables?
  • If you use a clustered index, could you have that be sorted by the same keys that you need the merge to happen on (removing the later sort in tempdb)
  • When loading into a sorted table (as above) use the ORDER hint if the input is already sorted. This reduces contention on the allocation maps (and also allows minimally logged loads)