SQL Server – Storage Size of a Bigint Table

database-internalssql serversql-server-2008-r2storage

I am working with a table that has this format :

CREATE TABLE dbo.ID_STORE
( WORKING_ID bigint PRIMARY KEY CLUSTERED )

The table stores around 2 millions rows, but the stored ids are not consecutive, MAX(WORKING_ID)-MIN(WORKING_ID) is around 24 millions .

When I look at used space, I find around 57 Megabytes, when I expected slightly above 2 10^6 x 8 = 16 Megabytes. Can anyone explain the difference?

EDIT : these figures were obtained from the very first import into said table. This table is also TRUNCATEd before it is filled.

Best Answer

Each row has a minimum of 7 bytes of overhead when stored in FixedVar format (the default). There will also be a (typically relatively small) number of pages used for the upper levels of the clustered index. Optimally stored, and disregarding the upper index levels, 2 million rows would require just over:

(7 + 8 bytes) * 2,000,000 = 28.61MB.

More importantly, pages are likely to have split (unless the data was loaded in clustered key order), so the current pages will probably be less than 100% full. When a page is split, to accommodate new rows in key order, roughly 50% of the existing rows are moved to a new page, lowering the average density. Also, any deleted rows will only result in space being reclaimed if the whole page from which rows were removed becomes empty. In addition, each 8KB data page has a 96-byte header, and 2 bytes per row on the page for the row offset array.

The following example loads 2,000,000 rows with approximately the same distribution as your data, compacted as much as reasonably possible:

CREATE TABLE dbo.ID_STORE
( 
    WORKING_ID bigint NOT NULL PRIMARY KEY CLUSTERED
);

WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12
FROM Nums
WHERE Nums.n <= 2000000;

The output of:

EXECUTE sys.sp_spaceused 
    @objname = N'dbo.ID_STORE',
    @updateusage = 'true';

...shows 33,800 KB space reserved for this object.

Truncated and loaded so page splitting occurs:

TRUNCATE TABLE dbo.ID_STORE;

WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12
FROM Nums
WHERE Nums.n <= 1000000;

WITH
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  L5   AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L5)
INSERT dbo.ID_STORE WITH (TABLOCKX)
SELECT Nums.n * 12 + 1
FROM Nums
WHERE Nums.n <= 1000000;

The output of:

EXECUTE sys.sp_spaceused 
    @objname = N'dbo.ID_STORE',
    @updateusage = 'true';

...now shows 50,632 KB KB space reserved.

Rebuilding the clustered index:

ALTER INDEX ALL 
ON dbo.ID_STORE
REBUILD 
WITH 
(
    MAXDOP = 1, 
    SORT_IN_TEMPDB = ON
);

...reduced the space reserved to 33,800 KB again.

Depending on the version and edition of SQL Server you have, this table can be stored even more compactly using row or page compression, clustered columnstore, or clustered columnstore archival storage. In the latter case (SQL Server 2014 Enterprise required), the 2,000,000 rows reserve just 2,960 KB of space.