Sql-server – Microsoft SQL database table size

primary-keysize;sql server

I have come across a sizing issue with my database tables becoming too large. Some digging has shown that it is how MS SQL stores the information.

I created two test tables and filled them with data:

CREATE TABLE INVOICEDOCUMENTSTEST1 (
  INVOICEINDEXNUMBER numeric(10, 0) NOT NULL DEFAULT (0),
  CONSTRAINT PK_INVOICEDOCUMENTSTEST1 PRIMARY KEY CLUSTERED (INVOICEINDEXNUMBER))

DECLARE @i INT;
SELECT @i = 1;
SET NOCOUNT ON
WHILE @i <= 235605
BEGIN
  INSERT INTO INVOICEDOCUMENTSTEST1(INVOICEINDEXNUMBER) VALUES (@i);
  SELECT @i = @i + 1;
END;

CREATE TABLE INVOICEDOCUMENTSTEST2 (
  INVOICEINDEXNUMBER numeric(10, 0) NOT NULL DEFAULT (0),
  [INVOICETAXTYPE] [varchar](15) NULL,
  [INVOICETYPE] [varchar](20) NULL,
  [INVOICESTATUS] [varchar](25) NULL DEFAULT ('Entering'),
  [INVOICEDATE] [datetime] NULL,
  [INVOICETIME] [datetime] NULL,
  [CREATEDATE] [datetime] NULL,
  [CREATETIME] [datetime] NULL,
  [CREATEBY] [varchar](50) NULL,
  [POSTDATE] [datetime] NULL,
  [POSTTIME] [datetime] NULL,
  [POSTEDBY] [varchar](50) NULL,
  [JOBCODE] [varchar](30) NULL,
  [LINKCODEFORWARD] [varchar](30) NULL,
  [CUSTACCOUNTCODE] [varchar](50) NULL,
  [CUSTCOMPANYNAME] [varchar](50) NULL,
  [CUSTDELIVERYADDRESS1] [varchar](50) NULL,
  [CUSTDELIVERYADDRESS2] [varchar](50) NULL,
  [CUSTDELIVERYSUBURB] [varchar](50) NULL,
  [CUSTDELIVERYSTATE] [varchar](50) NULL,
  [CUSTDELIVERYPOSTCODE] [varchar](10) NULL,
  [CUSTDELIVERYCOUNTRY] [varchar](50) NULL,
  [CUSTORDERNUMBER] [varchar](20) NULL,
  [CUSTCONTACT] [varchar](20) NULL,
  [CUSTDEFAULTPRICECODE] [varchar](10) NULL DEFAULT ('Price01'),
  [CUSTCASHSALEACCOUNTONLY] [varchar](5) NULL,
  [CUSTANALYSISCODE] [varchar](50) NULL,
  [SALESPERSON] [varchar](50) NULL,
  [SALESPERSONANALYSISCODE] [varchar](50) NULL,
  [EXTAXINVOICE] [varchar](5) NULL DEFAULT ('FALSE'),
  [EXTAXNUMBER] [varchar](20) NULL,
  [COMMENTS1] [varchar](100) NULL,
  [COMMENTS2] [varchar](100) NULL,
  [COMMENTS3] [varchar](100) NULL,
  [COMMENTS4] [varchar](100) NULL,
  [COMMENTS5] [varchar](100) NULL,
  [CUSTINVOICEADDRESS1] [varchar](50) NULL,
  [CUSTINVOICEADDRESS2] [varchar](50) NULL,
  [CUSTINVOICESUBURB] [varchar](50) NULL,
  [CUSTINVOICESTATE] [varchar](50) NULL,
  [CUSTINVOICEPOSTCODE] [varchar](10) NULL,
  [CUSTINVOICECOUNTRY] [varchar](50) NULL,
  [COURIERCOMPANY] [varchar](50) NULL,
  [COURIERCONSIGNMENTNOTE] [varchar](50) NULL,
  [FREIGHTCOMPANY] [varchar](50) NULL,
  [PAYMENTTERMS] [varchar](50) NULL,
  [PAYMENTREQUIREDBY] [datetime] NULL,
  [REQUIREDBY] [datetime] NULL,
  [REQUIREDAFTER] [datetime] NULL,
  [SHIPPINGDESCRIPTION1] [varchar](50) NULL,
  [SHIPPINGDESCRIPTION2] [varchar](50) NULL,
  [SHIPPINGCARTONQUANTITY] [numeric](10, 0) NULL DEFAULT (0),
  [SHIPPINGDATE] [datetime] NULL,
  [SHIPPINGDGDECLARATION] [varchar](10) NULL DEFAULT ('Does not'),
  [SHIPPINGTIME] [datetime] NULL,
  [SHIPPINGUNITS] [varchar](20) NULL DEFAULT ('Cartons'),
  [NOTES] [ntext] NULL,
  [CUSTCUSTOMERSLEDGERCODE] [varchar](50) NULL,
  [CUSTACCRUALSLEDGERCODE] [varchar](50) NULL,
  [CUSTGLOBALDISCOUNT] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSUBTOTAL] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSUBTOTALCALCORD] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSUBTOTALCALCDEL] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALDISCOUNT] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALDISCOUNTCALCORD] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALDISCOUNTCALCDEL] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALFREIGHT] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALTAX] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALTAXCALCORD] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALTAXCALCDEL] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALROUNDINGAMOUNT] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALTOTAL] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALTOTALCALCORD] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALTOTALCALCDEL] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALPAID] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALCHANGE] [numeric](14, 4) NULL DEFAULT (0),
  [SUPPLYLOCATION] [varchar](10) NULL,
  [HASBEENPRINTED] [varchar](5) NULL DEFAULT ('FALSE'),
  [PRINTTHISTIME] [varchar](5) NULL DEFAULT ('FALSE'),
  [TRANSACTIONMODE] [varchar](10) NULL DEFAULT ('Online'),
  [PREVTOTALFREIGHT] [numeric](14, 4) NULL DEFAULT (0),
  [LINEROUND] [varchar](5) NULL DEFAULT ('TRUE'),
  [SOURCEREFERENCE] [varchar](30) NULL,
  [LINECOUNT] [numeric](10, 0) NULL DEFAULT (0),
  [SHIPPINGINSTRUCTIONS] [varchar](50) NULL,
  [SHIPPINGPACKEDBY] [varchar](50) NULL,
  [INVENTORYFREIGHTANALCODE] [varchar](50) NULL,
  [LINKCODEBACKWARD] [varchar](30) NULL,
  [PREVTOTALCHANGE] [numeric](14, 4) NULL DEFAULT (0),
  [USEGST] [varchar](5) NULL DEFAULT ('FALSE'),
  [TOTALFREIGHTGST] [numeric](14, 4) NULL DEFAULT (0),
  [PREVTOTALFREIGHTGST] [numeric](14, 4) NULL DEFAULT (0),
  [GSTINVOICETYPE] [varchar](20) NULL DEFAULT ('Tax Invoice'),
  [CUSTPHONENUMBER] [varchar](25) NULL,
  [CUSTFAXNUMBER] [varchar](25) NULL,
  [LASTEDITDATE] [datetime] NULL,
  [LASTEDITTIME] [datetime] NULL,
  [LASTEDITBY] [varchar](50) NULL,
  [CUSTGSTSALESTYPE] [varchar](10) NULL DEFAULT ('Including'),
  [GSTPRICING] [varchar](10) NULL,
  [GSTTRANSFER] [numeric](10, 0) NULL DEFAULT (0),
  [ORIGINALINVOICENUMBER] [varchar](20) NULL,
  [ORIGINALDOCUMENTTYPE] [varchar](30) NULL,
  [CUSTEMAILADDRESS] [varchar](200) NULL,
  [CUSTMOBILENUMBER] [varchar](25) NULL,
  [TOTALSURCHARGE] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSURCHARGECALCDEL] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSURCHARGECALCORD] [numeric](14, 4) NULL DEFAULT (0),
  [CUSTFREIGHTCHARGE] [numeric](14, 4) NULL DEFAULT (0),
  [CUSTFREIGHTFREETHRESHOLD] [numeric](14, 4) NULL DEFAULT (0),
  [CUSTSURCHARGECHARGE] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSURCHARGEGST] [numeric](14, 4) NULL DEFAULT (0),
  [PREVTOTALSURCHARGE] [numeric](14, 4) NULL DEFAULT (0),
  [PREVTOTALSURCHARGEGST] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSURCHARGEGSTCALCORD] [numeric](14, 4) NULL DEFAULT (0),
  [TOTALSURCHARGEGSTCALCDEL] [numeric](14, 4) NULL DEFAULT (0),
  [INVENTORYSURCHARGEANALCODE] [varchar](50) NULL,
  [INVOICEPRIORITY] [varchar](10) NULL DEFAULT ('Normal'),
  [SHIPPINGPACKERHOLDINGLOC] [varchar](250) NULL,
  [CUSTOMERID] [numeric](10, 0) NOT NULL DEFAULT ((0)),
  [DELIVERYINSTRUCTIONS] [ntext] NULL,
  [PAYMENTNOTES] [ntext] NULL,
  [SHIPPINGNOTES] [ntext] NULL,
  [FREIGHTANDSURCHARGEOLDPOST] [varchar](5) NULL DEFAULT ('FALSE'),
  [TOTALFREIGHTTHRESHOLDREMOVED] [numeric](14, 4) NULL DEFAULT ((0)),
  [CUSTGSTLEDGERCODE] [varchar](50) NULL,
  [SALESPERSONNAME] [varchar](200) NULL,
  [SHIPPINGHASBEENPICKED] [varchar](5) NULL DEFAULT ('FALSE'),
  [SHIPPINGHASBEENPACKED] [varchar](5) NULL DEFAULT ('FALSE'),
  [SHIPPINGHASBEENDELIVERED] [varchar](5) NULL DEFAULT ('FALSE'),
  [SHIPPINGPICKEDBY] [varchar](250) NULL,
  [SHIPPINGPICKDATE] [datetime] NULL,
  [SHIPPINGPICKTIME] [datetime] NULL,
  [SHIPPINGPACKDATE] [datetime] NULL,
  [SHIPPINGPACKTIME] [datetime] NULL,
  [SHIPPINGDELIVEREDBY] [varchar](250) NULL,
  [SHIPPINGDELIVERYDATE] [datetime] NULL,
  [SHIPPINGDELIVERYTIME] [datetime] NULL,
  [INVOICEPRIORITYVALUE] [numeric](10, 0) NULL DEFAULT ((3)),
  [SHIPPINGPACKER] [varchar](250) NULL DEFAULT ('ANYONE'),
  [SHIPPINGPICKER] [varchar](250) NULL DEFAULT ('ANYONE'),
  [SHIPPINGDELIVERYRUN] [varchar](250) NULL DEFAULT ('ANYONE'),
  [CUSTALLOWBACKORDERS] [varchar](5) NULL DEFAULT ('TRUE'),
  [QUOTECOMMENT] [varchar](100) NULL,
  [QUOTESTATUS] [varchar](20) NULL DEFAULT ('Open'),
  [QUOTESTATUSCOMMENT] [varchar](100) NULL,
  [CUSTALLOWSHORTSUPPLY] [varchar](10) NULL DEFAULT ('ALL'),
  [PICKINGINPROGRESSBY] [varchar](100) NULL,
  [PACKINGINPROGRESSBY] [varchar](100) NULL,
  [MANFIESTED] [varchar](5) NULL DEFAULT ('FALSE'),
  CONSTRAINT PK_INVOICEDOCUMENTSTEST2 PRIMARY KEY CLUSTERED (INVOICEINDEXNUMBER))

DECLARE @i INT;
SELECT @i = 1;
SET NOCOUNT ON
WHILE @i <= 235605
BEGIN
  INSERT INTO INVOICEDOCUMENTSTEST2(INVOICEINDEXNUMBER) VALUES (@i);
  SELECT @i = @i + 1;
END;

The first table is a mini version of the second, with only the primary index created.

I have inserted 235605 rows as that's what I have in the live database.

I have a query to get the table and index size:

SELECT
  T.NAME AS TABLENAME,
  '' AS INDEXNAME,
  P.ROWS AS ROWCOUNTS,
  SUM(A.TOTAL_PAGES) * 8 AS TOTALSPACEKB,
  SUM(A.USED_PAGES) * 8 AS USEDSPACEKB,
  (SUM(A.TOTAL_PAGES) - SUM(A.USED_PAGES)) * 8 AS UNUSEDSPACEKB
FROM
  SYS.TABLES T
INNER JOIN
  SYS.INDEXES I ON T.OBJECT_ID = I.OBJECT_ID
INNER JOIN
  SYS.PARTITIONS P ON I.OBJECT_ID = P.OBJECT_ID AND I.INDEX_ID = P.INDEX_ID
INNER JOIN
  SYS.ALLOCATION_UNITS A ON P.PARTITION_ID = A.CONTAINER_ID
LEFT OUTER JOIN
  SYS.SCHEMAS S ON T.SCHEMA_ID = S.SCHEMA_ID
WHERE
  T.NAME NOT LIKE 'dt%'
  AND T.IS_MS_SHIPPED = 0
  AND I.OBJECT_ID > 255
GROUP BY
  T.NAME, S.NAME, P.ROWS

UNION ALL

SELECT
  OBJECT_NAME(I.OBJECT_ID) AS TABLENAME,
  I.NAME AS INDEXNAME,
  0 AS ROWCOUNTS,
  8 * SUM(A.USED_PAGES) AS TOTALSPACEKB,
  8 * SUM(A.USED_PAGES) AS USEDSPACEKB,
  0 AS UNUSEDSPACEKB
FROM
  SYS.INDEXES AS I JOIN
  SYS.PARTITIONS AS P ON P.OBJECT_ID = I.OBJECT_ID AND P.INDEX_ID = I.INDEX_ID JOIN
  SYS.ALLOCATION_UNITS AS A ON A.CONTAINER_ID = P.PARTITION_ID
WHERE
  OBJECT_NAME(I.OBJECT_ID) NOT LIKE 'sys%%'
  AND OBJECT_NAME(I.OBJECT_ID) NOT LIKE 'dt%%'
  AND OBJECT_NAME(I.OBJECT_ID) NOT LIKE 'filestream%%'
  AND OBJECT_NAME(I.OBJECT_ID) NOT LIKE 'filetable%%'
  AND OBJECT_NAME(I.OBJECT_ID) NOT LIKE 'plan_%%'
  AND OBJECT_NAME(I.OBJECT_ID) NOT LIKE 'queue_%%'
  AND OBJECT_NAME(I.OBJECT_ID) NOT LIKE 'sqlagent_%%'
GROUP BY
  I.OBJECT_ID,
  I.INDEX_ID,
  I.NAME

ORDER BY 1, 2

When I run the size query across these two tables, I get the following:

TableName             indexname                 RowCounts   TotalSpaceKB
INVOICEDOCUMENTSTEST1                             235605         7560
INVOICEDOCUMENTSTEST1 PK_INVOICEDOCUMENTSTEST1         0         7504
INVOICEDOCUMENTSTEST2                             235605       210584
INVOICEDOCUMENTSTEST2 PK_INVOICEDOCUMENTSTEST2         0       210496

This shows that the row data itself only takes up a fraction of the data. The index on only that field takes approx 7.5mb. However with those extra fields on the table, it expands the primary key out to 210mb, even though all those fields are blank and the same field is the only one in the primary key.

This appears to be an impact of how MS SQL want to access the data. If the primary key on Table 2 is created as 'nonclustered' then the index is smaller but there is a 'hidden' index created that is the same size as a clustered index anyway.

I have found that with the size of some of my tables, that even though the primary key is on a BCD file, the indexes are increased the database to 2-3x the size of the data.

In one case it's pushing the database over the 10gb Express limit so it's causing a financial problem as well.

Is there any other way to structure the table so that you don't give up so much space to the index?

Best Answer

SQL is pre-allocating the space for the fixed-length datatypes in your table (datetime 8b per field per row, numeric(10/14, x) 9b per field per row), and possibly 'stubs' for the variable length datatypes. It doesn't matter if they're null or not. This is by design so that the rows are inserted and updated efficiently. The varchar fields with defaults will also have space allocated to them appropriate to the size of the default. It also means your table is going to get a LOT bigger once you start populating the currently-null varchar fields.

A clustered index and the 'hidden' index you refer to (also known as a heap) aren't actually indexes as such; they're actually the table itself, and so will take up the space required to store the table data.

To try and compact your data, you can try to use some smaller datatypes in places. For example, date and time in place of datetime where appropriate, smaller precision numeric fields (numeric(9, x) and lower is 5 bytes), using integer types tinyint/smallint/int instead of numeric(10,0) for whole numbers, and using a bit field to represent true/false instead of varchar(5) (I count 11 true/false fields, which can be covered by 2b worth of bit fields instead of the 66-77b they currently occupy).

Beyond that, there appears to be a lot of normalisation that can be done on the table which could reduce the table size; e.g. can you combine invoice date & invoice time into a single datetime or datetime2 field? Address, staff/user details, comments and many others can be pushed out into lookup tables and a foreign key added to this table. There also seems to be a lot of calculated values which may be better created as calculated columns (or the logic moved into the app if there is one) rather than stored as an actual field.

Finally, you can look into sparse columns, where null values do not take up any space. However looking at the table I don't think they'll help much, as it looks like only a low number of columns will actually contain null data in normal use.