Sql-server – Calculating the total row size of a table – SQL Server

size;sql servert-sql

I have a table 'Connections' with the following schema in SQL Server:

[ID] [int] IDENTITY(1,1) NOT NULL,
[User] [nvarchar](20) NOT NULL,
[UserID] [int] NULL,
[True] [bit] NOT NULL,
[Comment] [nvarchar](max) NOT NULL,
[Created] [datetime] NOT NULL,
[Flag] [bit] NULL,
[Destination] [varchar](20) NULL

and it has a primary key clustered index on the ID column.

I need to get the size generate by the entries on this table per month. I have search for any existing SP, function or any DMV that can help me with this but I only found how to get the size of the table not per row. Also I need the total size of the rows per month so cannot get the total size/minimum/maximum of the rows on the whole table (as the solution provided in other stackexchange posts).

My attempt to this is as follows:

USE DB1;
SELECT DATEPART(year,created),
       DATEPART(month,created),
       (count (*))*(4+2*ISNULL((max(len([User]))),2)+4+1+2*ISNULL((max(len([Comment]))),2)+8+1+ISNULL(max(len([Destination])),2)) 'BytesPerMonth'
FROM Connections
GROUP BY DATEPART(year,created),DATEPART(month,created)

In the above I multiplied the number of rows with the byte size of a row and considered the following:

int - 4 bytes
nvarchar - 2 bytes per character ([nvarchar](max) also take 2 bytes per character, same as if we had [nvarchar](40), correct?)
bit - 1 byte
datetime - 8 bytes
varchar - 1 byte per character

However, this only provides an estimate due to only considering the max length of the variable columns and multiples that (the maximum) by the amount of rows which results in a much bigger value than the actual size of the rows. Is there a way that I can get the actual size per row in this context?

Furthermore, I am aware of the row header which is per row – another 4 bytes per row (currently I did not include this since my result was already huge due to considering the max of the variable columns). Also I have found that I should consider 3 bytes due to the null values and 8 bytes due to the variable columns in my schema, do they need to be considered per row/ per column? How can I calculate the size of the index?

Best Answer

You can use the DATALENGTH function to get the number of bytes for each column in a row. You can sum these up to get the total for the row, get the MIN, MAX, AVG and so forth.

SELECT DATEPART(year,created) AS [Year],
       DATEPART(month,created) AS [Month],
       COUNT(*) AS Rows,
       SUM(DATALENGTH([User]) + DATALENGTH([UserID]) + DATALENGTH([True]) + DATALENGTH([Comment]) + DATALENGTH([Created]) + DATALENGTH([Flag]) + DATALENGTH([Destination])) AS TotalRowSize,
       AVG(DATALENGTH([User]) + DATALENGTH([UserID]) + DATALENGTH([True]) + DATALENGTH([Comment]) + DATALENGTH([Created]) + DATALENGTH([Flag]) + DATALENGTH([Destination])) AS AvgRowSize,
       MIN(DATALENGTH([User]) + DATALENGTH([UserID]) + DATALENGTH([True]) + DATALENGTH([Comment]) + DATALENGTH([Created]) + DATALENGTH([Flag]) + DATALENGTH([Destination])) AS MinRowSize,
       MAX(DATALENGTH([User]) + DATALENGTH([UserID]) + DATALENGTH([True]) + DATALENGTH([Comment]) + DATALENGTH([Created]) + DATALENGTH([Flag]) + DATALENGTH([Destination])) AS MaxRowSize
FROM Connections
GROUP BY DATEPART(year,created),DATEPART(month,created)

The advantage of DATALENGTH is that it provides the actual used bytes for variable length fields, so there is no need to guess/estimate. See this db<>fiddle for a working example.