Sql-server – Diskspace usage of DB

disk-spacesql-server-2008

I have a production database that's consuming a significant amount more disk space than it should be.

Using sp_spaceused (only the interesting tables), I get.

Table Name    Reserved (KBs)    Data (KBs)    Index Size (KBs)
----------    --------------    ----------    ----------------
E              149600             142992         1136
H              184968             12104        162272
M               94104             29512         62480
T               25616             13544          9464
MH              16072             10640          5208
U             5399312           2984184       2414376
Total         5869672           3192976       2654936

However, the on disk size is a whopping 16 GB database and 6 GB log file.

The culprit is mainly Table U. Most of the data is very static with just lots of UPDATE to existing rows. Table U is the ony with loads of data being added constantly, and being archived weekly.

Table U's definition is,

[ID]      INT                         NOT NULL,
[RD]      DATETIME                    NOT NULL,
[UN]      CHAR (7)                    NOT NULL,
[A]       BIT                         NOT NULL,
[C]       BIT                         NOT NULL,
[S]       DATETIME                    NOT NULL,
[rowguid] UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL

It receives about 5 million rows per week, and archives another 5 million rows per week.

It has not been partitioned / placed into separate file groups.

The system has been arround for approximatley 6 months, and I don't believe our DBA team have ever shrunk it / defragged it!

Questions

Why is SQL using so much more disk space than sp_spaceused says?
Is the churn responsible for most of this disk space issue?
How should I use (SQL commands please) partitions and file groups to better optimise this churn?

Best Answer

Assuming that you did your homework correctly and these tables listed are indeed the way space is used (ie. there is no other table consuming the space) then what you have now is state and you ask about history. The state of the database now is a 16GB file with ~6Gb accounted for and 10Gb unexplained. How did it end up in this state? we cannot answer. There are a number of avenues that could lead up to this:

the database could had been created with a 16Gb data file size.
there could had been one or more big tables once that now are dropped
some operation needed temporary space and could had grown the database to 16Gb size and leave it so

You could consult the history captured in the default administrative trace and in the ERRORLOG and see if some of the info captured can shed light on the mystery.

If I'd venture a guess I would say you have a maintenance task of some sort, perhaps a table rebuild, that runs periodically. Rebuild operations require a great amount of data space as all the data exist for a while duplicated (old index being rebuild and new index being created) and, when the default OFFLINE option is used, they require a similarly great amount of log space as the entire operation is a single transaction. A log of 6Gb, data of approximately 6Gb size and a DB with +10Gb space would point toward that ~6Gb data being rebuild, maybe once manually, or maybe periodically by an index rebuild maintenance task.

Now about your real question: is there a better way to do this? Not an easy one. Such pattern of high churn with large range deletes is indeed very unhealthy. If you can prove that it causes real problems (sorry, but a 16Gb data file is not really a problem, and fragmentation is more often than not a red herring) then there are solutions, the most effective being using a partitioning sliding window for efficient deallocation via switch-out+truncate (The link contains the exact how-to). This is a heavy impact change that would ripple everywhere, including the application code, so do consider all the aspects. See How To Decide if You Should Use Table Partitioning.

Related Solutions

Sql-server – Can’t create indexes on really large table!

One thing you can try is to temporarily set the recovery model to Simple.

Alternatively, you can tell the create index command to use tempdb. Since tempdb is typically in the Simple recovery mode.

CREATE INDEX ...
WITH (SORT_IN_TEMPDB = ON);

SQL Server Space Usage – Analyze sys.allocation_units and sp_spaceused

Even though you fixed the immediate rounding issue, the overall algorithm to get per-object / index stats is incorrect. It does not properly handle LOB and row-overflow data. It also excludes: Indexed Views, FullText indexes, XML indexes, and a few other cases. Hence, you might not be seeing all of your data.

The following is an adaptation of the code I posted to an answer on StackOverflow ( sp_spaceused - How to measure the size in GB in all the tables in SQL ) that handles all of the cases that sp_spaceused handles. That S.O. question was only concerned with per-object stats, not per index, so I have adjusted the code to handle things at the index level.

;WITH agg AS
(   -- Get info for Tables, Indexed Views, etc
    SELECT  ps.[object_id] AS [ObjectID],
            ps.index_id AS [IndexID],
            NULL AS [ParentIndexID],
            NULL AS [PassThroughIndexName],
            NULL AS [PassThroughIndexType],
            SUM(ps.in_row_data_page_count) AS [InRowDataPageCount],
            SUM(ps.used_page_count) AS [UsedPageCount],
            SUM(ps.reserved_page_count) AS [ReservedPageCount],
            SUM(ps.row_count) AS [RowCount],
            SUM(ps.lob_used_page_count + ps.row_overflow_used_page_count)
                    AS [LobAndRowOverflowUsedPageCount]
    FROM    sys.dm_db_partition_stats ps
    GROUP BY    ps.[object_id],
                ps.[index_id]
    UNION ALL
    -- Get info for FullText indexes, XML indexes, Spatial indexes, etc
    SELECT  sit.[parent_id] AS [ObjectID],
            sit.[object_id] AS [IndexID],
            sit.[parent_minor_id] AS [ParentIndexID],
            sit.[name] AS [PassThroughIndexName],
            sit.[internal_type_desc] AS [PassThroughIndexType],
            0 AS [InRowDataPageCount],
            SUM(ps.used_page_count) AS [UsedPageCount],
            SUM(ps.reserved_page_count) AS [ReservedPageCount],
            0 AS [RowCount],
            0 AS [LobAndRowOverflowUsedPageCount]
    FROM    sys.dm_db_partition_stats ps
    INNER JOIN  sys.internal_tables sit
            ON  sit.[object_id] = ps.[object_id]
    WHERE   sit.internal_type IN
               (202, 204, 207, 211, 212, 213, 214, 215, 216, 221, 222, 236)
    GROUP BY    sit.[parent_id],
                sit.[object_id],
                sit.[parent_minor_id],
                sit.[name],
                sit.[internal_type_desc]
), spaceused AS
(
SELECT  agg.[ObjectID],
        agg.[IndexID],
        agg.[ParentIndexID],
        agg.[PassThroughIndexName],
        agg.[PassThroughIndexType],
        OBJECT_SCHEMA_NAME(agg.[ObjectID]) AS [SchemaName],
        OBJECT_NAME(agg.[ObjectID]) AS [TableName],
        SUM(CASE
                WHEN (agg.IndexID < 2) THEN agg.[RowCount]
                ELSE 0
            END) AS [Rows],
        SUM(agg.ReservedPageCount) * 8 AS [ReservedKB],
        SUM(agg.LobAndRowOverflowUsedPageCount +
            CASE
                WHEN (agg.IndexID < 2) THEN (agg.InRowDataPageCount)
                ELSE 0
            END) * 8 AS [DataKB],
        SUM(agg.UsedPageCount - agg.LobAndRowOverflowUsedPageCount -
            CASE
                WHEN (agg.IndexID < 2) THEN agg.InRowDataPageCount
                ELSE 0
            END) * 8 AS [IndexKB],
        SUM(agg.ReservedPageCount - agg.UsedPageCount) * 8 AS [UnusedKB],
        SUM(agg.UsedPageCount) * 8 AS [UsedKB]
FROM    agg
GROUP BY    agg.[ObjectID],
            agg.[IndexID],
            agg.[ParentIndexID],
            agg.[PassThroughIndexName],
            agg.[PassThroughIndexType],
            OBJECT_SCHEMA_NAME(agg.[ObjectID]),
            OBJECT_NAME(agg.[ObjectID])
)
SELECT sp.SchemaName,
       sp.TableName,
       sp.IndexID,
       CASE
         WHEN (sp.IndexID > 0) THEN COALESCE(si.[name], sp.[PassThroughIndexName])
         ELSE N'<Heap>'
       END AS [IndexName],
       sp.[PassThroughIndexName] AS [InternalTableName],
       sp.[Rows],
       sp.ReservedKB,
       (sp.ReservedKB / 1024.0 / 1024.0) AS [ReservedGB],
       sp.DataKB,
       (sp.DataKB / 1024.0 / 1024.0) AS [DataGB],
       sp.IndexKB,
       (sp.IndexKB / 1024.0 / 1024.0) AS [IndexGB],
       sp.UsedKB AS [UsedKB],
       (sp.UsedKB / 1024.0 / 1024.0) AS [UsedGB],
       sp.UnusedKB,
       (sp.UnusedKB / 1024.0 / 1024.0) AS [UnusedGB],
       so.[type_desc] AS [ObjectType],
       COALESCE(si.type_desc, sp.[PassThroughIndexType]) AS [IndexPrimaryType],
       sp.[PassThroughIndexType] AS [IndexSecondaryType],
       SCHEMA_ID(sp.[SchemaName]) AS [SchemaID],
       sp.ObjectID
       --,sp.ParentIndexID
FROM   spaceused sp
INNER JOIN sys.all_objects so -- in case "WHERE so.is_ms_shipped = 0" is removed
        ON so.[object_id] = sp.ObjectID
LEFT JOIN  sys.indexes si
       ON  si.[object_id] = sp.ObjectID
      AND  (si.[index_id] = sp.IndexID
         OR si.[index_id] = sp.[ParentIndexID])
WHERE so.is_ms_shipped = 0
--so.[name] LIKE N''  -- optional name filter
--ORDER BY ????

Best Answer

Related Solutions

Sql-server – Can’t create indexes on really large table!

SQL Server Space Usage – Analyze sys.allocation_units and sp_spaceused

Related Question