Sql-server – Two databases of same size, backups sizes are very different

backupsql server

This is not a big problem for me, but still, I would like to understand what is happening here. I have two databases on my SQL Server 2008 (10.00.1600) – they are restored from the same backup (and after that some optimization work was done by our C# program – the process was the same for both DBs, + some small amount inserts/deletes was executed afterwards).

But when I take a backup of those databases the output file size differs:

Database1:    706 MB
Database2:   1690 MB

The query used to take both backups (script generated by SSMS):

BACKUP DATABASE [DatabaseX] TO DISK = N'D:\databaseX.bak' 
    WITH NOFORMAT, NOINIT, NAME = N'DatabaseX-Full', SKIP, NOREWIND, 
    NOUNLOAD, NO_COMPRESSION,  STATS = 10

Here are file stats of databases:

Database1: 
    Data allocated:       1950 MB
    Logs allocated:       5121 MB
    Output backup size:    706 MB
Database2:
    Data allocated:       1704 MB
    Logs allocated:         50 MB (this was truncated in the process of my investigation :))
    Output backup size:   1690 MB (NB! this is not true anymore, see update 3 below)

As you can see database 2 has 200 MB less data and still produces a much bigger backup file! Can anyone tell me why is this happening? Please do tell if some additional info is needed.

UPDATE 1

I got the data/log allocated sizes by executing the following query:

SELECT  d.name,
        mf.physical_name,
        round(mf.size * 8 / 1024, 0) Size_MBs
FROM    sys.master_files mf
INNER JOIN sys.databases d ON d.database_id = mf.database_id
WHERE   d.database_id > 4 -- Skip system databases
ORDER BY d.name

Ldf and mdf sizes were there (each DB has only two files – one mdf and one ldf). And if it matters both databases' recovery model is FULL.

UPDATE 2

Just to clarify – the operation after restoring databases involves all kind of stuff (adding/removing table fields, rebuilding indexes, adding/removing FKs and inserts/deletes). The final structure (tables, indexes and fks) are the same. The amount of data differs slightly.

UPDATE 3

Here the results of the query posted by gbn:

            Database 2          Database 1 
            Reserv  Used        Reserv  Used
------------------------------------------------------
table1      475,88  230,98      240,94  240,51
table2      229,73  230,27      188,74  188,44
table3      162,78  163,18      69,06   68,35
table4      69,12   69,06       68,87   68,61
table5      68,75   68,50       66,83   66,37
table6      67,69   66,93       21,08   20,55
table7      19,02   11,05       15,28   14,88
table8      17,81   17,52       4,98    4,55
table9      8,06    3,91        2,52    2,30
table10     4,23    3,90        1,13    1,09
table11     3,40    3,09        0,97    0,93
table12     1,44    0,54        0,90    0,67
table13     1,20    1,04        0,77    0,59
table14     0,98    0,94        0,64    0,52
table15     0,65    0,63        0,39    0,29
table16     0,58    0,58        0,11    0,06
table17     0,12    0,06        0,08    0,08
table18     0,08    0,08        0,05    0,05
table19     0,06    0,06        0,05    0,05
table20     0,06    0,06        0,05    0,05
table21     0,05    0,05        0,05    0,05
table22     0,05    0,05        0,03    0,03
table23     0,05    0,05        0,03    0,03
table24     0,03    0,03        0,03    0,03
table25     0,03    0,03        0,03    0,03
table26     0,03    0,03        0,03    0,03
table27     0,03    0,03        0,03    0,03
table28     0,03    0,03        0,02    0,02
table29     0,02    0,02        0,00    0,00
------------------------------------------------------
TOTAL       1131,95 872,71      683,69  679,16

UPDATE 4

After rebuilding all indexes on Database2 its output backup file size reduced to 1165 MB.

Best Answer

As noted in comments, the tables do not have clustered indexes therefore you can not logically defragment them (rebuild indexes) completely. That is, the indexes are defragged but not the actual data.

Also, DELETEs from heaps (a table without a clustered index) do not deallocate space. See http://msdn.microsoft.com/en-us/library/ms189245(v=sql.105).aspx for more

Finally, not having clustered indexes is generally a bad thing except for staging tables that are truncated. However, GUIDs are not good clustering keys

Solutions?