Sql-server – ETL process causes database to rapidly grow – Shrinkfile reduces size from 140GB to 2GB – Keep shrinking

database-sizeshrinksql serversql-server-2012truncate

I have a database (SQL Server 2012) that is used mainly for ETL of data. Every hour SSIS packages run that truncates/deletes and reloads full datasets. The database keeps growing even though the data is wiped out and replaced.

I ran a SHRINKFILE and the database went from 140 GB down to 2 GB. As the days pass, the database increases in size 2+GB a day.

I am trying to figure out how to prevent the database from growing so rapidly.

Questions

  • Should I run truncate command or delete command? I try to use truncate but I was having issues with getting a lock on the tables so I switched to a delete
  • Is it better if I completely delete the table and recreate the object each time instead? Would this help with the database growth?
  • I have read where "shrinking" is not a good thing, but sometimes it is necessary. Would setting up a shrinkfile command to run weekly during off hours solve my problem?

Best Answer

Should I run truncate command or delete command? I try to use truncate but I was having issues with getting a lock on the tables so I switched to a delete • Is it better if I completely delete the table and recreate the object each time instead? Would this help with the database growth? • I have read where "shrinking" is not a good thing, but sometimes it is necessary. Would setting up a shrinkfile command to run weekly during off hours solve my problem?

First things first, are you using the full recovery model and not scheduling log backups? When are your log backups scheduled?

If yes, either change your recovery mode to simple (ALTER DATABASE ... SET RECOVERY SIMPLE or start taking log backups (more frequently). Consider the RPO of your database in this decision.

If not, then the delete is probably increasing your database's log file size as it is a logged operation.

The truncate table is a metadata only operation and should be instantaneous even for huge tables (as long as it is not being blocked).

MS Docs on TRUNCATE TABLE

Removes all rows from a table or specified partitions of a table, without logging the individual row deletions. TRUNCATE TABLE is similar to the DELETE statement with no WHERE clause; however, TRUNCATE TABLE is faster and uses fewer system and transaction log resources.

&

When the DELETE statement is executed using a row lock, each row in the table is locked for deletion. TRUNCATE TABLE always locks the table (including a schema (SCH-M) lock) and page but not each row.

When running truncate table, the table will be locked. This includes a SCH-M lock. Nonetheless you should be using truncate if you are deleting all the data in the table.

Why and what is locking the table when running the truncate?

EDIT

Recovery Model= simple. The truncate gets blocked mainly from users that have Power BI data refreshes setup or users that are running sql queries against it (via Power BI, excel, etc) Also, I have a small window of time to try and get the refresh completed so any locks were causing too much of a delay;

If you want to keep the delete, you should look into deleting in batches. This way, the log will not grow as much when using the simple recovery model.

If you don't want to do that, you will have to live with the increased size of the database log file. If the log grows out every week it serves no purpose to shrink it.

The Log file increases some but not like the .mdf file which grows to 140GB+ and shrinks down to 2GB. Is the log file the same as the .ldf file?

Correct, the log file corresponds to the .ldf file. If the .mdf file is the one growing, the only answer is looking at the data & datatypes.

If the data is deleted and inserted every load, the size should stay the same. If this is not the case, adding the datatypes, table definition and amount of rows to the question could be helpful.