Sql-server – Delete millions of rows disabling Indexes

sql serversql server 2014

I have a table with millions of rows. I'm deleting where year(data) in (2013,2014,2015). It's taking more than 4hrs. I'm trying now to delete only year 2013 but it's still very very slow (every year has like more than 600.000 rows and it has a varbinary column to store .PDF files).

I know that indexes can decrease performance on deletes. It will be a good Idea to Disable both indexes in that table, and then re-enable them?

enter image description here

Can it be faster if I create a view, and delete from the view?


Edit:

I made this query to delete it by top 1:

delete top (1) from MyTable where year(data) in ( 2013)--,2014,2015)
while @@rowcount > 0
    begin
    delete top (1) from MyTable where year(data) in ( 2013)--,2014,2015)
    end

Best Answer

Depending on how interconnected this table is with others in your database, and assuming that the table only has data for 2013-present, the fastest solution might be:

  1. Script out your original table, and create a new empty table with the same structure.
  2. INSERT the data for 2016 and 2017 into the new table.
  3. DROP the original table.
  4. Rename the new table to the original table's name (see the docs for ALTER TABLE).

Deletes are generally enough slower than inserts that it's probably faster to copy out 25-30% of the records in the table than to delete 70-75% of them. However, of course, you need to have sufficient disk space to hold the duplicates of the data to be kept to be able to use this solution (as noted by Toby in the comments).

If you do this, you'll want to be absolutely certain that the new table ends up exactly like the original, including any indexes, triggers, etc. You might want to truncate the original, rename it, and keep it around for a while instead of deleting it, just to be sure there's nothing you've missed. Also, outside of any clustered index, you may want to add clustered indexes and triggers after you've inserted the 2016 and 2017 data. If triggers are involved at all, make sure that whatever you do leaves the rest of your data in a valid state.

If other tables reference your table in foreign key relationships (as suggested by Joe Obbish in the comments), then this becomes somewhat more complicated. I would recommend scripting out all the foreign keys that point to this table, removing them, and then recreating them after the new table has been renamed. See this link to an article by Aaron Bertrand for help with this.