SQL Server – Massive MSreplcommands Table Cleanup Job Not Deleting Commands

replicationsql server

I have a runaway distribution database that is 640gb in size. I have narrowed it down to the msreplcommands table which is holding 2 billion rows.

I have checked all publications and found 2 of them having immediate sync set to 1. I changed this to 0. Reran the cleanup job and it deleted about 240K rows only. Ran this:

SELECT *  from MSrepl_transactions (nolock) where entry_time < getdate() -5;

Found entries as old as 30 days.

The problem is I am running out of space very fast and all delete attempts lock the msreplcommands table and thus lock replication. We are a 24/7 shop and I have no server downtime possible. Is there another way to delete this amount of data without locking replication?

Hi T.H, Thanks for the suggestion. I found that link earlier and decided to try it. I've done some investigating and found very old entries in the MSrepl_transactions table going back 32 days using the following query:

SELECT distinct cast(entry_time as date) as entry_time, publisher_database_id, xact_seqno, count(publisher_database_id) FROM [distribution].[dbo].MSrepl_transactions group by entry_time, publisher_database_id, xact_seqno order by entry_time

Best Answer

The process at https://blogs.msdn.microsoft.com/chrissk/2009/12/29/how-to-resolve-when-distribution-database-is-growing-huge-25gig/ can help with this.

To prevent it happening in future you may need to look at the configuration of the Distribution clean up: distribution job.