T-sql – Massive Transaction Logs – what are the options

loggingt-sqltransaction-log

I have a fairly crazy scenario where I am processing ~250 million inserts and 250 million deletes for a table that contains data that is only relevant for the moments that it is in the database. Once we get new data for that key, we delete the old and insert the new.

I am using an azure database, and we have been running with very few issues for a few years now. The problem is this: Microsoft has recently started charging their PIT backups. While I'm not opposed to them charging fair prices, it seems that I never really thought about HOW MUCH data we are storing for PIT recover for this table. Last month we had just over 8 TB in transnational logs. That means our PIT storage fees were ~$2000. I am running a small company and for us that is debilitating. Microsoft has been helpful and offered us some refunds to get us through, however the base problem still exists.

My question is this: is there anything I can do to lower my logging? From my research, I'm not sure Simple Recovery mode will solve this issue, and I don't think Azure supports simple recovery either.

I am proficient is SQL (enough to get by and optimize things occasionally), but by no means an expert. Is there anything I can do. is it possible to not log a table. Is it possible to not log a Sproc? Do I have options to reduce my Logging on azure?

As a last resort, is there another technology that would work for storing the data in this one table ( this would cause a large re-write, but for $2K a month I would consider it).

I'm a little desperate here! Any and all suggestions are welcome.

Best Answer

Maybe you could have a look to minimal logging in SQL Server (but this requires changing database recovery model to simple or bulk-logged) and means that you cannot restore/recover database using point in time recovery (I have never used Azure so I don't know if this is possible).

Related Solutions

Running out of Transaction Log space during Alter Table

You will want to load your data into a new table, doing this in small batches, then drop the existing table. I put together a quick example using the Sales.Customer table in AdventureWorks, something similar should work for you also.

First, create your new table, complete with the new datatype you want to use:

CREATE TABLE [Sales].[Currency_New](
    [CurrencyCode] [nchar](4) NOT NULL,
    [Name] [varchar](128) NOT NULL,
    [ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Currency_New_CurrencyCode] PRIMARY KEY CLUSTERED 
(
    [CurrencyCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
GO

Then, insert your records and define your batch. I am using 10 here, but you will likely want to use something larger, say 10,000 rows at a time. For 30MM rows I'd even suggest you go to 100k row batch size at a time, that's the limit I typically used with larger tables:

DECLARE @RowsInserted INT, @InsertVolume INT
SET @RowsInserted = 1
SET @InsertVolume = 10 --Set to # of rows

WHILE @RowsInserted > 0
BEGIN       

INSERT INTO [Sales].[Currency_New] ([CurrencyCode]
           ,[Name]
           ,[ModifiedDate])
SELECT TOP (@InsertVolume)
       SC.[CurrencyCode]
           ,SC.[Name]
           ,SC.[ModifiedDate]
FROM [Sales].[Currency] AS SC
LEFT JOIN [Sales].[Currency_New] AS SCN 
    ON SC.[CurrencyCode] = SCN.[CurrencyCode] 
WHERE SCN.[CurrencyCode] IS NULL

SET @RowsInserted = @@ROWCOUNT
END

I usually do a sanity check and verify the rowcounts are the same before cleaning up:

SELECT COUNT(*) FROM [Sales].[Currency] 
SELECT COUNT(*) FROM [Sales].[Currency_New]

Once you are confident you have migrated your data, you can drop the original table:

DROP TABLE [Sales].[Currency]

Last step, rename the new table, so that users don't have to change any code:

EXEC sp_rename '[Sales].[Currency_New]', '[Sales].[Currency]';
GO

I don't know how long this will take. I'd suggest you try doing this when you have a clear maintenance window and users aren't connected.

HTH

Sql-server – Viewing how much transaction log is used in Simple Recovery Model

Seems like an appropriate way to do it.

Create a logging table:

CREATE TABLE dbo.LogSpace
(
  dt DATETIME NOT NULL DEFAULT SYSDATETIME(),
  dbname SYSNAME, 
  log_size_mb DECIMAL(22,7),
  space_used_percent DECIMAL(8,5),
  [status] BIT
);

Do this before and after your load:

INSERT dbo.LogSpace(dbname, log_size_mb, space_used_percent, [status])
EXEC sp_executesql N'DBCC SQLPERF(LogSpace) WITH NO_INFOMSGS;';

Optionally, remove any rows not related to this specific database:

DELETE dbo.LogSpace WHERE dbname <> N'yourdb';

Then you can compare the before and after size/space used for any given date, or for all dates you have collected.

;WITH x AS
(
  SELECT dbname, dt,
  duration = DATEDIFF(SECOND, LAG(dt) OVER 
    (PARTITION BY dbname ORDER BY dt), dt),
  [current] = space_used_percent, 
  previous = LAG(space_used_percent) OVER
    (PARTITION BY dbname ORDER BY dt),
  rn = ROW_NUMBER() OVER 
    (PARTITION BY dbname ORDER BY dt),
  log_size_mb
 FROM dbo.LogSpace
)
SELECT * FROM x WHERE rn % 2 = 0;

Keep in mind that checkpoints that happen during your process can actually make log space be re-used; I remember doing some performance testing recently and after certain operations the space_used_percent actually went down. So you may want to take the max observed over a few days (and maybe run it more often - in which case you want a slightly different query, that doesn't assume pairs of consecutive rows are related to any specific activity), rather than just relying on how it ended up after the load.

Also make sure that the autogrow settings for the log file are reasonable - you don't want 1MB or 10%, but you don't want 10GB, either. Since an autogrow event for a log file will (a) make all transactions wait and (b) does not benefit from instant file initialization, you want a good balance between how many times the log file has to grow during an abnormal operation like your data cleanup, and how long it takes any individual growth event to happen. If that event was recent enough, you can review these events in the default trace to see how long it took then.

Best Answer

Related Solutions

Running out of Transaction Log space during Alter Table

Sql-server – Viewing how much transaction log is used in Simple Recovery Model

Related Question