You will want to load your data into a new table, doing this in small batches, then drop the existing table. I put together a quick example using the Sales.Customer table in AdventureWorks, something similar should work for you also.
First, create your new table, complete with the new datatype you want to use:
CREATE TABLE [Sales].[Currency_New](
[CurrencyCode] [nchar](4) NOT NULL,
[Name] [varchar](128) NOT NULL,
[ModifiedDate] [datetime] NOT NULL,
CONSTRAINT [PK_Currency_New_CurrencyCode] PRIMARY KEY CLUSTERED
(
[CurrencyCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
GO
Then, insert your records and define your batch. I am using 10 here, but you will likely want to use something larger, say 10,000 rows at a time. For 30MM rows I'd even suggest you go to 100k row batch size at a time, that's the limit I typically used with larger tables:
DECLARE @RowsInserted INT, @InsertVolume INT
SET @RowsInserted = 1
SET @InsertVolume = 10 --Set to # of rows
WHILE @RowsInserted > 0
BEGIN
INSERT INTO [Sales].[Currency_New] ([CurrencyCode]
,[Name]
,[ModifiedDate])
SELECT TOP (@InsertVolume)
SC.[CurrencyCode]
,SC.[Name]
,SC.[ModifiedDate]
FROM [Sales].[Currency] AS SC
LEFT JOIN [Sales].[Currency_New] AS SCN
ON SC.[CurrencyCode] = SCN.[CurrencyCode]
WHERE SCN.[CurrencyCode] IS NULL
SET @RowsInserted = @@ROWCOUNT
END
I usually do a sanity check and verify the rowcounts are the same before cleaning up:
SELECT COUNT(*) FROM [Sales].[Currency]
SELECT COUNT(*) FROM [Sales].[Currency_New]
Once you are confident you have migrated your data, you can drop the original table:
DROP TABLE [Sales].[Currency]
Last step, rename the new table, so that users don't have to change any code:
EXEC sp_rename '[Sales].[Currency_New]', '[Sales].[Currency]';
GO
I don't know how long this will take. I'd suggest you try doing this when you have a clear maintenance window and users aren't connected.
HTH
Seems like an appropriate way to do it.
Create a logging table:
CREATE TABLE dbo.LogSpace
(
dt DATETIME NOT NULL DEFAULT SYSDATETIME(),
dbname SYSNAME,
log_size_mb DECIMAL(22,7),
space_used_percent DECIMAL(8,5),
[status] BIT
);
Do this before and after your load:
INSERT dbo.LogSpace(dbname, log_size_mb, space_used_percent, [status])
EXEC sp_executesql N'DBCC SQLPERF(LogSpace) WITH NO_INFOMSGS;';
Optionally, remove any rows not related to this specific database:
DELETE dbo.LogSpace WHERE dbname <> N'yourdb';
Then you can compare the before and after size/space used for any given date, or for all dates you have collected.
;WITH x AS
(
SELECT dbname, dt,
duration = DATEDIFF(SECOND, LAG(dt) OVER
(PARTITION BY dbname ORDER BY dt), dt),
[current] = space_used_percent,
previous = LAG(space_used_percent) OVER
(PARTITION BY dbname ORDER BY dt),
rn = ROW_NUMBER() OVER
(PARTITION BY dbname ORDER BY dt),
log_size_mb
FROM dbo.LogSpace
)
SELECT * FROM x WHERE rn % 2 = 0;
Keep in mind that checkpoints that happen during your process can actually make log space be re-used; I remember doing some performance testing recently and after certain operations the space_used_percent
actually went down. So you may want to take the max observed over a few days (and maybe run it more often - in which case you want a slightly different query, that doesn't assume pairs of consecutive rows are related to any specific activity), rather than just relying on how it ended up after the load.
Also make sure that the autogrow settings for the log file are reasonable - you don't want 1MB or 10%, but you don't want 10GB, either. Since an autogrow event for a log file will (a) make all transactions wait and (b) does not benefit from instant file initialization, you want a good balance between how many times the log file has to grow during an abnormal operation like your data cleanup, and how long it takes any individual growth event to happen. If that event was recent enough, you can review these events in the default trace to see how long it took then.
Best Answer
Maybe you could have a look to minimal logging in SQL Server (but this requires changing database recovery model to simple or bulk-logged) and means that you cannot restore/recover database using point in time recovery (I have never used Azure so I don't know if this is possible).