Sql-server – Excessive replication snapshot size causing failure

replicationsnapshotsql server

I am searching for an explanation of a replication snapshot being in excess of 5 times what we're predicting

Following a replication failure we are attempting to reinitialise subscriptions with a new snapshot. The size of the database is 135GB, one table within the database is 60GB – we have excluded this article from the publication. This meant we had an expectation of snapshot size being ~75GB.

We've had several attempts at creating the snapshot and these have failed due to lack of disk space. Last night the snapshot filled a disk with 340GB of free space.

I 'd welcome any explanation for this huge size of snapshot.

In my examination of possible causes I note that the snapshot format is set to "Character – Required if a publisher or subscriber is not running SQL Server". This despite the fact that at both ends of this transnational replication is native SQL Server. Is there a different in size between formats?

Thanks in advance.

Native mode file Size:105 MB
C:>bcp IVM_ArchiveTest.dbo.Event out D:\NOBACKUP\UseOnce\EventNative.dat -T -n 
218977 rows copied. Network packet size (bytes): 4096 Clock Time (ms.) 
Total : 7878 Average : (27796.01 rows per sec.) 

Character mode file Size:66 MB 
C:\>bcp IVM_ArchiveTest.dbo.Event out D:\NOBACKUP\UseOnce\EventChar.dat -T -c 
218977 rows copied. Network packet size (bytes): 4096 Clock Time (ms.) 
Total : 1654 Average : (132392.38 rows per sec.) 

Best Answer

Note that Character representation of any data will almost always exceed the original size. Depending on type, it can exceed by a large margin. Eg. an int column takes 4 bytes, but the character representation of 1000000 takes ~16 bytes as Unicode, including delimiters. That is a x4 increase right there. Dates, floats, numerics all will usually increase, and by even bigger factors than x4.

The native format alternative is the bcp utility native format (since snapshots are applied using bcp). This format is only slightly longer than the native SQL storage when comparing individual values, but it can result in actually smaller files than SQL Server required storage because the bcp file is dense: there is not page header overhead, nor partially filled pages (fragmentation, fill-factor). See Use Native Format to Import or Export Data.

Additionally you should also compress the snapshots. See Enhance General Replication Performance:

  • Use a native mode snapshot unless a character mode snapshot is required. Use the default native mode snapshot for all Subscribers except non-SQL Server Subscribers and Subscribers running SQL Server Compact, which require a character mode snapshot.
  • Use a single snapshot folder for a publication. When specifying the publication properties related to snapshot location, you can choose to generate snapshot files to the default snapshot folder, to an alternate snapshot folder, or to both. Generating snapshot files in both locations requires additional disk space and more processing when the Snapshot Agent runs.
  • Place the snapshot folder on a drive local to the Distributor that is not used to store database or log files. The Snapshot Agent performs a sequential write of data to the snapshot folder. Placing the snapshot folder on a separate drive from any database or log files reduces contention among the disks and helps the snapshot process complete faster.
  • When you create the subscription database at the Subscriber, consider specifying a recovery model of simple or bulk-logged. This allows minimal logging of the bulk inserts performed during the application of the snapshot at the Subscriber. After the snapshot has been applied to the subscription database, you can change to a different recovery model if necessary (replicated databases can use any of the recovery models). For more information about selecting a recovery model, see Restore and Recovery Overview (SQL Server).
  • Consider using the alternate snapshot folder and compressed snapshots on removable media for low-bandwidth networks. Compressing snapshot files in the alternate snapshot folder can reduce snapshot disk storage requirements and make it easier to transfer snapshot files on removable media. Compressed snapshots can, in some cases, improve the performance of transferring snapshot files across the network. However, compressing the snapshot requires additional processing by the Snapshot Agent when generating the snapshot files, and by the Distribution Agent or Merge Agent when applying the snapshot files. This may slow down snapshot generation and increase the time it takes to apply a snapshot in some cases. Additionally, compressed snapshots cannot be resumed if a network failure occurs; therefore they are not suitable for unreliable networks. Consider these tradeoffs carefully when using compressed snapshots across a network. For more information, see Alternate Snapshot Folder Locations and Compressed Snapshots.