I am searching for an explanation of a replication snapshot being in excess of 5 times what we're predicting
Following a replication failure we are attempting to reinitialise subscriptions with a new snapshot. The size of the database is 135GB, one table within the database is 60GB – we have excluded this article from the publication. This meant we had an expectation of snapshot size being ~75GB.
We've had several attempts at creating the snapshot and these have failed due to lack of disk space. Last night the snapshot filled a disk with 340GB of free space.
I 'd welcome any explanation for this huge size of snapshot.
In my examination of possible causes I note that the snapshot format is set to "Character – Required if a publisher or subscriber is not running SQL Server". This despite the fact that at both ends of this transnational replication is native SQL Server. Is there a different in size between formats?
Thanks in advance.
Native mode file Size:105 MB
C:>bcp IVM_ArchiveTest.dbo.Event out D:\NOBACKUP\UseOnce\EventNative.dat -T -n
218977 rows copied. Network packet size (bytes): 4096 Clock Time (ms.)
Total : 7878 Average : (27796.01 rows per sec.)
Character mode file Size:66 MB
C:\>bcp IVM_ArchiveTest.dbo.Event out D:\NOBACKUP\UseOnce\EventChar.dat -T -c
218977 rows copied. Network packet size (bytes): 4096 Clock Time (ms.)
Total : 1654 Average : (132392.38 rows per sec.)
Best Answer
Note that Character representation of any data will almost always exceed the original size. Depending on type, it can exceed by a large margin. Eg. an int column takes 4 bytes, but the character representation of
1000000
takes ~16 bytes as Unicode, including delimiters. That is a x4 increase right there. Dates, floats, numerics all will usually increase, and by even bigger factors than x4.The native format alternative is the bcp utility native format (since snapshots are applied using bcp). This format is only slightly longer than the native SQL storage when comparing individual values, but it can result in actually smaller files than SQL Server required storage because the bcp file is dense: there is not page header overhead, nor partially filled pages (fragmentation, fill-factor). See Use Native Format to Import or Export Data.
Additionally you should also compress the snapshots. See Enhance General Replication Performance: