SQL Server – Difference Between LOB and Binary Data

sql server

What is the difference when speaking about 'LOB' and 'binary' data?

Is it the same, in terms of binary data being stored in a separate filegroup, or are there differences?

Best Answer

According to Technet: Data Types (Transact-SQL)

In SQL Server, based on their storage characteristics, some data types are designated as belonging to the following groups:

  • Large value data types: varchar(max), nvarchar(max), and varbinary(max)
  • Large object data types: text, ntext, image, varchar(max), nvarchar(max), varbinary(max), and xml

So these types are defined as LOB data types. When you asked about binary data I suppose you are talking about the data of binary data type.

Binary data type can be one of two kinds: fixed-length and variable length. Fixed-length binary data type or variable-length varbinary data type with the maximum number of characters instead of MAX specifier is always stored in row in the first case and (may be) off-row in the second and is NOT LOB data type.

On the other hand, according to this classification (from the same Technet article):

enter image description here

varchar(max), nvarchar(max), text, ntext data types are CHARACTER strings and not BINARY strings.

So LOB data types include CHARACTER strings that are not binary strings.


Conclusion


BINARY data <> LOB data

And the only case when it's the same is when your data is defined as VARBINARY(MAX)




Back to your question from the comment on how LOB data can be stored.

If the value is one of the old legacy types (text, ntext, image) then it is stored off-row by default.

Off-row means that there is a pointer from the data record which points to the location of the record storing the actual LOB value or the start of the tree that stores the LOB value:

enter image description here

This image is from Pro SQL Server Internals 1st Edition by Dmitri Korotkevitch book.

But you can force the value to go in-row (obviously with the limit) using text in row option.

Starting with SQL server 2005 new LOB data types were introduced, these new types: varchar(max), nvarchar(max), varbinary(max) are stored in-row by default (of course with the limit of 8Kb)

............................................................

To complete a picture.

There is another kind of data that is stored off-row, that is row-overflow data.

Thanks to this feature, starting with 2005 it's possible not only to DEFINE (as it was in 2000) some variable-length columns with the sum of sizes that exceeds 8060 bytes, but also to INSERT data that will exceed this limit, some of these values will be pushed off-row (i.e. the value will be stored on off-row page and IN ROW there will be only a pointer to the row stored on the row-overflow page):

A table can contain a maximum of 8,060 bytes per row. In SQL Server 2008, this restriction is relaxed for tables that contain varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns. The length of each one of these columns must still fall within the limit of 8,000 bytes; however, their combined widths can exceed the 8,060-byte limit. This applies to varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns when they are created and modified, and also to when data is updated or inserted.

This restriction does not apply to varchar(max), nvarchar(max), varbinary(max), text, image, or xml columns

(From Row-Overflow Data Exceeding 8 KB)

The max specifier expands the storage capabilities of the varchar, nvarchar, and varbinary data types. varchar(max), nvarchar(max), and varbinary(max) are collectively called large-value data types. You can use the large-value data types to store up to 2^31-1 bytes of data.


Back to the origin of this your question, how LOB data can be moved.

When we are talking about moving data to another filegroup, and about difficulties we have with LOB data, we refer to LOB data and NOT to binary data that is stored in binary columns (in-row data).

From What about moving LOB data? by Kimberly L. Tripp

In versions prior to SQL Server 2012, an index that includes LOB columns cannot be built or rebuilt using online operations. However, even with OFFLINE index builds or rebuilds, existing LOB data is not moved. The one exception is when you change from a partitioned table to a non-partitioned table or vice versa. In honesty, the limitations and requirements around LOB data are a bit strange. When you create a table you can specify TEXTIMAGE_ON. This allows the LOB data for that table to reside on a filegroup other than where the data resides. However, this option (TEXTIMAGE_ON) is NOT available for the CREATE INDEX statement.

What if you want to move the LOBData to another filegroup? The TEXTIMAGE_ON option is not allowed during a CREATE INDEX statement. So, if you want to move the LOB data to another filegroup – which is different than the data, then “game over” there is NO way to do this without exporting/importing or using INSERT/SELECT or SELECT INTO. Of these, the best option is INSERT/SELECT as you’ll need to first define the table (and you can specific exactly where you want both the data and LOB data will reside) before you copy it over. But, this is single, large transaction and while you’re migrating there will have to be some downtime to switch everything (by dropping the original table and renaming the newly created table). And, I haven’t even started to mention what happens if this table is being referenced by others. That further complicates this process.


And to your last question:

Is it the same, in terms of binary data being stored in a seperate filegroup

Separate filegroup is not a requirement for storing LOB data. But it's possible to store LOB data separately using TEXTIMAGE_ON option:

TEXTIMAGE_ON { filegroup| "default" } Indicates that the text, ntext, image, xml, varchar(max), nvarchar(max), varbinary(max), and CLR user-defined type columns (including geometry and geography) are stored on the specified filegroup.

Varchar(max), nvarchar(max), varbinary(max), xml and large UDT values are stored directly in the data row, up to a limit of 8000 bytes and as long as the value can fit the record. If the value does not fit in the record, a pointer is sorted in-row and the rest is stored out of row in the LOB storage space. 0 is the default value. TEXTIMAGE_ON only changes the location of the "LOB storage space", it does not affect when data is stored in-row. Use large value types out of row option of sp_tableoption to store the entire LOB value out of the row.

And the data of fixed-length binary data type just cannot be stored in a separate filegroup. It's impossible because all the data is stored in-row, and it's syntactically impossible as

TEXTIMAGE_ON is not allowed if there are no large value columns in the table

More on TEXTIMAGE_ON here: CREATE TABLE (Transact-SQL)