Sql-server – SQL Server VARCHAR Column Width

sql serversql-server-2008varchar

Searching around the web, I have found conflicting advice on whether there is a performance impact when specifying overly-wide VARCHAR columns, e.g. VARCHAR(255) when VARCHAR(30) will probably do.

I consistently see agreement that there's a performance hit if the entire row exceeds 8060 bytes. Other than that, I see disagreement.

Is the claim true that The default is SET ANSI PADDING ON = potential for lots of trailing spaces? As long as the total row width is less than 8060, are there any real performance concerns in over-sizing VARCHAR columns?

Evidence that column width matters

The same goes for CHAR and VARCHAR data types. Don’t specify more characters in character columns that you need.

http://www.sql-server-performance.com/2007/datatypes/

Length is a constraint on the data (like CHECK, FK, NULL etc)
Performance when the row exceeds 8060 bytes
Can not have unique constraint or index (key column width must be < 900)
The default is SET ANSI PADDING ON = potential for lots of trailing spaces

What are the consequences of setting varchar(8000)?

Evidence that column width DOES NOT matter

If you're talking about varchar and nvarchar then no, there is no penalty for allowing a higher field length.

https://stackoverflow.com/questions/7025996/overstating-field-size-in-database-design

The varchar datatype, by contrast, consumes only the amount of actual space used plus 2 bytes for overhead

http://sqlfool.com/content/PerformanceConsiderationsOfDataTypes.pdf

Best Answer

The question might be better stated as:

"What is the advantage of over-specifying the maximum length of a variable-length column?"

In general, there is little advantage, and several disadvantages as the various linked answers point out. Aside from any other concerns, consider that SQL Server is not open-source: there are many 'magic numbers' and heuristics applied based on the information provided to the system. Without source code access, we could never be entirely sure what the impact of this practice might be.

In some cases, where the average length of a column is significantly higher than the 50% assumed by SQL Server when calculating sort/hash memory grants, you may see a performance improvement by over-specifying the maximum length. This is a dubious workaround, and should probably only be applied by an explicit CAST or CONVERT (with comments!) rather than changing the base column definition. Sometimes, it will be preferable to rewrite the query to sort keys instead of whole rows anyway.

Where the maximum row size might exceed the in-row limit (even if no rows actually do), deleting rows can cause unexpected page-splitting if a trigger is present. Updates may also lead to fragmentation via the same mechanism.

SQL Server does a pretty good job in most cases where it is provided with good, accurate information from metadata. Compromising this principle for 'convenience' seems unwise to me. A sensible approach is to choose a maximum length value that is reasonable according to the actual data to be stored, and any foreseeable changes.

CHAR columns

Even if such management was in place, it would never get summoned because none of the fragments would be large enough to hold changed rows since the column had a smaller size before. This would result in appending all the rows to the physical file for the table. Now, the space occupied by the rows before the change would appear before the first of the table after the ALTER TABLE was done. This would be true for expanding CHAR variables.

VARCHAR columns

In the case of expanding a VARCHAR, altering lengths to be larger may require changing from 1 to 2 bytes for every row should you expand a VARCHAR(255) (255 is a 1-byte unsigned number) to VARCHAR(256) (256 is a 2-byte unsigned number). The introduction of that one extra byte would still require vigorous fragmentation block list management just for the sake of accommodating one extra byte. Therefore, actual space management via maintaining a fragmentation block list would be highly impractical for the open storage engine plugin architecture of MySQL. Such management would probably be in Oracle, SQL Server and PostgreSQL.

Best Answer

Related Solutions

Sql-server – Recalculate row size in SQL Server 2008

MySQL ‘copying to tmp table’ while enlarging VARCHAR

CHAR columns

VARCHAR columns

Related Question