Recently I started looking at SQL Server ANSI_Padding
setting. After reading BOL and some online articles, I find myself with more questions and confusion than I started with.
I understand that I would be better off sticking to ANSI_Padding On
, and I understand the different behaviors with NULL
and NOT NULL
columns when the setting is changed to ON
or OFF
.
But I cannot grasp this concept. What purpose does padding serve? Doesn't appending zeros to a binary number (or spaces to a character string) change the value?
Binary number 0x0000ee is equal to 0x00ee (at least when it is converted to decimal) But how is 0x00ee00
the same as 0x00ee?
A string 'a ' (1 space) is not the same as 'a' (no space), and yet they both become 'a ' (3 spaces) in a char(3)
column. How is that even acceptable?
Same confusion also exists when values are trimmed.
BOL does seem to mention that ANSI_Padding
is used to control how values are stored. So my initial guess was the actual values are not changed. But BOL does not elaborate on this point, nor do any of the articles I have found so far.
Thanks for your help.
Best Answer
First - here's a recent article that shows how things work pretty well.
There's also this article from Microsoft, explaining how SQL Server handles comparisons when strings have trailing spaces.
Why have padding? For
NOT NULL
columns where all the data will be very close to the same length, padding everything to (say) 10 characters may result in less space consumed than adding the overhead of making the column variable length. I think that making the columnNULL
able requires the same overhead as making it variable length, so including the padding becomes optional.When you mention that BOL is talking about how the values are stored - yes, this does change the value. If you store
'a'
(no space) and'a '
(one space) in achar(3) NOT NULL
column, they'll both be retrieved as'a
' (two spaces). IN achar(3) NULL
column, they'll be retrieved as'a'
or'a '
, depending on the ANSI_PADDING setting when the table (or column) was created.Why does the loss or retention of trailing spaces seem so unimportant?
Basically, when comparing two strings, SQL pads the shorter string out with spaces until they're the same length. So:
returns
TRUE
.Also -
binary()
andvarbinary()
should maybe not be thought of as being simply binary representations of numbers (which, of course, they are), but as binary strings. In comparisons of binary values as binaries, SQL does not pay attention to trailing zeroes.An example: 0X00EE00 is the decimal number 60928. 0X00EE is the decimal number 238.
returns
FALSE
(60928 <> 238). However:returns
TRUE
; as noted by Max Vernon, the trailing zeroes are acting like a NULL terminator in a string, and are not included in the comparison.Interestingly:
returns
TRUE
, so by default this is comparing the values as binary strings, not binary numbers.Per the docs above, a
LIKE
comparison is the only way to differentiate between string values based on trailing zeroes.returns
FALSE
.SO, in summary, unless you really work at it, SQL Server ignores trailing spaces regardless of the ANSI_PADDING settings, which is why trimming them or not is treated so cavalierly.
And, by the way, here's a quick example of the impact of
ANSI_PADDING
Query:
Results:
NOTE: As Max Vernon pointed out in a comment on his answer, my notes above are all about how SQL Server handles things. Equality of values in whatever application may be retrieving data from SQL Server may very well behave entirely different.