SQL Server – Why Object Names Can’t Start with a Number

identifiersql server

For example, if I'm creating view with a name '4aii', why does SQL Server care that it starts with a 4? I could call the table Fouraii or IVaii.

Additionally, what does [] do behind the scenes to allow for any string to be used as a name?

A string's a string, amirite?

Best Answer

A string's a string, amirite?

Yes and No: a string is a string, but object / item names are not strings. So while that statement is true, it is also not relevant to the behavior that you are seeing.

Ignoring the conceptual reasoning for the specific rules, the technical answer to "why one works and not the other" is that SQL Server follows (with minimal customization), the Unicode Standard's guidelines for identifiers. The Unicode documentation can be found here:

Unicode® Standard Annex #31: UNICODE IDENTIFIER AND PATTERN SYNTAX

Identifiers that are not enclosed in either [...] or "..." are "regular" identifiers, while those that are enclosed are "delimited" identifiers. Regular identifiers are names that are valid in all contexts (i.e. these are the rules for naming things in this language, software, etc). Delimited identifiers are everything else: names that are not valid and should not work, however, they are given an exemption if you wrap them in either of those delimiters. Most identifiers can be delimited; it is only GOTO labels and variables (including table variables) / parameters that cannot be delimited. The distinction seems to be that identifiers that exist purely for use in the T-SQL language (i.e. not a name that will ever be stored in a data file or log file as meta-data) cannot be delimited (much as you would expect in any language).

Now, the SQL Server documentation isn't exactly complete / correct, but it is correct about the classification of what is a valid "identifier" character (both starting and continuing) coming from the Unicode 3.2. If you want the actual list of rules for both regular and delimited identifiers, I have them documented here:

Completely Complete List of Rules for T-SQL Identifiers

To see the research that proves the relationship between the Unicode 3.2 categorizations and what SQL Server accepts for regular identifiers, please visit:

Addressing concerns noted in comments on this answer:

Yes, even allowing for non-delimited identifiers to start with _, #, and @ is accounted for in the Unicode spec. Section 1.2 addresses customizations to the base rules, and even provides four example customizations: _, #, @, and $. Those 4 four "potential" customizations are the exact same 4 that SQL Server uses. Hence, SQL Server allowing @Variable and #TempTable does not point away from this Unicode document as being the source of the rules.
As noted above, the SQL Server documentation does state that the categorizations used are from version 3.2 of the Unicode Character Database, and they are currently on version 10. You cannot use the current definitions of Ident_*, as found on the Unicode website, as indicating valid / invalid characters. Characters are added to Ident_Start and Ident_Continue in each new version of the Unicode Standard. The only way to see the correct set of characters matching these properties is to download Unicode Version 3.2.
Both of the above two points are dealt with in the two blog posts noted directly above (named "The Uni-Code: The Search for the True List of Valid Characters for T-SQL Regular Identifiers"). Please read those two posts before dismissing this answer as incorrect. There is a lot of nuance behind what is actually going on here which I address in those two posts, showing step-by-step how to match up the list of valid characters.

ALSO, with regards to the question as stated in the title, it depends on how loosely you define "number". Meaning, if you follow along with the research steps as shown in the two posts noted directly above, such that you have created a table to hold the Unicode Character Database v3.2 and a few additional properties, you can get a list of 52 non-letters (mostly "numbers") that are valid characters for starting an identifier via the following query:

SELECT ucd.*
FROM   [v3-2].UnicodeCharacterDatabase ucd
WHERE  ucd.[IDStart] = 1
AND    ucd.[GeneralCategory] NOT LIKE 'L%';

Choosing a few of those characters to test, we can see that they do indeed work:

USE [tempdb];
CREATE TABLE dbo.Ⅳaii ([Col1] INT); -- ROMAN NUMERAL FOUR (U+2163)

CREATE TABLE dbo.ↂaii ([Col1] INT); -- ROMAN NUMERAL TEN THOUSAND (U+2182)

CREATE TABLE dbo.〤aii ([Col1] INT); -- HANGZHOU NUMERAL FOUR (U+3024)

And, just to show that they are "numbers" in more than just their names, the following query proves that they are assigned a numeric value (as shown in the NumericValue column of the [v3-2].UnicodeCharacterDatabase table:

SELECT 1 WHERE N'〤' LIKE N'[3-5]'; -- HANGZHOU NUMERAL FOUR (U+3024)
-- 1

However, they are not numbers that can be used in numeric operations:

SELECT 〤 + 0;
/*
Msg 207, Level 16, State 1, Line 23
Invalid column name '〤'.
*/

Regarding the issue of parsing and needing to be able to determine if 3e2 is a number or identifier: while this is a consideration, and possibly why numbers are excluded from the "Ident_start" Unicode general category, it is not a universal, and not necessarily why SQL Server excludes them. Three points to consider:

While 3e2 by itself is ambiguous, if it were qualified with at least a schema name then it would not be: dbo.3e2
The name 4aii is not really ambiguous at all. Internal parsing would be able to identify that easily enough as not being a potential number
MySQL / MariaDB do not have this restriction. They allow for non-delimited identifiers such as 4aii and 3e, but not 3e2 or 300. I was able to successfully execute the following in MySQL:
```
create table 4aii (3e int);
```

So again, the reason that you cannot do this in SQL Server is because SQL Server adheres to the Unicode Standard's recommendation for identifiers. Why those characters were chosen by the Unicode Consortium is not specifically stated, but seems to be at least "best practice". Still, as proven with MySQL, it is possible to parse identifiers that start with a number.

Related Solutions

Sql-server – View Disc Usage

Your thinking is correct. Take the below example on a non-indexed view:

To create the test view:

use AdventureWorks2012;
go

create view dbo.DepartmentView
as

    select
        Name,
        GroupName
    from HumanResources.Department;

go

Now if you were to look at the execution plan of a SELECT on the view:

select *
from dbo.DepartmentView;

You can see below that the index of the underlying table is referenced (HumanResources.Department):

enter image description here

Likewise, when you update the view, you will see similar behavior:

update dbo.DepartmentView
set GroupName = 'QA'
where GroupName = 'Quality Assurance';

enter image description here

You can also write a few queries against the system catalog views to show that the only thing that is persisted is the definition of the view:

-- see the object in sys.objects
select *
from sys.objects
where object_id = object_id('dbo.DepartmentView');

-- ensure there are no indexes related to this view object
select *
from sys.indexes
where object_id = object_id('dbo.DepartmentView');

-- the definition is persisted
select
    object_name(object_id) as object_name,
    definition
from sys.sql_modules
where object_id = object_id('dbo.DepartmentView');

SQL Server – How sp_msforeachdb Works Behind the Scenes

First of all don't use sp_msforeachdb it has several known issues. You are better off using Aaron Bertrand's version here and here.

However it uses a cursor, dynamic SQL and a replace. You can actually look at the code by using sp_helptext.

EXEC sp_helptext sp_msforeachdb

If you use this code it will fix some of your issues.

EXEC SP_msforeachdb 'SELECT ''?'' AS Database
                 FROM [?].sys.objects
                 WHERE name like ''%aetna%''

The brackets will the specific problem you mentioned. You will however run into issues if you have a database with [ or ] in it.

Best Answer

Related Solutions

Sql-server – View Disc Usage

SQL Server – How sp_msforeachdb Works Behind the Scenes

Related Question