Sql-server – the logic behind ISNUMERIC for certain special characters

sql serversql-server-2012

The ISNUMERIC function has some unexpected behavior. The MSDN documentation says:

ISNUMERIC returns 1 when the input expression evaluates to a valid numeric data type; otherwise it returns 0. Valid numeric data types include the following: int, bigint, smallint, tinyint, decimal, numeric, money, smallmoney, float, real.

And it also has a footnote:

ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($). For a complete list of currency symbols, see money and smallmoney (Transact-SQL).

Okay, so +, -, and listed currency symbols are expected to be considered numeric. So far so good.

Now for the odd part. First up, some of the currency symbols from linked article are not numeric, including:

  • Euro-Currency sign, hex 20A0:
  • Naira sign, hex 20A6:
  • Rial sign, hex FDFC:

This is weird, and I can't seem to find out why? Is this version or environment dependent?

However, things get weirder. Here are a few others I can't explain:

  • / is not numeric, but \ is (Huh?!)
  • REPLICATE(N'9', 308) is numeric, but REPLICATE(N'9', 309) is not

The first and most basic question is: what explains the above cases? More importantly though: what is the logic behind ISNUMERIC, so I could explain / predict all cases myself?

Here's a good way to reproduce things:

DECLARE @tbl TABLE(txt NVARCHAR(1000));

INSERT INTO @tbl (txt) 
VALUES (N''), (N' '), (N'€'), (N'$'), (N'$$'), 
       (NCHAR(8356)), (NCHAR(8352)), (NCHAR(8358)), (NCHAR(65020)), 
       (N'+'), (N'-'), (N'/'), (N'\'), (N'_'), (N'e'), (N'1e'), (N'e1'), (N'1e1'), 
       (N'1'), (N'-1'), (N'+1'), (N'1+1'), (N'⒈'), (N'?'), (N'¹'), (N'①'), (N'½'), 
       (N'?'), (REPLICATE(N'9', 307)), (REPLICATE(N'9', 308)), (REPLICATE(N'9', 309)), 
       (REPLICATE(N'9', 310));

SELECT  UNICODE(LEFT(txt, 1)) AS FirstCharAsInt,
        LEN(txt) AS TxtLength,
        txt AS Txt,
        ISNUMERIC(txt) AS [ISNUMERIC]
FROM    @tbl;

When I run this on my local Sql Server 2012 box I get the following results:

FirstCharAsInt   TxtLength   Txt        ISNUMERIC
---------------  ----------  ---------  ----------
NULL             0                      0
32               0                      0
8364             1           €          1
36               1           $          1
36               2           $$         0
8356             1           ₤          1
8352             1           ₠          0  --??
8358             1           ₦          0  --??
65020            1           ﷼‎          0  --??
43               1           +          1
45               1           -          1
47               1           /          0
92               1           \          1  --??
95               1           _          0
101              1           e          0
49               2           1e         0
101              2           e1         0
49               3           1e1        1
49               1           1          1
45               2           -1         1
43               2           +1         1
49               3           1+1        0
9352             1           ⒈         0
55356            2           ?          0
185              1           ¹          0
9312             1           ①          0
189              1           ½          0
55356            2           ?         0
57               307        /*...*/     1
57               308        /*...*/     1  --??
57               309        /*...*/     0  --??
57               310        /*...*/     0

Best Answer

The detailed behaviours of ISNUMERIC are not documented, and probably not fully known to anyone without source code access. That said, it may be that interpretation depends on the Unicode categorization (numeric or not). Equally, the weird cases you mention may be bugs that are preserved for backwards compatibility. Yes I know that sounds crazy, but it does happen.

As you are using SQL Server 2012, there is no need to use ISNUMERIC. Use TRY_CONVERT or the synonymous TRY_CAST instead to check if a string is convertible to a given type. Where they provide adequate functionality, these are preferable to TRY_PARSE, because the latter involves more expensive processing via CLR integration.