Sql-server – Speed of Searching Text vs INT/BIGINT – SQL Server 2012

sql serversql-server-2012

I am building an application that will need to handle extremely high amounts of traffic. The traffic will consist of incoming requests for information from a SQL Server 2012 database. When a request is made, a unique KEY is passed that will be used to identify the account that the information belongs to. This KEY is an alphanumeric string and the table that contains the KEYs can contain millions of rows of data. The length and data type of the KEY can change based on situation.

Example Request

KEY=a123f-5n12d1km9-3cmc32mc32-32cm0429c-243g4453-g43f43
TYPE=123

--The system will look up data for type '123' in the database for account 'a123f-5n12d1km9-3cmc32mc32-32cm0429c-243g4453-g43f43'

The main focus will be on the speed of the query, so I am not concerned too much with database size or memory utilization. I know that searching for integers is a lot more efficient than searching text, so I came up with a way to easily convert the strings into integers. The goal is to make an efficient way to take a string, convert it to an integer, and then perform a quick search to return the results. The problem is that the integers generated by the strings will either take up two BIGINT columns or 4 INT columns.

My question is would it be more efficient to search and compare 4 INT columns or 2 BIGINT columns? Or would searching for the text be faster?

BIGINT Sample Query

SELECT (FIELDS) 
FROM KEYS (JOIN .......) 
WHERE C1 = xxxxxxxxxxxxxxxxxxxx and  C2 = xxxxxxxxxxxxxxxxxxxx 

INT Sample Query

SELECT (FIELDS) 
FROM KEYS (JOIN .......)  
WHERE C1 = xxxxxxxxxx and  C2 = xxxxxxxxxx and  C3 = xxxxxxxxxx and  C4 = xxxxxxxxxx 

Text Sample Query

SELECT (FIELDS) 
FROM KEYS (JOIN ........) 
WHERE KEY = 'a123f-5n12d1km9-3cmc32mc32-32cm0429c-243g4453-g43f43'

(NOTE: All columns are either type INT or BIGINT respectively)

Best Answer

As other suggested, sometimes the best choice is to test it out yourself. I hate answering my own question, so if someone can provide a better answer then I might chose it if it better explains my results.

It turns out there is not UBIGINT64 in SQL, so I created the table as I described as follows and used decimal instead of BigInt:

CREATE TABLE [dbo].[KEYS](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [KEY] [nvarchar](max) NOT NULL,
    [BI1] [decimal](28, 0) NOT NULL,
    [BI2] [decimal](28, 0) NOT NULL,
    [i1] [int] NOT NULL,
    [i2] [int] NOT NULL,
    [i3] [int] NOT NULL,
    [i4] [int] NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

I then wrote a script to populate the table with approximately 2.25 million rows of data and setup the following loop to see how long each one would take.

declare @x integer
set @x = 0
while @x < 100
begin
    --QUERIES WERE INSERTED HERE
    set @x = @x + 1
end

I then plugged the following three queries into the loop and recorded the results for 50 loops and 100 loops.

select * from IntTypes where i1 = 64003101 and i2 = 64003107 and i3 = 64003144 and i4 = 640031031


50 Loops: 9 seconds
100 Loops: 18 seconds

select * from IntTypes where BI1 = 15284527576400310788 and BI2 = 16825458760271544958


50 Loops: 11 seconds
100 Loops: 23 seconds

select * from IntTypes where [KEY] = 'a123f-5n12d1km9-3cmc32mc32-32cm0429c-243g4453-g43f43'


50 Loops: 54 seconds
100 Loops: 1 minute 48 seconds

Perhaps the results would be different if I did use a BIGINT instead of a DECIMAL, but comparing 4 integer columns seems to outperform comparing the two decimal columns. Both of those completely destroyed the text search results timing, but overall it looks like I will be using integer columns.

So to sum it up and answer my own question (sort of), it is faster to search 4 integer columns than searching two decimal columns.