Sql-server – Retrieving data speed tweaks SQL Server 2005

performancequery-performancesql serversql-server-2005

I have a database in Microsoft SQL Server 2005.

I have table with 3 columns, namely HASHKEY (BIGINT), NOTE_ID (INT) and TIME_OFFSET (INT).

Columns HASHKEY has 19 digit BIGINT values, which can DUPLICATE. But values will be always of 19 digits.

I have application mainly depends on data in this table. Application retrieves data from this table with query like:

select HASHKEY, NOTE_ID, TIME_OFFSET 
from TABLE_NAME 
where HASHKEY in (<list of around 30000 hashkeys>)

But this query takes around 2 minutes to retrieve data. This is my problem. The application is real time processing application, and need to retrieve data in about 5 seconds. How can I tweak things in server, so time to retrieve data can be decreased?

I have indexed the table by HASHKEY column when I created table, but still retrieving data is taking much time.

Is there any setting which I can do in database to so time can be decreased? I will welcome any type of solution. But I need to solve this. I am not very expert in this.

Also HASHKEY is just random values of 19 digits, no relation with other values.

Result of following query is,

select count(HASHKEY) from TABLE_NAME
go
select count(distinct(HASHKEY)) from TABLE_NAME

Result:

225899932
189200251

Time taken: 2 minutes, 1 second

EDIT

This is script to create table:

USE [fp]
GO
/****** Object:  Table [dbo].[fp_core]    Script Date: 09/13/2013 22:40:43 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[fp_core](
    [hashkey] [bigint] NOT NULL,
    [note_id] [int] NOT NULL,
    [timeoffset] [int] NOT NULL
) ON [PRIMARY]

I will ask random set of around 30000 hashkeys for data, no any ordering.

When I was having around 20000000 rows in table, query was taking less than 2 seconds, but now retrieval time is increasing.

EDIT

And here is script to create index, only one index in this table.

USE [fp]
GO
/****** Object:  Index [IX_fp_core]    Script Date: 09/13/2013 23:04:24 ******/
CREATE NONCLUSTERED INDEX [IX_fp_core] ON [dbo].[fp_core] 
(
    [hashkey] ASC
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

Please help me.

Best Answer

Your table is a "heap," which means that it doesn't have a clustered index. The short version of what that means is that your data isn't laid out on the disk in a logical way. A heap really isn't an optimal structure for an actively updated table. Here's an excellent video on heaps.

So, my suggestions--please test them before deploying them in production--are:

BEGIN TRANSACTION

USE fp; ALTER TABLE fp_core ADD ID bigint NOT NULL IDENTITY (1, 1)
GO

USE fp; CREATE CLUSTERED INDEX IX_fp ON fp_core (ID) WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

COMMIT

This will take some time to complete AND will be a blocking query, so do it during downtime.

This will give you a clustered index and lay out your table on a completely random sequential number, unfortunately, but since you don't have any non-unique values that might be the best course of action.

I would then include the columns note_id and timeoffset in your nonclustered index on HASHKEY.

BEGIN TRANSACTION

USE [fp]
GO

IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[fp_core]') AND name = N'IX_Agency_TIMESTAMP')
DROP INDEX [IX_fp_core] ON [dbo].[fp_core] WITH ( ONLINE = OFF )
GO

USE [fp]
GO

CREATE NONCLUSTERED INDEX [IX_fp_core] ON [dbo].[fp_core] 
(
    [hashkey] ASC
)
INCLUDE ( [note_id],[timeoffset])
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
GO

COMMIT

(Or you could do it through the GUI. If you right-click on the index and choose properties, there's an included columns pane.)

This will hopefully have the effect of replacing your heap scans with nonclustered index seeks, but since your query has a "where in [30,000 items], it might do a index scan instead. Either way, it won't be rooting around in the heap, which I expect will be a good thing.