Sql-server – Query Performance Tuning on a huge table

index-tuningperformancequery-performancesql-server-2012

I have the following table Transactions. The table looks like this:

CREATE TABLE [dbo].[Transactions](
[TransactionID] [bigint] IDENTITY(1,10) NOT NULL,
[PlayerID] [bigint] NOT NULL,
[BalanceTransactionTypeID] [int] NOT NULL,
[BalanceTransactionSubTypeID] [int] NULL,
[PointDelta] [money] NULL,
[TransactionDate] [datetime] NOT NULL,

The table contains the following constraints and indexes.

CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED 
(
[TransactionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,     
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_Transactions_PlayerID] ON [dbo].    
[Transactions]
(
[PlayerID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,     
DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) 
ON   [PRIMARY]

CREATE NONCLUSTERED INDEX [IX_Transactions_TransactionDate] ON [dbo].    
[Transactions]
(
[TransactionDate] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
 DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON
 [PRIMARY]

I am writting a query where the goal is to return the top 50 point earners, and write the results using SSIS into a seperate table that will be looked at daily. The job is scheduled to run once a day. The main issue is there are over 3 billion rows and growing by about 300K per day. I know the ultimate solution is to partition it etc since its never used in game and its just historical. We have plans to roll it up as well. For right now however I need to make this query run in a reasonable time. I have tried using a CTE to get the player ids and the sum of the Delta then joining the result of that to another table by id to get the player details, I have tried using a straight up query with the join included all of which takes over 25 minutes to run the query. What can i do to tune this and make the query runable in a normal amount of time? If there is nothing let me know but I am really hoping there is some way to optomize this for the time being. As always thanks in advance for all help.

Best Answer

We have plans to roll it up as well.

Cancel your other plans and do this now. Stop trying to optimise the evaluation of 3 billion rows and instead evaluate rollup + 300k rows. From your outline of the situation, anything else is a waste of time and someone's money.

A parody definition of insanity is doing the same thing over and over and expecting different results. Reading 3 billion rows that aren't changing fits that definition. Stop.