Sql-server – Query on 250 million records

performancesql server

I have a table with below structure

CREATE TABLE [dbo].[ColumnValues](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [ColumnId] [int] NOT NULL,
    [Value] [nvarchar](max) NULL,
    [OutputArea] [nvarchar](max) NULL,
 CONSTRAINT [PK_dbo.ColumnValues] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

ALTER TABLE [dbo].[ColumnValues]  WITH NOCHECK ADD  CONSTRAINT [FK_dbo.ColumnValues_dbo.CensusColumns_ColumnId] FOREIGN KEY([ColumnId])
REFERENCES [dbo].[CensusColumns] ([Id])
ON DELETE CASCADE
GO

ALTER TABLE [dbo].[ColumnValues] CHECK CONSTRAINT [FK_dbo.ColumnValues_dbo.CensusColumns_ColumnId]
GO

It contains almost 250 Million records. I need to filter records based on list of OutputArea codes and ColumnIds and the query which I used is like this:

SELECT 
[Extent1].[Id] AS [Id], 
[Extent1].[OutputArea] AS [OutputArea], 
[Extent1].[ColumnId] AS [ColumnId], 
[Extent1].[Value] AS [Value]
FROM [dbo].[ColumnValues] AS [Extent1]
WHERE ([Extent1].[OutputArea] IN (N'E00139237', N'E00139238', N'E00139239', N'E00139255', N'E00139257', N'E00139258', N'E00139259', N'E00139260', N'E00139261', N'E00139262', N'E00139263', N'E00139264', N'E00139265', N'E00139266', N'E00139267', N'E00139268', N'E00139689', N'E00139690', N'E00139691', N'E00139694', N'E00139814', N'E00139815', N'E00139816', N'E00139817', N'E00139818', N'E00139835', N'E00139837', N'E00139838', N'E00139847', N'E00139848', N'E00139849', N'E00139850', N'E00139855', N'E00139856', N'E00139857', N'E00139858', N'E00139939', N'E00139940', N'E00139941', N'E00139980', N'E00139981', N'E00139982', N'E00140008', N'E00140010', N'E00140011', N'E00140012', N'E00140013', N'E00140014', N'E00140015', N'E00140016', N'E00140017', N'E00140018', N'E00140019', N'E00140020', N'E00140021', N'E00140022', N'E00140023')) AND ([Extent1].[OutputArea] IS NOT NULL)
AND ([Extent1].[ColumnId] IN (298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309))

When I have small list of entries like less than 50 members for input the query works fine, not with great performance though. But when I run a query on an instance of SQL Server 2008 R2 which is hosted on a powerful server which has 128 GB memory it mostly fails to execute and I get below error:

The query processor ran out of internal resources and could not
produce a query plan. This is a rare event and only expected for
extremely complex queries or queries that reference a very large
number of tables or partitions. Please simplify the query. If you
believe you have received this message in error, contact Customer
Support Services for more information.

Is it very complex query? because I expect this to execute fast and without problem due to powerful resources on the server.
How do I fix the problem and increase the performance?

Best Answer

Likely, the issue is caused by the very long IN lists. Instead of using the IN list, create a temporary table with the values you want to filter on. Then rewrite the query as a join:

CREATE TABLE #FilterOutputArea (...) 
INSERT #FilterOutputArea VALUES ('E00139237')
... etc...

CREATE TABLE #FilterColumnID ()
INSERT #ColumnID VALUES (298)
... etc...

SELECT 
[Extent1].[Id] AS [Id], 
[Extent1].[OutputArea] AS [OutputArea], 
[Extent1].[ColumnId] AS [ColumnId], 
[Extent1].[Value] AS [Value]
FROM [dbo].[ColumnValues] AS [Extent1]
JOIN #FilterOutputArea ON ...
JOIN #ColumnID ON ...