Sql-server – How to eliminate tables in a partitioned view

partitioningperformancequery-performancesql server

I can't get my query that joins a regular table to a partitioned view to eliminate tables that don't meet the predicates on the partitioning column. In particular, I'm interested in the case where I LEFT OUTER JOIN to the partitioned view and my predicate covers a range of values. When I modify the query to use an INNER JOIN or restrict the predicate to a single value, the tables are correctly eliminated. Here's a script to demonstrate the issue. I tested this in SQL Server 2016 SP1.

--Create 2 tables for prices; 1 for 2017 and 1 for 2018
CREATE TABLE Price_2017
(
    PriceDate DATE NOT NULL,
    PriceValue FLOAT NOT NULL
)
GO

ALTER TABLE Price_2017 ADD CONSTRAINT PK_Price_2017 PRIMARY KEY(PriceDate);
GO

ALTER TABLE Price_2017 WITH CHECK ADD CONSTRAINT CK_Price_2017
CHECK (PriceDate >= '2017-01-01' AND PriceDate <= '2017-12-31');
GO

ALTER TABLE Price_2017 CHECK CONSTRAINT CK_Price_2017;
GO

CREATE TABLE Price_2018
(
    PriceDate DATE NOT NULL,
    PriceValue FLOAT NOT NULL
)
GO

ALTER TABLE Price_2018 ADD CONSTRAINT PK_Price_2018 PRIMARY KEY(PriceDate);
GO

ALTER TABLE Price_2018 WITH CHECK ADD CONSTRAINT CK_Price_2018
CHECK (PriceDate >= '2018-01-01' AND PriceDate <= '2018-12-31');
GO

ALTER TABLE Price_2018 CHECK CONSTRAINT CK_Price_2018;
GO

--Create a partitioned view for all dates
CREATE VIEW Price_All AS
SELECT p.PriceDate, p.PriceValue
 FROM dbo.Price_2017 p
UNION ALL 
SELECT p.PriceDate, p.PriceValue
 FROM dbo.Price_2018 p;

--Create some prices
INSERT INTO Price_2017 (PriceDate, PriceValue) VALUES('2017-01-01',1);
INSERT INTO Price_2017 (PriceDate, PriceValue) VALUES('2017-01-02',2);

INSERT INTO Price_2018 (PriceDate, PriceValue) VALUES('2018-01-01',10);
INSERT INTO Price_2018 (PriceDate, PriceValue) VALUES('2018-01-02',20);

--Create another table that we will relate to prices
CREATE TABLE Purchase
(
    PurchaseDate DATE NOT NULL,
    Quantity INT NOT NULL
)
GO

ALTER TABLE Purchase ADD CONSTRAINT PK_Purchase PRIMARY KEY(PurchaseDate);
GO

--Put some stuff in the other table
INSERT INTO Purchase (PurchaseDate, Quantity) VALUES ('2017-01-01', 1);
INSERT INTO Purchase (PurchaseDate, Quantity) VALUES ('2017-01-02', 2);
INSERT INTO Purchase (PurchaseDate, Quantity) VALUES ('2017-01-03', 3);
INSERT INTO Purchase (PurchaseDate, Quantity) VALUES ('2018-01-01', 4);
INSERT INTO Purchase (PurchaseDate, Quantity) VALUES ('2018-01-02', 5);
INSERT INTO Purchase (PurchaseDate, Quantity) VALUES ('2018-01-03', 6);

--Test Queries

--These are all good; the execution plan includes only necessary tables
SELECT * FROM Price_All WHERE PriceDate = '2017-01-01';
SELECT * FROM Price_All WHERE PriceDate = '2018-01-01';
SELECT * FROM Price_All WHERE PriceDate BETWEEN '2017-01-01' AND '2017-01-02';
SELECT * FROM Price_All WHERE PriceDate BETWEEN '2018-01-01' AND '2018-01-02';

--Good; doesn't seek in the 2018 table
--https://www.brentozar.com/pastetheplan/?id=BJ8RBqlYG
SELECT pu.PurchaseDate, pu.Quantity, pr.PriceValue
FROM Purchase pu
LEFT JOIN Price_All pr
ON pr.PriceDate = pu.PurchaseDate
WHERE pu.PurchaseDate = '2017-01-01';

--Good; doesn't seek in the 2018 table
--https://www.brentozar.com/pastetheplan/?id=SkOEUqgKG
SELECT pu.PurchaseDate, pu.Quantity, pr.PriceValue
FROM Purchase pu
INNER JOIN Price_All pr --notice, inner join
ON pr.PriceDate = pu.PurchaseDate
WHERE pu.PurchaseDate BETWEEN '2017-01-01' AND '2017-01-02'; --notice, range of dates

--Bad; seeks in both price tables
--https://www.brentozar.com/pastetheplan/?id=BkU8UcxFM
SELECT pu.PurchaseDate, pu.Quantity, pr.PriceValue
FROM Purchase pu
LEFT OUTER JOIN Price_All pr --notice, left join
ON pr.PriceDate = pu.PurchaseDate
WHERE pu.PurchaseDate BETWEEN '2017-01-01' AND '2017-01-02'; --notice, range of dates

I included links to the execution plans for the last three queries. Only the last query has the problem (it performs a seek in the 2017 and 2018 tables when it should only seek in 2017). In my production databases, this results in very poor performance and is blocking my ability to roll out date specific tables to deal with the maintenance overhead of 10+ billion rows that are currently stored in a single table. How can I make SQLServer use only the 2017 table in the last query?

Best Answer

SQL Server is indeed eliminating the 2018 partition even though it shows up in the plan.

The startup expression filter in the plan eliminates the unneeded 2018 table at execution time. After running the query with actual execution plan turned on, hover your mouse over the Price_2018 table's seek:

Clustered Index Seek

Note that "Number of Executions" is 0. That means SQL Server eliminated the seek at runtime.

You can also run the query with STATISTICS IO ON to see that only the 2017 table is touched:

SET STATISTICS IO ON;
GO
SELECT pu.PurchaseDate, pu.Quantity, pr.PriceValue
FROM Purchase pu
LEFT OUTER JOIN Price_All pr --notice, left join
ON pr.PriceDate = pu.PurchaseDate
WHERE pu.PurchaseDate BETWEEN '2017-01-01' AND '2017-01-02'; --notice, range of dates

The Statistics IO output shows:

(2 rows affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Price_2017'. Scan count 0, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Purchase'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The Price_2018 table isn't in that list - meaning it didn't get touched.