Sql-server – Tuning a query with temp table join

performancequery-performancesql server

I have a query as follows:

SELECT @tenor_from =CONVERT(DATETIME,MIN(spc.maturity_date),103)  
FROM source_price_curve spc  
INNER JOIN #source_price_curve_list spcl
    ON spc.source_curve_def_id  = spcl.price_curve_id  
WHERE spc.as_of_date >= @as_of_date_from

It takes almost 12 seconds to run.Removing the temp table #source_price_curve_list from join gives the result in less than 1 second.

The source_price_curve table has 130 million records. The temp table #source_price_curve_list has one record in output(as given below).It may contain more data in future

select * from #source_price_curve_list
 rowID  price_curve_id
  1          1

Why does the inner join to a one record temp table make the query take so much longer time? I need to reduce the query run time to less than 1 sec.

The execution plan of the query with the join is provided in the below link:

https://www.brentozar.com/pastetheplan/?id=BksaaWjSb

The query without the join to temp table runs in less than 1 sec.The execution plan link is as follows:

https://www.brentozar.com/pastetheplan/?id=BJ2_3boSZ

The table source_price_curve is created as follows:

CREATE TABLE [dbo].[source_price_curve](
    [source_curve_def_id] [int] NOT NULL,
    [as_of_date] [datetime] NOT NULL,
    [Assessment_curve_type_value_id] [int] NOT NULL,
    [curve_source_value_id] [int] NOT NULL,
    [maturity_date] [datetime] NOT NULL,
    [curve_value] [float] NOT NULL,
    [create_user] [varchar](50) NULL,
    [create_ts] [datetime] NULL,
    [update_user] [varchar](50) NULL,
    [update_ts] [datetime] NULL,
    [bid_value] [float] NULL,
    [ask_value] [float] NULL,
    [is_dst] [int] NOT NULL,
 CONSTRAINT [IX_unique_source_curve_def_id_index] UNIQUE NONCLUSTERED 
(
    [as_of_date] ASC,
    [source_curve_def_id] ASC,
    [maturity_date] ASC,
    [is_dst] ASC,
    [curve_source_value_id] ASC,
    [Assessment_curve_type_value_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

ALTER TABLE [dbo].[source_price_curve]  WITH NOCHECK ADD  CONSTRAINT [FK_source_curve_def_id] FOREIGN KEY([source_curve_def_id])
REFERENCES [dbo].[source_price_curve_def] ([source_curve_def_id])
ON DELETE CASCADE
GO

ALTER TABLE [dbo].[source_price_curve] CHECK CONSTRAINT [FK_source_curve_def_id]
GO

The following indexes available in the table source_price_curve (many were created long back and may not be used)

CREATE CLUSTERED INDEX [source_curve_def_id_index] ON [dbo].[source_price_curve]
(
    [as_of_date] ASC,
    [source_curve_def_id] ASC,
    [maturity_date] ASC,
    [is_dst] ASC,
    [curve_source_value_id] ASC
)
CREATE NONCLUSTERED INDEX [source_price_curve_123] ON [dbo].[source_price_curve]
(
    [source_curve_def_id] ASC,
    [as_of_date] ASC
)
INCLUDE (   [maturity_date]) 
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_as_of_date_curve_source_value_id] 
ON [dbo].[source_price_curve]
(
    [as_of_date] ASC,
    [curve_source_value_id] ASC
)
INCLUDE (   [curve_value],
    [maturity_date],
    [source_curve_def_id])
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_as_of_date_curve_source_value_id_Assessment_curve_type_value_id] 
ON [dbo].[source_price_curve]
(
    [as_of_date] ASC,
    [curve_source_value_id] ASC,
    [Assessment_curve_type_value_id] ASC
)
INCLUDE (   [curve_value],
    [maturity_date],
    [source_curve_def_id]) 
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_maturity_date] ON [dbo].[source_price_curve]
(
    [maturity_date] ASC
)
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_source_curve_def_id_curve_source_value_id]
 ON [dbo].[source_price_curve]
(
    [source_curve_def_id] ASC,
    [curve_source_value_id] ASC
)
INCLUDE (   [as_of_date],
    [Assessment_curve_type_value_id],
    [curve_value],
    [is_dst],
    [maturity_date]) 
CREATE NONCLUSTERED INDEX [IX_PT_source_price_curve_source_curve_def_id111] 
ON [dbo].[source_price_curve]
(
    [source_curve_def_id] ASC,
    [Assessment_curve_type_value_id] ASC,
    [curve_source_value_id] ASC,
    [maturity_date] ASC,
    [as_of_date] ASC
)

I hope all these explanations will help to analyze! Thanks in advance.

Best Answer

Why does the inner join to a one record temp table make the query take so much longer time?

Without the join, the optimizer is smart enough to work out that it can find the minimum value by reading one row from the end of the index.

Unfortunately, it is not currently equipped to apply the same sort of logic when the query is more complicated (with a join or grouping clause, for example). To work around this limitation, you can rewrite the query to compute local minimums per row in the temporary table, then find the global minimum.

Perhaps the easiest way to express this in T-SQL is to use the APPLY operator:

SELECT
    -- Global minimum
    @tenor_from = MIN(MinMaturityPerCurveID.maturity_date)
FROM #source_price_curve_list AS SPCL
CROSS APPLY
(
    -- Minimum maturity_date per price_curve_id
    SELECT TOP (1) 
        SPC.maturity_date
    FROM  dbo.source_price_curve AS SPC
    WHERE
        SPC.source_curve_def_id = SPCL.price_curve_id
         and as_of_date >= @as_of_date_from 
    ORDER BY
        SPC.maturity_date ASC
) AS MinMaturityPerCurveID;

Good performance relies on there being many rows per price_curve_id. You may need an index of the form:

CREATE NONCLUSTERED INDEX
    [IX dbo.source_price_curve source_curve_def_id, maturity_date, as_of_date]
ON dbo.source_price_curve 
(
    source_curve_def_id,
    maturity_date,
    as_of_date
);

Related Solutions

Sql-server – deteriorating stored procedure running times

What is up with FROM part JOIN model ON 1=1? This the same as FROM part, model, which is a cartesian join and will result in a very large number of rows. Is that join supposed to be like that?

You will likely help us help you if you provide details about the tables involved. Please "script" the definition of the tables, along with any indexes defined on those tables.

This sounds like a classic case of parameter sniffing resulting in good plan/bad plan choices for various scenarios in your data.

You may be able to get more reliable performance by making SQL Server cache different plans for different scenarios by using sp_executesql, as in the following example:

CREATE PROCEDURE [dbo].[create_grid_materials2] 
(
    @partlistid bigint
    , @pid bigint
    , @masterid bigint
)
AS
BEGIN
    begin
        DECLARE @cmd NVARCHAR(MAX);

        SET @cmd = '   
        INSERT INTO material (partid, personid, modelID)
        SELECT 
            partid = part.id
            , personid = @pid
            , modelid = model.id  
        FROM part
            INNER JOIN model ON 1=1
        WHERE (
            model.masterid = ' + CONVERT(NVARCHAR(50), @masterid) + ' 
                AND model.modelSetID IS NULL
                AND part.partlistid = ' + CONVERT(NVARCHAR(50), @partlistid) + '
                AND (
                    part.partType = 100 
                    or part.partType=120 
                    or part.partType = 130
                )
            )
            AND NOT EXISTS (
                SELECT 1 
                FROM material AS a1 
                WHERE a1.partid = part.id 
                    AND a1.personid=@pid 
                    AND a1.modelid=model.id
                )';
        DECLARE @Params VARCHAR(200);
        SET @Params = '@pid INT';
        EXEC sys.sp_executesql @cmd
            , @Params
            , @pid = @pid;
    end
End

The above code will cause a new plan to be generated for each combination of @partlistid, and @masterid.

The presumption here is some combinations of those two variables lead to a very small number of rows, whereas some combinations lead to a very large number of rows.

Forcing a plan for each combination allows SQL Server to generate more efficient plans for each. I've explicitly not included @pid since you probably want to try it with a fairly small number of combinations first; adding a third variable to the mix will make for an exponentially larger number of possible plans.

MySQL query taking too long

To do the filtering sooner, change the subquery to

( SELECT email,
         COUNT(*) AS email_ct
      FROM oc_order
      GROUP BY  email
      HAVING COUNT(*) > 3 ) AS xxx

Then remove xxx.`Orders Count` >3 as being redundant.

To make the subquery run faster, oc_orders needs INDEX(email)

In oc_order_total, change KEY order_id (order_id) to KEY order_id (order_id, code)

With email_ct, you can add this to the main part of the query:

email_ct as 'Orders Count',

Best Answer

Related Solutions

Sql-server – deteriorating stored procedure running times

MySQL query taking too long

Related Question