Sql-server – Why does this query take so long to execute

azure-sql-databaseperformancequery-performancesql server

I'm currently using Microsoft SQL Azure (RTM) - 12.0.2000.8 as the database. The database currently has 10 DTUs.

The idea here is that I want to make the query based on the view. This view contains of simple SELECT() syntax with 5 table joins. This view give output roughly about 253K rows in about 6 seconds.

CREATE VIEW [dbo].[TopAdsDisplaySumaryView]
AS
SELECT  
        client.Id AS ClientId,      -- (PK, int, not null)
        client.PartnerId,           -- (FK, int, not null)
        adsPict.Id AS AdsPictureId, -- (PK, int, not null)
        adsPict.ImageName,          -- (nvarchar(max), null)
        displayAds.DisplayTo,       -- (datetime, not null)
        displayAds.DisplayFrom      -- (datetime, not null)
FROM      
        dbo.Machines AS machine 
        INNER JOIN dbo.MachineGroups AS machineGroups ON machineGroups.Id = machine.MachineGroupId 
        INNER JOIN dbo.Clients AS client ON client.Id = machineGroups.ClientId 
        INNER JOIN dbo.DisplyadAds AS displayAds ON displayAds.MachineId = machine.Id 
        INNER JOIN dbo.AdsPictures AS adsPict ON adsPict.Id = displayAds.AdsPictureId

enter image description here

Notice that I'm only taking the column from tables clients, displayAds, and adsPictures, but I have to join tables machines and machineGroups in order to get data from clients. displayAds table has the most data.

From here I want to generate the top 5 data with the most data count. This is the query:

SELECT TOP 5
    adsSum.PartnerId as PartnerId,
    adsSum.ClientId as ClientId,
    adsSum.AdsPictureId as AdsPictureId,
    adsSum.ImageName as ImageName,
    count(*) as TotalDisplay
FROM [dbo].[TopAdsDisplaySumaryView] adsSum
GROUP BY adsSum.PartnerId, adsSum.ClientId, adsSum.AdsPictureId, adsSum.ImageName
ORDER BY TotalDisplay DESC

Here's the actual execution plan for this query: https://pastebin.com/CcR9cC5e

When this query is executed, it takes about 1 minute 15 seconds. I'm trying to find out why this query takes a lot of time to execute, while executing the view only takes 6 seconds.

Here's my attempts so far:

  1. Created an index for displayAds table. This gives me ~1 second improvement, no big change.

    CREATE NONCLUSTERED INDEX 
    [MachineIdIndexIncludeAdsPictureIdDisplayFromDisplayTo] ON [dbo].[DisplyadAds]
    (
        [MachineId] ASC
    )
    INCLUDE 
    (   
        [AdsPictureId],
        [DisplayFrom],
        [DisplayTo]
    )
    GO
    
  2. Created an index for adsPictures table. Nothing changed.

    CREATE NONCLUSTERED INDEX [IX_AdsPictures_IncludeImageName] ON [dbo].[AdsPictures]
    (
       [Id] ASC
    )
    INCLUDE ( [ImageName])
    GO
    
  3. Tried to remove column adsPict.ImageName in the query, the execution time was greatly improved to only 3 seconds (~72 sec. improvement!). I suspect that column is the problem, but I need it to show.

The maximum record length of ImageName is 36. Mostly around 11~20. There are 55 rows in AdsPictures table.

Any idea how to resolve this problem? Any help would be greatly appreciated.

Best Answer

Here's what I see in the XML for your actual plan:

<WaitStats>
<Wait WaitType="SOS_SCHEDULER_YIELD" WaitTimeMs="66883" WaitCount="1649" />
<Wait WaitType="RESOURCE_GOVERNOR_IDLE" WaitTimeMs="56086" WaitCount="3980" />
</WaitStats>
<QueryTimeStats CpuTime="6665" ElapsedTime="73522" />

You can also see this information in the root node of the plan in SSMS if you don't like looking at XML:

enter image description here

You can read about the SOS_SCHEDULER_YIELD wait here. A good layman's explanation is that your query was only able to execute on a CPU for 6639 ms. It spent 66883 ms on a queue waiting to get on a CPU. Based on your description of the problem ("executing the view only takes 6 seconds"), you have a server performance issue instead of a query performance issue. Of course, there may be ways to make the query use even less CPU, but I would approach this problem by verifying that the server has the correct number of CPU cores/DTUs/whatever and by looking at the queries that use the most CPU on the server. Your query could run almost 12X faster if there was less load on the server.