Sql-server – Finite State Machines in SQL

enterprise-editionperformancequery-performancesql serversql-server-2012

I'd like some input on an issue I'm having. We have a section of code that we repeat throughout our stored procedures, and every time it takes quite some time to process, and when combined the number of reads runs into the hundreds of millions on a set of hundreds of thousands of Items. Basically we have Items, and Items can have up to 12 machines, each with their own state.

These are the (simplified) table structures:

CREATE TABLE dbo.ItemMachineState
(
    [itemID] [int],
    [machineID] [int],
    [stateID] [int]
)

CREATE TABLE dbo.Transition
(
    [machineID] [int] NOT NULL,
    [eventID] [int] NOT NULL,
    [stateID] [int] NOT NULL,
    [nextStateID] [int] NOT NULL
)

What happens is, during processing we create a #temp table which we work against, and it eventually has an eventID per Item. That temp table is then joined back to ItemState and Transition, like so:

UPDATE  dbo.ItemState
SET     stateID = tr.nextStateID
FROM    #temp t  
JOIN    dbo.ItemMachineState ist ON ist.itemID = t.itemID
JOIN    Transition tr ON tr.stateID = ist.stateID AND
                         tr.machineID = ist.machineID AND
                         tr.eventID = t.eventID

So the eventID, which we calculate, determines what will happen to a given Item's Machines, depending on what State they're each in. The issue is that an event can manipulate zero or more machine states in one movement, if that event is relevant to that particular combo of state and machine.

Here's an example of one of these state shifts:

ItemID 3468489 first looks like this in ItemMachineState…

itemID      machineID   stateID
----------- ----------- -----------
3468489     12          4
3468489     14          113
3468489     15          157
3468489     16          165
3468489     18          169
3468489     19          165
3468489     20          157
3468489     21          165
3468489     23          173
3468489     24          173
3468489     26          9
3468489     36          9

We do some work, and eventually have a #temp table that has an ItemID and an EventID…

itemID      eventID
----------- -----------
3468489     64

Then we join both of these tables to Transition, which looks like this for this particular eventID:

machineID   eventID     stateID     nextStateID
----------- ----------- ----------- -----------
13          64          73          79
13          64          74          79
13          64          75          79
13          64          76          79
13          64          77          79
13          64          78          79
13          64          187         79
13          64          188         79
13          64          189         79
13          64          190         79
13          64          191         79
36          64          9           79
36          64          194         79
36          64          196         79
36          64          208         79
36          64          210         79
36          64          213         79
36          64          218         79
46          64          73          79
47          64          73          79
70          64          73          79
70          64          75          79
70          64          76          79
70          64          77          79
70          64          78          79

Putting it all together:

SELECT  t.itemID, t.eventID, ist.machineID, ist.stateID, tr.nextStateID
FROM    #temp t  
JOIN    dbo.ItemMachineState ist ON ist.itemID = t.itemID
JOIN    Transition tr ON tr.stateID = ist.stateID AND
                         tr.machineID = ist.machineID AND
                         tr.eventID = t.eventID

itemID      eventID     machineID   stateID     nextStateID
----------- ----------- ----------- ----------- -----------
3468489     64          36          9           79

So in this particular example, this Event was only relevant to one Machine for this Item. It's stateID is updated from 9 to 79 on machineID 36, and everything else stays the same for this Item.

I'd like suggestions on how to approach this differently. We can't move away from the table structure, but we can alter how we go about setting stateID to nextStateID during transitions/events. As you can see above this works by elimination; we need the current state and machine to figure out what the next state is, for that machine, for that event. In some cases this won't update anything, other times it will update multiple machines in one go, and we like that ability. I don't think the leanest solution to this problem will be found by simply changing indexes or adding query hints, and that we need a new approach that limits the number of reads and processing time, but gives us the same functionality.

I wanted to avoid bringing indexes and such into this discussion because I'd have to then use real examples, which pollute the essence of what I'm trying to ask here, I changed the name of columns and tables to simplify my question. In any case, here you go:

Query plan
http://pastebin.com/xhPa4t8d, create and index scripts
http://pastebin.com/sp70QuEJ

Note that in the query plan we force an INNER LOOP JOIN. If left to a simple JOIN, the query takes exponentially longer to process.

Using @wBob UNIQUE CLUSTERED index, before:

and after:

Using OPTION (MERGE JOIN, HASH JOIN) resulted in this execution plan and results:

Will update with other information shortly

Best Answer

I would consider not updating all the rows at once and instead running through batches of machines so the volume of records decrease per update. You can keep the same code, just batch it.

Related Solutions

Sql-server – Getting each status change in a table

This type of requirement comes under the banner of "gaps and islands". A popular approach is

WITH T
     AS (SELECT *,
                DENSE_RANK() OVER (PARTITION BY ItemId ORDER BY DateOfChange) - 
                DENSE_RANK() OVER (PARTITION BY ItemId, Status ORDER BY DateOfChange) AS Grp
         FROM   ItemTable)
SELECT ItemId,
       Status,
       MIN(DateOfChange) AS Start,
       MAX(DateOfChange) AS Finish
FROM   T
GROUP  BY ItemId,
          Status,
          Grp
ORDER  BY Start

Sql-server – Full outer join problems

The problem is in your WHERE clause. This piece of code in your WHERE clause for example.

and osnArtikel.SpletObjava='D'

is going to restrict your output whenever osnArtikel.SpletObjava is NULL. Including whenever there wasn't a match on it. (your OUTER results)

One method to fix it is to move your restrictions into the ON clause. Not generally something you want to do since the ON clause is a JOIN but with OUTER JOINs it's not unreasonable.

select stranka.SifraStranke,
--  stranka.DomacaStran,
--  strFaktura.SifraGrucePopusti,
--  popust.OdstotekPopusta,
--  popust.SifraGruceArtikla,
--  artProdaja.SifraArtikla,
    osnArtikel.SifraArtikla,
    cene.ProdajnaCena * ((100-popust.OdstotekPopusta)/100)

from STRStrankeSplosno stranka      -- customer basic data table
left join 
    STRStrankeFaktura strFaktura    -- customer financial data table
    on stranka.SifraStranke=strFaktura.SifraStranke
    and strFaktura.StevilkaSifranta=2
    and strFaktura.SifraGrucePopusti<>''
left join
    ARTPopustiStranke popust        -- customer discount data table for each discount group
    on strFaktura.SifraGrucePopusti=popust.SifraGruceStranke
    and popust.StevilkaSifranta=2
left join
    ARTArtikliProdaja artProdaja    -- item discount group data table
    on popust.SifraGruceArtikla=artProdaja.SifraGrucePopust
    and artProdaja.StevilkaSifranta=2
full outer join
    ARTArtikliOsnovniP osnArtikel   -- item basic data table
    on osnArtikel.SifraArtikla=artProdaja.SifraArtikla
    and osnArtikel.SpletObjava='D'

cross apply (
    select top 1 cene.ProdajnaCena
    from ARTArtikliCeneCP cene      -- item prices data table
    where cene.SifraArtikla=osnArtikel.SifraArtikla
    and cene.SifraCenovnegaPodrocja='MC'
    order by cene.DatumZacetkaVeljavnosti desc
    ) cene
where stranka.DomacaStran<>''
and stranka.StevilkaSifranta=2
order by stranka.SifraStranke, osnArtikel.SifraArtikla

Another method is to include checks for the column being NULL in your WHERE clause. This may or may not be the way you want to go because if the column you are checking is also nullable you could end up with some unintended results. The best method is to check if a non-null column (primary key column for example) is nullable.

select stranka.SifraStranke,
--  stranka.DomacaStran,
--  strFaktura.SifraGrucePopusti,
--  popust.OdstotekPopusta,
--  popust.SifraGruceArtikla,
--  artProdaja.SifraArtikla,
    osnArtikel.SifraArtikla,
    cene.ProdajnaCena * ((100-popust.OdstotekPopusta)/100)

from STRStrankeSplosno stranka      -- customer basic data table
left join 
    STRStrankeFaktura strFaktura    -- customer financial data table
    on stranka.SifraStranke=strFaktura.SifraStranke
left join
    ARTPopustiStranke popust        -- customer discount data table for each discount group
    on strFaktura.SifraGrucePopusti=popust.SifraGruceStranke
left join
    ARTArtikliProdaja artProdaja    -- item discount group data table
    on popust.SifraGruceArtikla=artProdaja.SifraGrucePopust
full outer join
    ARTArtikliOsnovniP osnArtikel   -- item basic data table
    on osnArtikel.SifraArtikla=artProdaja.SifraArtikla

cross apply (
    select top 1 cene.ProdajnaCena
    from ARTArtikliCeneCP cene      -- item prices data table
    where cene.SifraArtikla=osnArtikel.SifraArtikla
    and cene.SifraCenovnegaPodrocja='MC'
    order by cene.DatumZacetkaVeljavnosti desc
    ) cene
where stranka.DomacaStran<>''
and stranka.StevilkaSifranta=2
and (osnArtikel.PrimaryKeyColumn IS NULL 
        OR osnArtikel.SpletObjava='D' )
and (strFaktura.PrimaryKeyColumn IS NULL
    OR (strFaktura.StevilkaSifranta=2
        and strFaktura.SifraGrucePopusti<>'')
    )
and (popust.PrimaryKeyColumn IS NULL 
    OR popust.StevilkaSifranta=2)
and (artProdaja.PrimaryKeyColumn IS NULL 
    OR artProdaja.StevilkaSifranta=2)
order by stranka.SifraStranke, osnArtikel.SifraArtikla

Best Answer

Related Solutions

Sql-server – Getting each status change in a table

Sql-server – Full outer join problems

Related Question