Sql-server – Loading a FACT table per day

etlperformancesql serversql-server-2008

I have a database with 13 billion rows, per day I have around 20-30 mio rows. On top of this I have one cube, one of its dimensions is DateTime that goes down to milliseconds. To load the fact table I use the following Query as T-SQL Task within SSIS:

INSERT INTO [FACT].[DataMine]
SELECT MONTH(RDM.[DATE]) as 'PartitionID',
       DateTime_Key,
       Price,
       Amount,
FROM   [RAW].[DataMine] RDM
INNER JOIN [DIM].[DateTime] DDT
ON RDM.DateTime_Key = DDT.DateTime_Key
WHERE DM.Date BETWEEN '2011-11-28' AND '2011-11-28' AND
      DateTime_Key NOT IN (SELECT DISTINCT DateTime_Key
                           FROM [FACT].[DataMine] DM
                           INNER JOIN [DIM].[DateTime] DT
                           DM.DateTime_Key = DT.DateTime_Key
                           WHERE [DATE] BETWEEN '2011-11-28' AND '2011-11-28')

PartitionID is used because I partition the FACT table by Month. I have to be able to run the load over a certain date range and should not worry about double rows, therefore it looks first if the rows are already loaded or not.

From the performance this runs not bad, I need around 7-8 minutes for one day of data, but suddenly this goes up like a rocket and then takes > 1 hour for one day of data. What puzzles me is the fact that the load time doesn't go up gradually. Looking at the sql server i see that it is busy in the temp database and I see quiet some disk i/o (eventthough the sql server has around 140 GB RAM still free for him to grab).

Index are all up todate, no fragmentation, statistics are also looking good.

What am I missing to understand where this sudden performance drop comes from ?

Machine is:
(SQL 2008 R2 64bit / 8 cores / 192 GB RAM / SAN Disks / 10GbE)

Best Answer

The execution plan is likely to be changing.

Grab a copy of a fast plan and a slow plan and compare.

By using a plan guide you may be able to force the query to use the one plan for all occasions (after testing of course).

Related Solutions

Sql-server – Optimising join on large table

Your ix_hugetable looks quite useless because:

it is the clustered index (PK)
the INCLUDE makes no difference because a clustered index INCLUDEs all non-key columns (non-key values at lowest leaf = INCLUDEd = what a clustered index is)

In addition: - added or fk should be first - ID is first = not much use

Try changing the clustered key to (added, fk, id) and drop ix_hugetable. You've already tried (fk, added, id). If nothing else, you'll save a lot of disk space and index maintenance

Another option might be to try the FORCE ORDER hint with table order boh ways and no JOIN/INDEX hints. I try not to use JOIN/INDEX hints personally because you remove options for the optimiser. Many years ago I was told (seminar with a SQL Guru) that FORCE ORDER hint can help when you have huge table JOIN small table: YMMV 7 years later...

Oh, and let us know where the DBA lives so we can arrange for some percussion adjustment

Edit, after 02 Jun update

The 4th column is not part of the non-clustered index so it uses the clustered index.

Try changing the NC index to INCLUDE the value column so it doesn't have to access the value column for the clustered index

create nonclustered index ix_hugetable on dbo.hugetable (
    fk asc, added asc
) include(value)

Note: If value is not nullable then it is the same as COUNT(*) semantically. But for SUM it need the actual value, not existence.

As an example, if you change COUNT(value) to COUNT(DISTINCT value) without changing the index it should break the query again because it has to process value as a value, not as existence.

The query needs 3 columns: added, fk, value. The first 2 are filtered/joined so are key columns. value is just used so can be included. Classic use of a covering index.

Sql-server – Query Performance depending on parameter 21seconds vs. > 14hours

I don't know about sqlserver but in an Oracle database, I would suggest making a trace of the sql execution, one that includes all waits and events that cause the query to spend time. This shows the exact circumstances where the sql is executing in and that might be very different than those in the environment where you did the explain plan. Sqlserver without doubt has a similar feature to show the real execution, including waits.

In Oracle we have sql plan stability. Maybe sqlserver has something similar? In that case, try to use that.

Related Question