Sql-server – Optimising join on large table

performancesql-server-2008

I am trying to coax some more performance out of a query that is accessing a table with ~250-million records. From my reading of the actual (not estimated) execution plan, the first bottleneck is a query that looks like this:

select
    b.stuff,
    a.added,
    a.value
from
    dbo.hugetable a
    inner join
    #smalltable b on a.fk = b.pk
where
    a.added between @start and @end;

See further down for the definitions of the tables & indexes involved.

The execution plan indicates that a nested loop is being used on #smalltable, and that the index scan over hugetable is being executed 480 times (for each row in #smalltable). This seems backwards to me, so I've tried to force a merge join to be used instead:

select
    b.stuff,
    a.added,
    a.value
from
    dbo.hugetable a with(index = ix_hugetable)
    inner merge join
    #smalltable b with(index(1)) on a.fk = b.pk
where
    a.added between @start and @end;

The index in question (see below for full definition) covers columns fk (the join predicate), added (used in the where clause) & id (useless) in ascending order, and includes value.

When I do this, however, the query blows out from 2 1/2 minutes to over 9. I would have hoped that the hints would force a more efficient join that only does a single pass over each table, but clearly not.

Any guidance is welcome. Additional information provided if required.

Update (2011/06/02)

Having reorganised the indexing on the table, I have made significant performance inroads, however I have hit a new obstacle when it comes to summarising the data in the huge table. The outcome is a summarisation by month, which currently looks like the following:

select
    b.stuff,
    datediff(month, 0, a.added),
    count(a.value),
    sum(case when a.value > 0 else 1 end) -- this triples the running time!
from
    dbo.hugetable a
    inner join
    #smalltable b on a.fk = b.pk
group by
    b.stuff,
    datediff(month, 0, a.added);

At present, hugetable has a clustered index pk_hugetable (added, fk) (the primary key), and a non-clustered index going the other way ix_hugetable (fk, added).

Without the 4th column above, the optimiser uses a nested loop join as before, using #smalltable as the outer input, and a non-clustered index seek as the inner loop (executing 480 times again). What concerns me is the disparity between the estimated rows (12,958.4) and actual rows (74,668,468). The relative cost of these seeks is 45%. Running time is however under a minute.

With the 4th column, the running time spikes to 4 minutes. It seeks on the clustered index this time (2 executions) for the same relative cost (45%), aggregates via a hash match (30%), then does a hash join on #smalltable (0%).

I'm unsure as to my next course of action. My concern is that neither the date range search nor the join predicate is guaranteed or even all that likely to drastically reduce the result set. The date range in most cases will only trim maybe 10-15% of records, and the inner join on fk may filter out maybe 20-30%.

As requested by Will A, the results of sp_spaceused:

name      | rows      | reserved    | data        | index_size  | unused
hugetable | 261774373 | 93552920 KB | 18373816 KB | 75167432 KB | 11672 KB

#smalltable is defined as:

create table #endpoints (
    pk uniqueidentifier primary key clustered,
    stuff varchar(6) null
);

While dbo.hugetable is defined as:

create table dbo.hugetable (
    id uniqueidentifier not null,
    fk uniqueidentifier not null,
    added datetime not null,
    value decimal(13, 3) not null,

    constraint pk_hugetable primary key clustered (
        fk asc,
        added asc,
        id asc
    )
    with (
        pad_index = off, statistics_norecompute = off,
        ignore_dup_key = off, allow_row_locks = on,
        allow_page_locks = on
    )
    on [primary]
)
on [primary];

With the following index defined:

create nonclustered index ix_hugetable on dbo.hugetable (
    fk asc, added asc, id asc
) include(value) with (
    pad_index = off, statistics_norecompute = off,
    sort_in_tempdb = off, ignore_dup_key = off,
    drop_existing = off, online = off,
    allow_row_locks = on, allow_page_locks = on
)
on [primary];

The id field is redundant, an artefact from a previous DBA who insisted that all tables everywhere should have a GUID, no exceptions.

Best Answer

Your ix_hugetable looks quite useless because:

it is the clustered index (PK)
the INCLUDE makes no difference because a clustered index INCLUDEs all non-key columns (non-key values at lowest leaf = INCLUDEd = what a clustered index is)

In addition: - added or fk should be first - ID is first = not much use

Try changing the clustered key to (added, fk, id) and drop ix_hugetable. You've already tried (fk, added, id). If nothing else, you'll save a lot of disk space and index maintenance

Another option might be to try the FORCE ORDER hint with table order boh ways and no JOIN/INDEX hints. I try not to use JOIN/INDEX hints personally because you remove options for the optimiser. Many years ago I was told (seminar with a SQL Guru) that FORCE ORDER hint can help when you have huge table JOIN small table: YMMV 7 years later...

Oh, and let us know where the DBA lives so we can arrange for some percussion adjustment

Edit, after 02 Jun update

The 4th column is not part of the non-clustered index so it uses the clustered index.

Try changing the NC index to INCLUDE the value column so it doesn't have to access the value column for the clustered index

create nonclustered index ix_hugetable on dbo.hugetable (
    fk asc, added asc
) include(value)

Note: If value is not nullable then it is the same as COUNT(*) semantically. But for SUM it need the actual value, not existence.

As an example, if you change COUNT(value) to COUNT(DISTINCT value) without changing the index it should break the query again because it has to process value as a value, not as existence.

The query needs 3 columns: added, fk, value. The first 2 are filtered/joined so are key columns. value is just used so can be included. Classic use of a covering index.

Update (2011/06/02)

Best Answer

Related Solutions

Sql-server – Optimising a query using hint locks

Mysql improve Self join performance large table

Related Question