Sql-server – How to make a union view execute more efficiently

indexsql serverunionview

I have a large table (tens to hundreds of millions of records) that we have split for performance reasons into active and archive tables, using a direct field mapping, and running an archive process every night.

In several places in our code we need to run queries that combine the active and archive tables, almost invariably filtered by one or more fields (which we've obviously put indexes on in both tables). For convenience it would make sense to have a view like this:

create view vMyTable_Combined as
select * from MyTable_Active
union all
select * from MyTable_Archive

But if I run a query like

select * from vMyTable_Combined where IndexedField = @val

it's going to do the union on everything from Active and Store before filtering by @val, which is going to kill performance.

Is there any clever way of making the two sub-queries of the union view each filter by @val before they create the union?

Or maybe there's some other approach you would suggest that achieves what I'm going for, i.e. an easy and efficient way of getting the union record set, filtered by the indexed field?

EDIT: here's the execution plan (and you get to see the real table names here):

execution plan

Oddly enough, the active table is actually using the correct index (plus a RID lookup?) but the archive table is doing a table scan!

Best Answer

The comments on the question show that the issue is that the test database the OP was using to develop the query had radically different data characteristics than the production database. It had much fewer rows and the field being used for filtering wasn't selective enough.

When the number of distinct values in a column is too small the index may not be sufficiently selective. In this case a sequential table scan is cheaper than an index seek/row lookup operation. Typically a table scan makes exensive use of sequential I/O, which is much faster than random access reads.

Often, if a query would return more than just a few percent of rows it will be cheaper just to do a table scan than an index seek/row lookup or similar operation that makes heavy use of random I/O.