Sql-server – SQL Server 2012 – Frequent SELECT and UPDATE on bit column. Index doesn’t help

indexperformancesql serversql-server-2012

I have a SQL Server table with about 2 million rows in, this table stores short textual documents. Here's my schema:

CREATE TABLE Documents (
    documentId bigint IDENTITY NOT NULL,
    content nvarchar(MAX) NOT NULL,
    isIndexed bit NOT NULL
)

I have a separate Lucene-based indexer process that does this:

SELECT TOP 1 documentId, content FROM Documents WHERE isIndexed = 0

It then performs the indexing operation before executing this:

UPDATE Documents SET isIndexed = 1 WHERE documentId = @documentId

Initially this worked fine. When every document in the table was not indexed (i.e. all isIndexed values were 0) then each retrieval took about 5ms.

However as more and more documents were indexed then the retrieval time slowly increased. Currently it's about 150ms – a 30 time speed reduction. I note that the UPDATE statement always seems to run in under 2ms, so I know that's not a problem.

From the start the table has always had a non-clustered index on the isIndexed column, but the Actual Execution Plan shows the SQL Server uses an Index Scan (rather than Seek).

What can I do to speed up the system?

I am aware that the "isIndexed" column existing is bad in itself, but due to how the indexer works it can't request documents by documentId directly. For this and other reasons I can't accept any answers that don't solve the immediate problem at hand.

Best Answer

An index on the bit column isn't going to help at all because of selectivity. You should consider a filtered index:

CREATE INDEX unindexed ON dbo.Documents(documentId) WHERE IsIndexed = 0;

You should also include an ORDER BY documentId in your SELECT.

Though you should test this in a staging environment because it may help your SELECT but may offset that with a more expensive UPDATE.