Sql-server – Varbinary startswith query using index

indexsql servervarbinary

I've got this tree-like key (think object OIDs–it's very similar) to store and index in a table, for which queries want to select subtrees thereof. The most straightforward way to do this on a btree is to use the tree path as the single key and descend the key with a starts-with operation to find the row blocks.

So the obvious representation is VARBINARY(n)*. I cannot think of a way to express starts with in a way that SQL Server can figure out to use the index thereof.

The next-most obvious is to pack the data into some VARCHAR with a binary collation and use a like operation; however I've discovered that like @n + '%' will never use an index, but only like 'literal%. Adding the WITH INDEX hint still scans the index in its entirety because SQL server just doesn't understand. I really don't want to think about how to escape arguments in SQL so I can build an EXEC. That's just madness and a security disaster waiting to happen.

Expressing this in ordinary recursive tables and querying with recursive SQL is out of the question. The recursive query will power-sink the database server.

I actually have the data in a long string of long variables right now and can select my serialization form. As I said, the natural form is VARBINARY and it really would look like 0x000000100004A010000000D0000A000. The most common query is of the form "give me everything starting with 0x000000100004A01" or in natural model, the first n values from the string of long values. I could write them in the form a.b.c.d.e.f... but the individual numbers are long and come from clicking on things on the GUI.

Roughly speaking I'm looking at a plausible form of

CREATE TABLE Record (
   RecordId BIGINT NOT NULL IDENTITY(1,1),
   RecordedDate DATETIME NOT NULL,
   RecordClass CHAR(1) NOT NULL,
   UserId INT NOT NULL,
   ObjectId VARBINARY(384) NOT NULL,
   RecordValue NVARCHAR(100) NULL,
   OwnerId BIGINT NULL, -- Joins to another table
   SubOwnerId BIGINT NULL, -- and yet another table
   PRIMARY KEY (RecordId)
)
CREATE INDEX name ON Record(RecordedDate);
CREATE INDEX name ON Record(OwnerId);
CREATE INDEX name ON Record(SubOwnerId);
CREATE INDEX name ON Record(ObjectId);

What is the best way to do a starts-with index?

*I've computed the largest possible n and it's less than 400

Best Answer

So as a matter of fact, starts with is not so hard to write as it would first seem. Given the initial key of 0x000000100004, obviously equals won't work and there is no starts with operation, but we can write ObjectId >= 0x000000100004 AND ObjectId < 0x000000100005.

By forcing the string of long values to VARBINARY conversion to be big-endian the natural form actually works correctly. We want >= (value0 & value1 & ... & lastvalue) AND < (value0 & value1 & ... & (lastvalue + 1)). As far as I know, the conversion to VARBINARY has to be written in application code, which is fine for me.