Sql-server – Clustered Index and Primary Key on table with unique identifier and date

clustered-indexsql serversql-server-2005

I have table with Orders with unique incrementing ID and (order creation) Date. This table is quite large and wide (5mn rows, 50 columns, 10 columns of which are FK IDs). Example:

CREATE TABLE Orders As (
    ID int UNIQUE AUTOINCREMENT (1,1),
    DATE datetime,  --This is SQL Server 2005, so no [date] data type
    OrderStatus char (2),
    ClientID int
    ...
)

I get this table as copy of yesterday's production table. Table (as whole DB) is used for reporting purposes and therefore read-only.

Part 1

I have 2 very common use cases:

  1. 80% (or more) of queries have DATE column in WHERE clause, as users want data for specific business date.

Here I want to create clustered index on DATE.

  1. 40%-60% of queries use OrderID to JOIN Orders table to other tables with information about Order details (Product, Supplpier, Payments, Reservations, etc).

Here I want to create clustered index on OrderID.

Part 2

Can I have one CI for both cases?
Index (ID, DATE) will not work with WHERE clause.
Index (DATE,ID) will not work for joins on ID only or ID in WHERE

Catch. We know that DATE is incremental as well as ID. ID with higher value cannot appear yesterday under any circumstances as it is AUTOINCREMENT.

Question. Is there a way to tell SQL Server that CI (DATE, ID) will have ID sequentially ordered for all dates?

My only solution at the moment is crating non clustered covering index (ID, DATE), but it's suboptimal.

I've searched for some time, but could not find anything. If there is a solutions for later versions of SQL Server I would be interested in it too.

Update

I know clustered index basics. Please note, database is put in read-only state for users.

Logically you can just ignore Date column in (Date, ID) index without any harm at all. Possibly this is a very specific use case that is not yet covered by SQL Server functionality.

Best Answer

The main characteristic of the clustered index is that the data is all in that order. That allows SQL to "read ahead". So for example if you created your CI on the Date column and ran queries pulling summary info for a week or a month then SQL could pull the data more quickly.

If on the other hand you are having to do seeks (a single Id for example) then the CI is no different than a covering NCI. Note the covering part of that.

Here is something you could try. Place your CI on the Date, OrderId combination. Do this because you are more likely to pull range data on a date than an Id. Also because you said that 80% of your queries use the date. Some portion of these will presumably also use the OrderId.

Then see what columns the queries that just use OrderId use and add those columns to an OrderId NCI using the INCLUDE clause. And just to save time if the answer is "they are using all of the columns on these queries" then a) you need to look at those queries and see if they really need all that data and b) yes you could INCLUDE all of the columns in the table.

I realize that would be a HUGE INCLUDE. In fact it would double the size of your table and you'll want to carefully test that you aren't going to have a serious negative impact on write operations. However for reads it should work just fine.