Sql-server – Is the concept of a clustered index in a DB design sensical when using SSDs

clustered-indexsql server

When designing a SQL server data schema and the subsequent queries, sprocs, views, etc. does the notion of a clustered index and order of data on disk make any sense to consider for DB designs made explicitly to be deployed on SSD platforms?

http://msdn.microsoft.com/en-us/library/aa933131(v=sql.80).aspx
"A clustered index determines the physical order of data in a table."

On a physical disk platform, the design to consider them makes sense to me as a physical scan of the data to retrieve "sequential" rows can be more performant than a seek through the table.

On an SSD platform, all data read access uses an identical seek. There is no concept of "physical order" and data reads are not "sequential" in the sense that bits are stored on the same piece of silicon.

So, in the process of designining an application database is the clustered index consideration relevant to this platform?

My initial thought is that it is not because the idea of "ordered data" doesn't apply to SSDs storage and seek/retreival optimization.

EDIT: I know the SQL Server will create one, I'm just philosophizing about whether it makes sense to think about it during design/optimization.

Best Answer

Ask yourself another question: If the entire database is in memory and I never have to touch the disk, do I want to store my data in an ordered B-tree or do I want to store my data in an unordered heap?

The answer to this question will depend on your access pattern. On most cases your access requires single row look-up (ie. seeks) and range scans. These access patterns require a B-Tree, otherwise they are inefficient. Some other access patterns, common in DW and OLAP, are always doing aggregates over the entire table end-to-end always and they do no benefit from range scans. As you drill further other requirements come to light, like the speed of insert and allocation into a heap vs. B-Tree may play a role for huge ETL transfer jobs. But most times the answer really boils down to one question: do you seek or range-scan? The overwhelming number of times the answer is YES. And therefore the overwhelming number of times the design requires a clustered index.

In other words: just because is cheap to read it from disk in random order does not imply that you can trash your TLBs and L2 lines in a 64Gb RAM scan bonanza...