I am in a database designer in a data warehouse environment. I am used to dealing with tables with a maximum of 1 millions rows and am now faced with tables with more than half a billion rows. Are there any significant differences with the tools in the "efficiency toolbox"? Can I trust my previous knowledge of indexes, partitions, and the like or are some of these specific tools more of a hindrance than a help with such large data? Any other tips for dealing with the tables?
(Already found a great post on Updating 700 million rows to same value)
Best Answer
The fundamentals of indexing etc. all work in exactly the same way, so strictly speaking the only difference is the cost of getting this wrong!
That said, here's a (not necessarily complete) list of things worth bearing in mind:
estimate_percent
when gathering stats so less of the table is sampled.If you're not already familiar with reading and understanding execution plans I'd spend some time learning these: your bound to run into performance issues at some point so knowing how to diagnose the problem correctly will become more important as it's harder to add new indexes or make schema changes when your row counts are larger.