I am building a workstation with the purpose of analyzing multiple terabytes of data. Hundreds of tables, hundreds of columns, various data types throughout. This will include slicing-and-dicing the data into various aggregations and writing it back to summary tables (similar to building a cube) as well as processing large queries with lots of joins.
Most of the specs for database servers are (obviously) designed with lots of users in mind hitting the server. In this case, I'm the only user, so most of the times there will be only one or two big queries going on at once. Likewise, I may be taking a complex query and then rendering the output of that query (say 500GB+) to another table somewhere.
I want these reads/writes to happen as fast as possible but I also don't want to invest in hardware that is never going to be utilized. What recommendations does anyone have as far as CPU (clock-speed vs core count), memory size, and disk speed?
One setup I had considered was the following:
- CPU: i7-6950X
- RAM: 128GB (is this enough?)
- SSD: Multiple Samsung 960 PRO 2TBs (NVMe sticks)
No one database should exceed 2TB so the various DBs might be spread across multiple of the NVMe drives. Not sure if I should be going Xeon instead of i7 and, if so, should favor higher clock speed or higher core count. Lastly, wondering if 128GB RAM is enough but I was hoping the blazing fast disks would help make up for that.
Any and all versions of SQL Server are in play (including Enterprise Edition), meaning I can use whatever version best fits the performance needs.
Any advice is appreciated!
Best Answer
Favor the fastest processors you can get, over pure core count. Get as much memory as you can afford up to the maximum size of data you expect to process in one chunk. The NVMe SSDs you have specced are exactly what I would recommend.
Without more precise requirements, it is difficult to recommend specifics.
Take a look at this answer for details about clock speed vs core count. Which edition of SQL Server are you going to use?