Sql-server – Best table schema to store daily hits for many items

In my Sql Server 2005 database, I'm considering two ways of creating a table to track daily views on 100,000 items:

Option 1) Each day would insert a new row for each item. So the table columns would look like this:

| ItemID | Date | #ofHits |

That would mean the table grows by 100,000 rows per day.

Option 2) Each row would contain an entire month's hits per item. So the table columns would look like this:

| ItemID | Year | Month | Day01Hits | Day02Hits | Day03Hits | ... | Day31Hits |

That would mean the table grows by 100,000 rows per month.

Option 3) Perhaps you have a better idea altogether?

Keep in mind performance, such as when doing a sum of all hits between two dates (i.e. total hits for July 5th-Aug12th). Also, I'd want to keep at least 1 year of historical data.

On a related note, we're simply parsing the IIS logs nightly to get the hit data.

In summary:

Option 1 would probably be the easiest way to query and sum data across a date range. The downside is it would add 100,000 rows every day (or a million rows everyday if we had a million items).

Option 2 would create much less rows, but probably wouldn't be efficient to query and sum data across a data range.

Is there a better way to store daily hits for many items?

Update:
Unless someone has a better design, I'm considering this:
Use option 1, but create a new table every month (this will prevent the table from growing too large). So my table names would contain the year and month, like "Stats2012_01", "Stats2012_02", etc.
Then, to query across dates (such as total hits from July 5th-Aug12th) it'd be:

SELECT ItemID, SUM(Hits)  
FROM (  
    SELECT ItemID, SUM(Hits) AS Hits FROM Stats2012_06 WHERE Day>=5 GROUP BY ItemID  
        UNION ALL  
    SELECT ItemID, SUM(Hits) AS Hits FROM Stats2012_07 GROUP BY ItemID  
        UNION ALL  
    SELECT ItemID, SUM(Hits) AS Hits FROM Stats2012_08 WHERE Day<=12 GROUP BY ItemID  
) s  
GROUP BY ItemID

Update 2:
I just discovered partitioning, thanks to the suggested answer from MZDBA. I'll continue to look into partitioning, which may lead to combining Option 1 with partitioning.

Best Answer

I would want to attempt to insert a row for every hit and use SQL partitioning to store the table in managable chronological slices (by week or by month). This way, by keeping the original timestamp for each hit, you'll be able to aggregate the data using different methods as needed (by year one day, by hour the next, etc). For doing the aggregation, I would build a SSAS cube or Excel PivotTable, or T-SQL grouping functions. By using paritioning, you can store data for different time periods into individual filegroups spread across different disks to maintain scalability. When it comes to archival, you can perform what's called a sliding window to detach and later drop old unneeded time periods of data without expensive DELETE operations.

Best Answer

Related Solutions

Sql-server – Performing aggregations and calculations: SQL (CTE) vs MDX

Sql-server – How to store many smallish time-series in a relational DB

Related Question