I would link at all/most levels. This denormalized star means that yes, the data is redundant, but it typically makes the reporting and analysis a lot easier. Note that this is very different from OLTP normalization, and you don't typically have to worry about redundant data getting out of sync because in a DW scenario data never changes. New facts get added and dimensions get expired and new ones created.
I don't see a Dim_Folder. I would assume that the actual path of the folder would be an attribute of the Dim_Folder. Only the numeric quantity and any degenerate dimensions (http://en.wikipedia.org/wiki/Degenerate_dimension) would be in the fact table. I wouldn't think of the folder path as a degenerate dimension because it keeps coming back in each snapshot (an a folder isn't a transaction).
So you could do something like this:
SELECT AVG(bytes_on_disk)
FROM FACT_Folder
INNER JOIN DIM_Folder
ON FACT_Folder.FolderDimID = DIM_Folder.DimID
INNER JOIN DIM_Date
ON FACT_Folder.SnapshotDateID = DIM_Date.DateID
WHERE DIM_Date.Date BETWEEN '20120101' AND '20121231'
AND DIM_Folder.FolderPath = '/usr/bin/'
See how the DIM_Folder usage makes the set of dim ids small and then, we're assuming some kind of index on snapshot date and then folder dim id (or vice versa).
See how you also now don't need to join on folder at all if you just want the data at a higher level. Since you usually know all this at ETL time, there is a different motivation than in OLTP systems where you want everything to move together when something is changed (leg bone connected to the thigh bone, etc.). In DW scenario, you really don't want anything to move.
So, bam! - total Farm usage analysis:
SELECT DIM_Farm.Farm_Name, SUM(bytes_on_disk)
FROM FACT_Folder
INNER JOIN DIM_Farm
ON FACT_Folder.FarmDimID = DIM_Farm.DimID
INNER JOIN DIM_Date
ON FACT_Folder.SnapshotDateID = DIM_Date.DateID
WHERE DIM_Date.Date BETWEEN '20120101' AND '20121231'
GROUP BY DIM_Farm.Farm_Name
Remember stars are really simple for analysis. You NEVER need to worry about inadvertent cross joins in a single non-snowflaked star. When linking different stars, you DO have to watch out. So queries in MOST cases are MUCH simpler in star-schemas. No network traversal and worrying about many-many relationships like in a normalized model.
You could treat your reference tables as slowly changing dimensions and perform type 2 maintenance i.e. add a new reference row when any reference data changes. Over time you may choose to horizontally partition your reference tables into "active" and "archive" parts, depending on the data churn you experience. This can be achieved using using SQL Server's built-in partitioning functionality or a roll-your-own approach with two separate tables. Your needs at the time will dictate which.
There is no good reason why your active OLTP table and your logging table should look the same. They perform different roles and have different read and write requirements. If the logging table needs to be wide and sparse then that's what it needs to be. Create the objects to solve the problem you have.
And one last suggestion, which is defininetly from the "clutching at straws" bucket: define TASK_DONE (id int, reference_values xml)
. Extract your pertinent values at the point in time and save them away. This allows for changes in the reference data schema without having to bring existing log records up to the new schema. That will make historical searching more complicated, of course.
Best Answer
I would approach it with this structure:
This way you can have a more normalized structure, non-repeating data and simple tables.
In this method you would also be using WHERE PlaylistID=[bobs playlist id] instead.
With this structure's simplicity, you wont have to worry about any size issues for a long time. This structure will quickly handle millions of records. If you get bigger than that you can look into partitioning; that's not necessary for now.