SSAS Tabular Cube partitioning – number of partitions

partitioningssastabular-model

We ha a Fact table with 500 milions of rows (data from 10 years – 60GB). I've partitioned this table on database level and SSAS Tabular Cube level as well – one month per partition to improve the overall browsing performance and SSAS Cube processing time. But last time I've heard that configuring more than 2 partitions for SSAS Tabular cube is not good. There should be max 2 partitions per tabular cube. Have you heard about something like that ? Is that true ?

Best Answer

In SSAS 2014 Tabular partitions inside a table process serially. In SSAS 2016 Tabular partitions inside a table can process in parallel. So in your version partitioning won't speed up processing if you process the whole table.

However partitioning is great for use in incremental processing. If you partition by year and only 2017 rows changed then you can just process the 2017 partition. Also to archive off data older than 10 years old you can drop the oldest partition.

I don't know where the 2 partition rule you mentioned came from but don't believe it makes sense. If you have a reference for that assertion with some context then feel free to share.

Related Solutions

How to calculate/store Top 10 in a tabular model

I had a similar scenario and used the following DAX query...

First, to make it simple, I defined a measure to use inside the DAX so I don't have to repeat the formula. Then I used the generate to iterate on the TOPN formula:

define measure TableInTabular[NameOfTheMeasure] = COUNTAX(FILTER('Stats', ISBLANK([DeactDate]) = TRUE), 1)
evaluate
 (
  addcolumns
   (  
    filter
     (  
      generate
        (  
         VALUES(DatesTableName[Month]),  
         TOPN (10, VALUES(TableInTabular[ParentID]),TableInTabular[NameOfTheMeasure],0)
        ),
        TableInTabular[NameOfTheMeasure]>0
      ),
      "ActiveCount (or how you want to call this Column)",
      TableInTabular[NameOfTheMeasure]  
    )  
 )  
order by DatesTableName[Month] asc, 
TableInTabular[NameOfTheMeasure] desc

With the above you should have a top 10 ParentID and the Measure by each month. just replace the "TableInTabular" with your tabular table name where you have the data and the "DatesTableName" with the name of the dates table.

Please let me know if I misunderstood your question and hope it helps...

Sql-server – SSAS cube process time increased after making fact and dimension small

Processing a cube largely consists of 3 steps,

Getting the data
building indexes
calculating aggregations

Step 2 and 3 are the least expensive (during processing) in my opinion so let's start with that.

Building indexes does little more than calculating bitmap indexes for your attribute relationships. So depending on how many of those you have designed this could take longer or not, but usually shouldn't take half an hour. This does little more than saying "hey if this item group is filtered on, I already know what Items are in it, so I can add those up instead of doing a NONEMTPTYCROSSJOIN"

Calculating aggregations is the process in which the subtotals are calculated for every level where you defined them. If you haven't used any usage based optimization or attribute hierarchies and haven't defined any aggregations yourself those are only calculated on the leaf level of your dimensions hierarchies (aka the lowest level of every attribute). This is basically "hey if I need the sales for this item group I have precalculated it and don't have to add up these items"

A large part of processing time is used while fetching data. If you would trace your queries while processing you would notice that for every single dimension attribute a SELECT DISTINCT attributename FROM dimension_table is executed. If that dimension table is a view, and the query from the view is slower after adding a where, that dimension processing might become slower by time_the_view_is_slower * number_of_attributes.
How many columns are in the select list of your view is largely irrelevant, the number of attributes in your dimension is.

If you have a distinct count measure anywhere your extra where could have inserted a sort operation too because of a new execution plan.

So in your case my guess would be that your dimension views have gotten slower due to the fact that the valid date isn't indexed, or the queries aren't selective enough and the added processing time of the source queries exaggerates the reduced time of building indexes and aggregations.

You most likely have improved your cube browsing and MDX performance in the process by reducing the number of cells calculated with less data in the cube, but the processing time may just have grown by creating slower source queries, and that gets multiplied by the number of attributes.

Then again, I'm not sure what the problem is with larger processing times if you leave everything default. The performance of the SSAS solution should be more important than the processing time. If you run in to issues due to the processing time that might be another problem that needs fixing in another way.

So I guess, in the end, If you're worried about processing time, either load less data in your source database (which, depending on your setup, could reduce ETL loading time) or tune all those distinct queries by indexing your view accordingly.

If you want to know what exactly is slowing down your processing you could trace all the queries before and after the change and compare the execution times to see what queries are killing you.

Best Answer

Related Solutions

How to calculate/store Top 10 in a tabular model

Sql-server – SSAS cube process time increased after making fact and dimension small

Related Question