Wrong aggregation of Measures in Analysis Service in combination with UNION and DESCENDANTS

mdxssas

Update: When I tried to provide a simple example which reproduces the issue I found the leveller in my database which introduces this issue. Therefore I did an update to this question in order to reflect this new finding. Below you will find scripts to set up a database that reproduces this issue.

I am encountering a strange behaviour of Analysis Services in server version 2008 as well as 2012.

In my AS database I have defined a cube with a base measure like this:

AggregateFunction: Min;
Name: Existing Data

As you can see, the Aggregate Function is Min.

The fact table containing the corresponding existing_data column contains 0 or 1 values only. The goal is to return zero as soon as the data selection contains a category that is linked to a zero value. This matrix indicates areas within our real facts stored in another fact table that have been / have not been gathered data for, so we can differentiate whether a result is zero because in the selected areas there just have not been any cases or because in the selected areas there have not been any cases gathered yet.

When I now run the following query, I am getting 1's for each selected category:

WITH
   SET [adhoc] AS 'UNION(
     DESCENDANTS([Gebiete].[Hierarchie].[Bezirk].[010],[Gebiete].[Hierarchie].[Landkreis]),
     DESCENDANTS([Gebiete].[Hierarchie].[Bundesland].[10],[Gebiete].[Hierarchie].[Landkreis])
   )'
SELECT [adhoc] ON 0
FROM [Testdb]
WHERE ([Measures].[Existing Data])

Result

01001    01002    10041    10042
1        1        1        1

when I now change this query to the following I would expect one cell containing 1 as the result, since all values aggregated by the Min()function are 1 (compare to previous result):

WITH
    SET [adhoc] AS 'UNION(
      DESCENDANTS([Gebiete].[Hierarchie].[Bezirk].[010],[Gebiete].[Hierarchie].[Landkreis]),
      DESCENDANTS([Gebiete].[Hierarchie].[Bundesland].[10],[Gebiete].[Hierarchie].[Landkreis])
    )'
    MEMBER [Gebiete].[Hierarchie].[adhoc] AS 'Aggregate([adhoc])'
SELECT {[Gebiete].[Hierarchie].[adhoc]} ON 0
FROM [Testdb]
WHERE ([Measures].[Existing Data]);

But instead I am getting

adhoc
0

When I change the Aggregate function to explicitly calling Min in the [Gebiete].[Hierarchie].[adhoc] member I am getting the desired result of

adhoc
1

query:

WITH
    SET [adhoc] AS 'UNION(
      DESCENDANTS([Gebiete].[Hierarchie].[Bezirk].[010],[Gebiete].[Hierarchie].[Landkreis]),
      DESCENDANTS([Gebiete].[Hierarchie].[Bundesland].[10],[Gebiete].[Hierarchie].[Landkreis])
    )'
    MEMBER [Gebiete].[Hierarchie].[adhoc] AS 'Min([adhoc])'
SELECT {[Gebiete].[Hierarchie].[adhoc]} ON 0
FROM [Testdb]
WHERE ([Measures].[Existing Data]);

I tracked this down to be related to somehow to how the fact data is linked to the Gebiete-dimension and how this dimension is populated. This dimension has three layers Landkreis (lowest level), Bezirk, and Bundesland, the fact table however is linked only with the middle level (Bezirk). See this diagram of all related tables.

OLAP source DB tables

When I populate all three levels of this dimension having multiple nodes below the 01 branch the issue occurs.

Gebiete dimension populated with items that cause the issue

Do I instead populate the dimension with having only one node below the 01 branch and one or multiple nodes below the 10 branch the above queries return the expected result.

Gebiete dimension populated with items that cause the issue

So I think it is a bug possibly with the combination of UNION, DESCENDANTS and Aggregate and with how the dimension table is populated. Can you please confirm, whether I am right and this really is a bug or whether my MDX query is wrong?

Below you find all information to setup a database that reproduces the problem:

-- Fact
if exists (select * from dbo.sysobjects where id = object_id(N'dbo.fact_falldaten_existing_data') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table dbo.fact_falldaten_existing_data;

-- Area
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[sf_gebiet_landkreis]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[sf_gebiet_landkreis];
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[sf_gebiet_bezirk]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[sf_gebiet_bezirk];
if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[sf_gebiet_bundesland]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[sf_gebiet_bundesland];

GO

create table dbo.sf_gebiet_bundesland(
    id numeric(7, 0) primary key,
    shortname varchar(255) null,
    longname varchar(255) null,
    description varchar(255) null,
    geo_id numeric(10, 0) null,
    gkz varchar(10) null,
    parent_id numeric(7, 0)
);

create table dbo.sf_gebiet_bezirk(
    id numeric(7, 0) primary key,
    shortname varchar(255) null,
    longname varchar(255) null,
    description varchar(255) null,
    geo_id numeric(10, 0) null,
    gkz varchar(10) null,
    parent_id numeric(7, 0) references sf_gebiet_bundesland(id)
);

create table dbo.sf_gebiet_landkreis(
    id numeric(7, 0) primary key,
    shortname varchar(255) null,
    longname varchar(255) null,
    description varchar(255) null,
    geo_id numeric(10, 0) null,
    gkz varchar(10) null,
    parent_id numeric(7, 0) references sf_gebiet_bezirk(id)
);

GO

INSERT dbo.sf_gebiet_bundesland (id, shortname, longname, description, geo_id, parent_id) VALUES (2034, '01', 'Schleswig-Holstein', '', 16, 2032);
INSERT dbo.sf_gebiet_bundesland (id, shortname, longname, description, geo_id, parent_id) VALUES (2043, '10', 'Saarland', '', 9, 2032);

INSERT dbo.sf_gebiet_bezirk (id, shortname, longname, description, geo_id, parent_id) VALUES (2051, '010', 'Schleswig-Holstein alle Bezirke', '', 441, 2034);
INSERT dbo.sf_gebiet_bezirk (id, shortname, longname, description, geo_id, parent_id) VALUES (2079, '100', 'Saarland alle Bezirke', '', 446, 2043);

INSERT dbo.sf_gebiet_landkreis (id, shortname, longname, description, geo_id, parent_id) VALUES (2539, '01001', 'Flensburg', '', 881, 2051);
INSERT dbo.sf_gebiet_landkreis (id, shortname, longname, description, geo_id, parent_id) VALUES (2540, '01002', 'Kiel', '', 882, 2051);

INSERT dbo.sf_gebiet_landkreis (id, shortname, longname, description, geo_id, parent_id) VALUES (2858, '10041', 'Stadtverband Saarbrücken', '', 1200, 2079);
INSERT dbo.sf_gebiet_landkreis (id, shortname, longname, description, geo_id, parent_id) VALUES (2859, '10042', 'Merzig-Wadern', '', 1201, 2079);

-- Fact Data
CREATE table dbo.fact_falldaten_existing_data(
    bezirk          numeric(7)  references sf_gebiet_bezirk(id),
    existing_data   numeric(1)
);

GO

INSERT INTO fact_falldaten_existing_data (bezirk,existing_data) VALUES(2079,1);
INSERT INTO fact_falldaten_existing_data (bezirk,existing_data) VALUES(2051,1);

-- Comment this out to solve the issue
INSERT dbo.sf_gebiet_bezirk (id, shortname, longname, description, geo_id, parent_id) VALUES (2951, '01', 'Schleswig-Holstein unbekannter Bezirk', '', 3006, 2034);
INSERT dbo.sf_gebiet_landkreis (id, shortname, longname, description, geo_id, parent_id) VALUES (2502, '01', 'Schleswig-Holstein unbekannter Landkreis', '', 3000, 2951);
INSERT INTO fact_falldaten_existing_data (bezirk,existing_data) VALUES(2951,0);

--INSERT dbo.sf_gebiet_bezirk (id, shortname, longname, description, geo_id, parent_id) VALUES (2956, '10', 'Saarland unbekannter Bezirk', '', 3011, 2043);
--INSERT dbo.sf_gebiet_landkreis (id, shortname, longname, description, geo_id, parent_id) VALUES (2528, '10', 'Saarland unbekannter Landkreis', '', 3024, 2956);
--INSERT INTO fact_falldaten_existing_data (bezirk,existing_data) VALUES(2956,1);

Find here the SQL Server Data Tools project where you have to change the source database and the deployment target appropriately: Download of SSDT project

Best Answer

I tried to duplicate what you have using a multidimensional Contoso cube with a measure that is Min Inventory Day in Stock and didn't encounter your issue. I think it has to do with the current context when you aggregate [adhoc]. I don't have enough info to duplicate exactly. But this phrase caught my eye when looking at the specifications for MIN():

If a numeric expression is not specified, the specified set is evaluated in the current context of the members of the set and then returns the minimum value from that evaluation.

First, a few thoughts:

  • As a note, you don't need single quotes around your calculated members in your query. As long as your calculation doesn't throw an error, it doesn't hurt for them to be there, but they aren't needed after SSAS 2005 or so and can suppress messages in some cases.
  • Calculated members are evaluated at run time so you can't take advantage of any caching. It's better to move the work into a scoped script and keep the calculated measure simple. Or move them to into the axes if possible so they can be cached with the query as it is run multiple times. If you have a large dataset or very critical data that must be returned quickly you might want to change your use of calculated members.
  • Your UNION with DESCENDANTS is fine. I was able to do something similar and it produced the correct results. You can check this by using the set instead of the aggregated member and confirming that all the members are present as you expected. I'm a little puzzled why you have it, though. Correct me if I'm wrong, but I thought Wesser Ems is part of Lower Saxony (I'm not completely knowledgeable on German geography, though). If Wesser Ems is part of Lower Saxony, you could have just done the descendents of Lower Saxony and not done the union. That is neither here nor there since there is nothing wrong with your union. I just like to try to simplify queries as much as possible.
  • SSAS is really forgiving. It will try to interpret what you mean when you don't specify a fully qualified member. For instance, you use [Has Data] rather than [Measures].[HasData], and your set and aggregated member are both [adhoc]. If there is also a field called [HasData] somewhere, it can get confused as to what you meant.
  • SSAS ignores nulls when calculating the min
  • You already have a measure that performs the MIN() ([Measures].[Has Data]), so using AGGREGATE() will continue to do the min, if that is what is in the current context.
  • Unlike the WHERE clause in SQL, the WHERE clause of an MDX SELECT statement never directly filters what is returned on the Rows axis of a query. To filter what appears on the Rows or Columns axis of a query, you can use a variety of MDX functions, for example FILTER, NONEMPTY and TOPCOUNT. For instance, if I don't have any inventory in 2006, 2006 is still returned as a member in the following query:
WITH SET [adhoc] AS 
UNION(DESCENDANTS([Geography].[Geography Hierarchy].[Region Country Name].[Germany], 
[Geography].[Geography Hierarchy].[City Name]), 
DESCENDANTS([Geography].[Geography Hierarchy].[State Province Name].[Lower Saxony], 
[Geography].[Geography Hierarchy].[City]))
MEMBER [Geography].[Geography Hierarchy].[adhoc] as MIN([adhoc])   
SELECT {[Date].[Calendar YQMD].[Calendar Year].members } on 0 
from [Operation]
where ([Geography].[Geography Hierarchy].[adhoc],[Measures].[Inventory Min Day In Stock])

In my dataset, my fact table only has a value of 5 for the data that makes up [Measures].[Inventory Min Day In Stock]. This is similar to you having all 1s.

This query returns the same expected answer (5) for me whether I use MIN([adhoc]) or AGGREGATE([ad hoc])

WITH 
SET [adhoc1] AS 
UNION(DESCENDANTS([Geography].[Geography Hierarchy].[Region Country Name].[Germany], [Geography].[Geography Hierarchy].[City Name]), 
DESCENDANTS([Geography].[Geography Hierarchy].[State Province Name].[Lower Saxony], [Geography].[Geography Hierarchy].[City]))
MEMBER [Geography].[Geography Hierarchy].[adhoc] as Min([adhoc1])
SELECT {[Date].[Calendar YQMD].[Calendar Year].[Year 2008] } on 0
from [Operation]
where ([Geography].[Geography Hierarchy].[adhoc],[Measures].[Inventory Min Day In Stock])

My best advice is to try rewriting it a couple of different ways. If the calculated measures aren't too slow, you can use them to make sure you have the correct context.

WITH 
MEMBER [Geography].[Geography Hierarchy].[adhoc] as Aggregate(UNION(DESCENDANTS([Areas].[Hierarchy].[Region].[Weser Ems],[Areas].[Hierarchy].[City]),DESCENDANTS([Areas].[Hierarchy].[State].[Lower Saxony],[Areas].[Hierarchy].[City])))
MEMBER [Measures].[Test] as Min([Geography].[Geography Hierarchy].[adhoc],[Measures].[HasData])
SELECT {[Time].[Hierarchy].[Year].[2008] } on 0
from [Incidence]
where ([Measures].[Test])