SQL Server 2008 R2 – How to Get Median Value Without Pivot

compatibility-levelsql serversql-server-2008-r2

Create Table Script

create table temp 
(
    id int identity(1,1),
    a decimal(6,2),
    b decimal(6,2),
    c decimal(6,2),
    d decimal(6,2),
    e decimal(6,2),
    f decimal(6,2),
    g decimal(6,2),
    h decimal(6,2),
    i decimal(6,2),
    j decimal(6,2),
    k decimal(6,2),
    l decimal(6,2),
    m decimal(6,2),
    n decimal(6,2),
    o decimal(6,2),
    p decimal(6,2),
    q decimal(6,2),
    r decimal(6,2),
    s decimal(6,2),
    t decimal(6,2),
    u decimal(6,2)
)

Insert Script

insert into temp
    (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u)
values
    (1,5,6,7,8,2,6,3,4,5,2,1,6,5,7,8,2,7,6,2,8)

insert into temp
    (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u)
values
    (1,5,6,7,8,2,2,3,2,4,2,1,4,5,9,8,2,7,6,2,8)

Expected Result

Median
======
first row  - 5.00
second row - 4.00

Non-working Solutions

I tried the below query which is working fine in SQL Server 2014, but has issues in SQL Server 2008 R2.

select id, avg(val)
from ( 
    select id, val
         , count(*) over (partition by id) as c
         , row_number() over (partition by id order by val) as rn
    from temp unpivot (
             val for col in (a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u)
         ) as x 
) as y
where rn IN ((c + 1)/2, (c + 2)/2) 
group by id;

I ran the above query in 2014 version and it's working properly, but it's not working in 2008 R2. I get this error in SQL Server 2008 R2:

Incorrect syntax near the keyword 'for'

The reason must be because my database's compatibility level is 80. But if I change the compatibility level, it will affect my application, so I can't do that.

I've also tried this query:

select id,        
(select cast(Avg(TotAvg)as decimal(6,2)) as Median from (values (convert(decimal(6,2), a)),(convert(decimal(6,2), b)),        
(convert(decimal(6,2), c)),        
(convert(decimal(6,2), d)),(convert(decimal(6,2), e)),        
(convert(decimal(6,2), f)),(convert(decimal(6,2), g)),(convert(decimal(6,2), h)),(convert(decimal(6,2), i)),        
(convert(decimal(6,2), j)),(convert(decimal(6,2), k)),(convert(decimal(6,2), l)),(convert(decimal(6,2), m)),        
(convert(decimal(6,2), n)),(convert(decimal(6,2), o)),(convert(decimal(6,2), p)),(convert(decimal(6,2), q)),        
(convert(decimal(6,2), r)),(convert(decimal(6,2), s)),(convert(decimal(6,2), t)),(convert(decimal(6,2), u))) as Totalavg(TotAvg))         
Median
from tempone

Obviously it calculates the average, but I need the median.

Best Answer

PIVOT and UNPIVOT are indeed not supported under compatibility level 80.

However, you can unpivot rows using a nested VALUES constructor. The resulting query in my case looks slightly unwieldy because of the double nesting, but it works in SQL Server 2008 with any supported compatibility level:

SELECT
  id,
  median =
  (
    SELECT
      AVG(val)
    FROM
      (
        SELECT
          c   = COUNT(*) OVER (),
          rn  = ROW_NUMBER() OVER (ORDER BY v.val ASC),
          val = v.val
        FROM
          (
            VALUES
            (t.a), (t.b), (t.c), (t.d), (t.e), (t.f), (t.g),
            (t.h), (t.i), (t.j), (t.k), (t.l), (t.m), (t.n),
            (t.o), (t.p), (t.q), (t.r), (t.s), (t.t), (t.u)
          ) AS v (val)
        WHERE
          v.val IS NOT NULL
      ) AS derived
    WHERE
      rn IN ((c + 1) / 2, (c + 2) / 2)
  )
FROM
  temp AS t
;

The v.val IS NOT NULL filtering is there to imitate the UNPIVOT behaviour more closely, because UNPIVOT automatically filters out NULL values.

An extra nesting was necessary because there was no other way to produce the count and row numbers and use them at the same nesting level.

So the innermost SELECT (SELECT ... FROM (VALUES ...)) unpivots the row and provides the row count and row numbers, while the middle-tier level calculates the median.

It is possible to reduce nesting with the help of CROSS APPLY and grouping in the main query, like this:

SELECT
  id,
  median = AVG(x.val)
FROM
  temp AS t
  CROSS APPLY
  (
    SELECT
      c   = COUNT(*) OVER (),
      rn  = ROW_NUMBER() OVER (ORDER BY v.val ASC),
      val = v.val
    FROM
      (
        VALUES
        (t.a), (t.b), (t.c), (t.d), (t.e), (t.f), (t.g),
        (t.h), (t.i), (t.j), (t.k), (t.l), (t.m), (t.n),
        (t.o), (t.p), (t.q), (t.r), (t.s), (t.t), (t.u)
      ) AS v (val)
    WHERE
      v.val IS NOT NULL
  ) AS x
WHERE
  x.rn IN ((x.c + 1) / 2, (x.c + 2) / 2)
GROUP BY
  t.id
;

For a live demonstration of both methods, please follow this dbfiddle.uk link.

Related Solutions

Sql-server – How to restore the sql_variant_property of baseType back to a table variable in SQL Server

A lot of problems with what you are trying that I just don't think this particular approach is going to work (I think the goal is interesting):

returns @result table (reportID bigint) - this is basically schema-bound. You can't alter the schema of the returned table - I see later you are going to attempt to "ALTER" this variable - that's just not going to happen.

Even if you could, this part:

select @alterTableStr = 'alter table @result add '+ convert(varchar(2000),@dataLabel)+' '+ convert(varchar(2000),@dataType) ;

exec sp_executesql @alterTableStr;

isn't going to affect the @result in the outer part - these dynamic SQL parts have their own scope and can't get to the variables in the calling part this way

And then it really comes down to the error you are currently getting - which is sp_executesql can't be run inside a function anyway.

Anyway, to aid you a little in making progress, may I suggest you read the following two articles which are are tangentially related (and I have used recently) regarding parsing CSV and JSON data. Both use his hierarchy idea (an unpivoted name/value thing) and repivoting and you might find some useful techniques there:

http://www.simple-talk.com/sql/t-sql-programming/the-tsql-of-csv-comma-delimited-of-errors/

http://www.simple-talk.com/sql/t-sql-programming/consuming-json-strings-in-sql-server/

If you can use the CLR, this article which was published might also be useful:

http://www.sqlservercentral.com/articles/.Net/94922/

SQL Server 2008 R2 – CASE Statement Returning Different Results in COALESCE

I don't think that your code is actually working (the first query). For instance, the @Period4 value for that is being returned as Lst Yr, when it should be Last Calendar Year. This happens because your use of COALESCE and the way you are giving the values to your variables. The SELECT @variable = value from table should return only one row, otherwise you are assigning whatever value on all the rows of the table, it's non deterministic. And the second query you are using compares @fpID with a value of the column FinancialPeriodID that can be whatever row, so it doesn't match and assigns a '' to your variable. What you should be doing instead is:

SELECT @Period1 = MAX(CASE WHEN FinancialPeriodID = @fp1ID THEN FinancialPeriod ELSE '' END),
       @Period2 = MAX(CASE WHEN FinancialPeriodID = @fp2ID THEN FinancialPeriod ELSE '' END), 
       @Period3 = MAX(CASE WHEN FinancialPeriodID = @fp3ID THEN FinancialPeriod ELSE '' END), 
       @Period4 = MAX(CASE WHEN FinancialPeriodID = @fp4ID THEN FinancialPeriod ELSE '' END), 
       @Period5 = MAX(CASE WHEN FinancialPeriodID = @fp5ID THEN FinancialPeriod ELSE '' END), 
       @Period6 = MAX(CASE WHEN FinancialPeriodID = @fp6ID THEN FinancialPeriod ELSE '' END), 
       @Period7 = MAX(CASE WHEN FinancialPeriodID = @fp7ID THEN FinancialPeriod ELSE '' END)
FROM #FinancialPeriod

SELECT @Period1 = COALESCE(@Period1,MAX(CASE WHEN FinancialPeriodID = @fp1ID THEN FinancialPeriod END),''),
       @Period2 = COALESCE(@Period2,MAX(CASE WHEN FinancialPeriodID = @fp2ID THEN FinancialPeriod END),''), 
       @Period3 = COALESCE(@Period3,MAX(CASE WHEN FinancialPeriodID = @fp3ID THEN FinancialPeriod END),''), 
       @Period4 = COALESCE(@Period4,MAX(CASE WHEN FinancialPeriodID = @fp4ID THEN FinancialPeriod END),''), 
       @Period5 = COALESCE(@Period5,MAX(CASE WHEN FinancialPeriodID = @fp5ID THEN FinancialPeriod END),''), 
       @Period6 = COALESCE(@Period6,MAX(CASE WHEN FinancialPeriodID = @fp6ID THEN FinancialPeriod END),''), 
       @Period7 = COALESCE(@Period7,MAX(CASE WHEN FinancialPeriodID = @fp7ID THEN FinancialPeriod END),'')
FROM #FinancialPeriod

This way you assign only the value when the match is done between FinancialPeriodID and @fpID.