Sql-server – Displaying Parent Child Information, With Certain Parent Columns Only Shown Once

sql servert-sql

For a TSQL report, I have a one-to-many parent-child relationship.

I want to report both parent and child relationships in the same report, like so, with the parent columns on the left and child columns on the right:

ParentName | ParentNum | ParentIncome | ChildName | ChildNum | ChildAllowance
John       | 1         | 50000        | Johnny    | 1        | 5
Jane       | 2         | 55000        | Jackie    | 2        | 10
Jane       | 2         |              | Billy     | 3        | 5
Jane       | 2         |              | Sally     | 4        | 5
Jackie     | 3         | 90000        | Monique   | 5        | 0

I want to avoid duplicating certain columns, like ParentIncome, because users dump the report output into Microsoft Excel and I want to minimize the chances of them summing columns incorrectly.

Therefore, for each parent, I want to only display ParentIncome once (ParentNum and ParentName OK to repeat.). For rows other than the first row, display a blank cell (i.e., no NULL values).

Currently, I produce this report using a wrapper around a subquery. The wrapper looks like this for all the columns:

CASE WHEN DATA.ROW_NUMBER = 1 
     THEN DATA.Field1
     ELSE '' END AS 'AColumn'

The subquery orders things using OVER and PARTITION BY and ORDER BY, so when ROW_NUMBER = 1, it's always a new parent.

It seems like this is something others have run into, though…in TSQL, can I get the same results in a simpler way, without the extra outer wrapper query to format things?

Best Answer

A general policy is to let the reporting layer handle things like only printing ParentIncome once. However, since you are delivering a spreadsheet that will be used by others in who knows what manner, then I suppose you are stuck.

Because of the knowledge required you will need to develop some extra information (MIN, MAX, first, last, etc.) that is not known by a single row. There are dodges different from ROW_NUMBER() OVER (PARTITION...), but there will still be an extra step.

See the following:

CREATE TABLE #parent
(ParentNum INT,
 ParentName VARCHAR(20),
 ParentIncome INT);

CREATE TABLE #child
(ChildNum INT,
 ChildParentNum INT,
 ChildName VARCHAR(20),
 ChildAllowance INT);

INSERT INTO #parent VALUES(10,'John',50000);
INSERT INTO #parent VALUES(20,'Jane',55000);
INSERT INTO #parent VALUES(30,'Jackie',90000);

INSERT INTO #child VALUES(1,10,'Johnny',5)
INSERT INTO #child VALUES(2,20,'Jackie',10)
INSERT INTO #child VALUES(3,20,'Billy',5)
INSERT INTO #child VALUES(4,20,'Sally',5)
INSERT INTO #child VALUES(5,30,'Monique',0)

-- Basic approach you may be using

See this example SQL Fiddle #1

SELECT pc.ParentName, pc.ParentNum,
    CASE WHEN pc.RowNum = 1 THEN CAST(pc.ParentIncome AS VARCHAR(10)) ELSE '' END as ParentIncome,
    pc.ChildName, pc.ChildNum, pc.ChildAllowance
FROM (SELECT p.ParentNum, p.ParentName, p.ParentIncome, 
         c.ChildNum, c.ChildParentNum, c.ChildName, c.ChildAllowance,
         ROW_NUMBER() OVER (PARTITION BY ParentNum ORDER BY ParentNum) AS RowNum
       FROM #parent p JOIN #child c ON p.ParentNum = c.ChildParentNum) AS pc
ORDER BY pc.ParentNum, pc.ChildNum 

    -- An alternative, but still using a subselect for one element

See this example: SQL Fiddle #2

SELECT  p.ParentName, p.ParentNum, 
        CASE WHEN c.ChildNum = mc.MinChild THEN CAST (ParentIncome AS VARCHAR(10)) ELSE '' END AS ParentIncome,
        c.ChildName, c.ChildNum, c.ChildAllowance
       FROM #parent p 
          JOIN #child as c 
             ON p.ParentNum = c.ChildParentNum
          -- This subselect gives the MIN (or First) ChildNum per Parent
          JOIN (SELECT ChildParentNum, MIN(ChildNum) AS MinChild
                  FROM #child
                  GROUP BY ChildParentNum) AS mc 
             ON mc.ChildParentNum = c.ChildParentNum 
ORDER BY p.ParentNum, c.ChildNum 

drop table #parent
drop table #child

Notice that I cast the ParentIncome as a VARCHAR(10) so that the datatype would be of the same type as the empty income of ''. I originally used a NULL instead of a blank, but that might give you EXCEL problems.

Is it worth doing things this way? It is up to you, but it primarily depends on what you like best. The ROW_NUMBER() is a more powerful operator than MIN() and gives you more options, but in this case it appears that a MIN() will work for you.