Db2 – Syntax problem in union query

db2union

I hope someone can tell me what I'm doing wrong in this query. I'm unable to find the cause of the error.

I have 2 tables with 2 columns each, see the CREATE statements here:

CREATE TABLE TableA(datum DECIMAL(8,0), colA INTEGER);
CREATE TABLE TableB(datum DECIMAL(8,0), colB INTEGER);

INSERT INTO TableA values('20160104', '1');
INSERT INTO TableA values('20160101', '2');
INSERT INTO TableA values('20160102', '3');
INSERT INTO TableA values('20160105', '2');
INSERT INTO TableA values('20160102', '6');
INSERT INTO TableA values('20160105', '4');

INSERT INTO TableB values('20160107', '5');
INSERT INTO TableB values('20160103', '8');
INSERT INTO TableB values('20160107', '2');
INSERT INTO TableB values('20160101', '1');
INSERT INTO TableB values('20160101', '4');
INSERT INTO TableB values('20160105', '3');

What I'm trying to achieve is to get a result with all the dates and the sum of the columns for that date.
The query that I use is the following:

select datum, sum(resColA), sum(resColB)
from 
    select datum, sum(colA) as resColA, 0 as resColB
    from TableA
    group by datum
union 
    select datum, 0 as resColA, sum(colB) as resColB
    from TableB
    group by datum
group by datum
order by datum

It keeps telling me:

[SQL0104] Token ( is invalid. Valid tokens: FOR USE SKIP WAIT WITH FETCH ORDER UNION EXCEPT OPTIMIZE.

Both legs of the union work fine as separate SELECT statements, but when I put them together, the whole thing doesn't work any more.

Does anyone have an idea what I'm doing wrong? I can't figure it out.

Best Answer

You want to group by again the result of the union. For that you need to enclose the union in parentheses and add an alias ("name" it), i.e. make it a derived table:

select datum, sum(resColA) as resColA, sum(resColB) as resColB
from 
(   select datum, sum(colA) as resColA, 0 as resColB
    from TableA
    group by datum
  union 
    select datum, 0 as resColA, sum(colB) as resColB
    from TableB
    group by datum
) as drv              -- an alias for the derived table
group by datum
order by datum ;

or make it a CTE (common table expression):

with
  cte as
    (   select datum, sum(colA) as resColA, 0 as resColB
        from TableA
        group by datum
      union 
        select datum, 0 as resColA, sum(colB) as resColB
        from TableB
        group by datum
    )
select datum, sum(resColA) as resColA, sum(resColB) as resColB
from cte
group by datum
order by datum ;

There is practically little difference between the above 2 options and I think no difference in efficiency in DB2. Pick whatever feels more readable to you. Alternatively, you could first union the data from the 2 tables and then group by. This will probably result in different execution plans, so test which is more efficient:

select datum, sum(resColA) as resColA, sum(resColB) as resColB
from 
(   select datum, colA as resColA, 0 as resColB
    from TableA
  union 
    select datum, 0 as resColA, colB as resColB
    from TableB
) as drv 
group by datum
order by datum ;