Postgresql – Query with two LEFT JOIN’s is not working as expected

aggregatejoin;postgresql

I am using PostgreSQL as my database and I need to write the query that returns:

  • a user_id from 1st table
  • sum of every row's amount from 2nd table (where user_id is the same as in the 1st table)
  • sum of every row's amount from 3nd table (where user_id is the same as in the 1st table and the type is defined).

So I wrote this query with two LEFT JOINs, but it doesn't work as expected:

SELECT aff.user_id, COALESCE(SUM(t.amount), 0) as total_bet, COALESCE(SUM(d.amount), 0) as total_deposit
FROM affiliate_guests aff
LEFT JOIN transfers t ON aff.user_id = t.user_id
LEFT JOIN deposits d ON aff.user_id = d.user_id AND d.type = 'deposit'
WHERE aff.code = 'code'
GROUP BY aff.user_id;

This query returns the wrong numbers for total_bet and total_deposit columns.

If I omit the second LEFT JOIN, like this:

SELECT aff.user_id, COALESCE(SUM(t.amount), 0) as total_bet
FROM affiliate_guests aff
LEFT JOIN transfers t ON aff.user_id = t.user_id
WHERE aff.code = 'code'
GROUP BY aff.user_id;

everything works fine.

What can be the problem and how can I solve it?

Best Answer

Let's assume your tables are defined:

CREATE TABLE affiliate_guests
(
    user_id serial not null primary key,
    code text
    -- etc.
) ;

CREATE TABLE transfers
(
    transfer_id serial not null primary key,
    user_id integer not null references affiliate_guests(user_id),
    amount numeric not null
    -- etc.
) ;

CREATE TABLE deposits
(
    deposit_id serial not null primary key,
    user_id integer not null references affiliate_guests(user_id),
    amount numeric not null,
    type text
    -- etc.
) ;

... and we populate them with some sample data:

INSERT INTO 
    affiliate_guests (user_id, code) 
VALUES
    (1, 'code') ;
INSERT INTO 
    transfers (user_id, amount) 
VALUES
    (1, 1000.00),
    (1, 2000.00),
    (1, 1234.00) ;

INSERT INTO deposits (user_id, amount, type) 
VALUES
    (1, 1000.00, 'deposit'),
    (1, 2345.00, 'ignoreit'),
    (1, 3456.00, 'ignoreit as well'),
    (1, 9999.00, 'deposit') ;

In order to understand why your query is not working as you expect, check first the result of executing your same query, but without GROUP BY, and changing your aggregates by just values (I've also added some extra ids for ease of interpretation):

SELECT 
    aff.user_id, transfer_id, t.amount AS t_amount, deposit_id, d.amount AS d_amount
FROM 
    affiliate_guests aff
    LEFT JOIN transfers t ON aff.user_id = t.user_id
    LEFT JOIN deposits d ON aff.user_id = d.user_id AND d.type = 'deposit'
WHERE
    aff.code = 'code' ;

This is what you'll get:

|---------+-------------+----------+------------+----------|
| user_id | transfer_id | t_amount | deposit_id | d_amount |
|---------+-------------+----------+------------+----------|
|    1    |      1      | 1000.00  |     1      | 1000.00  |
|---------+-------------+----------+------------+----------|
|    1    |      1      | 1000.00  |     4      | 9999.00  |
|---------+-------------+----------+------------+----------|
|    1    |      2      | 2000.00  |     1      | 1000.00  |
|---------+-------------+----------+------------+----------|
|    1    |      2      | 2000.00  |     4      | 9999.00  |
|---------+-------------+----------+------------+----------|
|    1    |      3      | 1234.00  |     1      | 1000.00  |
|---------+-------------+----------+------------+----------|
|    1    |      3      | 1234.00  |     4      | 9999.00  |
|---------+-------------+----------+------------+----------|

As there are not any relationship between your tables t and d, what you actually get is the cartesian product of both of them. You clearly see it because you have all possible combinations of transfer_id and deposit_id.

You need to make this by using subquerys, instead:

SELECT 
    aff.user_id, 
    coalesce( (SELECT sum(amount) 
                 FROM transfers t 
                WHERE t.user_id = aff.user_id), 0) AS total_bet,
    coalesce( (SELECT sum(amount) 
                 FROM deposits d 
                WHERE d.user_id = aff.user_id AND d.type='deposit'), 0) AS total_deposit
FROM 
    affiliate_guests aff
WHERE
    aff.code = 'code' ;

That will get you what you expect:

|---------+-----------+-----------|
| user_id | total_bet | total_bet |
|---------+-----------+-----------|
|    1    |  4234.00  | 10999.00  |
|---------+-----------+-----------|

See Identifying and Eliminating the Dreaded Cartesian Product