Postgresql – Select on multiple tables with group by

aggregategroup byjoin;postgresqlwindow functions

Using Postgres 9.3, I have problem with a select over few tables.
I have simcards stock and I would like to get something like this:

| network | supplier | total topups (EUR) | 1st topup qty|   
| three   | three    | 500                | 34           |  
| three   | core     | 300                | 4            |  
| o2      | o2       | 100                | 3            |  

| network | supplier | total topups (EUR) | 2nd topup qty|   
| three   | three    | 200                | 4            |  

| network | supplier | total topups (EUR) | 3rd topup qty|   
| three   | three    | 200                | 4            |

my table schemas are:

simcard(id, iccid(unique), network_id, supplier_id)
topup(id, date, amount, simcard_id)
network(id, name)
supplier(id, name)

at the moment I am able to select all first topups, but I don't know how to display them in a format as I would like (network| supplied by| topups sum| 1st topup qty)
Then I have no idea how to select 2nd, 3rd topup for every simcard. I tried with limit 1 offset 1 or offset 2 but it didn't work.

select 
    distinct topup.iccid    
    , s.created_date
    , n.name as "network"
    , p.name as "supplier"
from
(
select 
    s.iccid as "iccid"
    , t.date

from 
    simcard_topup as t
    join simcard_simcard as s
        on t.iccid_id = s.id
--where 
--  CAST(t.date AS DATE) = date '2015-03-13' - integer '14'

group by
    s.iccid
    , t.date
order by
    t.date asc
) as topup
join simcard_simcard as s
    on topup.iccid = s.iccid
join simcard_network as n
    on s.network_id = n.id
join simcard_supplier as p
    on s.supplier_id = p.id

Best Answer

SELECT st.topup_nr, n.name AS network, p.name AS supplier
     , st.topup_sum, st.topup_qty
FROM  (
   SELECT t.topup_nr, s.network_id, s.supplier_id
        , sum(t.amount) AS topup_sum
        , count(*) AS topup_qty
   FROM  (
      SELECT simcard_id, amount
           , row_number() OVER (PARTITION BY simcard_id, ORDER BY date) AS topup_nr
      FROM   simcard_topup
      ) t
   JOIN   simcard_simcard s ON s.id = t.simcard_id  -- or so I assume
   GROUP  BY 1,2,3
   ) st
JOIN   simcard_network  n ON n.id = st.network_id
JOIN   simcard_supplier p ON p.id = st.supplier_id
ORDER  BY st.topup_nr, st.topup_qty DESC NULLS LAST
                     , st.topup_sum DESC NULLS LAST;

Notes

The core feature is the window function row_number() in the inner subquery to number topups per simcard. This makes aggregating for 1st, 2nd etc. topup in the outer query simple.
For lack of information, assuming to join simcard_simcard to simcard_topup on s.id = t.simcard_id, though you also have on t.iccid_id = s.id in your code, which seems backwards ...
Don't use a basic type name like date as column name.
Why do you have CAST(t.date AS DATE) in your code? A date should be a date, so no need to cast it to date. Also, it seems we only need the date for ranking topups.
Counting per (network_id, supplier_id) and sorting the result by ... topup_qty DESC, topup_sum DESC, so the combination with the most topups comes first. NULLS LAST is just to be safe against NULL values.

Related Solutions

Sql-server – How to SELECT multiple COUNT(*) columns with each GROUP BY having a different WHERE clause

This will give you the result you wanted but I'm not sure if it's the most flexible piece of code.

SELECT p.PersonId,
    SUM(CASE 
          WHEN a.ActionTime >= '2014-01-15 00:00:00' 
            AND a.ActionTime < '2014-02-15 00:00:00'
          THEN 1 
          ELSE 0 
        END) AS GroupA_Actions_Made,
    SUM(CASE 
          WHEN a.ActionTime >= '2014-02-15 00:00:00' 
            AND a.ActionTime < '2014-03-15 00:00:00'
          THEN 1 
          ELSE 0 
        END) AS GroupB_Actions_Made
FROM
    Person p
JOIN
    Action a on p.PersonId = a.PersonId
GROUP BY p.PersonId

Postgresql – Select rows within group before condition is met

A single window function in a subquery should do the trick:

SELECT *
FROM  (
   SELECT *, count(*) FILTER (WHERE height = 0)
                      OVER (PARTITION BY traj_id ORDER BY timestamp) AS ct0
   FROM   trajectory_data
   ) sub
WHERE  ct0 = 0;

This counts occurrences of 0 along the timeline per traj_id, and only keeps rows before the first one pops up. Voilá.

db<>fiddle here

Best Answer

Notes

Related Solutions

Sql-server – How to SELECT multiple COUNT(*) columns with each GROUP BY having a different WHERE clause

Postgresql – Select rows within group before condition is met

Related Question