Postgresql – How to rank an aggregation in each newly formed group

aggregatejoin;postgresqlsqlitesubquery

The problem below that I would like to solve sounds quite simple but it is quite difficult for me to arrive at the solution with SQL.

Let's say I have a simplified ERD that looks as follows:

enter image description here

This is a database of a hypothetical movie streaming website that keeps the data about its customers, invoice, movies and their genres. Each customer has a unique CustomerID, each purchase has a unique InvoiceID, each movie has a unique MovieID, each genre has a unique GenreID. One MovieID is only associated with one GenreID. One customer can purchase many movies, so one Invoice can contains many MovieIDs.

I would like to show the top 2 most popular genres (in terms of total count of InvoiceID) in each of the following continents/regions:
Africa (Egypt, South Africa), Asia (China, Japan), Europe (France, Germany).

An example of the Customer and Genre tables are as follows:

enter image description here

The resulting table that I would like to have looks more or less like this (Region means Continent):

enter image description here

How exactly the resulting table look like is not important, I would just want the top 2 genres grouped by each region.

Here are some of my effort in producing the query:

WITH t1 AS (
  SELECT c.Country,
  CASE
    WHEN (
      (c.Country = "China" OR c.Country = "Japan") THEN "Asia"
    )
    WHEN (
      (c.Country = "Egypt" OR c.Country = "South Africa") THEN "Africa"
    )
    ELSE "Europe"
  END AS region
FROM Customer c
GROUP BY 1
)


SELECT DISTINCT region
FROM t1;

I could only produce the new region groups but got stuck at how to rank the genres grouped by the regions. I have a difficulty in showing the ranking of genres (based on total count of InvoiceID) in each region. In other words, in each of the regions, how many customers purchased each genre. I have just started learning SQL that includes different kinds of JOINS, SUBQUERY, AGGREGATIONS, GROUP BY, ORDER BY and Common Table Expression but somehow I cannot produce the result I want. Could somebody please give some light on this? Thank you in advance for the help.

Best Answer

Use grouping to get the count per country, and a window function to get the rank:

WITH t1 AS (
  SELECT CASE
         WHEN Country IN ('France', 'Germany')     THEN 'Europe'
         WHEN Country IN ('China', 'Japan')        THEN 'Asia'
         WHEN Country IN ('Egypt', 'South Africa') THEN 'Africa'
         END AS Region,
         GenreName,
         COUNT(*) AS Views
  FROM Customer
  JOIN Invoice USING (CustomerID)
  JOIN Movie USING (MovieID)
  JOIN Genre USING (GenreID)
  GROUP BY Region, GenreName
),
t2 AS (
  SELECT Region,
         GenreName,
         Views,
         row_number() OVER (PARTITION BY Region ORDER BY Views DESC) AS Rank
  FROM t1
)
SELECT Region, GenreName
FROM t2
WHERE Rank <= 2
ORDER BY Region, Rank;