The problem below that I would like to solve sounds quite simple but it is quite difficult for me to arrive at the solution with SQL.
Let's say I have a simplified ERD that looks as follows:
This is a database of a hypothetical movie streaming website that keeps the data about its customers, invoice, movies and their genres. Each customer has a unique CustomerID, each purchase has a unique InvoiceID, each movie has a unique MovieID, each genre has a unique GenreID. One MovieID is only associated with one GenreID. One customer can purchase many movies, so one Invoice can contains many MovieIDs.
I would like to show the top 2 most popular genres (in terms of total count of InvoiceID) in each of the following continents/regions:
Africa (Egypt, South Africa), Asia (China, Japan), Europe (France, Germany).
An example of the Customer and Genre tables are as follows:
The resulting table that I would like to have looks more or less like this (Region means Continent):
How exactly the resulting table look like is not important, I would just want the top 2 genres grouped by each region.
Here are some of my effort in producing the query:
WITH t1 AS (
SELECT c.Country,
CASE
WHEN (
(c.Country = "China" OR c.Country = "Japan") THEN "Asia"
)
WHEN (
(c.Country = "Egypt" OR c.Country = "South Africa") THEN "Africa"
)
ELSE "Europe"
END AS region
FROM Customer c
GROUP BY 1
)
SELECT DISTINCT region
FROM t1;
I could only produce the new region groups but got stuck at how to rank the genres grouped by the regions. I have a difficulty in showing the ranking of genres (based on total count of InvoiceID) in each region. In other words, in each of the regions, how many customers purchased each genre. I have just started learning SQL that includes different kinds of JOINS, SUBQUERY, AGGREGATIONS, GROUP BY, ORDER BY and Common Table Expression but somehow I cannot produce the result I want. Could somebody please give some light on this? Thank you in advance for the help.
Best Answer
Use grouping to get the count per country, and a window function to get the rank: