Mysql – SUM n rows of each group with ORDER BY and LIMIT where the LIMIT is based off of another table

database-designgreatest-n-per-groupgroup byMySQLmysql-5.6

I have a requirement to build a leaderboard of users of a fantasy football game.
The simplified database of the game is as follows:

  • users have squad_players in squads for each matches (related by transfer_period)
  • players have match_points for each matches
  • squad_players have position, priority. The priority is the order of substitutions if a squad_players does not appear in a matches
  • squads have formations which determine the maximum number of players to select from of each position ordered by priority

The database is MySQL 5.6 and the maximum number of users is 10K.

I am able to inner join (in order) squad_players, squads, matches, match_points to get the point of each squad_players who played (players who didn't play have no match_points).

I am struggling to SUM the points of x squad_players of each squads where x is in formations and determined by position of the squad_players.

I have tried to modify versions of Top n-rows with group by with correlated subqueries but without success.

What should my approach be here?


My Expected Output is something like this

|---------|-----------|------------|----------|
| User ID |  SQUAD ID | POSITION   |  POINTS  |  
|---------|-----------|------------|----------|
|   12    |    34     | defender   |    12    |
|---------|-----------|------------|----------|
|   12    |    34     | forward    |    2     |
|---------|-----------|------------|----------|
|  117    |    31     | midfielder |    5     |
|---------|-----------|------------|----------|

Notes:

  • I must add the outputs may not reflect the problem statement.

  • If you look into the Sample Dataset Link, you will find a view
    named normalized_squad_player_points that is the denormalized table I want to work with to create different stats.

Final Output for the leaderboard will be something like this

|---------------------|------------------|
|      User ID        |       Points     |
|---------------------|------------------|
|          12         |         34       |
|---------------------|------------------|
|          56         |         32       |
|---------------------|------------------|
|          117        |         31       |
|---------------------|------------------|

Additional information

Best Answer

The solution I ended up implementing was to break up the query into simpler ones based on the different permutations of the positions and formations. I also needed to add several tables to store intermediate results. I ran all the queries (around 42) in a transaction.

This worked out pretty well and completed in reasonable time without overloading the server resources.

I guess trying to do it in one giant query was a bad decision from the beginning. Breaking it up into smaller queries allowed the process to be maintainable and flexible.