Mysql – Get names of games that have the highest number of guilds

MySQL

I've a database which contains tables:

  • Quest:

    PK  Name
    PK  qName
        goal
    
  • Guild:

    PK  Name
    PK  gName
        level
    
  • Game:

    PK  Name
        MaxPlayer
        levels
        uName
    
  • GuildQuest:

    PK  Name
    PK  gName
    PK  qName
    

I need to retrieve the names of the games that contain the highest number of guilds.

Guilds are assigned by the game.

My attempted query:

SELECT Name, Count(*) guildName
FROM Game game, Guild guild 
WHERE game.Name = Guild.gName

and somehow use the MAX function.

I'm kind of confused how to perform this, any help is much appreciated.

Best Answer

One obvious solution is to get all the counts, sort them in the descending order and retrieve just the topmost row:

SELECT
  gName,
  COUNT(Name) AS GuildCount
FROM
  Guild
GROUP BY
  gName
ORDER BY
  GuildCount DESC
LIMIT
  0, 1
;

That would work only in cases where only one game can have the maximum number, but you probably cannot guarantee that. So the above solution is not very good, because in cases where two or more games have the same highest number you say you want to get all those names.

So, you could try this approach instead:

SELECT
  gName,
  COUNT(Name) AS GuildCount
FROM
  Guild
GROUP BY
  gName
HAVING
  COUNT(Name) =
  (
    SELECT
      COUNT(Name)
    FROM
      Guild
    GROUP BY
      gName
    ORDER BY
      GuildCount DESC
    LIMIT
      0, 1
  )
;

The subquery will get just the number, the highest guild count, and the outer query will get the names of those games that have the matching number of guilds.

The problem with the above is obviously that you have to scan the table twice, each time performing the same grouping, and the grouping itself, moreover, has to be repeated in the code as well. Question is, how to avoid that?

Now, where other SQL platforms offer powerful instruments like ranking functions, MySQL is trying to make up by allowing you to use variables to achieve better performance. In this case, too, it is possible to use variables to get the results in one go and avoid repetition of logic. Here is one way:

SELECT
  gName,
  GuildCount
FROM
  (
    SELECT
      IF(@LastCount <> 0 AND @LastCount <> grp.GuildCount, @flag := 0, @flag) AS flag,
      grp.gName,
      @LastCount := grp.GuildCount AS GuildCount
    FROM
      (
        SELECT
          @LastCount := 0,
          @flag := 1
      ) AS init,
      (
        SELECT
          gName,
          COUNT(Name) AS GuildCount
        FROM
          Guild
        GROUP BY
          gName
        ORDER BY
          GuildCount DESC
      ) AS grp
  ) AS derived
WHERE
  flag = 1
;

The init derived table is just for variable initialisation, and grp provides the initial set of aggregated values sorted in the descending order to use our variables on.

The main part of the logic is located at the second nesting level where the variables are actually used. The @flag variable is used to generated the flag column for further filtering, marking rows that need to be included in the output with 1 and the rest with 0. It is initially set to 1 because the first row obviously need to be included. It is reset to 0 as soon as the number less than the maximum is encountered. To help with that, the other variable, @LastCount, is used, which stores the previous row's count value.

The outermost query merely filters the nested set on the flag column.

This solution can be tested at Rextester.