PostgreSQL – How to Extend Single Week’s Win/Loss Ratio Calculation to Whole Season

aggregatepostgresqlwindow functions

I'm quite inexperienced at SQL, and I'm trying to improve a query in a small project.

The database schema looks somewhat like this. (I've reduced to relevant information, if there seems to be something missing please point it out):

It's a database of a fantasy sports league, and we want to calculate an 'all-play record'.

My understanding is that the 'all-play record' is generated by taking every team in a given league's scores for a given week and assuming they all play each other.

I've written a query that does this for a given week:

WITH weekly_points_table (team_id, max_points)
AS
(
  SELECT teams.id, SUM(player_scores.points) as max_points
  FROM player_scores
  INNER JOIN weeks ON weeks.id = player_scores.week_id
  INNER JOIN teams ON teams.id = player_scores.team_id
  WHERE (team_id IN (?) AND weeks.number = ? and weeks.year = ?)
  GROUP BY teams.id
)
SELECT *
FROM weekly_points_table
WHERE max_points < (SELECT max_points
  FROM weekly_points_table
  WHERE team_id = ?
)

The list of team_ids, the given week number/year, and the final team_id are passed in.

Wins are calculated with the above query, losses are calculated by simply flipping the sign, and ties are just (count of all teams – 1) – (wins + losses) for a given week.

But now I'm at a loss at how to extend this to a given season.

A season is a collection of weeks where the year is identical, and week numbers go from 1 to 16.

Right now we're literally just taking the single-week query and running it in a for-loop. (Before that, we were using nested for loops to calculate the wins/losses/ties, and you can imagine how slow that can be.)

Pretty much, how do I take the calculated wins/losses for a given week and add them up for all 16 weeks?

Best Answer

Assuming a standard many-to-many implementation ...

SELECT week_id
     , count(*) FILTER (WHERE week_points > team_points) AS losses
     , count(*) FILTER (WHERE week_points < team_points) AS wins
     , count(*) FILTER (WHERE week_points = team_points) AS draws
--     , count(*) AS total -- redundant check
FROM (
   SELECT week_id, team_id, sum(points) AS week_points
        , first_value(sum(points)) OVER (PARTITION BY week_id
                                         ORDER BY team_id <> 7) AS team_points
   FROM  (SELECT id AS week_id FROM weeks WHERE year = ?) w  -- your year
   CROSS  JOIN unnest(?::int[]) t(team_id)  -- your teams, see below!
   JOIN   player_scores p USING (week_id, team_id)
   WHERE  EXISTS (  -- only weeks where the team actually played
      SELECT 1 FROM player_scores
      WHERE week_id = p.week_id
      AND   team_id = ?  -- your team_id
      )
   GROUP  BY 1, 2
   ) sub
WHERE  team_id <> ?  -- your team_id again
GROUP  BY 1;

Returns something like:

week_id | losses | wins | draws
--------+--------+------+------
1       | 3      | 0    | 0
2       | 5      | 7    | 0
3       | 2      | 2    | 1

The key feature is the window function first_value() to get the weekly score for the team in the same query. Aggregate functions can be nested in window functions. Consider the sequence of events:

Best way to get result count before LIMIT was applied in PHP/PostgreSQL

The aggregate FILTER clause requires Postgres 9.4+. There are other ways. Details in this closely related answer on SO:

How can I simplify this game statistics query?

Assuming you provide your list of teams as array literal in the form '{1,2,3}'. There are other ways ...

Related Solutions

Calculating percentage of a row over total sum

You're looking for the analytical function ratio_to_report

select 
  agent,
  round(ratio_to_report(commission) over ()*100) "% Comm."
from  
  commissions;

Sql-server – Performing aggregations and calculations: SQL (CTE) vs MDX

Row-level YTD calculations on a database table are not very useful unless you have a unit of analysis for your YTD figures that can be processed additively (i.e. you're not going to find your reports trying to aggregate two YTD figures). This essentially limits their usefulness to summary tables.

This type of aggregate table is of limited use with a cube, as the cube gives you the aggregation for free. If you have a time dimension on the cube you can make calculated YTD measures that encapsulate the calculations. with an expression such as Aggregate (YTD([Date].[Date].[CurrentMember]), [Measures].[SomeMetric]). You don't even have to specify the date.

The calculations you describe can all be done relatively easily in MDX with hierarchy functions such as PrevMember and ParallelPeriod, and this is more flexible and much less effort to implement than working off an aggregate table generated in the underlying database. You're better off doing al the YTD and period-period comparisons in the cube.

Best Answer

Related Solutions

Calculating percentage of a row over total sum

Sql-server – Performing aggregations and calculations: SQL (CTE) vs MDX

Related Question