SQL Server – Get Streak Count and Type from Win-Loss-Tie Data

sql serversql-server-2012t-sql

I made a SQL Fiddle for this question if that makes things easier for anyone.

I have a fantasy sports database of sorts and what I'm trying to figure out is how to come up with "current streak" data (like 'W2' if the team has won their last 2 matchups, or 'L1' if they lost their last matchup after winning the previous matchup – or 'T1' if they tied their most recent matchup).

Here is my basic schema:

CREATE TABLE FantasyTeams (
  team_id BIGINT NOT NULL
)

CREATE TABLE FantasyMatches(
    match_id BIGINT NOT NULL,
    home_fantasy_team_id BIGINT NOT NULL,
    away_fantasy_team_id BIGINT NOT NULL,
    fantasy_season_id BIGINT NOT NULL,
    fantasy_league_id BIGINT NOT NULL,
    fantasy_week_id BIGINT NOT NULL,
    winning_team_id BIGINT NULL
)

A value of NULL in the winning_team_id column indicates a tie for that match.

Here's a sample DML statement with some sample data for 6 teams and 3 weeks worth of matchups:

INSERT INTO FantasyTeams
SELECT 1
UNION
SELECT 2
UNION
SELECT 3
UNION
SELECT 4
UNION
SELECT 5
UNION
SELECT 6

INSERT INTO FantasyMatches
SELECT 1, 2, 1, 2, 4, 44, 2
UNION
SELECT 2, 5, 4, 2, 4, 44, 5
UNION
SELECT 3, 6, 3, 2, 4, 44, 3
UNION
SELECT 4, 2, 4, 2, 4, 45, 2
UNION
SELECT 5, 3, 1, 2, 4, 45, 3
UNION
SELECT 6, 6, 5, 2, 4, 45, 6
UNION
SELECT 7, 2, 6, 2, 4, 46, 2
UNION
SELECT 8, 3, 5, 2, 4, 46, 3
UNION
SELECT 9, 4, 1, 2, 4, 46, NULL

GO

Here is an example of the desired output (based on the DML above) that I'm having trouble even beginning to figure out how to derive:

| TEAM_ID | STEAK_TYPE | STREAK_COUNT |
|---------|------------|--------------|
|       1 |          T |            1 |
|       2 |          W |            3 |
|       3 |          W |            3 |
|       4 |          T |            1 |
|       5 |          L |            2 |
|       6 |          L |            1 |

I've tried various methods using subqueries and CTE's but I can't put it together. I'd like to avoid using a cursor as I could have a large dataset to run this against in the future. I feel like there might be a way involving table variables that join this data to itself somehow but I'm still working on it.

Additional Info: There could be a varying number of teams (any even number between 6 and 10) and the total matchups will increase by 1 for each team every week. Any ideas on how I should do this?

Best Answer

Since you are on SQL Server 2012 you can use a couple of the new windowing functions.

with C1 as
(
  select T.team_id,
         case
           when M.winning_team_id is null then 'T'
           when M.winning_team_id = T.team_id then 'W'
           else 'L'
         end as streak_type,
         M.match_id
  from FantasyMatches as M
    cross apply (values(M.home_fantasy_team_id),
                       (M.away_fantasy_team_id)) as T(team_id)
), C2 as
(
  select C1.team_id,
         C1.streak_type,
         C1.match_id,
         lag(C1.streak_type, 1, C1.streak_type) 
           over(partition by C1.team_id 
                order by C1.match_id desc) as lag_streak_type
  from C1
), C3 as
(
  select C2.team_id,
         C2.streak_type,
         sum(case when C2.lag_streak_type = C2.streak_type then 0 else 1 end) 
           over(partition by C2.team_id 
                order by C2.match_id desc rows unbounded preceding) as streak_sum
  from C2
)
select C3.team_id,
       C3.streak_type,
       count(*) as streak_count
from C3
where C3.streak_sum = 0
group by C3.team_id,
         C3.streak_type
order by C3.team_id;

SQL Fiddle

C1 calculates the streak_type for each team and match.

C2 finds the previous streak_type ordered by match_id desc.

C3 generates a running sum streak_sum ordered by match_id desc keeping a 0 a long as the streak_type is the same as the last value.

Main query sums up the streaks where streak_sum is 0.