Postgresql – What would be best data model for finding difference within n, n-1 and n-2 elements

database-designpostgresql

This is from betting domain which has something that is called a long list: a list of a "home team win/draw/away team win" markets for 13 games.
A punter can select any combination of the possible outcomes which are encoded in a following way:

1 - home team wins
2 - draw
4 - away team wins
3 - home team wins or draw
5 - home team wins or away team wins
6 - draw or away team wins
7 - home team wins or draw or away team wins

Meaning a [3, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7] represents a selection where punter put following bets:

  • "home team wins or draw" in a first game
  • "home team wins" in games 2-12
  • "home team wins or draw or away team wins" in 13th game

After games are finished there will be another 13 elements list representing winning outcomes, for example: [4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] means that in a first game away team won and in all other games home team won.

Let's take this bet [3, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7] and this query as an example [4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1].

Next we have a following correspondence between selected outcomes and actual results

1 - 1, 3, 5, 7
2 - 2, 3, 6, 7
4 - 4, 5, 6, 7

Meaning for an actual result of 13 games [4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] following bets considered to be a winning bets:

[4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
[4, 3, 1, 5, 1, 7, 1, 1, 1, 1, 1, 1, 1] 
[5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
[6, 1, 7, 1, 1, 5, 1, 1, 3, 3, 3, 3, 1] 
[7, 1, 5, 5, 5, 1, 1, 1, 1, 7, 7, 7, 1]

The question is what is the best way to model this in database for following use cases:

  • find bets where all 13 game results were guessed correctly
  • find bets where 12 game results were guessed correctly
  • find bets where 11 game results were guessed correctly

Examples of queries for following use-cases will be highly appreciated ;))

Best Answer

Possible realization:

id SERIAL PRIMARY KEY,
punter_id INT,
bet_number INT,
guess_number TINYINT, -- from 1 to 13
guess TINYINT, -- from 0 (or 1) to 7
UNIQUE (punter_id, bet_number, guess_number)

Each bet consists from 13 records for each punter.

The result is placed into the table with similar structure:

id SERIAL PRIMARY KEY,
bet_number INT,
result_number TINYINT, -- from 1 to 13
result TINYINT, -- from 0 (or 1) to 7
UNIQUE (bet_number, result_number)

So you join these tables by (bet_number, guess_number)=(bet_number, result_number), group by punter_id (or by (punter, bet_number, result_number) when you'll obtain the result for a lot of bets in one query), and calculate SUM( CASE guess & result WHEN 0 THEN 0 ELSE 1 END ). This sum is the amount of positive guesses per bet.