SQL Server – Identifying Bad Design or Complicated Query

database-designsql server

Problem: For fun I am building a Database for lottery results to get some statistics off of. As of now I am only interested in a Pick 3 style lotto and might introduce more later. How would you approach a query to produce the desired result below? Would there be a better table design?

Data

+----------------------------------------------+
| Date     | Time | NumOne | NumTwo | NumThree |
+----------------------------------------------+
| 01/02/14 | M    | 8      | 2      | 1        |
| 01/02/14 | E    | 8      | 8      | 8        |
| 01/03/14 | M    | 1      | 9      | 4        |
| 01/04/14 | E    | 0      | 7      | 5        |
+----------------------------------------------+

Desired Result

+----------------+
| Number | Count |
+----------------+
| 0      | 1     |
| 1      | 2     |
| 2      | 1     |
| 3      | 0     |
| 4      | 1     |
| 5      | 1     |
| 6      | 0     |
| 7      | 1     |
| 8      | 4     |
| 9      | 1     |
+----------------+

EDIT: Future Table

+-----------------------------------------------------------------------------------------------+
| TypeOfLottery | MinRegNum | MaxRegNum | NumOfRegPicks | MinSpeNum | MaxSpeNum | NumOfSpePicks |
+-----------------------------------------------------------------------------------------------+
| Pick3         | 0         | 9         | 3             | 0         | 0         | 0             |
| Pick4         | 0         | 9         | 4             | 0         | 0         | 0             |
| Powerball     | 1         | 51        | 6             | 1         | 45        | 1             |
+-----------------------------------------------------------------------------------------------+

Now That I think about it, I dont see any reason to have this table. If I just created a new table for each type of lotto that should work. If I wanted to go further the above table would give me the "Rules" for each lotto.

Best Answer

Saving your data differently would make that query trivial. Namely: one number per row. Something like:

results(Date, Time, Rank, Number)

where Rank would be 1 (or zero) for the first number, 2 for the second, etc. (Only if the order has importance, drop the rank if it doesn't.)

Then your query boils down to (ignoring reserved identifiers):

select number, count(*)
from results
-- where date = ...
group by number
order by number;

With your current schema, you can still get it fairly easily with a union:

select number, count(*)
from (
  select NumberOne as number from Results
  union all
  select NumberTwo as number from Results
  union all
  select NumberThree as number from Results
) AllNumbers
group by number
order by number;

(Or use a with AllNumbers as (...) CTE.)


When it comes to extending your application to more than three numbers, it's pretty clear which solution wins - you really don't want to add more columns for every type of draw, or produce a huge list of unions.

There's one issue I see though: a lot of data duplication for each row. More so if you need to add a column with the "type" of draw (to distinguish between "pick 3", pick 4" or whatever types of game you want).

You could mitigate that by only storing a "draw id", (optional) rank and number, and have another table where you store the "metadata" for each draw id. There probably are intermediate schemes that would fit your application better - depends on what trade-offs you can/want to make.