Sql-server – Best way to design this mileage table

database-designschemasql-server-2008

I will be populating a Miles Per Gallon (MPG) table. It's coming from an odometer source.

It's currently set up as so:

  id               (primary_key)
, truck_num
, start_date
, end_date
, start_miles
, end_miles
, start_fuel
, end_fuel
, miles
, gals
, mpg

There seems to be some redundancy. The miles is (end_miles - start_miles), ditto for gals.

Should we have those miles and gals columns precalculated and stored in the database? It would definitely make querying easier, but at the expense of space. Same question for having the mpg calculated. A computed column would slow things down, no?

What indexes would work best? There's about 3,000 trucks (records) inserted in a batch every week.

I'm using SQL Server 2008 R2.

Edit: A sample query that would I'd be using

-- find average mpg for since ytd
select m.truck_num, avg(mpg)
from mpg m
join truck t on t.truck_num = m.truck_num
where start_date >= @begin_of_year and end_date <= @today
group by truck_num

Best Answer

Computed columns are your friend. Use them to capture simple calculations you're going to do anyway, and to guarantee that the computed values are correct.
Persist the results if you want to index or filter on them; don't persist them if you just need to pull the value once in a while.
Capture all data constraints using CHECK constraints.

Here is a pseudo-schema definition:

  id               PRIMARY KEY
, truck_num

, start_date
, end_date         CHECK (end_date > start_date)

, start_miles      CHECK (start_miles >= 0)
, end_miles        CHECK (end_miles > start_miles)

                   -- what if they refill the tank?
, start_gals       CHECK (start_gals >= 0)
, end_gals         CHECK (end_gals < start_gals AND end_gals >= 0)

-- all these should be computed
, miles = end_miles - start_miles
, gals = start_gals - end_gals
, mpg = miles/gals

As for indexing the table, here's what I think:

Cluster on start_date ASC. This will satisfy your WHERE clause. You are always inserting data by increasing start_date, meaning your writes will always be sequential under this clustering scheme. You are also always querying by start_date so you satisfy your biggest query pattern as well. (3,000 inserts per week is nothing. Because you have such a low volume of inserts, you could even cluster on start_date ASC, end_date ASC.)
Create a non-clustered index on truck_num and INCLUDE mpg. This should satisfy your SELECT, JOIN, and GROUP BY clauses. If you want to ORDER BY mpg, then make mpg part of the index key after truck_num instead of just INCLUDE-ing it.

When you're done, test your setup as follows:

Create a test table
Pump it full of test data
Create the indexes
Update statistics
Run your most common queries
Check their plans and run times

Related Solutions

Mysql – Best way to design tournament database

I'd start off by trying to fix all the predetermined information in the model itself including

dates/venues
structure (ie group/knockout stages)
rules (ie points scoring, tie-break rules)

Some of this information will be data in tables, some will be codified logic in views.

Something like this perhaps:

team(team_id, group_code enum('A', 'B', 'C', 'D'), name)
match(match_id, kickoff_at)
group_match(match_id, team_id_home, team_id_away, group_code)
knockout_match(match_id, knockout_code enum('Q1', 'Q2', 'Q3', 'Q4', 'S1', 'S2', 'F')
result(match_id, score_home, score_away)

Information such which teams play in Q1 never needs to be stored directly because it can be calculated from the group stage results. The only changes to make as the tournament progresses are inserts into the result table.

The best database design for this situation

A rule of thumb I use, is that it's bad database design to have to alter a table to accomodate a code change or a new feature if it can be planned for in the future.

That said, I would use a variant of Option 1

Approvals
    Id PK
    ApprovalTypeID FK
    Status


ApprovalComments
    ID PK
    ApprovalId FK
    Comment

ApprovalTypes
    ID PK
    Name

Now when adding new types of objects that need approval, you only need to insert a row into ApprovalTypes instead of altering a table to add a column.

Best Answer

Related Solutions

Mysql – Best way to design tournament database

The best database design for this situation

Related Question