TRC: Select an ID if it appears on all measurements

database-theoryrelational-calculusrelational-theory

I have these two "tables":

Station (station_id, city)            // Stations details
Rainfall (station_id, date, amount)   // Rainfall measurements

With Tuple Relational Calculus (TRC), I need to get the IDs of all stations that have related measurements on every date in which any measurement was taken.

So with the following tables I'm supposed to select 5 because this station has measurements on every date that has a measurement.

       Station                             Rainfall
       -------                             --------
--station_id-----city--      --station_id-----date-----amount--
      5           LA               5          01/01      4
      7           NY               7          02/02      8
                                   5          02/02      3

I've tried this but I'm not quite sure if this would work if there are no measurements at all (in this case I'm supposed to select all station IDs):

{ t | ∃s ∈ Station (t[station_id] = s[station_id] ∧ 
    ∀r ∈ Rainfall (∃q ∈ Rainfall (r[date] = q[date] ∧ s[station_id] = q[station_id])) ) }

I need your help in understanding if what I did is correct.

Best Answer

I had to look up wikipedia because I did not know Tuple Relational Calculus. I did only skim over that article. There are differnces in notation as far as I can see.

But after activating my knowledge about formal logic I think you are right because

∀r ∈ Rainfall (...)

is true if Rainfall does not contain any tuples. In this case your query reduces to

{ t | ∃s ∈ Station (t[station_id] = s[station_id]}

and this describes all tuples of Station

What I am missing in your notation is that is unclear which attributes the tuple t of the solution contains. So if you want to select station_id then the query should start with something like

{ t: {station_id} | ...}

if one uses the notataion of the wiki article.

Maybe you can supply some reference to your notation.

Related Solutions

Sql-server – What’s a good relational structure for units and complex unit conversions

There are a few things you want to factor into your design:

1. Measurements Need a Timestamp

Make sure all of your measurements have an indication of:

Scalar Value
Unit of Measurement
Date and Time the Measurement was Taken

This will allow you to work with measurements that need time-dependent conversion calculations.

2. Units of Measure Have Attributes

Each unit of measure has a few different attributes. The obvious ones are indicative, like a code and maybe a descriptive name. There are also a couple of critical other attributes to keep for each unit of measure. (i) Unit Type and (ii) Conversion Factor to the Base Unit.

The first tells you whether your unit of measure is a length, a weight, energy, power, currency, etc. etc. It should also tell you what the base unit of measure is. You should pick exactly one for each unit type. You can use things like kWh if you like, but I'd stick to the base SI units (as applicable) if I were you.

The second tells you what your unit of measure needs to be multiplied by to get it to the base. I mentioned that this is an attribute of your UOM, but in fact it needs to be in a child table. The business key of the child table that holds this base conversion factor is the combination of the UOM, its base unit type and a date/time. I would keep both an effective and an expiry date/time on the base conversion factor table. This allows you to quickly find the right rate that applies at any particular point in time. If it happens to be a rate that doesn't change, that's OK. Just use a min-collating effective date and a max-collating expiry date for the one record.

3. Trying to Table-Drive Everything Will Make You Nuts The last piece of the puzzle is determining the calculation for moving from one kind of unit to another kind of unit. You could try to table-drive this kind of calculation but in the end the tricky ones are going to make the design so general (read complicated and slow) that it will be impractical. Instead, create a code-table of conversion calculations and use it to link one kind of Unit Type to another kind of Unit Type. Perform the actual calculations in some code somewhere. Which piece of code you use for any given conversion is what the code table tells you. How the calculation is performed is just in the code. You can have one calculation each for the various easy things, like area needs two lengths and volume needs three lengths as well as the harder ones like work needs energy and time.

When you get the details of your design figured out you should blog it and come back here to post a link!

Find the highest graded student using Tuple Relational Calculus

Given the schema Students(id:integer,grade:integer), you can solve the problem in tuple relational calculus by using the negation operator (¬).

{T1.id | ∃T1 ∈ Students ¬(∃T2 ∈ Students (T2.grade > T1.grade))}

This would return the id of all students in T1 where there is no student in T2 with a higher grade.

Because T1 and T2 are from the same relation, this effectively returns the set of students with the top grade.

If there are no ties for the top grade, it will return just one id.

It looks like you were thinking in terms of relational algebra rather than tuple relational calculus.

Tuple relational calculus does not have a set difference operator, so you can't find the maximum by subtracting all non-maximums.

Reference

The Solutions Manual for the third edition of Database Management Systems by Ragu Ramakrishnan and Johannes Gerke helped me solve this.

Question 4.3 asks you to solve some problems using this schema:

Suppliers(sid:integer,sname:string,address:string)
Parts(pid:integer,pname:string,color:string)
Catalog(sid:integer,pid:integer,cost:real)

Question 4.3.11 is similar to yours.

Find the pids of the most expensive parts supplied by suppliers named Yosemite Sham.

The solution looks like this:

{T | ∃T1 ∈ Catalog(∃X ∈ Suppliers
(X.sname = 'Yosemite Sham' ∧ X.sid = T1.sid) ∧ ¬(∃S ∈ Suppliers
(S.sname = 'Yosemite Sham' ∧ ∃Z ∈ Catalog
(Z.sid = S.sid ∧ Z.cost > T1.cost))) ∧ T.pid=T1.pid)}

The solution here is more complicated because of the joins between Suppliers and Catalog, but the essence of the solution is the same.

Elide the joins to get this:

T1 ∈ Catalog ...
(... ¬(... ∃Z ∈ Catalog
(... Z.cost > T1.cost)))

In the reference solution, free variable T has the same pid as bound variable T1.

Taking the pid of T1 directly is a simpler way to get the same result.

Best Answer

Related Solutions

Sql-server – What’s a good relational structure for units and complex unit conversions

Find the highest graded student using Tuple Relational Calculus

Reference

Related Question