Find the highest graded student using Tuple Relational Calculus

database-theoryrelational-calculusrelational-theory

Suppose I have a table of students, containing their ID and their grade:

-----------------
| id  |  grade  |
-----------------
|  1  |    83   |
|  2  |    94   |
|  3  |    92   |
|  4  |    78   |

How do I write a Tuple Relational Calculus formula that refers to the student with the highest grade?

My attempt:

Thinking in terms of SQL, I would write a query that does a cartesian product of the table with itself, take every grade that is less than some other grade, and then subtract from the original table. However, in Tuple Relational Calculus, it's not possible to build sub-tables in sub-queries, which is why I'm stuck.

However, I've attempted something in this direction:

{ <id> | Ǝ grade1 ∈ students (id, grade) ⋀ Ǝ grade2 ∈ students (id, grade2) ⋀ grade1 > grade2}

I believe this would get me the lower grades, but how do I subtract this all from the original table of students? I'm not allowed to insert this statement into another TRC query. Thanks in advance for your help!

Best Answer

Given the schema Students(id:integer,grade:integer), you can solve the problem in tuple relational calculus by using the negation operator (¬).

{T1.id | ∃T1 ∈ Students ¬(∃T2 ∈ Students (T2.grade > T1.grade))}

This would return the id of all students in T1 where there is no student in T2 with a higher grade.

Because T1 and T2 are from the same relation, this effectively returns the set of students with the top grade.

If there are no ties for the top grade, it will return just one id.

It looks like you were thinking in terms of relational algebra rather than tuple relational calculus.

Tuple relational calculus does not have a set difference operator, so you can't find the maximum by subtracting all non-maximums.

Reference

The Solutions Manual for the third edition of Database Management Systems by Ragu Ramakrishnan and Johannes Gerke helped me solve this.

Question 4.3 asks you to solve some problems using this schema:

Suppliers(sid:integer,sname:string,address:string)
Parts(pid:integer,pname:string,color:string)
Catalog(sid:integer,pid:integer,cost:real)

Question 4.3.11 is similar to yours.

Find the pids of the most expensive parts supplied by suppliers named Yosemite Sham.

The solution looks like this:

{T | ∃T1 ∈ Catalog(∃X ∈ Suppliers
(X.sname = 'Yosemite Sham' ∧ X.sid = T1.sid) ∧ ¬(∃S ∈ Suppliers
(S.sname = 'Yosemite Sham' ∧ ∃Z ∈ Catalog
(Z.sid = S.sid ∧ Z.cost > T1.cost))) ∧ T.pid=T1.pid)}

The solution here is more complicated because of the joins between Suppliers and Catalog, but the essence of the solution is the same.

Elide the joins to get this:

T1 ∈ Catalog ...
(... ¬(... ∃Z ∈ Catalog
(... Z.cost > T1.cost)))

In the reference solution, free variable T has the same pid as bound variable T1.

Taking the pid of T1 directly is a simpler way to get the same result.

Related Solutions

TRC: Select an ID if it appears on all measurements

I had to look up wikipedia because I did not know Tuple Relational Calculus. I did only skim over that article. There are differnces in notation as far as I can see.

But after activating my knowledge about formal logic I think you are right because

∀r ∈ Rainfall (...)

is true if Rainfall does not contain any tuples. In this case your query reduces to

{ t | ∃s ∈ Station (t[station_id] = s[station_id]}

and this describes all tuples of Station

What I am missing in your notation is that is unclear which attributes the tuple t of the solution contains. So if you want to select station_id then the query should start with something like

{ t: {station_id} | ...}

if one uses the notataion of the wiki article.

Maybe you can supply some reference to your notation.

SQL and Relational Algebra – Beginner Questions Answered

Your sql query is wrong:

- Assume there is a student s1 who has passed one exam 
  after '2000-01-01' and none before. 
  Your query results in {s1} - {} = {s1}. This will be a false positive.

- Assume there is a student s1 that passed three exams 
  after '2000-01-01' and one exam before. 
  Your query results in {s1} - {s1} = {}. This will be a false negative.

I'm having a bit of trouble reading your algebra expressions, but the first one looks ok. It should not matter that you do the selection (topic = "motorcycle") after the join instead of joining on the selection.

The second one can't be right. Assume there's a newspaper that published both an article on motorcycle and an article on something else. Your expression will pick the article on something else and therefore return that newspaper (incorrectly).

Best Answer

Reference

Related Solutions

TRC: Select an ID if it appears on all measurements

SQL and Relational Algebra – Beginner Questions Answered

Related Question