Translating the Relational Algebra output

relational-algebra

I have the following relation Jobs(name, company, salary).

P1(salary) = πsalary(σcompany='pwc'(Jobs))
P2(salary) = πsalary(ρT1(s)(P1) thetaJoin(s>salary) P1)
P3(salary) = P1 − P2

I know that P1 just selects the salary of all people works in pwc.

I am having a hard time figuring out what P2 indicates.

I know that ρT1(s)(P1) renames P1 into T1 and the salary attribute to s. But then I am confused with the thetaJoin(s>salary) part because I thought s and salary would be the same.

P3 would just be the difference between P1 and P2

The final relation

Final(name) = πname(Jobs thetaJoin(salary>s) ρT2(s)(P3))

will depend on what P2 means.

Please help me doing this translation.

Best Answer

P2(salary) = πsalary(ρT1(s)(P1) thetaJoin(s>salary) P1)

From the Wikipedia entry for Relational Algebra:

The result of the θ-join is defined only if the headers of S and R are disjoint, that is, do not contain a common attribute.

The simulation of this operation in the fundamental operations is therefore as follows:

R ⋈θ S = σθ(R × S)

To ensure the headers are disjoint, salary must be renamed (as 's' here). To ensure the relations are uniquely named for this self-join, P1 must also be renamed (as 'T1' here).

In rough SQL terms:

DECLARE @P1 table (salary integer NOT NULL);
INSERT @P1 (salary) VALUES (1), (2), (3);

WITH T1 (s) AS (SELECT DISTINCT salary FROM @P1)
SELECT DISTINCT
    P1.salary
FROM @P1 AS P1 
JOIN T1
    ON T1.s > P1.salary;

-- or, closer to the fundamental operations:

WITH T1 (s) AS (SELECT DISTINCT salary FROM @P1)
SELECT DISTINCT
    P1.salary
FROM @P1 AS P1
CROSS JOIN T1
WHERE T1.s > P1.salary;

dbfiddle

| salary |
| ------ |
|      1 |
|      2 |

P2 therefore contains all the salary values that are smaller than some other salary value.

P1 minus P2 therefore contains the top salary at PWC.

See also Finding a Maximum Value with Relational Algebra

Related Question