Help with removing duplicate reversed pairs in relational algebra

relational-theory

I have a listing of data that contains duplicate reversed pairs and I need to remove them.. Call them, name1 and name2. Where I have Tom and Mike, and Mike and Tom, such that this single pair is being counted twice.

 name1 | name2
-------|-------
 Tom   | Mike
 Pete  | Jenny
 Bill  | Jenny
 Joe   | Mary
 Mike  | Tom
 Jenny | Pete
 Jenny | Bill
 Mary  | Joe
 Linda | Jenny

The List was a product of an initial match of student with guidance counselor, then the product of the student/counselor and counselor table, which resulted in a longer list and was able to reduce to the pairs above. But, now can't get rid of the duplicates.

While I might have made a mistake in combing the tables that made the pairing, I am stuck with that table for now.. This is a listing of student paired with guidance counselors.

Is there a way to de-dup the list, or do I need to start over?

Best Answer

Do you need to return an existing combination, e.g. if there's only Tom,Mike do you need to return exactly this or is Mike,Tom also ok?

-- order of columns doesn't matter
SELECT DISTINCT
   CASE WHEN name1 > name2 THEN name2 ELSE name1 END as name1,
   CASE WHEN name1 < name2 THEN name2 ELSE name1 END as name2
FROM tab;

-- order of columns is maintained
SELECT DISTINCT name1,name2 -- DISTINCT might not be needed
FROM tab AS t1
WHERE NOT EXISTS(
  SELECT * FROM tab AS t2
  WHERE t1.name1 = t2.name2
    AND t1.name2 = t2.name1
    AND t1.name1 > t2.name1)
;

See fiddle

If you want to delete those rows you might us the 2nd logic, the actual syntax depends on your DBMS:

DELETE --change to SELECT * to see which rows will be deleted
FROM tab AS t1
WHERE EXISTS(
  SELECT * FROM tab AS t2
  WHERE t1.name1 = t2.name2
    AND t1.name2 = t2.name1
    AND t1.name1 > t2.name1)
;

Related Solutions

Relational Algebra Question – Understanding the Basics

DISCLAIMER : Never Learned Relational Algebra but it looks interesting

From the schema given and your question, this is what the SQL should be:

SELECT
    emp_mgr.person_name
FROM
    manages emp_mgr
    INNER JOIN employee emp ON emp_mgr.person_name  = emp.person_name
    INNER JOIN employee mgr ON emp_mgr.manager_name = mgr.person_name
WHERE
    emp.street = mgr.street AND
    emp.city = mgr.city
;

Here is another query that only uses JOINs, no WHERE clause:

SELECT
    emp.person_name
FROM
    (SELECT A.person_name,B.street,B.city FROM manages A
    INNER JOIN employee B ON A.person_name = B.person_name) emp
    NATURAL JOIN
    (SELECT A.manager_name,B.street,B.city FROM manages A
    INNER JOIN employee B ON A.manager_name = B.person_name) mgr
;

The first query gets all employees who are managed and their managers in the form of a Cartesian Product. Then, it looks for a common street and city.

The second query collects personnel records (name,street,city) of employees and their managers and performs a NATURAL JOIN between the employess and their managers using (street,city).

If you can transalate both queries back to Relational Algebra, I think you will have what you are looking for. I believe the second may be of better help.

SQL and Relational Algebra – Beginner Questions Answered

Your sql query is wrong:

- Assume there is a student s1 who has passed one exam 
  after '2000-01-01' and none before. 
  Your query results in {s1} - {} = {s1}. This will be a false positive.

- Assume there is a student s1 that passed three exams 
  after '2000-01-01' and one exam before. 
  Your query results in {s1} - {s1} = {}. This will be a false negative.

I'm having a bit of trouble reading your algebra expressions, but the first one looks ok. It should not matter that you do the selection (topic = "motorcycle") after the join instead of joining on the selection.

The second one can't be right. Assume there's a newspaper that published both an article on motorcycle and an article on something else. Your expression will pick the article on something else and therefore return that newspaper (incorrectly).

Best Answer

Related Solutions

Relational Algebra Question – Understanding the Basics

SQL and Relational Algebra – Beginner Questions Answered

Related Question