Mysql – How to do a join and exclude certain records

join;mariadbMySQL

Table STUDENT has columns ID, LASTNAME, FIRSTNAME

Table TEACHER has columns ID, TEACHER_ID, STUDENT_ID

Students can have many teachers. Teachers can add students to their class.

I want to be able to list all students available to add for a teacher. Once the teacher adds that student, they are no longer available to see on their list; however, other teachers can still see the student so they can add them to their list.

What would the query look like for the view page? I would like for the view page to be the same page for all teachers and the data to only change.

SELECT ID, LASTNAME, FIRSTNAME
FROM STUDENTS
LEFT JOIN TEACHERS ON TEACHERS.ID = STUDENTS.ID
WHERE TEACHERS.ID <> $teacherID  ???

This is where I get confused. It doesn't return results except for the ones in TEACHERS table that don't have the $teacherID (that's not how I want it to work). How Do I pick all students except the one's that the current teacher has already added?

I'm coding in PHP and using DQL, but any help on just the query part would be great.

Hope that makes sense. I confuse myself.

Best Answer

A unique constraint on TEACHER_ID, STUDENT_ID would prevent duplicate STUDENT_ID for a teacher. In fact you could just make the the PK and drop ID.

This is TSQL so it may be wrong for mysql

SELECT S.ID
     , S.LASTNAME
     , S.FIRSTNAME
  FROM STUDENTS S
  LEFT JOIN TEACHERS T 
        ON T.STUDENT_ID = S.ID
       AND T.TEACHER_ID = $teacherID 
 WHERE T.TEACHER_ID is null

Related Solutions

Mysql – Individual queries run on 10ms, with UNION ALL they are taking 290ms+ (7.7M records MySQL DB). How to optimise

Firstly, I think your original query may not be "correct"; With reference to your SQLFiddle, it looks to me as though you should be returning rows with ID = 2, 3 and 4 (in addition to the row with ID = 1 you are getting from this half), because your existing logic appears as though you intended for these other rows to be included, as they explicitly meet the OR (date_from >= '2014-04-10 08:00:00') part of your second WHERE clause.

The GROUP BY teacher_id clause in your second part of your UNION is causing you to lose those rows. This is because you're not actually aggregating any columns in your select list, and in this case the GROUP BY will cause 'difficult to define' behaviour.

Also, while I can't explain the poor performance of your UNION, I can work around it for you by outright removing it from your query:

Rather than using two separate (and in parts, repeating) sets of logic to get rows from the same table, I've consolidated your logic into one query with the differences in your logic ORed together - ie if a row meets one or the other of your original WHERE clauses, it's included. This is possible because I've replaced the (INNER) JOIN you were using to find the closestDate with a LEFT JOIN.

This LEFT JOIN means we are now also able to distinguish which set of logic should be applied to a row; If the join works (closestDate IS NOT NULL) we apply your logic from the first half, but if the join fails (closestDate IS NULL) then we apply the logic from your second half.

So this will return all the rows that your query returned (in the fiddle), and it's also picking up those additional ones.

  SELECT
    *

  FROM 
    teacher_slots ts

    LEFT JOIN 
    (
      SELECT 
        teacher_id,
        DATE(MIN(date_from)) as closestDay

      FROM 
        teacher_slots

      WHERE   
        date_from >= '2014-04-10 08:00:00' 
        AND order_of_arrival = 0
        AND status = 0 
        AND city_id = 6015 
        AND subject_id = 1

      GROUP BY 
        teacher_id

    ) a
    ON a.teacher_id = ts.teacher_id
    AND a.closestDay = DATE(ts.date_from)

  WHERE 
    /* conditions that were common to both halves of the union */
    ts.status = 0
    AND ts.city_id = 6015
    AND ts.subject_id = 1

    AND
    (
      (
        /* conditions that were from above the union 
           (ie when we joined to get closest future date) */
        a.teacher_id IS NOT NULL
        AND ts.date_from >= '2014-04-10 08:00:00'
        AND ts.order_of_arrival = 0
      ) 
      OR
      (
        /* conditions that were below the union 
          (ie when we didn't join) */
        a.teacher_id IS NULL       
        AND ts.order_of_arrival = 1 
        AND 
        (
          (
            date_from <= '2014-04-10 08:00:00' 
            AND  
            date_to >= '2014-04-10 08:00:00'
          )

          /* rows that met this condition were being discarded 
             as a result of 'difficult to define' GROUP BY behaviour. */
          OR date_from >= '2014-04-10 08:00:00' 
        )
      )
    )

  ORDER BY 
   ts.date_from ASC;

Further, you can "tidy up" your query further so that you don't need to "plug in" your status, city_id and subject_id parameters more than once.

To do this, change the subquery a to also select those columns, and to also group on those columns. Then, the JOIN's ON clause would need to map those columns to their ts.xxx equivalents.

I don't think this will negatively effect performance, but couldn't be sure without testing on a large dataset.

So your join will look more like:

LEFT JOIN 
(
  SELECT 
    teacher_id,
    status,
    city_id,
    subject_id,
    DATE(MIN(date_from)) as closestDay

  FROM 
    teacher_slots

  WHERE   
    date_from >= '2014-04-10 08:00:00' 
    AND order_of_arrival = 0
  /* These no longer required here...
    AND status = 0 
    AND city_id = 6015 
    AND subject_id = 1
  */

  GROUP BY 
    teacher_id,
    status,
    city_id,
    subject_id

) a
ON a.teacher_id = ts.teacher_id
AND a.status = ts.status 
AND a.city_id = ts.city_id 
AND a.subject_id = ts.city_id
AND a.closestDay = DATE(ts.date_from)

MySQL Insert into table, where name = id, if not exist, then insert name and use that ID

Assuming you have some tables for Persons, Animals:

CREATE TABLE Person
  ( PersonID INT UNSIGNED NOT NULL AUTO_INCREMENT
  , PersonName VARCHAR(255) NOT NULL
  , CONSTRAINT Person_PK
      PRIMARY KEY (PersonID)
  , CONSTRAINT PersonName_UQ 
      UNIQUE (PersonName)
  ) ;

CREATE TABLE Animal
  ( AnimalID INT UNSIGNED NOT NULL AUTO_INCREMENT
  , AnimalName VARCHAR(255) NOT NULL
  , CONSTRAINT Animal_PK
      PRIMARY KEY (AnimalID)
  , CONSTRAINT AnimalName_UQ 
      UNIQUE (AnimalName)
  ) ;

and results:

CREATE TABLE Result
  ( RaceID INT UNSIGNED NOT NULL
  , Position INT UNSIGNED NOT NULL
  , PersonID INT UNSIGNED NOT NULL 
  , AnimalID INT UNSIGNED NOT NULL
  , Errors INT UNSIGNED NOT NULL DEFAULT 0
  , CompletionTime Time NULL DEFAULT NULL

  , CONSTRAINT Result_PK
      PRIMARY KEY (RaceID, Position)

  , CONSTRAINT Race_Person_UQ           -- assuming a Person cannot enter
      UNIQUE (RaceID, PersonID)                 -- a race twice

  , CONSTRAINT Race_Animal_UQ           -- assuming an Animal cannot enter
      UNIQUE (RaceID, AnimalID)                 -- a race twice

  , INDEX PersonID_IX (PersonID)                -- indexes for the Foreign Key
  , INDEX AnimalID_IX (AnimalID)                -- constraints:

  , CONSTRAINT Person_Result_FK     
      FOREIGN KEY (PersonID)
      REFERENCES Person (PersonID)
  , CONSTRAINT Animal_Result_FK     
      FOREIGN KEY (AnimalID)
      REFERENCES Animal (AnimalID)
  ) ;

I suggest you first bulk load the data (possibly with LOAD DATA from .txt or .csv files) in a table in MySQL (supplying race IDS. If you can't supply raceIDs but you have race names, the tables should be adjusted accordingly). You should have a Race table as well, this is just a sample procedure:

CREATE TABLE BulkData
  ( RaceID INT UNSIGNED NOT NULL
  , Position INT UNSIGNED NOT NULL
  , PersonName VARCHAR(255) NOT NULL
  , AnimalName VARCHAR(255) NOT NULL
  , Errors INT UNSIGNED NOT NULL DEFAULT 0          -- adjust datatypes according
  , CompletionTime Time NULL DEFAULT NULL           -- to your data
  ) ;

LOAD DATA INFILE '/results.txt' 
    INTO TABLE BulkData
    FIELDS TERMINATED BY ',' 
           ENCLOSED BY '"'
    LINES TERMINATED BY '\r\n' ;

Then you can manipulate them and insert them into the 2-3 tables. For Person:

INSERT INTO Person
    (PersonName)
SELECT DISTINCT
    b.PersonName
FROM
    BulkData AS b
WHERE NOT EXISTS
    ( SELECT 1
      FROM Person AS p
      WHERE p.PersonName = b.PersonName
    ) ;

Similar for Animal:

INSERT INTO Animal
    (AnimalName)
SELECT DISTINCT
    b.AnimalName
FROM
    BulkData AS b
WHERE NOT EXISTS
    ( SELECT 1
      FROM Animal AS a
      WHERE a.AnimalName = b.AnimalName
    ) ;

And then in Result:

INSERT INTO Result
    (RaceID, Position, PersonID, AnimalID, Errors, CompletionTime)
SELECT 
    b.RaceID, b.Position, p.PersonID, a.AnimalID, b.Errors, b.CompletionTime
FROM
    BulkData AS b
  JOIN
    Person AS p  ON p.PersonName = b.PersonName
  JOIN 
    Animal AS a  ON a.AnimalName = b.AnimalName
WHERE NOT EXISTS
    ( SELECT 1
      FROM Result AS r
      WHERE r.RaceID = b.RaceID
        AND r.PositionID = b.PositionID
    ) ;

If the importing results are satisfying, then you can empty the BulkData table and repeat the procedure with more files. The NOT EXISTS conditions will take care and not allow duplicates even if you try to load same data twice.

Best Answer

Related Solutions

Mysql – Individual queries run on 10ms, with UNION ALL they are taking 290ms+ (7.7M records MySQL DB). How to optimise

MySQL Insert into table, where name = id, if not exist, then insert name and use that ID

Related Question