Postgresql – Determining which isolation level is appropriate

isolation-leveloptimizationpostgresqlrdbms

This is a homework question.

For the following transactions state the isolation level that will
maximize throughput without lowering the integrity of the database.
Explain the answer.

Change the course identified by coursed_id = ’CPSC1350’ from one
department to a different one.

The Courses table contains information about courses: their id, their name, the
department that offers it, the id of its instructor, and the maximum
number of students who can take it (max_size). Courses(coursed_id:
string, cname: string, dept: string, instructor_id: string,
max_size:integer) – Primary Key: coursed_id – Foreign Key:
instructor_id references Instructors

Assume that PostgreSQL is used.

I believe the transaction can be done using read committed because

  • dirty reads are not OK because there is an update going on, thus a read is involved
  • non-repeatable reads are OK because it is unlikely someone else is changing the value
  • phantoms are OK because there are no SELECT statements

Am I on the right track?

Best Answer

The question seems like a puzzle which appears very simple, but maybe not so simple ,or it just pretends to be complex... I'll try my best answering it as I understand it. I apologize if I misunderstood some obvious hints.

With PostgreSQL, there is no real Read uncommitted - you get Read committed. Quoting the documentation:

In PostgreSQL, you can request any of the four standard transaction isolation levels, but internally only three distinct isolation levels are implemented, i.e. PostgreSQL's Read Uncommitted mode behaves like Read Committed. This is because it is the only sensible way to map the standard isolation levels to PostgreSQL's multiversion concurrency control architecture.

The question doesn't sound very clear to me because isolation levels matter when you have simultaneous queries, and no other queries mentioned, so don't blame me too much if I miss something.

If we put remaining isolation levels in the order "Read committed"->"Repeatable read"->"Serializable", the overhead grows accordingly. So we need to check them in the same order, and once we are satisfied, there is no need to check remaining levels (they will be fine, but with more overhead). As far as I understand, UPDATE Courses set dept = 'New department' coursed_id = ’CPSC1350’ does the required work. From Postgre documentation that describes Read committed isolation level,

UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row.

I'd say I'm satisfied with such behaviour in this case, so I'll go with READ COMMITTED unless there are other restrictions not mentioned in the question.