Postgresql – better join, in or in with join

join;postgresql

I have these tables:

calls: rows [ 'created_at', 'user_id']

users: rows ['id', 'username']

And I need to get all user without calls from date:
How can I get it? Or what way is better and why?

I know three ways:

first:

SELECT id 
FROM users
WHERE id NOT IN (SELECT user_id FROM calls where calls.created_at >= date)

second(I think it is not right, but I am not sure):

SELECT id
FROM users
LEFT OUTER JOIN calls
   ON calls.user_id = users.id
WHERE calls.created_at >= date AND calls.user_id ISNULL

last:

SELECT id
FROM users
WHERE id NOT IN (
   SELECT user_id
   FROM calls
   INNER JOIN users AS call_users
      ON call_users.id = users.id
   WHERE calls.created_at >= date

DB: PostgreSQL

Best Answer

SQL is specifically meant to express simple queries more or less like you would ask them as a question in English. For instance, your question is like:

"What are all the id from users that have no associated call in calls with a created_date greater or equal than date $1 ?"

SQL translation:

SELECT id FROM users
  WHERE NOT EXISTS (
    SELECT 1 FROM calls WHERE call.user_id=users.id AND created_date >= $1
  );

It's a rather direct translation except maybe for SELECTing 1, which is an arbitrary value.

There is no reason to assume that PostgreSQL would produce a non-optimal execution plan for this query. The optimizer is certainly smart enough for that kind of queries. In general if there's a straightforward way to express your query, there's no reason to pretend to be smarter than the optimizer.

When you doubt that the optimizer found the best possible plan (say for more complex queries), you can start from the output of EXPLAIN to figure out if/why it's not optimal and typically try to improve the query by trial and error.

If you want to compare how different variants of the same query are planned, compare the outputs of EXPLAIN ANALYZE of the queries.

https://wiki.postgresql.org/wiki/Slow_Query_Questions is a good start too.

Related Solutions

MySQL: LEFT OUTER JOIN within reason

First consider a query that computes which rows are actually relevant from tablethree. With the assumption that with "most recently entered result" you mean "most recent enddate" the following query would gather the appropriate rows:

SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid

Now you can build a join to retrieve not only sid, but all of the data of tablethree:

SELECT a.*
FROM tablethree a
INNER JOIN (
  SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid
) b
ON a.sid = b.sid AND a.enddate = b.enddate

This is the result set you actually want to "left join in". You have to insert this into your original query:

SELECT t1.*
FROM tableone AS t1
INNER JOIN tabletwo AS t2
  ON t1.cid = t2.id
LEFT OUTER JOIN (
  SELECT a.*
  FROM tablethree a
  INNER JOIN (
    SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid
  ) b
  ON a.sid = b.sid AND a.enddate = b.enddate
) AS t3
  ON t3.sid = t2.sid
WHERE t1.fieldone = 1 
  AND t1.odate NOT BETWEEN t3.startdate AND t3.enddate

What should also work is the following:

SELECT t1.*
FROM tableone AS t1
INNER JOIN tabletwo AS t2
  ON t1.cid = t2.id
LEFT OUTER JOIN tablethree AS t3
  ON t3.sid = t2.sid
LEFT OUTER JOIN (
  SELECT sid, MAX(enddate) FROM `tablethree` GROUP BY sid
) mostrecent
  ON t3.sid = mostrecent.sid AND t3.enddate = mostrecent.enddate

WHERE t1.fieldone = 1 
  AND t1.odate NOT BETWEEN t3.startdate AND t3.enddate
  AND mostrecent.enddate IS NULL

This includes both tablethree and the new SELECT as left joins, and sorts out the rows where mostrecent.enddate IS NULL (meaning those rows which are actually not most recent). This should lead to the same result, but MySQL may be able to compute this result a little faster. EXPLAIN on both queries should reveal possible differences in computation.

Postgresql – Left outer join not returning all rows in a grouping query

The WHERE o.date ... condition makes the outer join behave like an inner join, cutting out any rows of peoplethat don't have at least one matching row in orders.

You'll have to move the condition about o.date from the WHERE clause to the joining ON:

SELECT p.id, p.name, coalesce(sum(o.price), 0.00) AS total
FROM people p
  LEFT OUTER JOIN orders o
    ON  p.id = o.person_id
    AND o.date BETWEEN date '2014-09-01' AND date '2014-09-30'
GROUP BY p.id;

Test in: SQLFiddle

Best Answer

Related Solutions

MySQL: LEFT OUTER JOIN within reason

Postgresql – Left outer join not returning all rows in a grouping query

Related Question