PostgreSQL 9.2/9.3 – Get Distinct Results Based on Column and Condition

postgresqlpostgresql-9.2postgresql-9.3

I have the below select which is selecting from a table with the columns:

id, project_id, keyword_id, position,
Data example:

i, p_id, k_id, p 
1,  1,    1,   4
2,  1,    1,   5
3,  1,    1,   6
5,  1,    2,   7
6,  1,    2,   8
7,  1,    3,   5
8,  1,    3,   7

QUERY

SELECT
  pr.project_id,
  COUNT(DISTINCT pr.keyword_id) as total_keywords,
  COUNT(CASE WHEN position BETWEEN 4 AND 10 THEN position ELSE NULL END ) AS pos4,
  date(pr.created_at) AS created_at
FROM
  project_reports pr
GROUP BY pr.project_id, date(created_at)

I only want to get the number of distinct keyword_id in pos 4-10. But using my SELECT I get all of them.

Expected Data example:

p_id, total_keywords, pos4
1,      3,            3

My current Data output

p_id, total_keywords, pos4
1,      3,             7

Best Answer

If you want the distinct keywords then you need to count the keyword_id, not the position column.

So your expression:

COUNT(CASE WHEN position BETWEEN 4 AND 10 THEN position ELSE NULL END )

becomes:

COUNT(distinct CASE WHEN position BETWEEN 4 AND 10 THEN keyword_id ELSE NULL END )

So the final statement is:

SELECT
  pr.project_id,
  COUNT(DISTINCT pr.keyword_id) as total_keywords,
  count(distinct CASE WHEN position BETWEEN 4 AND 10 THEN keyword_id ELSE NULL END ) AS pos4
FROM
  project_reports pr
GROUP BY pr.project_id;

SQLFiddle example: http://sqlfiddle.com/#!15/b1015/1