PostgreSQL – How to Remove Duplicate Values from Query Result

database-designgreatest-n-per-grouppostgresql

I am using Postgres 9.3.9. I have 2 tables :

f_agent :

CREATE TABLE f_agent (
  f_agent__id bigserial NOT NULL,
  f_agent__total_users integer NOT NULL,
  f_agent__dim_time_id integer NOT NULL,
  .... other unnecessary columns
);

dim_time :

CREATE TABLE dim_time (
  dim_time__id bigserial NOT NULL,
  dim_time__date date NOT NULL,
  dim_time__month_start_date date NOT NULL,
  dim_time__week_start_date date NOT NULL,
  dim_time__quarter_start_date date NOT NULL,
  dim_time__year_start_date date NOT NULL,
  dim_time__guid uuid NOT NULL DEFAULT uuid_generate_v4(),
);

Now, f_agent__dim_time_id field in f_agent is FK referencing dim_time__id of dim_time.

My dim_time table has a date and its related week_start_date, month_start_date etc. For example: if dim_time__date is 14-07-2015 then dim_time__week_start_date will be 13-07-2015 (monday is start of week) and start_of_month is 01-07-2015 (always the first).

I have to find f_agent__total_users from f_agent given a date range and a dimension (week / month / quarter).

What I've tried:

SELECT dim_time__week_start_date, f_agent__total_users 
FROM f_agent
JOIN dim_time 
    ON f_agent__dim_time_id = dim_time__id 
WHERE dim_time__week_start_date IN (
    SELECT generate_series('2015-07-06', '2016-07-03', '7 day'::interval)
);

The output:

 dim_time__week_start_date   f_agent__total_users
"2015-07-13";                      3
"2015-07-13";                     33
"2015-08-10";                     12

But I only need the first value in case 2 week_start_dates are same. The output I am expecting here is:

 dim_time__week_start_date   f_agent__total_users
"2015-07-13";                      3
"2015-08-10";                     12

How can I do this?

Best Answer

Another way - if your version is 9.3+ - is to use the LATERAL syntax:

SELECT * 
FROM 
  generate_series('2015-07-06', '2016-07-03', '7 day'::interval) AS d (dt)
  , LATERAL  
    ( SELECT fa.*, dt.*
      FROM dim_time AS dt
        INNER JOIN f_agent AS fa
          ON fa.f_agent__dim_time_id = dt.dim_time__id
      WHERE dim_time__week_start_date = d.dt
      ORDER BY dt.dim_time__date
      LIMIT 1
    ) AS dt ;

Tested at SQLfiddle.