Postgresql – Performance issue using view vs. direct statement

performancepostgresqlpostgresql-9.2query-performanceview

I have a table with structure similar to this:

table values
(
id1 int,
id2 int,
value real,
value_at timestamp,
constraint primary key (id1,id2,value_at)
)

I need to get the "current value" for a given combination (id1,id2) — that is, the record with highest value_at among the records with a given combination id1,id2.

The "normal" SQL query would be (example for id1,id2 = 100,200):

select ... from values 
where (id1,id2) = (100,200) 
      and value_at = (select max(value_at) from values 
                      where (id1,id2) = (100,200))

However, I want to have a view that hides this complexity:

select value from vw_current_value where (id1,id2) = (100,200)

My attempt is:

create view vw_current_value as 
select id1, id2, value from values as outer_v 
where value_at = (select max(value_at) from values as inner_v 
                  where outer_v.id1 = inner_v.id1 and outer_v.id2 = inner_v.id2)

It works, but getting a current value for a given (id1,id2) is 12x slower than what I get using the "direct" SQL statement (without using the view)
as reported by EXPLAIN ANALYZE (0.04 ms for the non-view, 0.48 ms for the view, with a table that has 250k records).

My question is: is there a more efficient way to write the view? Perhaps using some other SQL construct that I'm either overlooking or unfamiliar with?

Notice that I would like to:

  • Avoid a Pl/PgSQL procedure for this.
  • Use a view and not the "direct" select statement.
  • Stick to version 9.2.4 (however, if this is known to have much better performance in later versions, I would certainly like to know about it).

EDIT: As requested, here's the output of the two explain analyze for each of the queries (first one for the direct query, second for the query using the view):

Index Scan using pk_values on values  (cost=0.09..8.43 rows=1 width=24) (actual time=0.018..0.019 rows=1 loops=1)
  Index Cond: ((id1 = 100) AND (id2 = 200) AND (value_at = $1))
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.08..0.09 rows=1 width=0) (actual time=0.013..0.013 rows=1 loops=1)
          InitPlan 1 (returns $0)
            ->  Limit  (cost=0.00..0.08 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1)
                  ->  Index Only Scan Backward using pk_values on values  (cost=0.00..5.98 rows=74 width=8) (actual time=0.011..0.011 rows=1 loops=1)
                        Index Cond: ((id1 = 100) AND (id2 = 200) AND (value_at IS NOT NULL))
                        Heap Fetches: 0
Total runtime: 0.036 ms


Bitmap Heap Scan on values outer_v  (cost=5.05..262.46 rows=1 width=24) (actual time=0.457..0.458 rows=1 loops=1)
  Recheck Cond: ((id1 = 100) AND (id2 = 200))
  Filter: (value_at = (SubPlan 2))
  Rows Removed by Filter: 75
  ->  Bitmap Index Scan on pk_values  (cost=0.00..5.05 rows=74 width=0) (actual time=0.017..0.017 rows=76 loops=1)
        Index Cond: ((id1 = 100) AND (id2 = 200))
  SubPlan 2
    ->  Result  (cost=0.08..0.09 rows=1 width=0) (actual time=0.004..0.004 rows=1 loops=76)
          InitPlan 1 (returns $2)
            ->  Limit  (cost=0.00..0.08 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=76)
                  ->  Index Only Scan Backward using pk_values on values inner_v  (cost=0.00..6.02 rows=76 width=8) (actual time=0.003..0.003 rows=1 loops=76)
                        Index Cond: ((id1 = outer_v.id1) AND (id2 = outer_v.id2) AND (value_at IS NOT NULL))
                        Heap Fetches: 0
Total runtime: 0.478 ms

Best Answer

You might want to try using a window function for that:

create view vw_current_value as 
select id1, id2, value from (
  select id1, id2, value, 
      row_number() over (partition by id1, id2 order by value_at desc) rn
  from your_values_table
) t where t.rn = 1;

sqlfiddle

EDIT:

The above has an advantage of being standard SQL. However (thanks to @a_horse_with_no_name) Postgres also offers a proprietary alternative: DISTINCT ON(something), which reportedly performs better:

create view vw_current_value as 
select distinct on (id1, id2) id1, id2, value 
from your_values_table
order by id1, id2, value_at desc;

sqlfiddle 2

Comparison of the execution plans of the alternatives is left as an excercise for the reader.