While testing my answer to SQL – Getting data from first and last row of each group I noticed something strange. The first_value
and last_value
window functions appear to behave differently.
As expected, first_value(col1) over (order by col2)
finds the value of col1
from the row with the minimum value for col2
. But last_value(col1) over (order by col2)
seems to find the value for the current row. last_value
seems unrelated to the group or partition it's operating on.
For the query below:
id Session ID bal
0 00000002 100
1 00000002 120
2 00000002 140
3 00000001 900
4 00000001 800
5 00000001 500
The expression:
last_value(bal) over (partition by [Session ID] order by id) as lv
Returns something different than:
first_value(bal) over (partition by [Session ID] order by id DESC) as fv_desc
Here's the result (note that the value of lv
changes within one group):
lv fv_desc
500 500
800 500
900 500
140 140
120 140
100 140
Example at SQL Fiddle with some extra columns. Reversing the order by
or omitting partition by
doesn't seem to impact what last_value()
returns.
If I read the MSDN page correctly, it suggests that last_value()
should work as the opposite of first_value()
, which is different from what I observe during testing.
Does last_value()
do what it is supposed to do? Why does last_value()
allow you to specify a partition
and/or order by
clause when it doesn't seem to use either?
Best Answer
This happens because default window frame is
range between unbounded preceding and current row
, so thelast_value()
never looks beyond current row unless you change the frame.From MSDN:
Since
last_value()
has an optionalorder by
clause, its default window frame ends at the current row. So with the default frame, no matter which partition or ordering you choose,last_value()
returns the value from the current row.