This is a gaps-and-islands problem. Assuming there are no gaps or duplicates in the same id_set
set:
WITH partitioned AS (
SELECT
*,
number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
FROM atable
WHERE status = 'FREE'
),
counted AS (
SELECT
*,
COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
FROM partitioned
)
SELECT
id_set,
number
FROM counted
WHERE cnt >= 3
;
Here's a SQL Fiddle demo* link for this query: http://sqlfiddle.com/#!1/a2633/1.
UPDATE
To return only one set, you could add in one more round of ranking:
WITH partitioned AS (
SELECT
*,
number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
FROM atable
WHERE status = 'FREE'
),
counted AS (
SELECT
*,
COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
FROM partitioned
),
ranked AS (
SELECT
*,
RANK() OVER (ORDER BY id_set, grp) AS rnk
FROM counted
WHERE cnt >= 3
)
SELECT
id_set,
number
FROM ranked
WHERE rnk = 1
;
Here's a demo for this one too: http://sqlfiddle.com/#!1/a2633/2.
If you ever need to make it one set per id_set
, change the RANK()
call like this:
RANK() OVER (PARTITION BY id_set ORDER BY grp) AS rnk
Additionally, you could make the query return the smallest matching set (i.e. first try to return the first set of exactly three consecutive numbers if it exists, otherwise four, five etc.), like this:
RANK() OVER (ORDER BY cnt, id_set, grp) AS rnk
or like this (one per id_set
):
RANK() OVER (PARTITION BY id_set ORDER BY cnt, grp) AS rnk
* The SQL Fiddle demos linked in this answer use the 9.1.8 instance as the 9.2.1 one doesn't appear to be working at the moment.
Data model
You are saving avg. and max. speed per race, not per driver? Normally this would be something like:
CREATE TABLE race (
race_id serial PRIMARY KEY
,race_date timestamp
);
CREATE TABLE driver (
driver_id serial PRIMARY KEY
,driver_name text
,date_started timestamp
);
CREATE TABLE driver_race (
driver_id int REFERENCES driver(driver_id)
,race_id int REFERENCES race(race_id)
,average_speed numeric
,max_speed numeric
);
This is also how you should provide your data model: as CREATE
scripts ready to use for testing.
numeric
and decimal
are the same in Postgres.
Query
Your query extended to all drivers:
SELECT driver_id, d.driver_name, x.slope
FROM driver d
JOIN (
SELECT dr.driver_id
,regr_slope(dr.average_speed, EXTRACT(EPOCH FROM r.race_date)) AS slope
FROM race r
JOIN driver_race dr USING (race_id)
WHERE r.race_date BETWEEN '2013-3-1'::date AND '2013-3-31'::date
GROUP BY dr.driver_id
) x USING (driver_id)
ORDER BY slope DESC;
Simple alternative
A simple alternative would be to match the average speed of the first half of a time period to the average of the second half:
SELECT driver_id, d.driver_name
,round(x.avg1::numeric, 2) AS avg1
,round(x.avg2::numeric, 2) AS avg2
,round((x.avg2 / x.avg1 - 1) * 100, 2) AS pct_change
FROM driver d
JOIN (
SELECT dr.driver_id
,avg(CASE WHEN r.race_date BETWEEN '2013-3-01'::date AND '2013-3-15'::date
THEN dr.average_speed END) AS avg1
,avg(CASE WHEN r.race_date BETWEEN '2013-3-16'::date AND '2013-3-31'::date
THEN dr.average_speed END) AS avg2
FROM race r
JOIN driver_race dr USING (race_id)
WHERE r.race_date BETWEEN '2013-3-1'::date AND '2013-3-31'::date
GROUP BY dr.driver_id
) x USING (driver_id)
ORDER BY pct_change DESC;
The CASE
expression has no ELSE
branch, because the default NULL
serves just fine: avg()
ignores NULL
values.
-> SQLfiddle demo.
Best Answer
The inner most query (alias
t1
) creates a number each time the country changes). The second level query (aliast2
) then does a running sum on those flags, which essentially gives each consecutive set of countries a different number. The outer most query then groups by that number and sums the distance. Themin(country)
is necessary to make thegroup by
operator happy, but as all rows with the samegrp
have the same country anyway, it doesn't matter.SQLFiddle: http://sqlfiddle.com/#!15/fe341/1