PostgreSQL – How to Find Consecutive Free Numbers Using Window Functions

gaps-and-islandsgroup bypostgresqlwindow functions

I have some table with numbers like this (status is either FREE or ASSIGNED)

id_set  number  status         
-----------------------
1       000001  ASSIGNED
1       000002  FREE
1       000003  ASSIGNED
1       000004  FREE
1       000005  FREE
1       000006  ASSIGNED
1       000007  ASSIGNED
1       000008  FREE
1       000009  FREE
1       000010  FREE
1       000011  ASSIGNED
1       000012  ASSIGNED
1       000013  ASSIGNED
1       000014  FREE
1       000015  ASSIGNED

and I need to find "n" consecutive numbers, so for n = 3, query would return

1       000008  FREE
1       000009  FREE
1       000010  FREE

It should return only first possible group of each id_set (in fact, it would be executed only for id_set per query)

I was checking WINDOW functions, tried some queries like COUNT(id_number) OVER (PARTITION BY id_set ROWS UNBOUNDED PRECEDING), but that's all I got 🙂 I couldn't think of logic, how to do that in Postgres.

I was thinking about creating virtual column using WINDOW functions counting preceding rows for every number where status = 'FREE', then select first number, where count is equal to my "n" number.

Or maybe group numbers by status, but only from one ASSIGNED to another ASSIGNED and select only groups containing at least "n" numbers

EDIT

I found this query (and changed it a little bit)

WITH q AS
(
  SELECT *,
         ROW_NUMBER() OVER (PARTITION BY id_set, status ORDER BY number) AS rnd,
         ROW_NUMBER() OVER (PARTITION BY id_set ORDER BY number) AS rn
  FROM numbers
)
SELECT id_set,
       MIN(number) AS first_number,
       MAX(number) AS last_number,
       status,
       COUNT(number) AS numbers_count
FROM q
GROUP BY id_set,
         rnd - rn,
         status
ORDER BY
     first_number

which produces groups of FREE/ASSIGNED numbers, but I would like to have all numbers from only first group which meets the condition

SQL Fiddle

Best Answer

This is a problem. Assuming there are no gaps or duplicates in the same id_set set:

WITH partitioned AS (
  SELECT
    *,
    number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
  FROM atable
  WHERE status = 'FREE'
),
counted AS (
  SELECT
    *,
    COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
  FROM partitioned
)
SELECT
  id_set,
  number
FROM counted
WHERE cnt >= 3
;

Here's a SQL Fiddle demo* link for this query: http://sqlfiddle.com/#!1/a2633/1.

UPDATE

To return only one set, you could add in one more round of ranking:

WITH partitioned AS (
  SELECT
    *,
    number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
  FROM atable
  WHERE status = 'FREE'
),
counted AS (
  SELECT
    *,
    COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
  FROM partitioned
),
ranked AS (
  SELECT
    *,
    RANK() OVER (ORDER BY id_set, grp) AS rnk
  FROM counted
  WHERE cnt >= 3
)
SELECT
  id_set,
  number
FROM ranked
WHERE rnk = 1
;

Here's a demo for this one too: http://sqlfiddle.com/#!1/a2633/2.

If you ever need to make it one set per id_set, change the RANK() call like this:

RANK() OVER (PARTITION BY id_set ORDER BY grp) AS rnk

Additionally, you could make the query return the smallest matching set (i.e. first try to return the first set of exactly three consecutive numbers if it exists, otherwise four, five etc.), like this:

RANK() OVER (ORDER BY cnt, id_set, grp) AS rnk

or like this (one per id_set):

RANK() OVER (PARTITION BY id_set ORDER BY cnt, grp) AS rnk

* The SQL Fiddle demos linked in this answer use the 9.1.8 instance as the 9.2.1 one doesn't appear to be working at the moment.