Sql-server – Recursive CTE with partition

recursivesql server

I have a table like this in MS SQL SERVER 2014:

ID|Race|Lap  
1 |21  |11  
2 |21  |NULL
3 |21  |NULL  
4 |21  |NULL  
5 |29  |65  
6 |29  |NULL  
7 |29  |NULL  
8 |29  |NULL

I am trying to fill up the Lap column by adding 1 to it based on the first value. The partition is based on Race column. Something like this would be the end result:

ID|Race|Lap  
1 |21  |11  
2 |21  |12
3 |21  |13  
4 |21  |14  
5 |29  |65  
6 |29  |66  
7 |29  |67  
8 |29  |68

There might be other ways of doing this but I would rather stick with recursive CTE. Is there any way to do this?

Best Answer

This would produce the expected result:

create table #demo (id int, race int, lap int)
insert into #demo values (1,21,11),(2,21,null),(3,21,null),(4,21,null),(5,29,65),(6,29,null),(7,29,null),(8,29,null);


with CTE as
(select race, ROW_NUMBER() over (partition by race  order by race) "extra_lap" from #demo where lap is null),
CTE2 as 
(select race, lap "lap" from #demo where lap is not null)
select race, lap from CTE2
union 
select CTE.race, CTE2.lap + CTE.extra_lap "lap" from CTE join CTE2 on CTE.race=CTE2.race

drop table #demo;

Efficient rCTE

WITH RECURSIVE
  input AS (SELECT 'news-on-apple'::text AS slug)  -- input basic slug here once
, cte   AS (
   SELECT slug || '-' AS slug  -- append '-' once, if basic slug exists
        , 1 as suffix          -- start with suffix 1
   FROM   article
   JOIN   input USING (slug)

   UNION ALL
   SELECT c.slug, c.suffix + 1  -- increment by 1 ...
   FROM   cte     c
   JOIN   article a ON a.slug = c.slug || c.suffix  -- ... if slug-n already exists
   )
(
SELECT slug || suffix AS slug
FROM   cte
ORDER  BY suffix DESC  -- pick the last (free) one
LIMIT  1
)  -- parentheses required
UNION  ALL  -- if the basic slug wasn't taken, fall back to that
SELECT slug FROM input
LIMIT  1;

Better performance without rCTE

If you worry about thousands of slugs competing for the same slug or generally want to optimize performance, I'd consider a different, faster approach.

WITH input AS (SELECT 'news-on-apple'::text  AS slug
                    , 'news-on-apple-'::text AS slug1)  -- input basic slug here
SELECT i.slug
FROM   input        i
LEFT   JOIN article a USING (slug)
WHERE  a.slug IS NULL  -- doesn't exist yet.

UNION ALL
(  -- parentheses required
SELECT i.slug1 || COALESCE(right(a.slug, length(i.slug1) * -1)::int + 1, 1)
FROM   input        i
LEFT   JOIN article a ON a.slug LIKE (i.slug1 || '%')  -- match up to last "-"
                     AND right(a.slug, length(i.slug1) * -1) ~ '^\d+$' -- suffix numbers only
ORDER  BY right(a.slug, length(i.slug1) * -1)::int DESC
)
LIMIT  1;

If the basic slug isn't taken yet, the more expensive second SELECT is never executed - same as above, but much more important here. Check with EXPLAIN ANALYZE, Postgres is smart that way with LIMIT queries. Related:
- Optimize a query on two big tables
Check for the leading string and the suffix separately, so the LIKE expression can use a basic btree index with text_pattern_ops like
```
CREATE INDEX article_slug_idx ON article (slug text_pattern_ops);
```
Detailed explanation:
- Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
Convert the suffix to integer before you apply max(). Numbers in text representation don't work.

Optimize performance

To get the optimum, consider storing the suffix separated from the basic slug and concatenate the slug as needed: concat_ws('-' , slug, suffix::text) AS slug

CREATE TABLE article (
   article_id serial PRIMARY KEY
 , title text NOT NULL
 , slug  text NOT NULL
 , suffix int
);

The query for a new slug then becomes:

SELECT slug
    || COALESCE((
          SELECT '-'::text || (max(suffix) + 1)::text
          FROM   article a
          WHERE  a.slug = i.slug), '') As slug
FROM  (SELECT 'news-on-apple'::text AS slug) i  -- input basic slug here

Ideally supported with a unique index on (slug, suffix).

Query for list of slugs

In any version of Postgres you can provide rows in a VALUES expression.

SELECT *
FROM   article
JOIN  (
   VALUES
     ('slug-foo'::text, 1)
     ('slug-bar',7)
   ) u(slug,suffix) USING (slug,suffix);

You can also use IN with a set of row-type expressions Which is shorter:

SELECT *
FROM   article
WHERE (slug,suffix) IN (('slug-foo', 1), ('slug-bar',7));

Details under this related question (as commented below):

the <set clause>'s <multiple column assignment>

For long lists, the JOIN to a VALUES expression is typically faster.

In Postgres 9.4 (released today!) you can also use the new variant of unnest() to unnest multiple arrays in parallel.

Given an array of basic slugs and a corresponding array of suffixes (as per comment):

SELECT *
FROM   article
JOIN   unnest('{slug-foo,slug-bar}'::text[]
            , '{1,7}'::int[]) AS u(slug,suffix) USING (slug,suffix);

Best Answer

Related Solutions

Sql-server – Recursive CTE performance

Postgresql – Recursive CTE to find unique slug

Efficient rCTE

Better performance without rCTE

Optimize performance

Query for list of slugs

Related Question