Sql-server – Find first non-matching character between two strings

sql servert-sql

Posting this with solution provided.

The use case could be framed in a number of ways:

  • Return all characters that match at the start of two strings, until they do not match
  • How many characters match between two strings before one character doesn't match?
  • What is the first character that doesn't match between two strings?
  • etc.

For example:

If I have the strings 'Interesting' and 'Interested' they are similar for all the characters 'Interest' and then the first string ends in 'ing' while the second ends in 'ed'. Therefore, they have 8 characters in common, the first non-matching character is the 9th and the identical string is 'Interest'.

For the question in the title, the first non-matching character is number 9.

Best Answer

First you should have a Numbers table.

create table dbo.Number(N int not null);
go

alter table dbo.Number add constraint PK_Number_N primary key clustered (N);

go

;with C(N) as
(
  select 0
  union all
  select N+1
  from C
  where N < 11000
)
insert into dbo.Number(N)
select N
from C
option (maxrecursion 0);

Use the numbers table to split your strings into rows. Get the matching characters and position using intersect. Use row_number to get only the matching characters from the start and finally rebuild the string with a for xml trick or string_agg.

declare @Str1 varchar(20) = 'Interesting0a';
declare @Str2 varchar(20) = 'Interested00a';

with C1 as
(
  select N.N,
         substring(@Str1, N.N, 1) as C
  from dbo.Number as N
  where N.N between 1 and len(@Str1)
  intersect
  select N.N,
         substring(@Str2, N.N, 1)
  from dbo.Number as N
  where N.N between 1 and len(@Str2)
),
C2 as
(
  select C1.N,
         C1.C,
         row_number() over(order by C1.N) as rn
  from C1
)
select string_agg(C2.C, '') within group(order by C2.N)
from C2
where C2.N = C2.rn;

/*
select (
       select C2.C as '*'
       from C2
       where C2.N = C2.rn
       order by C2.N
       for xml path(''), type
       ).value('text()[1]', 'varchar(20)');
*/