SQL Server T-SQL – Find Partial Duplicates of a Field

sql serversql-server-2008t-sql

I have a table with a field for a member number. The table can contain records with duplicate values for this field and does.

I have a SQL query that will obtain any duplicate values for this field, however, I would now like to find records where there is another record with the same member number but only the first 8 digits out of the 9 that there are.

For example, I would like to return the below when the query is run;

MEMBNO
123456789
123456782

At the moment, my query only returns;

MEMBNO
123456789

where there are two or more records with this exact number.

Current query is;

SELECT
  basic.MEMBNO
FROM
  basic
GROUP BY
  basic.MEMBNO
HAVING
  COUNT(basic.MEMBNO) >1
ORDER BY
  basic.MEMBNO

A query I thought might work was;

SELECT
  basic.MEMBNO
FROM
  basic
GROUP BY
  basic.MEMBNO
HAVING
  COUNT(LEFT(basic.MEMBNO,8)) >1
ORDER BY
  basic.MEMBNO

I'm faily new to SQL and databases in general, so apologies if I haven't explained myself that well. Does it require the LIKE operator somewhere?

I'm using Microsoft SQL Server Report Builder 3.0.

Best Answer

You can identify the repeated 1->8 substrings using this:

SELECT LEFT(basic.MEMBNO,8)
FROM dbo.basic
GROUP BY LEFT(basic.MEMBNO,8)
HAVING COUNT(*) > 1;

So you can join with that to get the individual values:

;WITH x(m) AS 
(
  SELECT LEFT(basic.MEMBNO,8)
  FROM dbo.basic
  GROUP BY LEFT(basic.MEMBNO,8)
  HAVING COUNT(*) > 1
)
SELECT b.MEMBNO
  FROM dbo.basic AS b
  INNER JOIN x
  ON x.m = LEFT(b.MEMBNO, 8)
  ORDER BY b.MEMBNO;

You can also do it this way (I just wasn't sure when windowed COUNT was introduced):

;WITH x AS 
( 
  SELECT MEMBNO, c = COUNT(*) OVER (PARTITION BY LEFT(MEMBNO, 8))
  FROM dbo.basic
)
SELECT MEMBNO FROM x WHERE c > 1
ORDER BY MEMBNO;