Sql-server – Find and return items based on matching strings across 17 tables

datatypespattern matchingsql server

I must start by saying that I am at best a sql novice. I am doing some data analysis and consolidation for my company and need to find if items match across any of the tables. The info in the tables came from 3 different databases and a bunch of spreadsheets that had been used.

The information I'm looking to match up is circuit numbers. Some of the tables have the same design, but the information is not always entered in the same fields or even the same format. The tables do not have the same number of columns and sometimes the string that would match across them is a sub string in one of the other tables. The id's can be both numeric and alphanumeric. One other issue is that no one single table has enough good/correctly entered data to use as a primary table to compare the rest to.

What I am looking to achieve is to somehow search all of the tables and if there is a string that matches across any number of them, I would like to identify where it matched and create a table that contains the info from all columns in the row(s) that contained the match.

Below is a list of the tables, the number columns each has and the fields in that table that could potentially contain a circuit ID. I know this is confusing and I hope I was able to explain it and even more hopeful that someone may be able to help since doing a manual stare and compare will take months. I have tried everything that my limited knowledge allows from joins to unions to putting it in excel and using tools like vlookup, but the state of the data is creating too many stumbling blocks for me. It seems that I may need to use some sort of stored procedure to check through the tables and return the results, but this is something that is quite a bit over my head.

Table 1 - 152 columns - CktNum(nvarchar(8)), IXC(nvarchar(25)), BOC(nvarchar(25)), AML(nvarchar(10)), LLOCKTID(nvarchar(25)), assignnum(nvarchar(50))

Table 2 - 17 columns - IXC CKT ID(nvarchar(255)), BOC CKT ID(nvarchar(255)), Local CKTID(nvarchar(255))

Table 3 - 10 columns - Name(nvarchar(255)), Name1(nvarchar(255)), Name2(nvarchar(255))

Table 4 - 21 columns - Local Circuit ID(nvarchar(255)), IXC Circuit ID(nvarchar(255)), BOC Circuit ID(nvarchar(255))

Table 5 - 22 columns - Local Circuit ID(nvarchar(255)), IXC Circuit ID(nvarchar(255)), BOC Circuit ID(nvarchar(255))

Table 6 - 20 columns - Local Circuit ID(nvarchar(255)), IXC Circuit ID(nvarchar(255)), BOC Circuit ID(nvarchar(255))

Table 7 - 20 columns - Local Circuit ID(nvarchar(255)), IXC Circuit ID(nvarchar(255)), BOC Circuit ID(nvarchar(255))

Table 8 - 20 columns - Local Circuit ID(nvarchar(255)), IXC Circuit ID(nvarchar(255)), BOC Circuit ID(nvarchar(255))

Table 9 - 5 columns - Local ID(nvarchar(255)), Circuit ID(nvarchar(255)), ECCKT ID(nvarchar(255))

Table 10 - 5 columns - Local ID(nvarchar(255)), Circuit ID(nvarchar(255)), ECCKT ID(nvarchar(255))

Table 11 - 7 columns - Local ID(nvarchar(255)), Circuit ID(nvarchar(255)), ECCKT ID(nvarchar(255))

Table 12 - 5 columns - Local ID(nvarchar(255)), Circuit ID(nvarchar(255)), ECCKT ID(nvarchar(255))

Table 13 - 18 columns - Circuit ID(nvarchar(255)), Customer(nvarchar(255))

Table 14 - 17 columns - Circuit ID(nvarchar(255)), Customer(nvarchar(255))

Table 15 - 40 columns - ECCKT(nvarchar(255)), LO CKT ID(float), CKR(nvarchar(255))

Table 16 - 9 columns - Circuit ID(nvarchar(255)), Customer(nvarchar(255))

Table 17 - 17 columns - Circuit ID(nvarchar(255)), Customer(nvarchar(255)), Contact(nvarchar(255))

Best Answer

Is the application going to search for these one by one? Is the performance of this query important? Easy (ish) solution would be to create a stored procedure that accepted some arguments and then returned the results from searching each table.

Results from the stored procedure could include the table it found a match on, the primary key and the field that matched. Then the caller of that stored procedure could do more work based on that result set.

EDIT - Below is a very basic sample procedure for you.

CREATE PROCEDURE dbo.sp_FindCircuit
    (
    @Search NVARCHAR(100)
    )
AS
BEGIN

--Hold Results.
DECLARE @Results TABLE
    (
    TableName VARCHAR(100) NULL
    , FieldName VARCHAR(100) NULL
    , KeyValue INT NULL
    )

INSERT INTO @Results
        (TableName, FieldName, KeyValue)
SELECT 'Table 5'
    , 'Circuit ID'
    , <PrimaryKey>
FROM [Table 5]
WHERE CircuitID LIKE ('%' + @Search + '%')

--Repeat for each table.


--Return data.
SELECT TableName, FieldName, KeyValue
FROM @Results 

END