Sql-server – the best way to compare values of whole tables between many databases

sql serversql-server-2012

We are trying to compare the values of whole tables between many databases.
The user can enter in input the table names and the columns names, and the database names he wants to compare.

They can enter as many databases, tables and columns they wish. We want to compare rows per rows, only for the specified columns.

ex:
DatabaseA, DatabaseB, DatabaseC
Table1, Col1|Col2|Col3|Col4
Table2, Col1|Col4|Col5|Col6|Col20
...

So for an example, if I have:

DatabaseA
Table1, Col1|Col2|Col3|Col4 = 'Apple', 1, 10, 'ABC'

DatabaseB
Table1, Col1|Col2|Col3|Col4 = 'Banana', 1, 10, 'ABC'

There's a difference.

At first I was considering usign the CHECKSUM on each rows if each tables (excluding columns that where not specified by the user) and compare the CHECKSUM but I've read that it's not always unique.

Now I was thinking of using HASHBYTES instead.
Either by doing this:

    SELECT HASHBYTES('sha2_512', CONVERT(NVARCHAR(MAX), ISNULL(col1,'')) +
           HASHBYTES('sha2_512', CONVERT(NVARCHAR(MAX), ISNULL(col2,'')) +
           HASHBYTES('sha2_512', CONVERT(NVARCHAR(MAX), ISNULL(col3,'')) +
           HASHBYTES('sha2_512', CONVERT(NVARCHAR(MAX), ISNULL(col4,'')) +
    FROM Table1

Or by doing this:

   With Vals AS 
   (
       SELECT CONVERT(NVARCHAR(MAX), ISNULL(col1, '')) + 
              CONVERT(NVARCHAR(MAX), ISNULL(col2, '')) + 
              CONVERT(NVARCHAR(MAX), ISNULL(col3, '')) + 
              CONVERT(NVARCHAR(MAX), ISNULL(col4, ''))  AS Val
       FROM Table1
   )
   SELECT HASHBYTES ('sha2_512', Val) FROM Vals

What do you think? How would you do it? What would be the best way?

Thanks

Best Answer

This may not be exactly what you are looking to do, but you could use the EXCEPT statement to get the rows that are different:

SELECT Column1, Column2 FROM DatabaseName1.dbo.TableName1
EXCEPT
SELECT Column1, Column2 FROM DatabaseName2.dbo.TableName2

If you do it this way though, make sure you also do the opposite EXCEPT so you know the rows that don't exist in the other database's table:

SELECT Column1, Column2 FROM DatabaseName2.dbo.TableName2
EXCEPT
SELECT Column1, Column2 FROM DatabaseName1.dbo.TableName1

Hopefully that helps you a little.