Currently, I'm comparing two data sets, that contain unique `StoreKey/ProductKey`

combinations.

The 1st data set has the unique `StoreKey/ProductKey`

combinations for sales between begin January 2012 and end May 2014 (result = 450K lines). The 2nd data set has the unique `StoreKey/ProductKey`

combinations, for sales begin June 2014, until today (result = 190K lines).

I'm looking to find the `StoreKey/ProductKey`

combinations that are in the 2nd set, but not in the 1st set – i.e. new products sold from the beginning of June.

Up until now, I've dumped the two data sets into temp tables, created indexes for both tables on both keys, and used the `EXCEPT`

statement to find unique items.

What is the most efficient way of comparing such large data sets?

Is there a more efficient way of doing this type of large comparison?

## Best Answer

Using EXCEPT is in my opinion the way to go here, but you might want to reconsider the use of the temporary table. By doing so you are effectively duplicating your data in memory, which will slow you down. If the indexes you need exist on the source tables (as I suspect), just compare the appropriate SELECTS: