Sql-server – Any way of optimizing joins between 2 databases with different collations

collationsql servert-sql

I'm creating a query for a report where we must do a cross database join. Unfortunately the fields we have to join on have two separate collations – "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_CI_AS". The join is very slow between these and is really impacting the speed of the report – is there any ways of optimizing this? Is changing the collation to match wise and/or difficult?

The join portion of the query is below

INNER JOIN EDE.dbo.DeliveryDetail D
              ON D.DeliveryNoteNo COLLATE database_default = Consignments.CustomerReference COLLATE database_default

Best Answer

If you can add columns to tables, you can try persisted computed columns, and index those. For example:

ALTER TABLE dbo.DeliveryDetail ADD DeliveryNoteNo_Collated COLLATE whatever PERSISTED;

CREATE INDEX IX_DeliveryDateNo_Collated ON dbo.DeliveryDetail(DeliveryNoteNo_Collated);

You're basically pre-baking converted versions of your fields once on insert (and kept updated for you automatically), and indexing those.

There are two really big drawbacks. First, these are new columns in your table: they'll take up more space, the indexes will take up more space, and if your apps are doing inserts without explicitly listing their column names, their inserts will fail.

Second, if the app's connection options aren't set correctly, your delete/update/insert operations can actually fail. NUMERIC_ROUNDABORT has to be set OFF, and ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, ARITHABORT, CONCAT_NULL_YIELDS_NULL, and QUOTED_IDENTIFIER must all be set to ON. You'll definitely want to test for this in development first, and for bonus points, check sys.dm_exec_sessions in production to check everybody's session settings.

For more info: