I have a problem with Azure SQL cross-database queries, it looks like if we join data across databases the join is not distributed as input to the external database.
Currently I tried to implement a simple example and therefore I used this starting point:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-elastic-query-getting-started-vertical.
As preperation I created 199999 CustomerInformation entries and 200 OrderInformation entries.
Following query works as expected the CustomerInformation Table returns just one row:
SELECT o.OrderId, c.CustomerID, c.CustomerName
FROM OrderInformation o
JOIN CustomerInformation c ON o.CustomerID = c.CustomerID
WHERE c.CustomerID = 2
But any query with a join that does not specifiy the CustomerID returns all the rows from the Customers Database.
SELECT o.OrderId, c.CustomerID, c.CustomerName
FROM OrderInformation o
JOIN CustomerInformation c ON o.CustomerID = c.CustomerID
WHERE o.OrderId = 1870
Therefore the queries are really slow because the servers transfer to many rows. Do I miss some setting or is it not possible to separate this data with the suggested queries.
Thanks for your help.
Best Answer
Recreating the issue
Test data below
All records are returned by the remote query operator, to filter down to one when the inner join is applied.
Some more background on pushing down the predicate to a remote database:
Source
Above query shows an example of using parameters to define a range from where to read the data. The difference with your problem is that the filtering is happening on the local table.
If we know that this predicate can be pushed down to the remote table
You could look for a range of customerid's and filter on these
MAX()
andMIN()
functions ono.Customerid
are used to get the upper and lower boundaries forWHERE o.OrderId = 155
The complete result
Pushing down the predicates:
Execution plan of this query
It depends on your data & parameters passed how much this could filter out on the remote table. YMMV
Test data
Customerinformation (remote db)
On the database that will use the external table