I don't think this is empirically deterministic. My understanding is that an entire table could be brought over and processed on the calling server instead of being processed remotely on the linked server. If both servers are SQL Server 2008+ you should see scenarios like this:
SELECT x.foo FROM linked.db.dbo.x JOIN linked.db.dbo.y ON ...
And in these cases you will see that the join, filters etc. are processed on the remote server. But this should not be construed as a blanket statement! You may see slightly different remoting decisions in the plan for a slightly different scenario:
SELECT x.foo FROM localdb.dbo.x JOIN linked.db.dbo.y ON ...
These are for SQL Server 2008+ -> SQL Server 2008+ linked servers. Other linked servers (lower versions of SQL Server, other platforms altogether) may behave differently, and it may be version-dependent, it could even be specific driver-version dependent, or query-dependent (e.g. is a function used). And it of course could depend on the permissions of the linked server account - if you can't see remote statistics / cardinality for the distributed portion of the query, the local optimizer is going to be pretty blind and may not make the decision you want.
If you want to guarantee that processing occurs on the remote server before it starts streaming data across, and you have actually tried to get it work with four-part names and the plan does not try to process remotely, you can use OPENQUERY
instead of explicitly referencing four-part names. Never tried that in a view, though. :-)
If you come across a case where you expect processing to occur remotely and it's not, I'm more than happy to look at the actual plan and the stats profile to help understand why (or at the very least to pass it on to some of my connections who can help).
From my own personal experience, I've found much, much, much more reliable and predictable performance by doing whatever I can to make as many queries as possible entirely local. Linked servers are useful but they certainly don't lend themselves to real-time performance. While you may think it violates DRY in some sense, I would rather pay that cost once during a mirroring session or replication and have redundant data, than potentially pay for a sub-optimal remoting decision for every single cross-server query. And if the data can be slightly delayed there are even more options for moving data across to make it available locally (log shipping, SSIS, custom).
Just use the syntax that allows you to define an alias:
DELETE t1 FROM Server.Database.dbo.Table1 AS t1
WHERE NOT EXISTS
(
SELECT 1 FROM Server.Database.dbo.Table2 AS t2
WHERE t2.Code = t1.Code
);
Or you can use dynamic SQL to simplify:
DECLARE @sql NVARCHAR(MAX) = N'DELETE t1 FROM dbo.Table1 AS t1
WHERE NOT EXISTS
(
SELECT 1 FROM dbo.Table2 AS t2
WHERE t2.Code = t1.Code
);';
EXEC Server.Database..sp_executesql @sql;
This forces the query to be run on the other side, which can provide other benefits too...
Best Answer
You cant have 5 identifiers, this should do the trick
https://technet.microsoft.com/en-us/library/ms190406(v=sql.105).aspx