Sql-server – Sudden Query Performance Degradation On A Linked Server

linked-serverperformancequery-performancesql-server-2008sql-server-2008-r2

I have an application that uses a view on a SQL Server 2008 R2 instance. This view actually queries a linked server (SQL Server 2008) that has replicated data from our main ERP database. Yesterday, we began experiencing some timeout on the application that uses this view and we discovered that the view was taking over 1 minute to complete (before this it was a matter of seconds).

Here's what we know:

  • Running the query on the linked server directly returns results in a
    matter of milliseconds.
  • Creating a view on the linked server, and then having the view on the
    first server simply query the view on the linked server restores
    performance to its original state. (view source), (workaround)
  • When running the query on the first server, I notice that there are quite a few OLEDB waits (using sp_WhoIsActive). Because of this, I decided to check the linked server while the query was running (on the first server). I noticed that the primary wait was ASYNC_NETWORK_IO.

Because of this, I am inclined to believe that there is an issue with either the network, or the configuration on my linked server. I am not sure that there is anything wrong with it, but I am posting it too.
enter image description here

What other things should I be looking at to help me to get to the root cause of this?

EDIT

Query Plan: http://i.imgur.com/yn6ZC.png

IO Stats: http://i.imgur.com/JNX2K.png

Query Columns: http://i.imgur.com/Coxx5.png

Top Operations: http://i.imgur.com/IROrA.png

Plan Tree: http://i.imgur.com/Wco4K.png

Best Answer

If I'm reading the plan and operations correctly, it looks like those nested loops joins with the inner remote operations are the problem.

They were estimated to execute ~750 times apiece, but actually executed almost 6,000 times each. Normally that might not be such an issue, but if network latency is involved with every call, the query becomes extremely sensitive to network conditions. This would certainly explain the ASYNC_NETWORK_IO waits you're seeing.

In any event, I would stick with your workaround -- it's probably better to protect the tables behind a view anyway instead of trying to do all the joins remotely.