Sql-server – Measuring latency of a long-distance SQL connection

performancequery-performancesql serversql-server-2012

We have written a simple "client" app for a customer that connects to their SQLServer database, runs about 360 queries against a "property", and returns that data as a .Net object we call a "cProp".

The queries vary in complexity, but the very longest take perhaps 400ms. When we recently ran it for 60 properties, the entire run took about 3 minutes. The customer has decided to move their database to a remote server with 65ms between us. As a result, the same query required 51 minutes.

Now I'm 99% sure the problem is row retrieval, but I need to prove that. The DBA ran a trace on the process, but having never used it before, I was dismayed to find out how little information it contained. What I did see is that it took just under 4 minutes of server time. This reinforces my belief that the problem is in the row retrieval.

But how do I measure that? The trace returns nothing about the number of rows or total data size – am I doing it wrong? I also know that batching happens outside the server, so how do we even find out what's going on there? Generally speaking, how does one debug issues that are taking place "outside the server" so to speak.

Best Answer

Just to clarify some math here:

360 queries per "property"
60 properties * 360 = 21,600 queries
3 minutes in duration is about 120 queries per second

So in your best case, you're averaging 120 queries per second throughput.

After moving the SQL Server to somewhere that has a ping latency of 65ms, you're averaging 7 queries per second.

Your best bet is to

measure wait statistics while this query is running
- If you don't have a monitoring tool, a simple way to do this is running sp_BlitzFirst, an open source tool developed by Brent Ozar Unlimited. It'll take a snapshot of your wait stats (among other things), wait 5 seconds, and then take another snapshot, and show you a summary of the biggest issues
- You can also see the wait stats since the server started using a query like the one in Paul Randal's blog post: Wait statistics, or please tell me where it hurts
Once you've got those results, examine the highest wait stats, and use the SQL Skills wait stats library to dig into what they mean

Based on your description, it's likely that you'll see ASYNC_NETWORK_IO waits. This can indicate slow network transfer. But usually it means that the client application is doing row-by-row processing, and SQL Server is waiting to send rows (when it really wants to fire them over the network as fast as possible).

On a related note - in general, it would be far more effective to try to pull larger "groups" of data - rather than running 21,000 queries, and thus making 21,000 round trips over the (potentially slow) network. This is called a "chatty" interface or API, and is generally considered to be an anti-pattern.

Consider that 65ms * 21,600 queries is 1,404,000 ms, AKA over 23 minutes of overhead. That's a big chunk of overheard that could be removed simply by limiting the number of roundtrips.

Why is `SELECT` slower after `DELETE` completed?

That is because DELETE has only ghosted the rows. The table is just as big as when it had 12M rows, even though it has none. To count the rows (0), it takes just as much time it took to count 12M rows. In time the ghost cleanup process will garbage collect these ghosted records and deallocate pages that contained only ghosts, and your SELECTs will speed up. But right now if you check Skipped Ghosted Records/sec in perfmon is probably skyrocketing during SELECT COUNT(*). You could also speed up things by rebuilding the table: ALTER TABLE ... REBUILD.

TRUNCATE would had also taken care of this problem, as it leaves no ghosts behind.

Sql-server – SQL Server 2012 – Frequent SELECT and UPDATE on bit column. Index doesn’t help

An index on the bit column isn't going to help at all because of selectivity. You should consider a filtered index:

CREATE INDEX unindexed ON dbo.Documents(documentId) WHERE IsIndexed = 0;

You should also include an ORDER BY documentId in your SELECT.

Though you should test this in a staging environment because it may help your SELECT but may offset that with a more expensive UPDATE.

Best Answer

Related Solutions

Sql-server – SQL Server “empty table” is slow after deleting all (12 million) records

Why is SELECT slower after DELETE completed?

Sql-server – SQL Server 2012 – Frequent SELECT and UPDATE on bit column. Index doesn’t help

Related Question

Why is `SELECT` slower after `DELETE` completed?