SQL Server – Troubleshooting Unable to Complete Login Process

authenticationevent-notificationsql serversql-server-2008-r2

Issue
On one of our SQL Server witness servers (This is part of our Database Mirroring setup for SQL Server 2008 R2) we have just started getting the below event in the windows event log.

The description for Event ID 17052 from source MSSQLSERVER cannot be found.
Either the component that raises this event is not installed on your local
computer or the installation is corrupted. You can install or repair the
component on the local computer.

If the event originated on another computer, the display information had to
be saved with the event.

The following information was included with the event:

Severity: 16 Error:0, OS: 0 [Microsoft][SQL Server Native Client 10.0]Unable
to complete login process due to delay in opening server connection

the message resource is present but the message is not found in the
string/message table

These events are happening at varying times and so far we have had 22 errors starting on 27/08/2015. I have looked through our change log and I can't see any changes for that day that look relevant.

I have done some investigation and found the following article. This looks at the xml information in the below table; however none of the events in the result set correspond to the times of any of the failures.

SELECT CAST(record as xml) as record
FROM sys.dm_os_ring_buffers
WHERE ring_buffer_type = 'RING_BUFFER_CONNECTIVITY'

To make matters worse the errors seem sporadic with no determinable pattern.

Event Log

Question
How can I find out what is causing the above events?


Version
Microsoft SQL Server 2008 R2 (SP1) – 10.50.2500.0 (X64)
Jun 17 2011 00:54:03
Copyright (c) Microsoft Corporation
Standard Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1) (Hypervisor)

Best Answer

Troubleshooting this wasn't the simplest of processes, as it turned out to be a third party application killing the user. However there were clear indications in the SQL log that there were problems.

There were around 20 memory errors a day in the SQL Server log along the lines of "A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 7254 seconds. Working set (KB): 12756, committed (KB): 61520, memory utilization: 20%."

Using a product called ExtraHop (You can use your favorite network packet capturing tool) we were able to trace all the traffic going into and out of the server during the time frame of the 17052 error. This showed us that our monitoring software was connecting to the server and that a simple SELECT 1 was taking two seconds to complete. A Change request was raised to increase the memory and then implemented and the errors went away.

I believe the monitoring software terminates its connection attempt to an SQL server; as the termination was coming from outside of SQL it would indeed be an unknown error.