Sql-server – Job fails overnight intermittently because of a windows authentication failure to log on to SQL Server. How to troubleshoot

sql-server-2005

The failed job is a Litespeed transaction log backup, but also happens to fulls and diffs occasionally.

This same setup works on other servers without issue, and it's only a few times overnight that the error occurs, but it does happen every night at some point:

Error in SQL Server Agent Log:

"Logging on to SQL Server 'SQLFOO' with Windows Authentication TWideSafeCallErrorException – Timeout expired C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn\slssqlmaint.exe"

Error in Litespeed log:

"Msg 49999, Level 19, State 1, Line 0 Failed to login to SQL Server".

Is there anything I can look for on the server that would give me a clue as to why there is an intermittent login error?

EDIT: More information – A scheduled query was discovered that runs across this time period for several hours, all the failures happen during this time period. The transaction log backups that do complete are taking 10 minutes compared to less than a minute to complete as they normally do.

EDIT2: No transaction log failures when the job isn't running. There is one update script that while running still causes the trxlog backup to fail with the logon authentication error. It's an update script that hits a ton of records, I guess it's time to put on my junior_dev hat.

Best Answer

I've seen this happen both with network DC issues and also with a system that had too many connections and not enough memory.