Sql-server – DBCC Usage on production system

dbccsql serversql-server-2008-r2t-sqltimeout

We have a system running SQL Server 2008 R2 on a Hyper-V VM (64GB RAM and 16 virtual cores).

Another VM (the application server) within the system accesses the db server quite intensively for both querying and data updates.

We have been experiencing two main exceptions in the application server at the time were a particular maintenance sub plan stored procedure is executed.

The two exceptions are:

System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding (…….)

System.Data.SqlClient.SqlException (0x80131904): The app domain with specified version id (1859) was unloaded due to memory pressure and could not be found. (…..)

The maintenance plan stored procedure executed at this time has the following statements:

DBCC FREESYSTEMCACHE ('ALL')
DBCC FREESESSIONCACHE
DBCC FREEPROCCACHE
CHECKPOINT 
DBCC DROPCLEANBUFFERS

This is obviously causing the issues after some testing. Also I have found out that:

Timeout exception is caused by this:
```
CHECKPOINT 
DBCC DROPCLEANBUFFERS 
```

Memory pressure exception is caused by this:

DBCC FREESYSTEMCACHE ('ALL')
DBCC FREESESSIONCACHE
DBCC FREEPROCCACHE

Unfortunately we don't know the reason for having this maintenance task happening. My first thought is remove this task, but I don't know if there would be more problems by not having these running.

Are these DBCC commands really necessary?

Thanks

Best Answer

First, find the person who put that job in place and have him/her tied to a chair listening to cats scratching a chalk board while you go for lunch.

When done with that, compare your resource utilization and performance data for extended periods before and after the job runs. I suspect you'll see perf dips and a number of resources getting really busy after the job executes. You can temporarily suspend the job and monitor closely for a week to see what happens. I doubt you'll run into problems but as with all changes to a prod system, still keep a close eye for a while and make sure you capture perf, resource usage and waits data.

These sprocs are typically run in a test environment, especially for performance testing, to see how specific queries or the overall workload behaves on a cold-start equivalent. It can also be used to see how specific parts of the system handles unexpected load spikes especially storage.

We have encountered a single support engineer whose standard perf tuning method involves doing these things on a production system; we couldn't get rid of him fast enough. Running all of them on a production system regularly is pretty close to rebooting your server daily to keep it healthy and doing it in the middle of a busy day for best results - really, really shouldn't.

Related Solutions

Sql-server – Msg 2534 : DBCC CHECKDB allocation errors

If the sequence is:

DBCC CHECKDB reported errors.
REPAIR_REBUILD fixed the database, no errors reported.
DBCC CHECKDB reporting errors again.

You probably have failed or failing disks in an array or some other component of the IO subsystem is broken, or breaking. Personally, I'd want off the problem server ASAP!

Take a tail-log backup.
Restore last known good full backup and diff/log sequence to a different server.
Switch service to the second server.
Get a fresh cuppa and start diagnosis on the problem server.

Addendum: Root cause of the problem:

as it turned out, I found that as we are using Diskeeper, we are (for corporate reasons) still using an old version : 14.0.896. With this version we run into messages like :

The operating system returned error 1784(failed to retrieve text for this error. Reason: 15100) to SQL Server during a write at offset 0x000000010a4000 in file 'D:\Application-Data\MSSQL\Data\<Database>.mdf:MSSQL_DBCC17'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Msg 3313, Level 21, State 2, Server <Server>, Line 1
During redoing of a logged operation in database '<Database>', an error occurred at log record ID (2349664:1503:4). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.
Msg 1823, Level 16, State 1, Server <Server>, Line 1
A database snapshot cannot be created because it failed to start.
Msg 1823, Level 16, State 2, Server <Server>, Line 1
A database snapshot cannot be created because it failed to start.
Msg 7928, Level 16, State 1, Server <Server>, Line 1
The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.
Msg 5030, Level 16, State 12, Server <Server>, Line 1
The database could not be exclusively locked to perform the operation.
Msg 7926, Level 16, State 1, Server <Server>, Line 1
Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details

Found an explanation on Paul Randal's blog here

Link to DiskKeeper Fix found here

Interesting Diskeeper white paper on database defragmentation here

SQL Server – Resolving Tempdb Out of Space or System Table Inconsistency

DBCC CHECKDB makes extensive use of tempdb. It relies on a mechanism similar to snapshot isolation to make sure it always sees a transactionally consistent image of the database, independent what else might be going on.

The error you receive is most likely due to the fact that there is not enough room on the drive that tempdb is on. However, the error also states to check any prior errors. Were there any?

Best Answer

Related Solutions

Sql-server – Msg 2534 : DBCC CHECKDB allocation errors

SQL Server – Resolving Tempdb Out of Space or System Table Inconsistency

Related Question